Using a grid-search approach to validate the Graves–Pitarka broadband simulation method

This work assesses the ability of the Graves–Pitarka simulation approach to reproduce observed ground motions for 12 California and Baja California earthquakes. A total of 240 realizations are computed for each earthquake and compared with recorded strong motions from near-fault sites. In addition to spatial variability in slip, each realization samples from discrete combinations of average rupture speed, rise time, and down-dip fault width. Ground motions for each realization of a given earthquake are compared to the observations using a pseudo-spectral acceleration goodness-of-fit (GoF) metric. Parameters for the lowest misfit cases are then tabulated to develop relations for estimating median values and ranges for future applications. The results are generally consistent with published scaling relations for estimating rupture dimensions as a function of magnitude. Additionally, I find the average rise time scales as 2.3 × α T × 10 − 9 × M 1 / 3 o ( M o is seismic moment in dyne-cm) and average rupture speed as ( 0.765 ± 0.075 ) × α − 1 T × V s ( V s is local shear wave velocity), where a T is a mechanism adjustment ranging from a value of 1.0 for pure strike-slip to 0.9 for pure reverse-slip cases. There are cases where slightly different combinations of parameters produce equally good fits to the observations, demonstrating the non-uniqueness of using a single GoF metric in this approach. None-theless, the results reinforce the importance of adequately sampling ranges of rupture parameters when performing validations, as well as when simulating ground motions for future events.


Introduction
Recent years have seen the development of several broadband (0-20 Hz) earthquake ground motion simulation techniques (e.g., Hartzell et al. 1999;Liu et al. 2006;Frankel 2009;Graves and Pitarka 2010;Mai et al. 2010;Song et al. 2014;Crempien and Archuleta 2015;Olsen and Takedatsu 2015;Pitarka et al. 2017).These approaches are attractive in that they can be used to simulate situations for which recorded motions are not readily available (e.g., large magnitude and close distance to rupture), and they also provide full waveform time series that can be directly input to dynamic structural analyses (e.g., Zhong et al. 2020).The most desirable formulation for these simulations is the so-called "broadband deterministic" approach in which the full response is computed in a single computation (e.g., Graves and Pitarka 2016;Withers et al. 2019;Rodgers et al. 2020).However, this fully deterministic approach requires specification of the rupture process and geologic media at relatively short length scales (e.g., ~ 10 m), which is not only computationally expensive, but also strains the limits of our knowledge.To circumvent these limitations, most broadband simulation methods use a hybrid technique in which the lowfrequency response is computed deterministically (i.e., the "deterministic" approach) and the high-frequency response is computed stochastically (i.e., the "stochastic" approach), and then these separate responses are combined into a single broadband response using a process called match-filtering (e.g., Hartzell and Heaton 1995;Kamae et al. 1998).Note that since both the "deterministic" and "stochastic" approaches utilize deterministic and stochastic features, the preceding classification is not strictly correct.However, due to its traditional acceptance, I will continue to employ this terminology.
An important component in the development of broadband simulation techniques is validation against observed data, such as that performed by the Southern California Earthquake Center (SCEC) Broadband Simulation Platform (BBP) (e.g., Dreger et al. 2015;Goulet et al. 2015).The BBP (github.com/SCECcode/bbp)provides a transparent validation framework where the codes are installed and verified by the developers, but then the validations can be run independently using a consistent set of input parameters.The most basic kinematic parameters describing an earthquake rupture are magnitude, hypocenter, slip mechanism, rise time, rupture speed, and fault dimensions.Given these inputs for a particular earthquake, the validation runs on the BBP sample a suite of randomized slip distributions (typically ~ 50) generated by each method to see if an acceptable fit to the observations can be found.Most of the simulation approaches on the BBP fall into the hybrid classification, which means a balance must be found between specification of kinematic (deterministic) and stochastic rupture parameters.For the Graves-Pitarka (GP hereafter) hybrid simulation method (Graves and Pitarka 2010, 2015, 2016), this is done by relating the stochastic parameters directly to the kinematic parameters, when possible.While magnitude, hypocenter, slip mechanism, and fault length are usually relatively well-constrained, other parameters, such as rupture speed, slip rise time and down-dip fault width, are less well-constrained (e.g., Mai et al. 2016).However, these latter parameters are very important for ground motion simulation.For example, rupture speed and rise time can trade off with one another, and can also affect the corner frequency, which scales the mid-to higherfrequency motions (e.g., Brune 1970;Ji and Archuleta 2021), and fault width trades off with average slip, which directly scales the lower-frequency motions (e.g., Kanamori and Anderson 1975;Yamada et al. 2009).
In order to address this uncertainty within the BBP framework, I propose a validation approach that not only samples different slip distributions, but also samples variations in average rupture speed, down-dip fault width, and slip rise time.This is done by using a grid of discrete samples of these parameters where each realization gets a unique combination of values, which is then used to generate a rupture model with randomized slip distribution.In this study, I validate this approach using near-fault data from 12 California and Baja California earthquakes (Fig. 1).These events span the magnitude range 5.9 ≤ M w ≤ 7.3 (throughout this article, magnitude refers to moment magnitude, M w , as defined by Kanamori 1977) with nine primarily strike-or oblique-slip events, and three reverse-slip events.All rupture models in this study are created using the most recent version of the GP kinematic rupture generator (Graves and Pitarka 2016), which includes the modification to the shallow slip rate described by Pitarka et al. (2021) to better account for rupture through dynamically weak fault zone materials.One notable event that I have excluded from the current analysis is the 2003 M w 6.5 San Simeon earthquake.This was done due to the sparsity of nearfault (roughly defined as within one fault length of the rupture) recordings for this event, which are necessary to provide sufficient constraints on the rupture process.More details of the 12 validation earthquakes can be found in the Additional file 4. The goals of this validation work are to demonstrate that given a general description of a past earthquake, the methodology can create ruptures sharing similar characteristics to results derived by inversion studies, and also produce ground motions for these ruptures that agree with observed values, on average.Satisfying these goals will thus provide confidence in the predictive capabilities of the method for future earthquakes.
In the sections that follow, I first illustrate the details of this approach by describing its application to modeling the near-fault strong ground motions for the 1994 M w 6.7 Northridge, California earthquake.All simulations are carried out using the GP method as implemented in version 22.4.0 of the BBP with the misfit to the observations computed using a response spectral acceleration goodness-of-fit criterion (e.g., Abrahamson et al. 1990;Schneider et al. 1993).This is followed by a summary of the results for the remaining validation events.I then conclude by providing guidance on parameter specification for additional validation events, as well as for scenario simulation exercises.

Overview
The M w 6.7 Northridge earthquake occurred on 17 January 1994 beneath the western portion of the San Fernando Valley in southern California (Fig. 2).The earthquake was well-recorded by strong motion stations throughout the greater Los Angeles region with more than 50 sites located within 30 km of the rupture.Further, over 60 GPS observations, as well as leveling-line data were available to provide additional constraints on the earthquake rupture.Numerous studies have been conducted on this earthquake, and numerous models of the rupture process have been presented (e.g., Dreger 1994;Hartzell et al. 1996;Hudnut et al. 1996;Shen et al. 1996;Wald et al. 1996;Zeng and Anderson 1996).The common elements of these models indicate primarily reverseslip occurring on a roughly 20-km long blind-thrust fault dipping towards the southwest with the rupture initiating at a depth of about 18 km.Average rupture speed values used in the above studies range from about 75-85% of the local shear wave velocity and the down-dip fault width values range from 20 to 30 km.The average slip rise time from these studies is harder to constrain due to the different types of formulations employed; however, the values are generally around 1 s.

Kinematic rupture models
The kinematic rupture models used for the grid search validation exercise are generated using the approach of Graves and Pitarka (2016) and include the shallow slip rate modification of Pitarka et al. (2021).For all the Northridge rupture models, I fix the magnitude at M w 6.7, fault length at 20 km, depth to top of rupture at 5 km, fault strike at 122°, fault dip at 40°, and hypocenter at − 118.554° longitude, 34.208° latitude and 17.5 km depth.These parameter values are consistent with the consensus values listed in the NGA-West2 Database (Ancheta et al. 2014).Then, I generate a suite of rupture models using discrete combinations of average rupture speed, downdip fault width, and average slip rise time.Specifically, I sample across six values of average rupture speed, five values of fault width, and four values of average rise time.Below, I describe how the values of these parameters are set in the grid search approach.
For the GP kinematic rupture generator, the average rupture speed (V r ) is parameterized as a fraction of the local shear wave velocity (V s ): (1) The sampling of the down-dip fault width is done by first specifying a median value (W med ), and then setting additional values at ± 12.5% and ± 25% with respect to the median value.This results in a set of five discrete fault width values: 3 4 W med , 7 8 W med , W med , 9 8 W med , and 5 4 W med .Setting the maximum and minimum values at ± 25% about the median is a suitable choice based on results from previous finite-fault inversions (e.g., Mai et al. 2016).The choice for additional values at ± 12.5% represents a practical trade-off between the desire to adequately sample the range and also keep the number of simulations at a reasonable level.Based on the previous studies of the Northridge earthquake, I set W med at 24 km.Using the above formulation then yields the following set of fault width values to be sampled for the Northridge case: 18 km, 21 km, 24 km, 27 km, and 30 km.
Combining the six discrete values of f RV and the five discrete values fault width described above results in a grid of 30 unique pairs of these parameters, as illustrated in Fig. 3.This figure also plots sample ruptures generated using selected combinations of the parameters.These plots illustrate the effects of changing the average rupture speed or down-dip width on the resulting kinematic rupture model.In particular, they show that the total rupture time decreases with increasing average rupture speed, and the average (and peak) slip increases with decreasing fault width.Also, note that the slip distribution for each rupture is different, with each case being generated using a different random seed.
The final requirement needed to complete the kinematic rupture description is the specification of the sliprate function.In the GP approach, the slip-rate function is based on the formulation of Liu et al. (2006).This function has a sharp initial pulse followed by a relatively long tail, similar to that seen in dynamic rupture simulations (e.g., Pitarka et al. 2021).The slip rise time is defined as the total time duration of the function.
For the GP method, the average slip rise time (t A ) is given by a modified version of the relation from Somerville et al. (1999): where M o is the seismic moment in dyne-cm, c RT is a scaling constant, and a T is a mechanism scaling factor, which reduces the rise time up to a maximum of 10% in the limiting case of pure reverse faulting and is given by where with c α = 0.1 , and where δ and λ are the average fault dip and rake, respectively.
While the local rise time at any given point on the rupture depends on both depth and amount of slip in the GP approach (see Graves and Pitarka 2016 for details), the slip rise time given by Eq. (2) represents the value averaged across the entire fault.Somerville et al. (1999) originally determined the scaling constant c RT to be 2.0 based on analysis of rupture models inverted from past earthquakes.However, Graves and Pitarka (2010)  c RT = 1.6 provided a reasonable fit in simulating broadband motions of four past earthquakes, although these simulations were restricted to a constant average rupture speed.Recognizing that the average rise time is likely variable for different earthquakes, in the present study I allow the scaling factor c RT to sample across the four discrete values: 1.6, 2.0, 2.4, and 2.8.I chose this range based on results of preliminary simulations of the current event set that indicated the average rise time for the strikeslip events was noticeably underpredicted when using c RT = 1.6.Figure 4 plots slip-rate functions corresponding to these four values of c RT .These plots illustrate that as the rise time increases the peak amplitude of the slip-rate decreases and the function becomes relatively deficient in shorter period energy.
Combining the sets of six f RV , five fault widths and four c RT samples yields 120 unique combinations of these parameters that are used as inputs to the kinematic rupture generator.Additionally, for each of these unique combinations, I generate two slip model realizations.This then results in a total of 240 different rupture models that are considered in the validation exercise.While performing additional realizations (e.g., more densely sampling the range of discrete rupture parameter values or more slip realization per unique combination) might provide a more detailed statistical distribution of the results, the current choice affords a reasonable balance between computational burden and resolution of ground motion trends with respect to the kinematic parameters.

Observed ground motion records
Ground acceleration waveforms for each of the recording sites shown in Fig. 2 were obtained from the NGA-West2 Database (ngawest2.berkeley.edu).These recordings come from various organizations including the U.S. Geological Survey, California Strong Motion Instrumentation Program, University of Southern California, Southern California Edison, and Los Angeles Department of Water and Power.As part of the NGA-West2 project, these Fig. 3 Examples of Northridge earthquake rupture models.Center panel shows grid of 30 discrete rupture velocity factor (f RV ) and fault width combinations (red circles) used to generate the kinematic ruptures.For each pair of f RV and fault width, a unique rupture with random slip distribution is generated.Left and right panels show sample ruptures for different combinations of f RV and fault width.For each rupture model plot the green star is the hypocenter, the color shading denotes slip amount with numbers at upper right indicating the average and maximum slip levels, and the black lines show rupture time contours at 1-s intervals data have been uniformly processed and were subsequently utilized in the development of the NGA-West2 ground motion models (Ancheta et al. 2014).Most of these records have a usable bandwidth of 0.2 to 25 Hz; although some records are reliable down to 0.1 Hz (all recordings were analog so recovery of lower frequency energy requires careful processing, see Ancheta 2014).
In order to concentrate on the effects of the rupture process in the Northridge validation exercise, I have limited the maximum rupture distance of the observations to 30 km.This yields a total of 53 sites as shown by the red and blue triangles in Fig. 2.These sites are located on a variety of geologic settings with V s30 values ranging from 191 m/s (deep basin) to 2016 m/s (hard rock) (Wills et al. 2015).A list of these sites including location, station name, distance to fault rupture, V s30 , and usable frequency bandwidth is provided in the Additional file 2. For the ground motions at each site, I have rotated the horizontal component records to orientations of 122° and 212°, which correspond to fault-parallel (FP) and faultnormal (FN) components, respectively.Using this rotated set of records, I then compute 5% damped pseudo-spectral acceleration (PSA) for the individual components, as well as the RotD50 spectral acceleration parameter for the combined horizontal components.

Broadband simulations
The broadband simulations were run using the hybrid GP method (Graves andPitarka 2010, 2015) as implemented in version 22.4.0 of the BBP.This implementation includes minor modifications to the high-frequency simulation code to (1) eliminate the slight dependence of the computed response to subfault size present in previous implementations, and (2) increase the sensitivity of the high-frequency motions to variations in fault area via an adjustment to the median stress parameter.Details of these modifications are provided in Appendix A. The BBP implementation is limited to plane-layered (1D) velocity structures and for the Northridge simulations I used the Los Angeles region model listed in the Additional file 3.This model was developed by averaging vertical 1D profiles taken from the SCEC Community Velocity Model CVM-S4.26.M01 at locations of seismic stations throughout the Los Angeles region with the reference V s30 value set at 500 m/s.Green's functions for the low-frequency portion (f < 1 Hz) of the simulation are computed using a frequency-wavenumber approach (Zhu and Rivera 2002) with anelastic attenuation model as The 1D velocity model is also used to compute impedance and geometric spreading effects in the high-frequency (f > 1 Hz) portion of the simulation with Q effects modeled using the parameters of Graves and Pitarka (2010) and the high-frequency spectral decay parameter κ 0 (Anderson and Hough 1984) set to 0.04 s.Additionally, the median stress parameter is set to 50 bars for all high-frequency computations.Given the maximum variation in prescribed fault width of ± 25% with respect to the median value, the resulting maximum adjustment to the median stress parameter is about ± 12% following the modification described in Appendix A.
In the hybrid simulation, the separate low-and highfrequency portions are computed independently and then combined using a match-filtering approach (e.g., Hartzell and Heaton 1995) with the matching frequency set at 1 Hz.The final step in the simulation process is to adjust the simulated motions from the reference V s30 condition of 500 m/s to the site-specific value at each location, which is taken from the NGA-West2 database.This site adjustment is done using the implementation of Graves and Pitarka (2010) with the site term from Boore et al. (2014).The result of the simulation is a broadband (0-20 Hz) three-component ground motion waveform at each site for the prescribed rupture model.
Computing the full broadband response at all 53 sites for one rupture model takes about 30 min using a standard implementation of the simulation codes.Because each run is independent, the simulations for different rupture models can be run in parallel.By distributing the computations across multiple CPUs, the total run time to complete all 240 realizations is less than a few hours.
The simulation results for each rupture model are compared with the observations using a spectral acceleration goodness-of-fit (GoF) criteria (e.g., Abrahamson et al. 1990;Schneider et al. 1993).This is done by first computing the 5% damped PSA and RotD50 values from the simulated FP and FN horizontal components of motion at each site.Then, residuals are computed for each site j as a function of period T i in the natural log domain: where and O Xj and S Xj are the observed and simulated responses, respectively, and X = FP, FN or RotD50 to specify the component.The model bias for each component is then given by: and the standard deviation is given by where N X is the number of sites for each component.For the FP and FN components, I limit the distance to within 15 km from the rupture, yielding N FP = N FN = 21 (blue triangles in Fig. 2); and for RotD50, I include all sites out to 30 km from the rupture, yielding N RotD50 = 53 .The reason for limiting the distance for the FP and FN components in this manner is to focus on the component specific response only at near-fault sites where wave propagation and scattering effects that are not captured by our assumed 1D velocity structure may be less significant. (4) In order to distill the large amount of GoF results down to a more manageable number, I compute the average absolute misfit value (|M| avg ) for each of the 240 realizations.This is done by first summing the absolute value of the GoF bias values over the period range 0.1 to 10 s for each of the RotD50, FN, and FP components, then combining these using a weighted sum based on the number of stations in each set.The formal expression is where NP is the number of discrete periods sampled in the range of 0.1 to 10 s, and the weights are w RotD50 = N RotD50 /(N RotD50 + N NF ) and w NF = N NF / (N RotD50 + N NF ) and where I have also defined I use the absolute value in determining |M| avg so that positive and negative biases do not cancel.However, I also compute the average bias as well, to determine if the given realization over-or under-predicts the observed motions, on average.I refer to this later quantity as the misfit trend and it is normalized to the range − 1 to 1 across the 240 realizations.Figure 5 plots the |M| avg values for the 240 realizations as a function of fault width, f RV and c RT with the symbols color-coded by the misfit trend.
The lowest |M| avg case occurs for a fault width of 27 km, f RV = 0.90 and c RT = 1.6.The plots indicate that there is a clearly defined minimum in |M| avg as a function of f RV that occurs near the upper bound value of 0.90.The minimum in |M| avg as a function of c RT is not as sharply defined, but does occur towards the lower bound value of 1.6.There is no apparent minimum in |M| avg as a function of fault width.These results indicate that the GoF results are most sensitive to variations in average rupture speed, and relatively less sensitive to variations in average rise time and fault width, respectively, for the parameterization used here.Additionally, most of the realizations indicate an average bias that under-predicts the observations (blue misfit trend), with the few realizations that show an average over-prediction (red misfit trend) occurring for the smallest fault widths and highest average rupture speeds.
In order to examine possible trends and inter-correlations between the rupture parameters, I plot in Fig. 6 the misfit values as a function of discrete parameter pairings: f RV vs. fault width, f RV vs. c RT , and fault width vs. c RT .Determining the values shown at individual points in the panels of this figure is done by averaging the |M| avg values across all simulations having ruptures with a particular combination of parameters.For example, each point in ( 7) the f RV vs. fault width plot represents the average of the misfit values across four c RT values and two slip realizations.Points in the other panels are determined in a similar fashion.
The most apparent feature seen in the results in Fig. 6 is the strong trend of decreasing misfit with increasing f RV (also clearly observed in Fig. 5).However, the results shown in the panels of Fig. 6 also suggest that in terms of producing a similar misfit level, f RV is somewhat positively correlated with fault width, f RV is somewhat positively correlated with c RT , and fault width exhibits a weak negative correlation with c RT .While these correlations are subtle, they make sense intuitively.That is, with all other parameters held fixed, increasing average rupture speed (f RV ) will generally increase ground motion levels, while increasing fault width (i.e., decreasing average slip) or increasing average rise time (c RT ) will generally decrease ground motion levels.Thus, as values of different parameters are either increased or decreased, their combined impacts on ground motion level can cancel one another, which result in the apparent correlations.
More detailed results for the three realizations with the lowest |M| avg are shown in Fig. 7.The (fault width, f RV , c RT ) combinations for these are Rup-184: (27 km, 0.90, 1.6), Rup-140: (24 km, 0.90, 2.4) and Rup-035: (18 km, 0.85, 2.0), and the |M| avg values are 0.086, 0.094, and 0.095, respectively.The top panels of Fig. 7 show the rupture models for each of the realizations, and the other panels show the detailed GoF results across the period range 0.1 to 10 s for each of the RotD50, FP (122°), and FN (212°) components.This period range encompasses the vast majority of engineered structures where seismic loads are a concern.For each realization, the bias of the RotD50 component is near zero across the full period range, indicating that all of these models provide a good fit to the observations when averaged over the 53 sites.
All of the RotD50 GoFs exhibit a slight over-prediction for periods shorter than 0.2 s, which then increases somewhat at the near-fault sites (FP and FN components).This is likely caused by under-estimating the strength of nonlinear effects through the use of the Boore et al. (2014) site terms in these simulations.Additionally, while the GoFs for the FP and FN components show more variability as compared to the RotD50 component (due in part to the fewer number of stations used to compute the GoF), they all show a systematic tendency to under-predict the FP component and over-predict the FN component for periods greater than 1 s.These systematic features at Fig. 7 Top panels show rupture models for the three realizations with the lowest |M| avg for the Northridge simulations.The labels at the top left of each rupture indicate the f RV and c RT corresponding to the particular realization and numbers at the top right indicate the average and maximum slip, respectively.Below each rupture are the corresponding GoF results in the period range 0.1 to 10 s for the RotD50, FP (122°) and FN (212°) components, respectively.The red line is the bias (Eq.4), the green region is the standard deviation (Eq.5) and the yellow region is the 95% confidence interval of the mean.The |M| avg value for each realization is given at the upper left of the RotD50 GoF panel the longer periods are likely due to over-estimating the strength of rupture directivity effects at the near-fault sites in the simulations by using a 1D velocity structure.It is known that the geologic structure in the region just up-dip of the Northridge rupture has strong lateral heterogeneities (e.g., Hartzell et al. 1999) and these 3D structures can scatter the propagating wave field and diminish the coherence of radiation pattern effects.
While the GoF plots shown in Fig. 7 provide an assessment of how well the simulations perform when averaged across all sites, it is also important to examine if there are any underlying trends with respect to propagation distance or site type.This is done in Fig. 8, which plots the residuals as a function of distance to rupture (R rup ) and V s30 for selected oscillator periods.For brevity, I only show results for the lowest misfit case (Rup-184) in Fig. 8; however, results for the other low |M| avg cases are quite similar to these.Additionally, for this analysis I have divided the sites into two groups: one for sites with V s30 > 400 m/s (22 sites) and the other for sites with V s30 < 400 m/s (31 sites).I chose 400 m/s for this division since it is roughly near the median of the V s30 values of the sites considered in the analysis.I then perform linear regression for each of these V s30 groups at each period to find the best-fitting line along with its 95% confidence bounds.
The results in Fig. 8 show that there are no statistically significant trends as a function of distance for either V s30 group at periods of 1 s and shorter.At periods of 2 s and longer, the residuals for sites with V s30 > 400 m/s still show little trend with distance; however a slight trend of increasing residuals with increasing distance is apparent for the V s30 < 400 m/s site group.This trend is dominated by sites having large positive residuals (i.e., simulation under-predicts observation) at R rup distances greater than about 20 km.These sites having relatively low V s30 at R rup > 20 km are all located in the northern portion of the Los Angeles basin just south of the Santa Monica Mountains (Fig. 2).Previous studies (e.g., Graves et al. 1998;Olsen et al. 2003) have noted that sites within the Los Angeles basin experienced significant amplification of longer period (T > 1 s) motion during the Northridge Oscillator period is shown at the lower right of each panel.Symbols are color-coded to the site V s30 value with blue-shading for the 22 sites having V s30 > 400 m/s and red-shading for the 31 sites having V s30 < 400 m/s.Least-squares fits with 95% confidence bounds are indicated by the solid and dashed lines, respectively, for each V s30 group at each oscillator period earthquake due to the generation of surface waves along the northern margin of the basin.Hence, I surmise that the underprediction of these motions in the current analysis results from the use of 1D Green's functions, which are not able to capture this 3D wave propagation effect.
Figure 9 compares simulated three-component ground velocity waveforms for the lowest misfit ruptures with observations at the three selected sites indicated in Fig. 2. In general, all of the simulated waveforms do reasonably well in matching the amplitude level, shaking duration, and frequency character of the observed waveforms.This includes features such as the strong pulse-like motions on the fault-normal component at station SYL due to rupture directivity (and to a lesser extent at station CNP), and the extended duration, relatively higher frequency motions at station CWC, which is located to the side of the rupture.One notable feature that is not reproduced by the simulations is the large amplitude, late arriving, low-frequency pulse observed at CWC.This is likely a basin phase that is generated by the interaction of the direct waves with the 3D structure of the San Fernando basin, and which is not present in our 1D Green's functions.While the general character of the simulated motions shown in Fig. 9 is quite similar, there are differences among these cases that are apparent in the waveforms.For example, the difference in relative amplitude of the initial pulses at CNP, the amplitude of the fault-parallel motions at CWC, and the timing and amplitude of the large directivity pulse at SYL.These differences do not have a strong effect on the spectral acceleration GoF results; however, they demonstrate that the use of a single metric such as PSA can be non-unique in terms of the waveform features produced by a particular rupture.

Summary of GoF results
I have applied the same modeling approach as that described above for the Northridge earthquake to the other 11 validation events.Details of the fault parameters, station lists and velocity models used for these events are in the Additional files 2, 3, and 4, respectively.Additionally, plots showing |M| avg as a function of the rupture parameters, RotD50 and PSA GoF, sample rupture models and residuals as a function of distance and V s30 for these events are in the Additional file 1: Figures S1-S55.For these other events, I find some of the same general trends as that seen for the Northridge modeling.For example, the level of misfit is much more sensitive to variations in f RV compared to fault width or c RT , and simulations for the lowest 3-5 misfit cases for a particular event produce relatively similar ground motions.
As a summary of the results for all 12 validation events, Fig. 10 plots RotD50 GoF for the simulation having the lowest misfit for each event (for reference, Additional file 1: Figure S56 shows the RotD50 GoF plots with the largest misfit for each event).These plots cover the period range 0.1 to 10 s, with the number of stations and maximum distance varying by event.In general, the bias (red line) is centered around zero for all events, with deviations from zero bias typically on the order of 10 to 20%.The maximum deviations are about 40% and these are limited to rather narrow period bands for just a few events (e.g., near 1 s for Imperial Valley, near 10-s period for Landers, and near 4-s period for M w 7.1 Ridgecrest).There are many trade-offs and approximations in the modeling, so determining the cause of these deviations, whether they be source, path or site related, is difficult.
In an ideal situation, there would be multiple events of differing magnitude occurring in the same geographic region that are also recorded by the same group of stations, which would then help to isolate source, path and site effects.For the set of events I consider here, this situation is only strictly realized for the M w 6.4 and M w 7.1 Ridgecrest sequence events.Examining the GoF plots for these two events does not indicate systematic biases or trends that are repeated for both events, which if such features were present, would suggest modeling deficiencies related to path and/or site effects.This is also supported by the lack of significant and repeatable distance or V s30 trends seen in the residual plots for these events (Additional file 1: Figures S40 and S44).The lack of these systematic features suggests the deviations are more likely related to unmodeled source effects.And, in fact, the roughly 30% over-predication (negative bias) around 3-4 s period for the M w 7.1 Ridgecrest event is a feature that has been previously identified as a source effect (e.g., Ahdi et al. 2020;Parker et al. 2020;Pitarka et al. 2021).The other event groupings that only marginally satisfy the ideal situation are the M w 6.5 Imperial Valley and M w 7.2 El Mayor Cucapah events, and the M w 6.1 North Palm Springs, M w 7.3 Landers and M w 7.1 Hector Mine events.For these event groupings, while the station sets within each group have some overlap, there are also a number of stations that only recorded one event.Additionally, there is a mixture of analog and digital records and this might impact the overlap in usable bandwidth, particularly at the longer periods.Keeping in mind these caveats, there still do not appear to be repeatable features in the GoF plots within these groups that would suggest systematic biases in path and/or site effects.The above results are encouraging since my focus in the present study is primarily on characterization of the earthquake source.
Looking more closely at the variability of the ground motion estimates, the standard deviations shown in the GoF plots of Fig. 10 can be viewed as an approximation of the intra-event variability determined from the simulations.However, this is not a true intra-event standard deviation such as that provided by ground motion prediction equations (GMPEs, e.g., Boore et al. 2014) for several reasons.First, I have considered only a few events having a limited magnitude range and I have used only a subset of recordings (i.e., near fault sites) for each event.Second, it is not possible in the current analysis to determine the contribution to variability that comes directly from the simulations versus the contribution inherent to the data themselves.And finally, the GMPE intra-event standard deviations are determined using a rigorous statistical framework that allows for potential magnitude, distance, and site type dependence of the standard deviation to be extracted from the data.
Within the framework of these caveats, it is still informative to compare the simulation based GoF standard deviation with the GMPE intra-event terms, which is done in Fig. 11.To compute the GoF standard deviation shown in this figure (labeled GoF σ LM ), I first pool the residuals from the lowest misfit cases for the 12 events, then I apply Eqs. ( 5) and ( 6) to this set of combined residuals.The GMPEs used for this comparison are Fig. 10 RotD50 GoF results in the period range 0.1 to 10 s for the lowest misfit simulation for each of the 12 validation events.The maximum distance and number of sites for each event are listed in the upper right of each panel.The red line is the bias (Eq.4), the green region is the standard deviation (Eq.5) and the yellow region is the 95% confidence interval of the mean from the NGA-West2 project (Abrahamson et al. 2014;Boore et al. 2014;Campbell and Bozorgnia 2014;Chiou and Youngs 2014) and the intra-event terms are computed for an event magnitude of 7.0, distance of 40 km and V s30 of 400 m/s.As shown in Fig. 11, the agreement between GoF σ LM and the GMPE intra-event standard deviations is very good across nearly the entire period band, with the exception being at the very shortest periods (T < 0.2 s).Additionally, the trends seen in the GMPE curves of increasing standard deviation from 0.2 to 1 s and then plateauing or decreasing values at longer periods is matched quite well by GoF σ LM .The favorable comparison shown in Fig. 11 suggests that the simulation methodology appears to be capturing the main features of the observed intra-event variability.However, confirmation of this will require additional analyses, which are beyond the scope of the current study.

Rupture area, length and width
For each validation event, I compute an estimate of the best-fitting mean rupture area by averaging the rupture areas from the simulations producing the five lowest |M| avg scores.I chose to use an average across multiple realizations rather than using just the lowest misfit case due to the relative lack of sensitivity to variations in fault width.The specific choice of averaging over five realizations is somewhat arbitrary; however, the result does not change significantly when using anywhere from three to 10 realizations.
Figure 12 compares the best-fitting mean rupture area estimates for the simulations against empirically constrained area-magnitude scaling relations.These relations are from Hanks and Bakun (2002, HB02 hereafter), Shaw (2009, S09 hereafter), andLeonard (2014, L14 hereafter).In making this comparison, I have converted the HB02 and S09 relations from M to M w using the relation M w = M -0.03 (Hanks and Kanamori 1979).Also, for S09, I have used the preferred value of rupture width saturation, which is 15 km, although S09 does provide relations for a saturation width of 19 km, as well.The gray-shaded region on this plot denotes the one standard deviation level of the L14 relation.It is interesting to note that this region encompasses the range of the other empirical relations, as well as all of the estimated mean rupture areas from the current study.
Below about M w 6.7, all three empirical curves are roughly similar, and the areas from the simulations tend to cluster around these curves.For larger magnitudes, the behavior of the empirical curves begins to diverge with the HB02 and S09 relations predicting significantly smaller areas for a given magnitude compared to L14.The areas for the simulated events around M w 7 (Loma Prieta, Hector Mine and M w 7.1 Ridgecrest) appear to follow this trend and are close to the S09 curve; however, the areas for events at larger magnitudes (El Mayor Cucapah and Landers) move closer to the L14 curve.
Figure 13 breaks down the rupture area scaling into separate length and width elements, and again compares these with empirical relations.For this comparison, I have  Campbell and Bozorgnia (2014); and CY14, Chiou and Youngs (2014).The intra-event terms from the GMPEs are determined for an event magnitude of 7.0, site distance of 40 km and V s30 of 400 m/s Fig. 12 Comparison of the mean rupture areas for the five lowest misfit realizations for each of the 12 validation events with published relations: HB02 is from Hanks and Bakun (2002), S09 is from Shaw (2009), and L14 is from Leonard (2014).The gray shading indicates the standard deviation estimate from L14.For the simulations, circles denote strike-slip and oblique-slip events, and squares denote reverse-slip events.Symbols are plotted at the mean value and the vertical bars indicate the minimum and maximum rupture areas across the five lowest misfit realizations (bars hidden behind symbol when minimum and/or maximum are very close to mean) also divided the events into two groups: one for strikeand oblique-slip, and the other for reverse-slip.Note that the HB02 and S09 studies do not provide explicit relations for length or width scaling with magnitude.To estimate these curves, I have followed the simplifying assumption of HB02 that length and width are equal and increase with magnitude until the width reaches a prescribed value, which HB02 set at 15 km (about M w 6.3).Above this magnitude, the length continues to grow, but the width is constant.Furthermore, since both HB02 and S09 are focused on strike-slip events, I do not plot relations from these studies for the reverse-slip events.
Examining first the length-magnitude scaling for strike-slip events indicates that while there is some variability among the empirical relations, they are fairly similar and show the same general trends.This includes an increase in slope related to the width saturation, which occurs at about M w 6.3 for HB02 and S09, and about M w 7.0 for L14.The rupture lengths for the simulations (which were not varied across realizations) generally follow the empirical relations, including the increase in slope, and all fall within the standard deviation of L14 with the exception of Parkfield.The length-magnitude scaling for dip-slip events has only three samples for Fig. 13 Comparison of rupture length (top row) and mean rupture width (bottom row) for the five lowest misfit realizations for each of the 12 validation events with published relations: HB02 is from Hanks and Bakun (2002), S09 is from Shaw (2009), and L14 is from Leonard (2014).The events are further divided into strike-and oblique-slip (left column) and reverse-slip (right column).The gray shading in the rupture length panels indicates the standard deviation estimate from L14, and the yellow shading for the rupture width panels indicates the ± 25% variation used in the simulations.For rupture width, symbols are plotted at the mean value and the vertical bars indicate the minimum and maximum widths across the five lowest misfit realizations.Rupture length is not varied across realizations the validation events, and these generally follow the L14 relation.
The width-magnitude scaling for strike-slip events shows a noticeable difference for the empirical relations, which is directly due to the assumption of width saturation at 15 km for HB02 and S09 compared to 19 km for L14.For M w 6.5 and less, the widths of the simulated events generally follow the L14 curve (although it should be noted that the behavior of HB02 and S09 could be different in this regime if a different aspect ratio scaling were assumed).For the larger magnitude events, the widths for the simulated events lie between the relations of L14 and those of HB02 and S09.The width-magnitude scaling for dip-slip events again has only three samples for the validation events, and these generally follow the L14 relation, but with a fairly large degree of scatter.

Average rise time and rupture speed
Figure 14 plots the mean values of c RT and f RV determined from the five lowest misfit simulations for each validation event as a function of magnitude and fault dip.Here, fault dip is used as a surrogate for rupture mechanism as the events in the validation set have a strong correlation of increasing reverse-slip component with decreasing fault dip.Also shown in these plots are linear regression fits for each set of points along with their associated 95% confidence bounds.These plots show that there is essentially no trend in the c RT or f RV values as a function of magnitude.For fault dip, both c RT or f RV exhibit trends, with c RT decreasing and f RV increasing with decreasing dip.While the trend for c RT is outside the 95% confidence threshold, the trend for f RV is statistically significant at the 95% confidence level.
Recall that the GP method already applies a mechanism scaling factor (Eq. 3) to the rise time such that the rise time is reduced up to 10% with decreasing fault dip.The results in Fig. 14 suggest that a similar mechanism adjustment should be applied in the determination of f RV as well.I propose a relation of the form where a T is the mechanism adjustment given by Eq. ( 3), f RV0 is the input rupture speed scaling factor, and f RV is the resulting factor, which is then used in the generation of the rupture model.
Using relation (8), I have rerun simulation sets for the North Palm Springs, Whittier Narrows, Loma Prieta and Northridge events.I did not rerun the other validation events as the adjustment is negligible given their rupture mechanisms.The results for the updated simulations showing how f RV0 scales as a function of magnitude and dip are plotted in Fig. 15.Also shown in these plots using open symbols are the f RV values determined with (8) Eq. ( 8) for each event.Note that since the adjustment is negligible for primarily strike-slip events so the open symbols for these events plot beneath the filled symbols.As expected, the open symbols show the same trend of increasing f RV with decreasing fault dip seen in Fig. 14.However, this trend is effectively removed for f RV0 .Additionally, the new parameterization significantly reduces the scatter in f RV0 across the events.

Peak ground motions
Recently, Baltay and Hanks (2014) and Ji and Archuleta (2021) compared single-and double-corner-frequency source spectra models with the general characteristics of the NGA-West2 strong motion data to investigate how peak motions scale with earthquake magnitude.Here, I follow a similar analysis and compare ground motions from the lowest misfit simulations for the 12 validation events with these source spectra models.The ground motion metrics analyzed in this comparison are peak ground acceleration (PGA), peak ground velocity (PGV), and dominant frequency (f DOM ), with f DOM defined as the ratio PGA/(2πPGV) (Ji and Archuleta 2021).The PGA and PGV values are determined as the geometric mean of the peak values on the two horizontal component records simulated at each site.As with these previous studies, the PGA and PGV values are normalized to a distance of 10 km by assuming 1/R geometric spreading.I use where R jb is the horizontal distance from the site to the surface projection of the fault and h = 4.5 km is the "fictitious depth" term (Boore et al. 2014).Additionally, I only use records with R jb ≤ 50km in order to minimize the impacts of wave propagation effects on the analysis.Once all the simulated motions are normalized to 10 km distance, I compute the mean and standard deviation of the PGA, PGV and f DOM values from the set of records for each earthquake.
The source spectra models are based on stochastic representations originally proposed by Hanks and McGuire (1981).In this approach, the basic form of the far-field S-wave acceleration and velocity amplitude spectra are given by (e.g., Baltay and Hanks 2014;Ji and Archuleta 2021).
Here, C = √ 2R θϕ A 4πρβ 3 R e −π f κ 0 represents combined effects of radiation pattern ( R θϕ ), crustal amplification (A), geo- metric spreading (1/R) and site attenuation ( κ 0 ), and Ω 0 is the single-or double-corner-frequency source spectrum.For my analysis, I set R θϕ = 0.6, near-source density (9a) and S-wave velocity as ρ = 2.75g/cm 3 and β = 3.6km/s , respectively, and κ 0 = 0.04s , which are all consistent with parameters used in the GP simulation approach.I allow the crustal amplification value to vary in order to optimally adjust the level of the stochastic model spectra to the simulated ground motion values.
The single-and double-corner-frequency source spectra I consider are given by (10a) Equation (10a) is the classic Brune (1970) spectrum with single-corner-frequency f c .Equation (10b) is the double-corner-frequency spectrum from Ji and Archuleta (2021) with frequency corners at f c 1 and f c 2 .In modeling the NGA-West2 data, Baltay and Hanks (2014) employed the single-corner-frequency model to determine the following relation for f c (10b) (11) log 10 f c = 2.427 − 0.5M w Fig. 14 Mean value of c RT (top row) and f RV (bottom row) across the five lowest misfit realizations for each of the 12 validation events plotted as a function of magnitude (left column) and fault dip (right column).In each panel, the horizontal gray line is the average of the data points and the solid and dashed black lines are linear regression fits to the points and 95% confidence bounds, respectively.Symbols are plotted at the mean value of the respective parameter for each earthquake, however, regression fits and confidence bounds are determined using values from the five lowest misfit realizations as input data points and for the double-corner-frequency model Ji and Archuleta (2021) determined the following relations for f c 1 f c 2 when constrained to be self-similar, and when the self-similarity condition is relaxed.Using the nomenclature of Ji and Archuleta (2021), I refer to the source spectra models based on Eqs. ( 11), (12), and (13) as BH14, JA19, and JA19_2S, respectively.The final step is to relate the S-wave acceleration and velocity spectra given in Eqs. ( 9) to the PGA and PGV of the motion.This is done by first using Parseval's theorem to estimate the rms acceleration and velocity from the corresponding S-wave spectra, and then applying random vibration theory to relate the peak values to the rms values (e.g., Hanks and McGuire 1981;Boore 1983).Details of this process are given in Appendix B.
Figure 16 compares the distance-normalized (R = 10 km) PGA, PGV and f DOM from the lowest misfit validation simulations with the BH14, JA19, and JA19_2S models.The median values for each simulated event are plotted as symbols with the error bars denoting the standard deviation.For the theoretical models, I have determined an optimal fit to the simulation results where N = 12 for the 12 validation events, S i is the median value for the simulations of the ith event, M i is the model prediction for the ith event, and σ S i is the standard deviation (in natural log units) of the simulated values for the ith event.The values of E for the optimal fit of each model and for each of PGA, PGV, and f DOM are shown in the respective panels of Fig. 16.One of the clear features in both the simulated values and model predictions for PGA and PGV shown in Fig. 16 is their saturation with increasing magnitude, that is, the slope decreases with increasing magnitude.Furthermore, this effect is significantly stronger for PGA than it is for PGV.Identification of this effect in observed ground motion data is not new (e.g., Abrahamson et al. 2014;Boore et al. 2014;Campbell and Bozorgnia 2014;Chiou and Youngs 2014) and can be explained by the stronger attenuation of the relatively higher frequency waves that control PGA compared to the relatively lower frequency waves that control PGV (e.g., Baltay and Hanks 2014).The strength of this effect becomes greater Note that the adjustment is negligible for primarily strike-slip events (dip ≥ 75°) so the open symbols for these events plot beneath the filled symbols as the fault length increases with magnitude.The fact that the simulations nicely replicate this behavior provides an added degree of confidence in the simulation For the range of magnitudes considered here, the behavior of models BH14 and JA19 are quite similar, and thus, their misfit levels (E) are quite similar, as well.However, model JA19_2S has a flatter slope across this range of magnitudes, which better matches the simulated values of PGA and PGV, and thus results in a lower misfit compared to the other models.The difference in slope between the models is effectively removed in computing f DOM , resulting in nearly identical performance of all three models for this metric.While Ji and Archuleta (2021) determined the JA19_2S model is their preferred representation, they also acknowledge that there is large variability in the data, as well as limited number of observations at the larger magnitudes.These same limitations apply to an even greater extent to the analysis I present here.Thus, I hesitate to draw overly broad conclusions from the results shown in Fig. 16, aside from noting the strong consistency of the simulations with the general features of the theoretical models.

Guidance on parameter specification
One of the objectives of the current work is to use the results of the validation exercises to develop guidelines on parameter ranges for application in future simulation  Ji and Archuleta (2021), and BH14 is the single-corner-frequency model of Baltay and Hanks (2014).A a and A v denote the amplification factors applied to the empirical models for PGA and PGV, respectively, in determining the optimal fit to the simulated ground motions.E denotes the misfit given by Eq. ( 14) exercises.In the present study, I have purposely chosen fairly wide bounds on the parameter ranges in order to adequately capture the that produce the lowest misfit simulations; however, this approach may not be practical or desirable in all applications.
Based on the above comparisons for magnitude vs. rupture area, length and width, it is not possible to choose a clearly preferred empirical scaling relation that best matches the simulations.Much of the variability among the empirical relations comes from the large uncertainties in determining down-dip rupture width for past earthquakes, and hence this uncertainty should be sampled when performing simulations for future earthquakes.My recommendation for determining the rupture length and width when applying the GP method is as follows: first, given a prescribed magnitude (M w ), and if the rupture length is not prescribed, use an average of HB02, S09 and L14 for strike-slip or L14 for dip-slip to determine the rupture length to use in the simulations.Then, to determine the mean rupture width, if the mechanism is strikeslip and M w > 6.8, use an average of HB02, S09 and L14; otherwise if the mechanism is strike-slip and M w ≤ 6.8, or if the mechanism is dip-slip, use L14.Alternatively, if the rupture length was already prescribed, use an average of the HB02, S09 and L14 magnitude-area relations for strike-slip, or the L14 relation for dip-slip, to determine the rupture area and from this calculate the rupture width.Finally, in order to sample the uncertainty, a range of rupture widths spanning ± 25% about the mean value should be used in the simulations.
For rupture speed and risetime scaling, I recommend using median values of c RT = 2.3 and f RV0 = 0.765 in the GP method.Additionally, the mechanism adjustment for the rupture speed factor given by Eq. ( 8) should also be applied.The value of c RT = 2.3 determined here is noticeably larger than the value of c RT = 1.6 determined in Graves and Pitarka (2010).I suspect the value determined in the earlier study may have been biased low by the Northridge earthquake, which has the smallest c RT of the 12 events in the current study (Fig. 14), but was one of only four events examined previously.In terms of ground motion impact, changes in rupture speed and rise time are correlated such that the level of high-frequency ground motion generally increases with increasing rupture speed, as well as with decreasing rise time.Thus, in simulating the range of expected ground motions for future earthquakes, it is probably not necessary to include variations in both rupture speed and rise time.Given that the simulated motions are much more sensitive to changes in rupture speed than rise time (e.g., Fig. 5), I propose that only variations in rupture speed need to be considered.Based on the results in Fig. 15, a range of ± 0.075 about the median value of f RV0 should be adequate.
As a test of the impact of holding c RT fixed in the simulations, I have extracted the subset of results for the case of c RT = 2.4 from the full set of realizations.Then, I compare the mean values of fault width, f RV0 , and |M| avg determined for the five lowest misfit cases in the full set with those from the subset having c RT fixed at 2.4.The results are shown in Fig. 17.For all three parameters, the results show quite good agreement between the two sets, with the mean values of fault width and |M| avg differing by a maximum of about 10% and those for f RV0 differing by a maximum of about 5%.There are no apparent systematic variations seen in the results for the fault width and f RV0 parameters.However, the |M| avg values for the c RT = 2.4 subset are all slightly larger (i.e., slightly poorer fit) than those determined from the full set of realizations.This later result is consistent with the expected outcome when the number of free parameters in the set of realizations is reduced.Nonetheless, given the relatively small changes in mean values of fault width, f RV0 , and |M| avg for the lowest misfit cases when holding c RT fixed provides further support for this choice in future simulations.

Conclusions
In this paper, I have presented a grid search technique for validating broadband strong ground-motion simulations.The validation process begins with the prescription of the main earthquake parameters (e.g., magnitude, hypocenter, slip mechanism, fault location), which are constrained by generally accepted values from previous studies.Then, a suite of possible ruptures to consider is generated by sampling from ranges of prescribed downdip fault width, average rupture speed, and slip rise time values.Each discrete combination of fault width, average rupture speed, and rise time also has a unique slip distribution, which is randomly generated using the Graves and Pitarka (2016) approach.This results in a total of 240 realizations that are analyzed for each validation event.I apply this approach to 12 California region earthquakes spanning the magnitude range 5.9 ≤ M w ≤ 7.3.I assess the fit of the simulations to the observed strong motion records using a pseudo-spectral acceleration goodness-of-fit metric (e.g., Abrahamson et al. 1990;Schneider et al. 1993).For each validation event, the method is able to generate simulations that produce very low misfit scores across the period range 0.1 to 10 s.The process used to specify the simulation parameters for the validation exercises is identical to that used for future hypothetical earthquakes.These results demonstrate that given a general description of the earthquake, the technique can (1) create ruptures that share similar characteristics to results derived by inversion studies, and (2) produce ground motions for these ruptures that closely match observed ground motion values, thus providing in the predictive capabilities of the method.
I find that the simulated motions are much more sensitive to variations in average rupture speed compared to variations in fault width and slip rise time.The result of this is that slightly different combinations of these parameters can produce nearly the same level of fit to the observations.This ambiguity in determining a unique set of best-fitting parameter values is primarily due to the use of a single pseudo-spectral acceleration misfit metric.In fact, one drawback to using this type of GoF metric is that it is relatively insensitive to the details of the slip distribution compared to the other rupture parameters.This is because pseudo-spectral acceleration is not as sensitive to waveform phasing as it is to spectral amplitudes.So, for example, the pseudo-spectral acceleration GoF approach will have difficulty distinguishing multiple waveform pulses from a single pulse as long as their spectral amplitude characteristics are similar.Employing additional misfit metrics, particularly those that combine time-and frequency-domain measures (e.g., Anderson 2004) or performing GoF comparisons utilizing Fourier amplitude spectra (e.g., Hartzell et al. 1999) would potentially increase the resolution of the approach, as well as provide more sensitivity to the details of the slip distribution.Nonetheless, even with the current implementation, the lowest misfit cases tend to have parameter values that cluster around limited ranges of these parameters.Utilizing ranges of rupture parameters in the characterization of ground motions for a future hypothetical earthquake represents part of the epistemic variability can be important for certain seismic hazard analyses.Beyond demonstrating the viability of the Graves-Pitarka simulation method, the results presented here also provide guidance on specifying median values and ranges of these parameters for use in simulating future earthquakes.Comparing the parameters from the lowest misfit simulations to empirical scaling relations for magnitude vs. rupture area, length and width from Hanks and Bakun (2002), Shaw (2013), and Leonard (2014), I find the simulation results are generally consistent with all the relations.Given the limitations on magnitude range and number of events for the simulations, coupled with the large uncertainties associated with the empirical relations, it is not possible to select a preferred relation from among this group.Based on these factors, I propose using an average of these relations for strike-slip mechanisms, and using Leonard (2014) for dip-slip mechanisms, in order to determine mean rupture dimensions for future events.Since the largest source of uncertainty in these scaling relations is rupture width, I propose a uniform sampling of ± 25% about the mean value should be used in future suites of simulations.
For slip rise time averaged across the fault given by the relation τ A = c RT × α T × 10 −9 ×M 1/3 o where M o is the seismic moment in dyne-cm and a T is the mechanism scaling factor (Eq. 3), I find a value of c RT = 2.3 works well.For average rupture speed (V r ) prescribed as a fraction of local shear wave velocity (V s ) using the relation V r /V s = f RV 0 × α −1 T , I find a scaling constant of f RV 0 = 0.765 ± 0.075 works well for all events.Utiliz- ing a T in the above formulations significantly reduces the mechanism dependence on the input parameters, as it acts to reduce the resulting rise time and increase the resulting rupture speed fraction up to 10% in the limiting case of pure reverse faulting.While this apparent mechanism dependence is clear in the current analysis, it should be noted that the four events where this feature is significant are also buried ruptures that do not break the ground surface.Additionally, these events are limited to M w < 7. Thus, the exact physical mechanisms that might give rise to this feature and its scaling with earthquake size are ambiguous.Indeed, recent work by Wang et al. (2016) found that very large (M w > 7.5) strike-slip earthquakes had generally faster rupture speeds than dip-slip events of comparable magnitude.In any case, given the stronger sensitivity of the simulation results to variations in rupture speed compared to rise time, coupled with the inherent trade-offs between these two parameters, I recommend only variations in rupture speed sampled from a uniform distribution given by the above parameterization need to be considered in future suites of simulations.
The analysis of the 12 California and Baja California earthquakes in the current study represents a significant validation effort; however, incorporation of additional events would certainly be beneficial.Future validation work might include events in other regions, such as those in Japan (e.g., 2000 M w 6.6 Tottori, 2007, M w 6.8 Chuetsu-oki, 2008 M w 6.9 Iwate), Turkey (e.g., 1999 M w 7.1 Duzce, 1999 M w 7.5 Kocaeli), China (e.g., 2008 M w 7.9 Wenchuan), New Zealand (e.g., 2010 M w 7.0 Darfield, 2011 M w 6.2 Christchurch), and Taiwan (e.g., 1999 M w 6.2 Chi-Chi).This would not only help to better constrain estimates of parameter medians and distributions, but will also extend the validation range to larger event magnitudes.

Fig. 1
Fig.1Maps showing the locations of the 12 California region earthquakes analyzed in this study.Solid (top edge) and dashed black lines are the fault planes and green star is the epicenter.Red triangles are locations of strong-motion sites considered for goodness-of-fit analysis.Color shading denotes V s30 values fromWills et al. (2015)

Fig. 2
Fig. 2 Map showing Northridge earthquake rupture plane projected to ground surface (black rectangle) and strong motion sites within 15 km (blue triangles) and 30 km (red triangles) of the rupture surface.Epicenter indicated by green star.Color shading denotes V s30 values from Wills et al. (2015).Labeled sites discussed in text

Fig. 4
Fig. 4 Slip-rate functions corresponding to the four values of risetime coefficient (c RT ) considered in this study.Left panel shows functions in the time domain and right panel shows their Fourier amplitude spectra.The amplitudes of the functions are normalized to have unit area, and the time and period axes are normalized by the risetime value of t 0 , which corresponds to c RT = 2.2

Fig. 5
Fig. 5 Average absolute misfit (|M| avg ) for the 240 Northridge earthquake simulations as a function of fault width (left panel), f RV (middle panel), and c RT (right panel).The color scale labeled misfit trend indicates if a realization over-predicts (red shading) or under-predicts (blue shading) the observed motions, on average

Fig. 8
Fig. 8 RotD50 residuals for the Northridge simulation with the lowest |M| avg score plotted as a function of distance at six different oscillator periods.Oscillator period is shown at the lower right of each panel.Symbols are color-coded to the site V s30 value with blue-shading for the 22 sites having V s30 > 400 m/s and red-shading for the 31 sites having V s30 < 400 m/s.Least-squares fits with 95% confidence bounds are indicated by the solid and dashed lines, respectively, for each V s30 group at each oscillator period

Fig. 9
Fig. 9 Three-component, broadband ground velocities for the three lowest |M| avg simulations compared with observed motions at stations CNP, SYL, and CWC.The horizontal components have been rotated to fault-normal (212°) and fault-parallel (122°) orientations.Absolute timing is not available for the observations so records have been shifted to roughly align with first arriving P-wave in the simulations.Peak amplitude shown at top right of each trace

Fig. 11
Fig. 11 Comparison of the standard deviation computed from the residuals for the lowest misfit simulations across all 12 validation events (GoF σ LM ) with the intra-event standard deviations from ground motion prediction equations (GMPEs).GMPEs are as follows: ASK14, Abrahamson et al. (2014); BSSA14, Boore et al. (2014); CB14,Campbell and Bozorgnia (2014); and CY14,Chiou and Youngs (2014).The intra-event terms from the GMPEs are determined for an event magnitude of 7.0, site distance of 40 km and V s30 of 400 m/s (12) log 10 f c 1 = 1.754 − 0.5M w log 10 f c 2 = 3.250 − 0.5M w ,(13) log 10 f c 1 = 1.462 − 0.415M w M w ≤ 5.27 2.357 − 0.585M w M w > 5.27log 10 f c 2 = 3.250 − 0.5M w ,by adjusting the level of crustal amplification factor.This was done separately for PGA (A a ) and PGV (A v ) in order to account for frequency dependent amplification.Note that this only adjusts the amplitude of the curve and does not change its shape.The optimal fit was determined using the misfit function defined by Ji and Archuleta (2021):

Fig. 15
Fig. 15 Same as the bottom row of Fig. 12 except here I plot f RV0 , which are the input values prior to any mechanism adjustment.The open symbols are the f RV values then used in the rupture generator, and are determined by dividing the f RV0 by the mechanism scaling factor a T (Eq.8).Note that the adjustment is negligible for primarily strike-slip events (dip ≥ 75°) so the open symbols for these events plot beneath the filled symbols

Fig. 16
Fig. 16Ground motion values from the lowest misfit validation simulations plotted as a function of magnitude for PGA (top left), PGV (top right), and dominant frequency (bottom left).The ground motions have been normalized to a distance of 10 km.Symbols are plotted at the median level and error bars denote the standard deviation.JA19_S2 and JA19 are double-corner-frequency models ofJi and Archuleta (2021), and BH14 is the single-corner-frequency model ofBaltay and Hanks (2014).A a and A v denote the amplification factors applied to the empirical models for PGA and PGV, respectively, in determining the optimal fit to the simulated ground motions.E denotes the misfit given by Eq. (14)

Fig. 17
Fig. 17Comparison of fault width (upper left), f RV0 (upper right) and |M| avg (lower left) for the five lowest misfit realizations with c RT allowed to vary (vertical axis) and c RT fixed at 2.4 (horizontal axis).Symbols are plotted at the mean values and the horizontal and vertical bars indicate the range of minimum and maximum parameter values across the five lowest misfit realizations.The gray line denotes 1:1 scaling and the yellow shading indicates regions of ± 10%, ± 5% and ± 10% about the 1:1 line for fault width, f RV0 , and |M| avg , respectively