Conventional N-, L-, and R-tests of earthquake forecasting models without simulated catalogs
© The Society of Geomagnetism and Earth, Planetary and Space Sciences (SGEPSS); The Seismological Society of Japan; The Volcanological Society of Japan; The Geodetic Society of Japan; The Japanese Society for Planetary Sciences; TERRAPUB. 2011
Received: 12 January 2010
Accepted: 24 August 2010
Published: 4 March 2011
We propose a new procedure for testing the expected number (N-test), log likelihood (L-test), and log likelihood-ratio (R-test) of seismicity models. In these tests, scores obtained from observed earthquakes are compared with distributions of scores estimated from earthquakes expected from the models under test. We introduce a method to estimate the test score distributions analytically where uncertainties in magnitude and hypocentral parameters are involved. The analytical formulas used to estimate expected values and standard deviations of the test scores for earthquakes conforming to the test models were derived in earlier published studies, which allowed calculation of normal approximations by which to test score distributions. Using these two methods simultaneously, we can perform N-, L-, and R-tests for seismicity models without using any simulated catalogs. As a case study, the proposed procedure was applied to two seismicity models for Kanto, central Japan. To compare our procedure with the current one based on the Monte Carlo method, we randomly generated sets of 10,000 earthquake catalogs of two kinds: one set conforming to the model under test, and the other derived from the observed catalog allowing for uncertainties in magnitude and hypocentral parameters. The distributions of L-scores obtained from both sets are in good agreement with those obtained by the proposed procedure. This comparison suggests that the analytical approach presented here could be useful for conducting the N-, L-, and R-tests in a conventional way.
With the development of statistical seismology, probabilistic predictions of earthquakes are now more common than has hitherto been possible. The probabilities of earthquake occurrence are usually estimated based on specific seismicity models. To provide a reliable probabilistic prediction, the model should pass well-defined tests. The Collaboratory for the Study of Earthquake Predictability (CSEP) Project (Jordan, 2006) has been organized to solve questions related to earthquake predictability and to develop an adequate experimental infrastructure to conduct scientific prediction experiments under rigorous conditions. The aim of the CSEP Project is to evaluate each proposed model with three statistical tests, i.e., the N-, L-, and R-tests (Kagan and Jackson, 1995). In these tests, observed scores are compared with the distributions of respective scores expected from a proposed model. When the observed scores fall within an acceptance range, the model is not rejected.
In calculating observed scores, effects due to uncertainties in the earthquake source parameters of location, depth, and magnitude are also taken into account in the CSEP procedures (Schorlemmer et al., 2007). The underlying rationale for this step is that parameter uncertainties may cause earthquakes to be associated with bins differing from those originally assigned. A serious problem may arise if these uncertainties are ignored. For example, the involvement of a particular earthquake in the score can become largely a matter of chance if it is close to the boundaries of the test region or to the lower magnitude limit of the tests.
A large number of simulated catalogs are generated in the CSEP tests in order to obtain distributions of N-, L-, and R-scores expected from the proposed models (Schorlemmer et al., 2007). The observed scores are estimated using simulated catalogs derived from an original one by allowing for uncertainties in earthquake source parameters. Generating a large number of catalogs consumes a great deal of computational time that is proportional to the number of proposed models and time and space segments necessary for forecasting.
In this paper, we propose a method for conducting these tests analytically without generating simulated catalogs. This method is based on the assumptions that the rate in each cell is far less than unity and that at most one earthquake occurs in one cell. For each test, we analytically derive two sets of the mean and variance of the respective score: that expected from a proposed model (Imoto, 2009; Imoto and Rhoades, 2010) and that of the probable score if uncertainties in hypocenter parameters are taken into account. The central limit theorem allows us to regard the distribution of the scores as approximately normal, with the analytically obtained means and variances, if the number of earthquakes in the test area is large enough.
The proposed method is applied to two seismicity models for Kanto, central Japan. The first model (Hazmap model) is tentatively introduced as a candidate for testing and is a subset of a seismicity model used in estimating the recent seismic hazard maps for Japan. The second one is the EEPAS model (Rhoades and Evison, 2004, 2005), which was developed into a horizontal multi-layered model for seismicity in the Kanto region, central Japan (Rhoades and Evison, 2006; Imoto and Rhoades, 2010). The Hazmap model is configured to the tectonic setting in Kanto, where three plates (the Eurasian (North American, or Okhotsk) Plate, the Philippine Sea Plate, and the Pacific Plate) converge. Differences in configuration between the Hazmap and EEPAS models could be resolved by re-configuring the EEPAS model to be compatible with the Hazmap under a simple assumption. Thus, the Hazmap model could be compared with the EEPAS model in the R-test.
In order to compare our analytical approach with that involving the use of simulated catalogs, we have compared the distributions of L-scores by both methods. The results show that the means of L-scores for earthquakes conforming to a model under test and for those with parameter uncertainties taken into consideration are similar whether computed by our method or by the method involving simulated catalogs.
2. N-, L-, and R-Tests
In the N-, L-, and R-tests, scores obtained from the observed data are compared with a distribution of scores that could be expected assuming a given model to be correct. If the observed scores are not consistent with these distributions, the null hypothesis that the real earthquake sequences conform to the model may be rejected. If an observed score falls outside an acceptance range, the model should be rejected at the respective level of significance. In the following discussion of the three tests, we will introduce methods to estimate distributions of the test scores for a catalog with uncertain parameters. Using methods already presented in previous papers (Imoto, 2009; Imoto and Rhoades, 2010), these tests can be conducted without using simulated catalogs.
2.1.1 N-score expected from a proposed model
2.1.2 N-score from observed events with uncertain hypocentral parameters
Under an assumption similar to that in the distribution of g(n), the observed number of earthquakes may be well represented by a normal distribution, f(n), with the mean and the variance given in Eqs. (4) and (5).
2.2.1 L-score expected from a proposed model
2.2.2 L-score from observed events with uncertain hypocentral parameters
If the event occurs in the j-th cell, not in the j-th cell, the difference in likelihood is given by the same formula with j in the place of j. If the location and/or magnitude parameters contain significant errors, an earthquake could be located in several different cells with certain probabilities.
2.3.1 R-score expected from two models
2.3.2 R-score from observed events with uncertain hypocentral parameters
3. Models and Data
3.1 Seismic hazard map model
Probabilistic seismic hazard maps were prepared for Japan based on the long-term probability of earthquakes (Fujiwara, 2004; Fujiwara et al., 2009). Future earthquakes in and around Japan are classified into several categories, such as along major and minor inland active faults, thrust faults along subduction zones, and others. The probabilistic seismic hazard estimates from every category are then merged into a total hazard.
In the present study, one candidate seismicity model tentatively considered is a subset of the long-term probability pertaining to the category of earthquakes without specified faults (Hazmap model). In this category, the long-term probability of earthquakes is estimated based on a smoothed seismicity of past earthquakes. The target earthquakes in our tested time-space volume all belong to this category.
3.2 EEPAS model
The total rate density is obtained by summing over all past occurrences, including earthquakes outside R that could affect the rate density within R. More detail is given in previous studies (e.g., Evison and Rhoades, 2002, 2004; Rhoades and Evison, 2004). We do not distinguish the magnitudes of target earthquakes. Therefore, the integral form of the rate density is used. A minor modification of the EEPAS model was made to study three-dimensional seis-micity in the Kanto region, Japan (Rhoades and Evison, 2006), where the depth of 0–120 km is divided into six layers, and the EEPAS model is applied to each layer.
Differences in configuration between the Hazmap and EEPAS models (Fig. 1(b)) prevent us from performing the R-test. However, one simple assumption could enable us to make the EEPAS model compatible with the Hazmap model. Assuming that the cell size of the EEPAS model is so small that a uniform Poisson process is maintained in each cell, we could divide a cell into multiple pieces of arbitrary size, the hazard rates of which are estimated to be proportional to their volume. In our case, only the depth range for the EEPAS model differs from that of the Hazmap. Therefore, we performed “divide” and “connect” upon cells of the EEPAS model to make them compatible for the depth range only, since the horizontal grid spacing of 0.1 × 0.1° is common between the two models. The modified EEPAS model thus obtained is adapted in the present study for comparison with the Hazmap model. Hereafter, we refer to this modified EEPAS model as the EEPAS model.
List of target earthquakes. “Prob” in the last column indicates the probability that the earthquake falls in the study volume.
Figure 4(b) compares the observed R-score with that expected from distributions under the EEPAS model. The intersections of broken lines and solid curves remain in the acceptance range (between 5% and 95%). This would be maintained even if we consider the extreme value of R-score that is estimated as side lobes of the normal density function at the bottom. Based on this result, the null hypothesis that earthquake occurrences conform to the EEPAS model rather than the Hazmap model cannot be rejected.
In this paper, we have derived the means and variances of the distributions for N-, L-, and R-scores in the cases of (1) seismicity compatible with the proposed model and (2) catalogs with errors in the observed parameters. With these means and variances, N-, L-, and R-tests could be performed for the proposed models in a simple way if distributions associated with the tests are well approximated by normal distributions with means and variances. This approximation is basically guaranteed by the central-limit theorem if the number of earthquakes is sufficiently large. For the first case in Section 2.1.1, the Poisson distribution is approximated with a normal distribution. This approximation is likely accomplished with a total expected number of earthquakes exceeding 10.
The next case, presented in Section 2.1.2, is that in which binomial distributions are approximated by a normal distribution, and it should be carefully considered. One of the various rules that may be used to decide whether a sample size is large enough for this approximation is that both the expected value and the value of the sample size minus the expected value must be greater than 5. If we consider a case of only errors in location, only a small proportion of earthquakes contributes to the variance of n(O0) in Eq. (5), i.e., only earthquakes near the border of the test area contribute to the variance since only such events are origins of changes in n(O0).
Therefore, the above rule for the approximation may not be satisfied. However, if we consider the case of errors both in locations and magnitude, the rule must be satisfied in most cases. For example, provided that we observe ten earthquakes with magnitudes exceeding the cutoff Mc in the test area and the standard deviation of the magnitude determination is 0.2, we expect about 21 earthquakes in the magnitude range between Mc - 0.4 and Mc + 0.4, where the b-value of the Gutenberg-Richter relation is assumed to be 1. We could expect seven earthquakes larger than Mc and 14 less than this, which may satisfy the above rule.
In our case, we presume the standard deviation of the magnitude determination to be 0.1, and earthquakes of magnitude =4.7 are considered in our test. In total, 51 events are used, among which 24 events are registered with magnitude <5.0. The expected number of events for the five years is 25.4 and is given as the summation of the probabilities in the last column of Table 1. Accordingly, the sample size minus the expected value becomes 25.6, which is much greater than 5.0 and satisfies the above rule.
In summary, we present here methods by which to perform N-, L-, and R-tests without generating random catalogs conforming to the test models or catalogs modified from the observed catalog with uncertainties in the parameters. We applied the proposed method to the Hazmap and EEPAS models in Kanto, central Japan. The N- and L-tests for the years 2004–2008 confirm the self-consistency of both models. The values for the R-test for the last 5 years suggest that the EEPAS model is superior to the Hazmap model over this period. This 5-year comparison is in no way a meaningful test of the Hazmap model itself, which is a long-term model designed for a time period of 30 years. Rather, it suggests that the time-varying EEPAS model contains information about earthquake occurrence on a 5-year timescale that is not contained in longer term estimates of seismicity.
Our comparison of L-scores derived analytically with those derived from simulated catalogs indicates no significant difference between the two sets of scores, thus implying that the proposed method is an alternative to the current one. However, caution is warranted because the analytical method is only reliable when the assumptions on which it is based are satisfied.
This manuscript was greatly improved by the comments of an anonymous reviewer and K. Nanjo.
- Evison, F. F. and D. A. Rhoades, Precursory scale increase and long-term seismogenesis in California and northern Mexico, Ann. Geophys., 454, 479–495, 2002.Google Scholar
- Evison, F. F. and D. A. Rhoades, Demarcation and scaling of long-term seismogenesis, Pure Appl. Geophys., 161, 21–45, 2004.View ArticleGoogle Scholar
- Fujiwara, H., National seismic hazard mapping project of Japan, Proceedings of the 5th U.S.-Japan Natural Resources Meeting, U.S. Geological Survey, Open-File Report 2005–1131, 14, 2004.Google Scholar
- Fujiwara, H., S. Kawai, S. Aoi, N. Morikawa, S. Senna, N. Kudo, M. Ooi, K. Hao, K. Wakamatsu, Y. Ishikawa, T. Okumura, T. Ishii, S. Matsushima, Y. Hayakawa, N. Toyama, and A. Narita, A study on “National seismic hazard maps for Japan”, Technical Note National Research Institute for Earth Science and Disaster Prevention, No336, 2009 (in Japanese).Google Scholar
- Imoto, M., Comments on the N-, L- and R-tests for seismicity models,Zisin 2, 61, 207–209, 2009 (in Japanese).View ArticleGoogle Scholar
- Imoto, M. and D. A. Rhoades, Seismicity models of moderate earthquakes in Kanto, Japan, utilizing multiple predictive parameters, Pure Appl. Geophys., 167, 831–843, 2010.View ArticleGoogle Scholar
- JMA, The Seismological and Volcanological Bulletin of Japan, Japan Meteorological Agency, ISSN 1349–8320, 2009.Google Scholar
- Jordan, T. H., Earthquake predictability, brick by brick, Seismol. Res. Lett., 77, 3–6, 2006.View ArticleGoogle Scholar
- Kagan, Y. Y. and D. D. Jackson, New seismic gap hypothesis: five year after, J. Geophys. Res., 100, 3943–3960, 1995.View ArticleGoogle Scholar
- NIED, J-SHIS at http://wwwold.j-shis.bosai.go.jp/j-shis/index en.html (as of July 1, 2010).Google Scholar
- Rhoades, D. A. and F. F. Evison, Long-range earthquake forecasting with every earthquake a precursor according to scale, Pure Appl. Geophys., 161, 47–72, 2004.View ArticleGoogle Scholar
- Rhoades, D. A. and F. F. Evison, Test of the EEPAS forecasting model on the Japan earthquake catalogue, Pure Appl. Geophys., 162, 1271–1290, 2005.View ArticleGoogle Scholar
- Rhoades, D. A. and F. F. Evison, The EEPAS forecasting model and the probability of moderate-to-large earthquakes in central Japan, Tectono-physics, 417, 119–130, 2006.View ArticleGoogle Scholar
- Schorlemmer, D., M. C. Gerstenberger, S. Wiemer, D. D. Jackson, and D. A. Rhoades, Earthquake likelihood model testing, Seismol. Res. Lett., 78, 17–29, 2007.View ArticleGoogle Scholar