 Full paper
 Open Access
 Published:
Some reasoning on the RELMCSEP likelihoodbased tests
Earth, Planets and Space volume 66, Article number: 4 (2014)
Abstract
The null hypothesis is the essence of any statistical test: this is basically a comparison of what we observe with what we would expect to see if the null hypothesis was true. In this work, I explore the suitability of the null hypothesis of likelihoodbased tests (LBTs), which are often adopted by the laboratories of the Collaboratory for the Study of Earthquake Predictability (CSEP), to check earthquake forecast models. First, I discuss the LBT in the wider context of classical statistical hypothesis testing. Then, I present some cases in which the null hypothesis of LBT is not appropriate for determining the merits of earthquake forecast models. I justify these results from a theoretical point of view, within the framework of point process theory. Finally, I propose a possible upgrade of LBT to enable the correct assessment of the forecasting capability of earthquake models. This study may provide new insights to the CSEP LBT.
Background
The increasing interest of the seismological community in earthquake forecasting has highlighted the need for a proper evaluation of forecast models. This has motivated the birth of the working group on Regional Earthquake Likelihood Models (RELM, Schorlemmer and Gerstenberger 2007) and of the Collaboratory for the Study of Earthquake Predictability (CSEP, Jordan 2006), both designed to evaluate the quality of forecast models. The protocol adopted by RELM/CSEP is based on classical statistical hypothesis testing (Schorlemmer et al. 2007). This is then finalized to reject or accept the null hypothesis (hereinafter H_{0}) on the basis of a numerical summary of the data. RELM/CSEP working groups adopt two main types of testing methods: likelihoodbased tests (LBTs) (Schorlemmer et al. 2007; Zechar et al. 2010) and alarmbased tests (ABTs) (Zechar and Jordan 2008). In this study, I focus on LBTs and specifically on N and L tests (Schorlemmer et al. 2007).
The RELM/CSEP working groups formalized the LBT to test hypotheses that ‘should follow directly the model, so that if the model is valid, the hypothesis should be consistent with data used in a test. Otherwise, the hypothesis, and the model on which it was constructed, can be rejected’ (Schorlemmer et al. 2007). Actually, as I discuss below, this intent was not attained (Lombardi and Marzocchi 2010a; Schorlemmer et al. 2007, 2010a; Werner et al. 2010).
The CSEP testing centers use the N and L tests to check the consistency of expected (Λ={λ_{(i,j)}}) and observed (Ω={ω_{(i,j)}}) values of variables X_{(i,j)}, representing the number of earthquakes with magnitude above a threshold M_{ F }, in nonoverlapping bins \left\{\right({T}_{i},{R}_{j});{T}_{i}\in \mathcal{T},{R}_{j}\in \mathcal{R}\} of a predetermined spatiotemporal space \mathcal{S}=\mathcal{R} ×\mathcal{T} (Jordan 2006; Zechar et al. 2010). A model is represented by forecasts Λ, which are the only values provided by the modelers. The correct calculation of the p values of the LBT requires the probability distribution of X_{(i,j)} given by the model and specifically the probabilities
As this information is not available to modelers, the LBT assumes, as the null hypothesis H_{0}, that the variables X_{(i,j)} are independent and follow a Poisson distribution with mean λ_{(i,j)}. Therefore, the set of probabilities {p}_{n}^{\mathit{\text{ij}}} are substituted for the probabilities
and the p values of the LBT are computed accordingly (Schorlemmer et al. 2007).
Specifically, the N test measures the probability of observing {N}_{i}^{O}=\sum _{j}{\omega}_{(i,j)} events, for each forecast time period T_{ i }. The p values of the N test are given by the probabilities (Zechar et al. 2010):
where {X}_{i}\phantom{\rule{0.3em}{0ex}}=\phantom{\rule{0.3em}{0ex}}\sum _{j}{X}_{(i,j)}. The RELM/CSEP protocol rejects a model if δ_{1} or δ_{2} is too small, meaning that the model overpredicts or underpredicts the observed seismicity. Under H_{0}, X_{ i } is a Poisson variable with expectation {N}_{i}^{F}=\sum _{j}{\lambda}_{(i,j)} (and PDF {q}_{n}^{i}={\left[{N}_{i}^{F}\right]}^{n}{e}^{{N}_{i}^{F}}/n!), and the percentiles δ_{1}/ δ_{2} are computed by this distribution (see Schorlemmer et al. 2007).
The Ltest measures the probability of the joint loglikelihood L(Ω_{ i }Λ) of observing Ω, given the forecast Λ. Under H_{0}, L(Ω_{ i }Λ) is given by:
The p value of the L test is estimated by comparing L(Ω_{ i }Λ) with a predetermined number N of synthetic likelihood values L\left({\Omega}_{i}^{S}\right\Lambda )=\{L\left({\Omega}_{i}^{{S}_{l}}\right\Lambda ),l=1,\dots ,N\}, computed by Equation 4, of simulated catalogs ‘consistent with the forecast’ (Schorlemmer et al. 2007). This means that the forecast grids {\Omega}_{i}^{{S}_{l}} are simulated according to the Poisson hypothesis supposed by H_{0}, and the p value of the L test is given by the proportion of simulated loglikelihoods below the value L(Ω_{ i }Λ):
This shows that the LBT does not check the hypothesis that a forecast model has merit with the given data (marked hereinafter by Hyp_{1}). Actually, the LBT tests whether {ω_{(i,j)}} are independent random variables, from a Poisson population with mean {λ_{(i,j)}} (marked hereinafter by Hyp_{2}). When a model is not consistent with Hyp_{2}, i.e., when the set of probabilities \left\{{p}_{n}^{\mathit{\text{ij}}}\right\} is significantly different from \left\{{q}_{n}^{\mathit{\text{ij}}}\right\}, the specific computation of the p values of the LBT is misleading, causing a potentially unjustified rejection of the model itself (Lombardi and Marzocchi 2010a).
The CSEP laboratories still systematically use the LBT, but a process of revision has begun. This study is intended to provide a contribution to this process.
Methods
A suitable revision of the LBT requires the full recognition and quantification of the causes and effects of the present inefficiencies. For this purpose, I apply the N and L tests to two classes of 1,000 simulated forecast grids, generated by different spatiotemporal magnitude models. In this way, the data are perfectly known, and the rejection of H_{0} cannot mean the failure of the model being tested.
First, I generate two sets of synthetic catalogs. Each catalog covers a time period of 1 month (January 1 to 31, 2012), the Italian collecting region, and a magnitude range of [ 2.5,9.0], as chosen by CSEP (Schorlemmer et al. 2010b).
The first class of simulations is consistent with a version of the epidemictype aftershocks sequence (ETAS) model (Ogata 1998), submitted to the CSEPItaly testing region (Lombardi and Marzocchi 2010b). The rate of the model at time t, with location (x,y) and magnitude m, is given by:
where {μ,K,c,p,α,d,q,γ,b} are the model parameters, M_{0} and M_{max} are the minimum and maximum magnitudes, {\mathcal{\mathscr{H}}}_{t}=\left\{\right({T}_{i},{X}_{i},{Y}_{i},{M}_{i});{T}_{i}<t\} is the history (i.e., the information relative to past events) up to time t, and r_{ i } is the distance between location (x,y) and the epicenter of the i th event (X_{ i },Y_{ i }) (see Lombardi and Marzocchi 2010b, for details). To compute the rate {\lambda}_{1}(t,x,y,m/{\mathcal{\mathscr{H}}}_{t}), I include in the history the seismic bulletin of the Istituto Nazionale di Geofisica e Vulcanologia (INGV) from April 16, 2005 to December 31, 2011. Moreover, I add a synthetic event (T_{ms},X_{ms},Y_{ms},M_{ms}) at time 00:00:00 on January 1, 2012 (T_{ms}), with magnitude M_{ms} = 6.0 and coordinates (X_{ms},Y_{ms})=(13.384°E,42.346°N). The parameter values used in this study are μ = 0.7, K = 0.026, p = 1.15, c = 0.01, α = 1.4, d = 0.7, q = 1.5, γ = 0.3, b = 1.0, M_{0} = 2.5, and M_{max} = 9.0.
To generate the ETAS forecasts for day T_{ i } and catalog C_{ k }, I mimic the CSEP realtime experiment: specifically, I include the triggering rate for events with history {\mathcal{\mathscr{H}}}_{{T}_{i}} of C_{ k } and average the triggering rates of 1,000 simulated realizations of the process inside T_{ i } (see Lombardi and Marzocchi 2010b, for details).
The second class of simulations follows a nonstationary poisson (NP) process. Specifically, the rate λ_{2}(t,x,y,m) is given by a stationary background and the triggering effect of event (T_{ms},X_{ms},Y_{ms},M_{ms}). The rate of the NP model is as follows:
where r is the distance between (x,y) and (X_{ms},Y_{ms}). The parameters used here are μ = 0.7, K = 0.1, p = 0.9, c = 0.02, α = 1.4, d = 0.7, q = 1.5, γ = 0.3, b = 1.0, M_{0} = 2.5, M_{max} = 9.0.
The simulations represent the average seismicity of the first month of a sequence (following a shock with magnitude 6.0), as predicted by the ETAS and NP models. The basic difference between the models is that the rate of the ETAS model depends on the whole history {\mathcal{\mathscr{H}}}_{t} (i.e., information relative to past events), whereas the rate of the NP model depends on the coordinates of only one event (T_{ms},X_{ms},Y_{ms},M_{ms}). Thus, the rate of the NP model is deterministic and decreasing in time from T_{ms}, whereas the rate of the ETAS model has a random nonmonotonic time evolution, depending on history {\mathcal{\mathscr{H}}}_{t}.
For each synthetic catalog, I compute the 1day binned forecast grids Λ (M_{ F } = 2.5) by integrating (in time, space, and magnitude) the rate of the model used to generate the catalog. The forecast grids Λ cover a period of 1 month (starting from January 1, 2012) and the test spatial grid adopted for the CSEP Italian laboratory (Schorlemmer et al. 2010b). Finally, I apply the CSEP/RELM N and L tests (with significance level α = 0.05 and M_{ F } = 2.5) on all simulated catalogs, using the forecast grids previously computed.
In this paper, I propose an obvious upgrade of LBT, which does without the Poisson distribution. First, the discrete loglikelihood function L(Ω_{ i }Λ) of variables X_{ i } (Equation 4) is substituted for the continuoustime loglikelihood function (hereinafter, CLF). This is a proper measure of the agreement between model and data, taking into account the features of a model. For a spatiotemporal magnitude earthquake model, this is given by
where λ(t,x,y,m) is the rate of the model (Daley and VereJones 2003) and {N}_{\mathcal{Rx}\mathcal{Tx}\left[{M}_{0}{M}_{\text{max}}\right]} is the number of events inside the spatiotemporal magnitude space \mathcal{Rx}\mathcal{Tx}\left[{M}_{0}{M}_{\text{max}}\right].
Second, the percentiles of the distributions of both the variables X_{ i } and the CLF are derived directly by the model. This information allows the computation of more reliable p values for the tests (Werner and Sornette 2008; Schorlemmer et al. 2010a).
In brief, the new testing procedure presented here consists of the following steps:

1.
For each forecast period T _{ i }, the number of events (Ω _{ i }) and the CLF (CLF_{M,i}) of model M being tested are computed.

2.
For each T _{ i }, N catalogs given by model M are simulated; the occurrences {\Omega}_{M,i}^{S}=\left\{{\Omega}_{M,i}^{{S}_{l}},l=1,\dots ,N\right\} and the likelihood {\text{CLF}}_{M,i}^{S}=\left\{{\text{CLF}}_{M,i}^{{S}_{l}},l=1,\dots ,N\right\} are computed for all catalogs.

3.
The percentiles of the empirical distributions generated in the previous step, used to perform a test at the 95% confidence level, are estimated. Specifically, the 2.5th and 97.5th percentiles \left({P}_{M,i}^{\Omega}\left[\phantom{\rule{0.3em}{0ex}}2.5\%\right]\text{and}\phantom{\rule{1em}{0ex}}{P}_{M,i}^{\Omega}\left[\phantom{\rule{0.3em}{0ex}}97.5\%\right]\right) of values {\Omega}_{i}^{S} and the 5th percentile \left({P}_{M,i}^{\text{CLF}}\left[\phantom{\rule{0.3em}{0ex}}5\%\right]\right) of quantities {\text{CLF}}_{M,i}^{S} are identified.

4)
The observed values Ω _{ i } and CLF_{M,i} are compared with the percentiles computed in the previous step. In this way, model M is rejected or retained for T _{ i }. Specifically, model M is rejected if {\Omega}_{i}<{P}_{M,i}^{\Omega}\left[\phantom{\rule{0.3em}{0ex}}2.5\%\right] or {\Omega}_{i}>{P}_{M,i}^{\Omega}\left[\phantom{\rule{0.3em}{0ex}}97.5\%\right] or if {\text{CLF}}_{M,i}\le {P}_{M,i}^{\text{CLF}}\left[\phantom{\rule{0.3em}{0ex}}5\%\right].
In this procedure, the percentiles of model M are estimated by simulations because it is often not possible to derive them analytically. However, the use of simulations is not mandatory for modelers, of course.
Results
First, I apply the CSEP LBT to two classes of ETAS and NP simulations. Figure 1a shows the fraction of rejections F_{ R } (i.e., the proportion of catalogs for which H_{0} is rejected) of the N and L tests as a function of time. As shown in Lombardi and Marzocchi (2010a), F_{ R } for the ETAS simulations is well above 5%, which is the threshold justifiable by chance. On the other hand, F_{ R } for the NP simulations is close to or below 5%, suggesting that Hyp_{2} is consistent with the NP model.
To investigate whether previous results depend on M_{ F } or on the average seismic rate of the region, I apply the procedure described above to 1,000 new catalogs, reproducing the average seismicity of Japan (which has a seismic rate two orders of magnitude higher than that of Italy). These datasets are simulated by using an ad hoc ETAS model of this region. In this experiment, I consider a forecast time span T_{ i } of 3 months, an overall time period of 10 years, and M_{ F }=4.0. This last value is the threshold magnitude adopted by the Japanese CSEP laboratory for shortterm forecasting experiments (Nanjo et al. 2011; Tsuruoka et al. 2012). I find that F_{ R } is equal to 40% t o 50% and 60% t o 75% for the N and L tests, respectively (see Figure 1b).
I apply the new testing procedure described previously to the simulated Japanese catalogs. This gives the values of F_{ R } in Figure 1b. The improvement, with respect to the CSEP version of the N and L tests, is clear: F_{ R } is close to or below 0.05 for both tests. To clearly compare the CSEP methodology and the new testing procedures, Figure 2 shows the PDF of occurrences and loglikelihoods computed by the CSEP LBT and the proposed procedure for the first ETAS simulated Japanese catalog. The observed occurrences (solid black line, Figures 2a,b) are well above or below the confidence bounds (dashed black lines, Figure 2a) of the Poisson PDF (Equation 1) supposed by Hyp_{2}. This is because the distribution expected by the ETAS model (contour plot, Figure 2b), estimated by the empirical PDF of {\Omega}_{\text{ETAS},i}^{S}, has a long/heavy tail, which is clearly not consistent with Hyp_{2}. Similar results are found for the loglikelihood. The loglikelihoods L(Ω_{ i }Λ) computed by Equation 4 are well below the values of L\left({\Omega}_{i}^{S}\right\Lambda ) expected by Hyp_{2} (contour plot, Figure 2c). However, the loglikelihoods CLF_{ETAS,i} (Equation 8) are fully consistent with the loglikelihoods {\text{CLF}}_{\text{ETAS},i}^{S} expected by the ETAS model (contour plot, Figure 2d).
Discussion
The rejection of the null hypothesis of a statistical test can be due to chance because it is really false or because it is probabilistically inadequate (Stark 1997; Luen and Stark 2008). The null hypothesis H_{0} of the RELM/CSEP LBT supposes that X_{(i,j)} are independent (in time and space) and Poisson random variables, with mean λ_{(i,j)}, given by the model. The CSEP protocol interprets the rejection of H_{0} as the failure of the model being tested. However, this procedure is misleading because H_{0} is not consistent with any model (Lombardi and Marzocchi 2010a).
The above findings may be explained with the help of stochastic point process theory (Daley and VereJones 2003); this is the natural context in which stochastic earthquake models may be discussed. A point process is fully represented by its ‘conditional intensity function’ (CIF) \lambda (t,\overrightarrow{x}/{\mathcal{\mathscr{H}}}_{t}), i.e., the probability of observing an event in the instant t\in \mathcal{T} and with additional variables (called marks) \overrightarrow{x}\in \overrightarrow{\mathcal{X}}, given the realization {\mathcal{\mathscr{H}}}_{t} of the process before t (Daley and VereJones 2003). The CIF of the models described in the previous section are given by Equations 6 and 7; the marks are locations and magnitudes. In the case of an NP process, the CIF is a deterministic function of time and marks, but it is independent of the past history (i.e., \lambda (t,\overrightarrow{x}/{\mathcal{\mathscr{H}}}_{t})=\lambda (t,\overrightarrow{x})). Therefore, the events in nonoverlapping subsets of \mathcal{T}\times \overrightarrow{\mathcal{X}} are independent and Poisson random variables (Daley and VereJones 2003), as supposed by the RELM/CSEP LBT. In the most general case, the CIF is also a function of history {\mathcal{\mathscr{H}}}_{t}, and the variables X_{(i,j)} are not Poisson, unless the history is fully known (Meyer 1971; Papangelou 1972a; 1972b; Daley and VereJones 2003). In a realtime forecast experiment, the history inside the forecast time window T_{ i } is unknown; therefore, for such historydependent models, such as ETAS, Hyp_{2} is inadequate.
The hypothesis Hyp_{2} has been questioned in several studies (Schorlemmer et al. 2010a; Werner et al. 2010) and, in the specific context of ETAS modeling, by Lombardi and Marzocchi (2010a). Here, I examine the causes and effects of the failure of the LBT. Specifically, I show that the failure of the LBT may be significant for high values of M_{ F } and that it has heavy consequences for long forecast time windows. This is because the longer the forecast time window T_{ i }, the greater the randomness of forecasts (due to the effect of the unknown history inside T_{ i }) and the lower the reliability of Hyp_{2}. This result contradicts the statement that the Poisson distribution is a good approximation of the forecast variability when M_{ F } is large (Werner et al. 2010).
The process of revising the LBT has begun inside the scientific community. Some people have proposed replacing the Poisson distribution with a negative binomial distribution (Werner et al. 2010) to compute the p values of the tests. However, this solution does not significantly improve the LBT because the negative binomial distribution (as for the Poisson or any other distribution) is not consistent with all models. Inside the CSEP community, some suggest updating the forecasts more regularly, leaving the LBT unchanged (personal communication). I do not think this is the best way to resolve the inefficiencies of the LBT, as these do not derive from the regularity of the forecast calculations.
The procedure described above is an obvious upgrade of the N and L tests. It accounts for the actual variability of the X_{(i,j)} given by the model being tested. Moreover, it uses the CLF, which is a better tool for checking the agreement between models and data than the discrete loglikelihood (Equation 4) used by the CSEP L test and based on Hyp_{2} (Schorlemmer et al. 2007).
This study has focused on shortterm forecasts, without analyzing the dependence of results on the size of the forecast window. From a theoretical point of view, LBT might also fail for longterm forecasts because of dissimilarities between the sets of probabilities \left\{{p}_{n}^{\mathit{\text{ij}}}\right\} and \left\{{q}_{n}^{\mathit{\text{ij}}}\right\} (see Equations 1 and 2) or, in other words, the unsuitability of Hyp_{2}. This study is not relevant to models that are explicitly supposed to be timeinvariant, such as the models tested in the 5year mainshock RELM experiment (Schorlemmer et al. 2010a; Zechar et al. 2013). However, the failure of the LBT might be significant for medium longterm forecast models with strong timedependent components, especially in testing regions with a high seismic rate. In other words, the present study does not invalidate most of the results of the first RELM/CSEP forecast experiments, which focus on longterm timeinvariant models. However, the inclusion of different forecast timespans and timedependent models in new CSEP experiments requires both an urgent revision of the testing procedure and an effort by modelers to provide full distributions of the variables being tested.
Conclusions
The main goal of this study was to interpret the failures of the CSEP/RELM LBT and to propose a possible upgrade of the N and L tests. The main findings can be summarized as follows:

1.
All LBTs are based on classical statistical hypothesis testing; therefore, they are intended to reject or not reject a null hypothesis H _{0}. The null hypothesis of the LBT is that the variables X _{(i,j)} are independent and Poissondistributed, with the rate given by forecasts. Therefore, the LBT is inadequate for checking the merits of a forecast model that is inconsistent with Hyp_{2}.

2.
Specifically, Hyp_{2} is not adequate for historydependent models, such as ETAS, because the unknown history inside the forecast period means that X _{(i,j)} do not follow a Poisson distribution.

3.
In these cases, the LBT may fail for large values of M _{ F }, especially for large forecast time windows, as the effect of the unknown history is greater.

4.
I propose a revised version of the LBT that (1) adopts the CLF and (2) requires the percentiles of the distributions of X _{ i } and CLF_{M,i}.

5.
The points discussed in this study highlight the need to revise the testing procedure for present and future experiments, which include many timedependent models. However, they have a relative effect on the first RELM/CSEP experiments, mainly focused on longterm timeindependent models.
References
Daley DJ, VereJones D: An introduction to the theory of point processes. Springer, New York, pp. 469; 2003.
Jordan TH: Earthquake predictability: Brick by brick. Seism Res, Lett 2006, 77(1):3–6. 10.1785/gssrl.77.1.3
Lombardi AM, Marzocchi W: Exploring the performances and usability of the CSEP suite of tests. Bull Seismol Soc Am 2010a, 100: 2293–2300. 10.1785/0120100012
Marzocchi W, Lombardi, AM: The ETAS model for daily forecasting of Italian seismicity in the CSEP experiment. Ann Geophys 2010b, 53: 155–164.
Luen B, Stark PB: Testing earthquake predictions. IMS Lecture Notes Monograph Series. Probability and Statistics: Essays in Honor of David A. Freedman. Institute for Mathematical Statistics Press, Beachwood; 2008. 302–315 302315
Meyer P: Demonstration simplifiée d’un thèoréme de Knight. In Sèminaire de, Probabilitès V. Univ. Strasbourg, Lecture Notes in Math; 1971. vol 191, pp. 191–195 vol 191, pp. 191–195
Nanjo KZ, Tsuruoka H, Hirata N, Jordan TH: Overview of the first earthquake forecast testing experiment in Japan. Earth Planets Space 2011, 63(3):159–169. 10.5047/eps.2010.10.003
Ogata Y: Spacetime pointprocess models for earthquake occurrences. Ann Inst Statist Math 1998, 50(2):379–402.
Papangelou F: Summary of some results on point and line processes, in Lewis P.A.W. Stochastic Point Processes. Wiley, New York; 1972a. pp. 522–532 pp. 522–532
Papangelou F: Integrability of expected increments of point processes and a related random change of scale. Trans Amer Math Soc 1972b, 165: 483–506.
Schorlemmer D, Gerstenberger MC: RELM Testing Center. Seismological Res, Lett 2007, 78(1):30–36. 10.1785/gssrl.78.1.30
Schorlemmer D, Gerstenberger MC, Wiemer S, Jackson DD, Rhoades DA: Earthquake likelihood model testing. Seism Res Lett 2007, 78(1):17–29. 10.1785/gssrl.78.1.17
Schorlemmer D, Zecher JD, Werner MJ, Field EH, Jackson DD, Jordan TH: First results of the Regional Earthquake likelihood models experiment. Pure Appl Geophys 2010a, 167: 859–876. 10.1007/s0002401000815
Schorlemmer D, Christophersen A, Rovida A, Mele F, Stucchi M, Marzocchi W: Setting up an earthquake forecast experiment in Italy. Ann Geophys 2010b, 53: 1–9.
Stark PB: Earthquake prediction: the null hypothesis. Geophys J Int 1997, 131: 495–499. 10.1111/j.1365246X.1997.tb06593.x
Tsuruoka H, Hirata N, Schorlemmer D, Euchner F, Nanjo KZ, Jordan TH: CSEP Testing Center and the first results of the earthquake forecast testing experiment in Japan. Earth Planets Space 2012, 64(8):661–671. 10.5047/eps.2012.06.007
Werner MJ, Sornette D: Magnitude uncertainties impact seismic rate estimates, forecasts, and predictability experiments. J Geophys Res 2008, 113: B08302. doi:10.1029/2007JB005427 doi:10.1029/2007JB005427
Werner MJ, Zechar JD, Marzocchi W, Wiemer S: Retrospective evaluation of the fiveyear and tenyear CSEPItaly earthquake forecasts. Ann Geophys 2010, 53(3):11–30. doi:10.4401/ag4840 doi:10.4401/ag4840
Zechar JD, Jordan TH: Testing alarmbased earthquake predictions. Geophys J Int 2008, 172: 715–724. doi:10.1111/j.1365–246X.2007.03676.x doi:10.1111/j.1365246X.2007.03676.x 10.1111/j.1365246X.2007.03676.x
Zechar JD, Gerstenberger MC, Rhoades DA: Likelihoodbased tests for evaluating spaceratemagnitude earthquakes forecasts. Bull Seism, Soc Am 2010, 100(3):1184–1195. doi:10.1785/0120090192 doi:10.1785/0120090192 10.1785/0120090192
Zechar JD, Schorlemmer D, Werner MJ, Gerstenberger MC, Rhoades DA, Jordan TH: Regional earthquake likelihood models I: firstorder results. Bull Seism, Soc Am 2013, 103(2A):787–798. doi.10.1785/0120120186 doi.10.1785/0120120186 10.1785/0120120186
Acknowledgements
The author is grateful to W. Marzocchi (INGV) for stimulating discussions on the topics presented in this paper. The suggestions made by D.D. Jackson (UCLA) and two anonymous referees have significantly improved the quality of the paper. The Italy earthquake data were obtained from the seismic bulletin of the Istituto Nazionale di Geofisica e Vulcanologia (INGV, http://iside.rm.ingv.it). The Japan earthquake data were extracted by the Earthquake Catalog of the Japan Meteorological Agency (JMA, http://www.jma.go.jp/en/quake). Information on CSEP is available at http://www.cseptesting.org.
Author information
Affiliations
Corresponding author
Additional information
Competing interests
The author declares that she has no competing interests.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Lombardi, A.M. Some reasoning on the RELMCSEP likelihoodbased tests. Earth Planet Sp 66, 4 (2014). https://doi.org/10.1186/18805981664
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/18805981664
Keywords
 Statistical tests
 Earthquake forecast
 Point processes