Next-day earthquake forecasts for the Japan region generated by the ETAS model
- Jiancang Zhuang^{1}Email author
https://doi.org/10.5047/eps.2010.12.010
© The Society of Geomagnetism and Earth, Planetary and Space Sciences (SGEPSS); The Seismological Society of Japan; The Volcanological Society of Japan; The Geodetic Society of Japan; The Japanese Society for Planetary Sciences; TERRAPUB. 2011
Received: 30 April 2010
Accepted: 24 December 2010
Published: 4 March 2011
Abstract
This paper gives the technical solutions of implementing the space-time epidemic-type aftershock sequence (ETAS) model for short-term (1-day) earthquake forecasts for the all-Japan region in the Collaboratory for the Study of Earthquake Predictability (CSEP) project in Japan. For illustration, a retrospective forecasting experiment is carried out to forecast the seismicity in the Japan region before and after the Tokachi-Oki earthquake (M 8.0) at 19:50:07 (UTC) on 25 September 2003, in the format of contour images. The optimal model parameters used for the forecasts are estimated by fitting the model to the observation records up to the starting time of the forecasting period, and the probabilities of earthquake occurrences are obtained through simulations. To tackle the difficulty of heavy computations in fitting a complicated point-process to a huge dataset, an “off-line optimization” and “online forecasting” scheme is proposed to keep both the estimates of model parameters and forecasts updated according to the most recent observations. The results show that the forecasts have captured the spatial distribution and temporal evolution of the features of future seismicity. These forecasts are tested against the reference Poisson model that is stationary in time but spatially inhomogeneous.
Key words
ETAS model probability forecast point process random simulation information score1. Introduction
Statistical models for describing the occurrence process of earthquakes can be used for short-term or real-time earthquake forecasts (Vere-Jones, 1970). The principle of evaluating the probabilities of earthquake occurrence by using point process models, which are formulated with conditional intensity functions, was framed by Vere-Jones (1998). Among the different models for seismicity, the epidemic-type aftershock sequence (ETAS) model, which describes the features of earthquake clustering of fore-shocks, mainshocks, and aftershocks, has become a standard model for testing hypotheses and a starting point for short-term earthquake forecasts (see, e.g., Helmstetter and Sornette, 2003; Zhuang et al., 2004, 2008; Hainzl and Ogata, 2005; Lombardi et al., 2010).
In the study reported here, we constructed an “offline optimization and online forecast” framework for 1-day earthquake forecasts by using the space-time ETAS model, implemented in both the Collaboratory for the Study of Earthquake Predictability (CSEP) project in Japan, maintained by Earthquake Research Institute, University of Tokyo (Nanjo et al., 2011 this issue), and the CSEP project of the Southern California Earthquake Center. Similar models and procedures were also been implemented by Helmstetter et al. (2006) and Werner et al. (2011) in their earthquake forecasting experiments. The main difference between our implementation and previous reported ones is the “off-line optimization”, which will be explained in Section 3. For illustration, a retrospective forecasting experiment for the Japan region before and after the 2003 Tokachi-Oki earthquake (M_{J} = 8.0) is carried out in Section 4.
2. The Forecasting Model
It is easy to see that such a model is a branching process with immigrants: the background (immigrant) process is a Poisson process; once an event occurs, irrespective of whether it is from the background processes or triggered by a previous event, this earthquake triggers a non-stationary Poisson process, specified by Eq. (3) as its children process. This model is also a kind of self-excitation process (Hawkes, 1971a, b).
The above formulations are according to Zhuang et al. (2005) and Ogata and Zhuang (2006), which are improved versions of the ones in Ogata (1998). Many other forms can also be found in the studies carried out during last 20 years (see, e.g., Console et al., 2003; Helmstetter et al., 2003).
3. Model Estimation
To use the model specified by Eq. (2) to forecast seismicity, the following technical problems need to be solved: (1) estimating background seismicity rate, (2) estimating the model parameters (A, α, c, p, D, q, γ), and (3) forecasting by using the fitted model. The solutions to (1), (2) and (3) are addressed in Subsections 3.1 to 3.3, respectively.
3.1 Estimating time-independent total seismicity and background seismicity
Similar to estimating total seismicity, many approaches have also been developed for assessing background seismicity rate: (1) proportional to total seismicity rate of all events or only of the big events in the catalog (Musmeci and Vere-Jones, 1986; Console et al., 2003); (2) using a declustering method to decluster the catalog and use the total rate in the declustered catalog as background rate (Ogata, 1998; Helmstetter et al., 2006; Werner et al., 2011); (3) weighting each event by background probability that it is a background event (Zhuang et al., 2002, 2004); and (4) the method introduced by Ogata (2004b), which is a Bayesian smoothness prior on a tessellation grids to estimate the spatial variation of the background and the model parameter at the same time. In this study, the third method is used because it is relatively simple and gives an unbiased estimate of the intensity function.
To find optimal values of n_{ p } and ε in Eq. (10) and Eq. (11), I apply the above variable kernel functions to estimate the rates of simulated inhomogeneous Poisson point processes. Cross-validation (see, e.g., Picard and Cook, 1984) reveals that the optimal n_{ p } is in the range 3~6, but not in the range 20~100 as declared in Zhuang et al. (2002). The parameter ε does not influence the prediction too much—if the locations of points are not rounded at a certain precision. This parameter only becomes important when some points happen to overlap at one location caused by rounding the numbers. Also, smaller n_{ p } and ε make Algorithm A easier to converge.
3.2 Iterative algorithm
As pointed out in Section 2, when the background rate is known, the maximum likelihood method can be used to estimate model parameters. However, in most cases, the background rate is unknown. To estimate the model parameters and the background seismicity rate simultaneously, Zhuang et al. (2002) introduced the following iterative algorithm.
- A1.
Given a fixed n_{ p } and ε, say 5 and 0.05° (equivalent to 5.56 km on the earth surface, which is close to the location error of earthquakes), calculate the bandwidth h_{ j } for each event (t_{ j }, x_{ j }, m_{ j }), j = 1, 2, · · ·, N.
- A2.
Set l ← 0, and u^{(0)}(x, y) ← 1.
- A3.Using the maximum likelihood procedure (see, e.g., Ogata, 1998), fit the model with conditional intensity functionto the earthquake data, where κ, g, and f are defined in Eq. (4) to Eq. (6), and ν is the relaxing coefficient, which is introduced in order to fasten the convergency speed of this algorithm. The model parameters are (ν, A, α, c, p, D, q, γ).
- A4.
Calculate f_{ i } for j = 1, 2, · · ·, N by using Eq. (8).
- A5.
Calculate µ(x, y) by using Eq. (11) and record it as u^{(l+1)}(x, y).
- A6.
A6. If max |u^{(l+1)}(x, y) — u^{(l)}(x, y)| > e, where e is a given small positive number, then set l ← l + 1 and go to step A3; otherwise, take νu^{(l+1)}(x, y) as the background rate and also output ρ_{ ij }, ρ_{ i } and φ_{ i }.
3.3 Simulation method and forecasting procedure
Suppose that observation data are available up to time t, but not including t. The following algorithm (modified from Zhuang et al., 2004) can then be used to simulate the seismicity in the interval [t, t + Δt].
- B1.
Generate the background catalog with the estimated background intensity in Eq. (11). For each event in the background catalog, generate a random variable U_{ i } uniformly distributed in [0, 1], accept it if U_{ i } < νφ_{ i } Δt /(t - t_{0}), where ν is as defined in (12), t_{0} is the starting time of the catalog and thus t - t_{0} is the length of period of data fitted to the model. Randomly assign each selected event a new time uniformly distributed in [t, t + Δt], and relocate each selected event by adding a 2D Gaussian random variable with a density , where Z is the kernel function used in estimating the background seismicity and d_{ i } is the bandwidth corresponding to the selected event. Recorded these new events as Generation 0, namely G'^{(0)}.
- B2.
Let the initial catalog G^{(0)} be the collection of all the events in G'^{(0)} and all the observed events before t.
- B3.
Set 2113 ← 0.
- B4.
For each event i, namely (t_{ i }, x_{ i }, y_{ i }, m_{ i }), in the catalog G^{(l)}, simulate its N^{(i)} offspring, namely, , where N^{(i)} is a Poisson random variable with a mean of κ(m_{ i }), and and are generated from the probability densities g(· - t_{ i }), f(· - x_{ i }, · - y_{ i }, m_{ i }) and s(·) respectively. Let :
- B5.
- B6.
3.4 Off-line optimization and online forecasting
4. A Retrospective Forecasting Experiment on the Seismicity before and after the 2003 Tokachi-Oki Earthquake
4.1 Data
A summary of parameters used for the forecasting experiment.
Earthquake catalog | JMA catalog from 1 January 1965 |
Historical catalog | None |
Polygon region for model fitting | (128.93, 33.54), (130.43, 29.91), (134.75, 33.18), (140.57, 34.36), (143.39, 40.35), (147.43, 43.62), (145.92, 45.00), (140.38, 45.00), (136.44, 43.89), (134.1, 38.17), (131.09, 35.54), (127.52, 33.91) |
Magnitude threshold | 4.0 |
Bandwidths for estimating background rates | n_{ p } = 4, 𝝐 = 0.1 |
Bandwidths for smoothing forecast | 0.3 |
Training period | 0–10,500 days from 1 January 1965 |
Model fitting period | 10,500 to 14,712 days from 1 January 1965 |
4.2 Comparison between observations and forecasts
4.3 Evaluation of forecasting performance
There are several ways to evaluate the forecast performances. (1) the R-score or Hanssen-Kuiper skill score (see, e.g., Shi et al., 2001; Harte and Vere-Jones, 2005; Console et al., 2010); (2) Molchan’s error diagram (see, e.g., Molchan, 1990); (3) the entropy or information score (see, e.g., Vere-Jones, 1998); (4) the gambling score (Zhuang, 2010). In this study, we use the entropy score to evaluate the performance of the forecast.
It can be seen from Figs. 4(b) and 6 that, on the occurrence day of the mainshock, the ETAS model has a lower score than the Poisson model. This can be explained by the fact that the ETAS model basically forecasts with its background rate, which is of course lower than the average rate of the fitted Poisson model, for the day of the mainshock together with a burst of many aftershocks and gives lower probabilities of earthquake occurrences if there is no fore-shock has occurred nearby in the recent past. In the cases of the Landers earthquake in California and the L’aquila earthquake in Central Italy, some foreshocks occurred before the mainshocks, and quite high probabilities of earthquakes were forecasted by the ETAS model (Helmstetter et al., 2006; Marzocchi and Lombardi, 2009; Werner et al., 2011).
5. Discussion
5.1 Evolution of model parameters
It is instructive to study the evolution of the parameters over time and with new data. In this article the parameters are estimated just once, and it is therefore questionable how strongly they vary over the forecasting period. Firstly, these parameters do not change very much. To see how the model parameters evolve over time and with new data, I fit the model to the observation data each time before a forecast is made, during the forecast period from 23 September to 22 October 2003. The ranges of the estimates are µ ∈ [0.2691, 0.2743] (events/day/deg^{2}), A ∈ [0.4112,0.4280] (events), c ∈ [0.00723,0.00761] (day), α ∈ [1.2349, 1.2665], p ∈ [1.0684, 1.0710], D ∈ [0.000131,0.000136] (deg^{2}), q ∈ [1.5875,1.5976], and γ ∈ [1.3076, 1.3656], showing that the variations are quite small. Secondly, in my opinion, the ETAS model is quite a stable model. That is to say, given the observation history, quite reasonable results can be obtained for short-term forecasts by using the ETAS model with some typical parameters, without fitting the model to the past seismicity. The differences between forecasts made using typical parameters and those made using the maximum likelihood estimates can be only distinguished through strict statistical tests, but not visually.
5.2 Stability of simulations
5.3 Other information-based scoring methods
In the evaluation of the forecasting performance, the probabilities that one or more events occur in each forecasting space-time-magnitude window are used as forecasting results. In fact, through simulation, it is not difficult to forecast the full distribution of the numbers of events and use the Poisson information score to evaluate the forecast performance (see, e.g., Vere-Jones, 1998; Werner et al., 2011). In this study, I do not consider this format of forecasts and the Poisson information score because, as shown by Vere-Jones (1998), the binary score is equivalent to the Poisson score asymptotically. The significant superiority of the ETAS model in forecasting seismcity to the Poisson model in binomial information score implies that the ETAS model also performs better than the Poisson model in the Poisson information score.
5.4 Influence of small events
This study does not consider the triggering effect from smaller events that are below the magnitude threshold, 4.0. The reason for this is that there are already have more than 19,000 events of M_{J} = 4.0 in the catalog, and if the magnitude threshold is lowered, the number of selected events increases vastly and it takes much more computation time to fit the model to the data, since the computation time is approximately proportional to the square of the number of events. However, as shown by several researchers (see, Helmstetter and Sornette, 2003; Helmstetter et al., 2003; Werner, 2007; Zhuang et al., 2008), small events are important in triggering large earthquakes. Also, cutting-off the triggering effect from the events lower than magnitude threshold is one of the biggest sources of the estimating errors (Wang et al., 2010), while including the smaller events in the observation history improves the forecasts (Werner et al., 2011).
6. Concluding Remarks
The space-time ETAS model has been implemented as an “off-line optimization and online forecasting” scheme in the Japan and SCEC CSEP projects. It consists of four components: (1) off-line optimization, which computes optimal model parameters for future uses in forecasting; (2) a simulation procedure, which simulates many copies of possibilities of earthquake occurrence in a future time interval, based on the last updated parameters and most recent records of seismicity; (3) a smoothing procedure, which smoothes the events generated in the simulation step to obtain stable and smoothed spatiotemporal occurrence rate; (4) a forecast performance evaluation procedure, which uses the CSEP common evaluation framework.
Using the ETAS model, I have made retrospective experiments on 1-day forecasts of earthquake probabilities in the Japan region before and after the Tokachi-Oki earthquake in September 2003, in the format of contour images. The optimal parameters for the forecasts were obtained by fitting the ETAS model to the previous observations. Once the parameters are obtained, the seismicity for the next forecast interval was simulated many times based on the ETAS model. The probabilities of earthquake occurrences were estimated as the ratio of the number of simulations that one or more earthquakes occur to the total number of simulations. These forecasts were test against the reference model, the Poisson process which is stationary in time but spatially inhomo-geneous. As expected, the forecasts based on the ETAS model catch the temporal and spatial features of the aftershock sequence, and the ETAS model performs better than the Poisson model.
Declarations
Acknowledgments
This research is supported by Grant-in-Aid Nos 20240027 for Scientific Research (A), and 20840043 for Young Scientists (Startup), both from Ministry of Education, Science, Sports and Culture, Japan. The author thanks Denjel Schrommer and Maria Liuks, for their assistance in implementing the above procedure to SCEC CSEP project, and Kazuyoshi Nanjo and Hiroshi Tsuruoka, for implementing it in Japan CSEP. Support from Prof. David Jackson from UCLA and Prof. Yosihiko Ogata from ISM is also gratefully acknowledged. The author also thanks Dr. Takashi Iidaka, Dr. Kazuyoshi Nanjo, Dr. Max Werner and Dr. Rodolfo Console for their helpful comments and suggestions.
Authors’ Affiliations
References
- Console, R., M. Murru, and A. M. Lombardi, Refining earthquake clustering models, J. Geophys. Res., 108(B10), 2468, 2003.View ArticleGoogle Scholar
- Console, R., M. Murru, F. Catalli, and G. Falcone, Real time forecasts through an earthquake clustering model constrained by the rate-and-state constitutive law: comparison with a purely stochastic ETAS model, Seismol. Res. Lett., 78(1), 49–56, 2007.View ArticleGoogle Scholar
- Console, R., M. Murru, and G. Falcone, Probability gains of an epidemic-type aftershock sequence model in retrospective forecasting of m = 5 earthquake in Italy, J. Seismol., 14(1), 9–26, 2010.View ArticleGoogle Scholar
- Daley, D. D. and D. Vere-Jones, An Introduction to Theory of Point Processes—Volume 1: Elementrary Theory and Methods (2nd Edition), Springer, New York, NY, 2003.Google Scholar
- Fletcher, R. and M. J. D. Powell, A rapidly convergent descent method for minimization, Comput. J., 6(2), 163–168, 1963.View ArticleGoogle Scholar
- Hainzl, S. and Y. Ogata, Detecting fluid signals in seismicity data through statistical earthquake modeling, J. Geophys. Res., 110, B05S07, 2005.Google Scholar
- Harte, D. and D. Vere-Jones, The entropy score and its uses in earthquake forecasting, Pure Appl. Geophys., 162(6), 1229–1253, 2005.View ArticleGoogle Scholar
- Hawkes, A., Spectra of some self-exciting and mutually exciting point processes, Biometrika, 58, 83–90, 1971a.View ArticleGoogle Scholar
- Hawkes, A., Point spectra of some mutually exciting point processes, J. R. Statist. Soc.: Ser. B (Statistical Methodology), 33, 438–443, 1971b.Google Scholar
- Helmstetter, A. and D. Sornette, Foreshocks explained by cascades of triggered seismicity, J. Geophys. Res., 108(B10), 2457, 2003.View ArticleGoogle Scholar
- Helmstetter, A., G. Ouillon, and D. Sornette, Are aftershocks of large Californian earthquakes diffusing?, J. Geophys. Res., 108(B10), 2483, 2003.View ArticleGoogle Scholar
- Helmstetter, A., Y. Y. Kagan, and D. D. Jackson, Comparison of short-term and time-independent earthquake forecast models for southern California, Bull. Seismol. Soc. Am., 96(1), 90–106, 2006.View ArticleGoogle Scholar
- Helmstetter, A., Y. Y. Kagan, and D. D. Jackson, High-resolution time-independent forecast for m = 5 earthquakes in California, Seismol. Res. Lett., 78, 59–67, 2007.View ArticleGoogle Scholar
- Institute of Seismology and Volcanology, Graduate School of Science, Hokkaido University, Seismic quiescence and activation prior to the 2003 Tokaichi-Oki earthquake, in Report of the Coordinating Committee for Earthquake prediction, edited by Geographical Survey Institute, 72, 118–119, 2004.Google Scholar
- Kagan, Y. and L. Knopoff, Statistical search for non-random features of the seismicity of strong earthquakes, Phys. Earth Planet. Inter., 12, 291–318, 1976.View ArticleGoogle Scholar
- Lombardi, A. M., M. Cocco, and W. Marzocchi, On the increase of background seismicity rate during the 1997–1998 Umbria–Marche (central Italy) sequence: Apparent variation or fluid-driven triggering?, Bull. Seismol. Soc. Am., 100(3), 1138, 2010.View ArticleGoogle Scholar
- Marzocchi, W. and A. M. Lombardi, Real-time forecasting following a damaging earthquake, Geophys. Res. Lett., 36, L21302, 2009.View ArticleGoogle Scholar
- Molchan, G. M., Strategies in strong earthquake prediction, Phys. Earth Planet. Inter., 61(1–2), 84–98, 1990.View ArticleGoogle Scholar
- Murase, K., A characteristic in fractal dimension prior to the 2003 Tokachi-Oki earthquake (m_{ j } = 8.0), Hokkaido, northern Japan, Earth Planets Space, 56, 401, 2004.View ArticleGoogle Scholar
- Murru, M., R. Console, and G. Falcone, Real time earthquake forecasting in italy, 470, 214–223, 2009.Google Scholar
- Musmeci, F. and D. Vere-Jones, A variable-grid algorithm for smoothing clustered data, Biometrics, 42, 483–494, 1986.View ArticleGoogle Scholar
- Nanjo, K. Z., H. Tsuruoka, N. Hirata, and T. H. Jordan, Overview of the first earthquake forecast testing experiment in Japan, Earth Planets Space, 63, this issue, 159–169, doi:10.5047/eps.2010.10.003, 2011.View ArticleGoogle Scholar
- Ogata, Y., On lewis’ simulation method for point processes, IEEE Trans. Information Theory, IT-27(1), 23–31, 1981.View ArticleGoogle Scholar
- Ogata, Y., Space-time point-process models for earthquake occurrences, Ann. Inst. Statist. Math., 50, 379–402, 1998.View ArticleGoogle Scholar
- Ogata, Y., Seismicity quiescence and activation in western Japan associated with the 1944 and 1946 great earthquakes near the nankai trough, J. Geophys. Res., 109(B4), B04305, 2004a.Google Scholar
- Ogata, Y., Space-time model for regional seismicity and detection of crustal stress changes, J. Geophys. Res., 109(B3), B03308, 2004b.Google Scholar
- Ogata, Y. and J. Zhuang, Space–time ETAS models and an improved extension, Tectonophysics, 413(1–2), 13–23, 2006.View ArticleGoogle Scholar
- Picard, R. and D. Cook, Cross-validation of regression models, J. Am. Statist. Assoc., 79, 575, 1984.View ArticleGoogle Scholar
- Shi, Y., J. Liu, and G. Zhang, An evaluation of Chinese earthqauike prediction, 1990–1998, J. Appl. Prob., 38A, 222–231, 2001.View ArticleGoogle Scholar
- Tsukakoshi, Y. and K. Shimazaki, Temporal behavior of the background seismicity rate in central Japan, 1998 to mid-2003, Tectonophysics, 417, 155–168, 2006.View ArticleGoogle Scholar
- Vere-Jones, D., Stochastic models for earthquake occurrence, J. R. Statist. Soc.: Ser. B (Statistical Methodology), 32(1), 1–62, 1970.Google Scholar
- Vere-Jones, D., Probability and information gain for earthquake forecasting, Comput. Seismol., 30, 248–263, 1998.Google Scholar
- Wang, Q., D. D. Jackson, and J. Zhuang, Missing links in earthquake clustering models, Geophys. Res. Lett., 37, L21307, 2010.Google Scholar
- Werner, M. J., On the fluctuations of seismicity and uncertainties in earthquake catalogs: Implications and methods for hypothesis testing, Ph.D. thesis, Univ. of Calif., Los Angeles, 2007.Google Scholar
- Werner, M. J., A. Helmstetter, D. D. Jackson, and Y. Y. Kagan, High-resolution Time-independent Grid-based Forecast for M >= 5 Earthquakes in California, Bull. Seismol. Soc. Am., 2011 (revising).Google Scholar
- Zhuang, J., Second-order residual analysis of spatiotemporal point processes and applications in model evaluation, J. R. Statist. Soc.: Ser. B (Statistical Methodology), 68(4), 635–653, 2006.View ArticleGoogle Scholar
- Zhuang, J., Gambling scores for earthquake predictions and forecasts, Geophys. J. Int., 181, 382–390, 2010.View ArticleGoogle Scholar
- Zhuang, J., Y. Ogata, and D. Vere-Jones, Stochastic declustering of space-time earthquake occurrences, J. Am. Statist. Assoc., 97(3), 369–380, 2002.View ArticleGoogle Scholar
- Zhuang, J., Y. Ogata, and D. Vere-Jones, Analyzing earthquake clustering features by using stochastic reconstruction, J. Geophys. Res., B5(3), B05301, 2004.Google Scholar
- Zhuang, J., C.-P. Chang, Y. Ogata, and Y.-I. Chen, A study on the background and clustering seismicity in the Taiwan region by using a point process model, J. Geophys. Res., 110, B05S13, 2005.Google Scholar
- Zhuang, J., A. Christophosen, M. K. Savage, D. Vere-Jones, Y. Ogata, and D. D. Jackson, Differences between spontaneous and triggered earthquakes: their influences on foreshock probabilities, J. Geophys. Res., 113, B11302, 2008.View ArticleGoogle Scholar