Skip to main content

Prediction and validation of short-to-long-term earthquake probabilities in inland Japan using the hierarchical space–time ETAS and space–time Poisson process models

Abstract

A hierarchical space–time version of the epidemic-type aftershock sequence (HIST–ETAS) model was constructed for an optimally adapted fit to diverse seismicity features characterized by anisotropic clustering as well as regionally distinct parameters. This manuscript validates this elaborate model for short-term prediction based on several years of recent inland Japan earthquakes as a testing data set, by evaluating the results using a log-likelihood ratio score. To consider intermediate- and long-term performance, several types of space–time Poisson models are compared with the background seismicity rate of the HIST–ETAS model. Results show first that the HIST–ETAS model has the best short-term prediction results for earthquakes in the range of magnitudes from M4.0 to M5.0, although, for the larger earthquakes, sufficient recent earthquake data is lacking to evaluate the performance. Second, for intermediate-term predictions, the optimal spatial nonuniform Poisson intensity model has a better forecast performance than the seismic background intensity of the HIST–ETAS model, while the uniform rate Poisson model throughout all of inland Japan has the worst forecast performance. For earthquakes of M6 or larger, the performance of retrospective long-term forecasts was tested in two ways. First, a retrospective forecasting experiment divided the entire period from 1885 to the present into two parts, with the recent ~ 30 years as the forecast period. Second, the historical damaging earthquake data (599–1884) were spatially validated using century data from 1885 to the present. In both validations, it was determined that the spatial intensity of the inland background seismic activity of the HIST–ETAS model is much better than the best-fit nonuniform Poisson spatial model, leading to the best results. The findings of this study will be critical for regional earthquake hazard planning in Japan and similar locations worldwide.

Graphical Abstract

Introduction

Earthquake occurrence patterns vary greatly from place to place and exhibit a variety of clustering characteristics. For this purpose, practical space–time extensions of the epidemic-type aftershock sequence model (ETAS) model (Ogata 1985, 1988, 1989; Ogata et al. 1993) have been proposed (Ogata 1998). Such a space–time ETAS model can be sufficiently accurate in the sense that it adapts well in time and space to various local activity patterns and predicts them well. Indeed, in the above space–time ETAS model, automatic data modifications are implemented from the ordinary epicenter positions of the rupture initiation and an isotropic aftershock distribution. For example, they can be elliptical contour-shaped spatial aftershock distributions (Utsu 1969) that reflect the ratio of the length to width of the fault, the tilt angle, and use of centroid main shock epicenters for aftershocks of unilateral fault rupture. To identify these features, earthquakes of a certain magnitude or higher, all subsequently detected and located earthquakes within a short time span (say, 1 h) are automatically processed using the Akaike Information Criterion (Akaike 1974):

$${\text{AIC}} = ( - 2)\mathop {\max }\limits_{\theta } \left\{ {\log {\text{likelihood}}(\theta )} \right\} + 2\dim \left( \theta \right),$$
(1)

where the maximizing parameter \(\hat{\theta }\) is the maximum likelihood estimate (MLE).

In contrast, we know empirically that, as the magnitude threshold decreases and the number of data events increases or as the area becomes wider, the differences in the parameter values of the model between locations become larger. For example, it is clear that the p values of aftershock attenuation vary from place to place, and in particular, the background seismicity rates vary largely from place to place. Therefore, in Ogata (1998), we extended the best-fit case of the candidate space–time ETAS models to a hierarchical version (hierarchical space–time ETAS model, hereafter referred to as HIST–ETAS model), whose parameters depend on the location of the earthquake (Ogata et al. 2003 and Ogata 2004).

The parameter estimation of the HIST–ETAS model by Ogata et al. (2003) and Ogata (2011) as well as the non-homogeneous Poisson model used in this manuscript rely on the Akaike Bayesian Information Criterion (ABIC; Akaike 1980):

$${\text{ABIC}} = ( - 2)\mathop {\max }\limits_{w} \left\{ {\log \int_{\Theta } {{\text{Posterior}}(\theta |w)\,d\theta } } \right\} + 2\dim\left( w \right),$$
(2)

where w is a hyper-parameter vector representing some weights to determine the strengths of the smoothness constraints within the parameter coefficients \(\theta\). The hyper-parameters \(\hat{w}\) minimizing the ABIC will optimally smooth constraints of a large number of parameter coefficients \(\theta\) by maximizing \({\text{Posterior}}(\theta |\hat{w})\); we call this \(\hat{\theta }\), the maximum a posteriori (MAP) estimate. However, it is necessary to demonstrate that the plugged-in forecasting model using the MAP solution has excellent prediction ability in the case of a large number of parameter coefficients. Additional methodological details are provided by Ogata et al. (2003) and Ogata (2004, 2011). Relevant computational FORTRAN codes and practical manuals are also available (Ogata et al. 2021).

In this manuscript, I aim to validate this elaborate Bayesian-type model from the short-, medium-, and long-term forecasting viewpoints using earthquakes that occurred in inland Japan as a testing data set, by evaluating the results using log-likelihood ratio scores. Specifically, I consider shallow earthquakes of M4.0 or greater for the period 1923–2018 in inland Japan (see Fig. 1), selected from the Hypocenter Catalog of the Japan Meteorological Agency (JMA), hereafter referred to as the JMA catalog (JMA, 2021). I further use the Utsu catalog for 1885–1922 (Utsu 1982, 1985), for the precursory data to withstand the stationary nature of the ETAS model, whose magnitude determination method is consistent with the JMA catalog. Although the Utsu catalog is complete with earthquakes of M6 or higher, I use them as the precursory history in the HIST–ETAS model in the target period, since such large earthquakes affect the seismic activity in the target and forecasting period under study. The time frame concept of short-to-long-term forecasting varies from author to author, but in the context of this study I define short-term forecasting to mean within a few days, medium-term forecasting to mean within a few years, and long-term forecasting to mean longer.

Fig. 1
figure 1

Epicenter locations within inland Japan (blue boundary curve). a Earthquake locations of the target data from the JMA catalog (1923–2018, M ≥ 4.0) applied to estimate the models, in addition to those of the Utsu catalog (1885–1922, M ≥ 6.0). b Earthquake locations of the target data (2019 to September 2021, M ≥ 4.0) applied for evaluating forecast of models

Probability forecasting and verification methods

For the spatio-temporal element, defined as Δ(t, x, y) = [t, t + dt) × [x, x + dx) × [y, y + dy), in which earthquakes of a certain magnitude threshold or higher may occur, the occurrence probability satisfies the following relationship for calculating the short-term forecast of earthquake occurrence, depending on the history of past occurrences:

$${\text{Probability}}\left\{ {{{\text{an event occurs in}\quad\Delta}}\left( {t,x,y} \right)|H_{t} } \right\} \, = {\lambda}\left( {t, \, x, \, y| \, H_{t} } \right){\text{d}}t{\text{d}}x{\text{d}}y \, + {\text{o}}\left( {{\text{d}}t{\text{d}}x{\text{d}}y} \right),$$
(3)

where λ(t, x, y| Ht) is a conditional intensity function, and \(\,H_{t} = \left\{ {(t_{j} ,x_{j}, y_{j} ,M_{j} );\,\,t_{j} < t,\,\,\,M_{j} \ge M_{c} } \right\}\) represents the history of earthquake occurrence time {ti} up to time t, its corresponding epicenters (xi, yi), and magnitudes Mi.

Here, if a model is independent of history and time t, in such a way that

$${\lambda}\left( {t, \, x, \, y| \, H_{t} } \right) \, = {\lambda}\left( {x, \, y} \right),$$
(4)

this actually characterizes a stationary space–time Poisson process.

As for models for the magnitude sequence, considering that

$${\text{Probability}}\,\left\{ {{\text{earthquake magnitude is in }}\left( {M,M + dM} \right) \, |H_{t} } \right\} \, \approx f\left( {M|H_{t} } \right){\text{d}}M ,$$
(5)

the probability of the occurrence of an earthquake of magnitude M can be provided in principle by the multiplication λ(t, x, y| Htf (M | Ht). In fact, the magnitude sequence is mathematically history-dependent, and there is a case for it (Ogata et al. 2018). However, only a limited number of models for studying the dependence is available. So far, most previous research results assume that f (M | Ht) is independent of history and then distributed by the Gutenberg–Richter law (G–R) (Gutenberg and Richter, 1944) or its modifications (Utsu, 1999). Therefore, for simplicity, I have assumed f (M | Ht) = f (M) = \(\,10^{{ - \hat{b}\,\,(M - 4.0)}}\) in this manuscript hereafter.

Suppose that various point-process models \(\lambda_{\theta } (t,x,y|H_{t} )\,\) are obtained from the earthquake occurrence data with magnitudes \(M \ge 4.0\) whose parameters \(\hat{\theta }\) are obtained by the Bayesian maximum a posteriori (MAP) estimate as described in “Estimation of the HIST–ETAS and space–time Poisson process models for predicting seismic activity over a wide area” section. Then, for predictors of earthquakes of Mc and larger is computed by

$$\hat{\lambda }(t,x,y,M_{c} |H_{t} ) = \lambda_{{\hat{\theta }}} (t,x,y|H_{t} )\,10^{{ - \hat{b}\,\,(M_{c} - 4.0)}} ,$$
(6)

with the MLE \(\hat{b}\) of G–R law. Thus, the standard short-, intermediate-, and long-term seismicity forecasts are implemented throughout the inland region of Japan (see Fig. 1) using the specific models introduced in “Estimation of the HIST–ETAS and space–time Poisson process models for predicting seismic activity over a wide area” section.

Then, I adopt the space–time log-likelihood score calculated from the occurrence prediction as an evaluation criterion and the result in the forecasting time interval [S, T] is as follows:

$$\log L\left( {\hat{\lambda };\;S,\,T,M_{c} } \right) = \sum\limits_{{\{ i;\,S < t_{i} < T,\,M_{i} \ge M_{c} \} }} {\log \hat{\lambda }\left( {t_{i} ,x_{i} ,y_{i} ,M_{i} |H_{{t_{i} }} } \right) - \int_{{M_{c} }}^{\infty } {\int_{S}^{T} {\iint_{{{\text{Inland}}}} {\hat{\lambda }\left( {t,x,y,M|H_{t} } \right)}\;{\text{d}}x\,{\text{d}}y\,{\text{d}}t} } } {\text{d}}M$$
(7)

Here, it should be noted that, even with truncated or tapered magnitude distributions, there is no mathematical inconsistency when integrating from Mc to infinity.

Alternatively, the spatial log-likelihood score

$$\log L(\hat{\lambda };\;{\text{Inland}},M_{c} ) = \log \prod\limits_{{\{ i;\,\,M_{i} \ge M_{c} \} }}^{{}} {\left\{ {\frac{{\hat{\lambda }(x_{i} ,y_{i} ,M_{i} )}}{{\int_{{M_{c} }}^{\infty } {\iint_{{{\text{Inland}}}} {\hat{\lambda }(x,y,M)}\;{\text{d}}x\,{\text{d}}y{\text{d}}M} \,}}} \right\}} ,$$
(8)

where \(\lambda (x,y,M) = \lambda (x,y) \cdot 10^{{ - b(M - M_{c} )}}\). The likelihood in (8) is actually conditional on the given fixed number of occurred earthquakes in the likelihood in (7) in case where the intensity function is history independent, namely, Poisson processes. Here, it should be noted that the periods for estimation and prediction should be mutually disjointed for the space–time Poisson processes. For earthquakes of M= 6.0 or larger, I conducted a retrospective forecasting experiment by dividing the entire period from 1885 to the present into two parts, using the last part of approximately 30 years as the forecast period. Furthermore, as a long-term backcast of large earthquakes, I attempt to cross-validate historical earthquakes against the JMA data to evaluate the performance of the spatial Poisson models.

Estimation of the HIST–ETAS and space–time Poisson process models for predicting seismic activity over a wide area

HIST–ETAS model

The HIST–ETAS model (Ogata 2015, 2016, 2017a, b, and 2020) is defined by the following equation:

$$\lambda_{\theta } (t,x,y|H_{t} ) = \mu \,(x,y) + \sum\limits_{{\{ i;\,\,t < t_{i} \} }} {\frac{{K(\overline{x}_{i} ,\overline{y}_{i} )}}{{\left( {t - t_{i} + c} \right)^{{p\,\,(\overline{x}_{i} ,\overline{y}_{i} )}} }}\times \left[ {\frac{{(x - \overline{x}_{i} ,y - \overline{y}_{i} )S_{i}^{ - 1} (x - \overline{x}_{i} ,y - \overline{y}_{i} )^{t} }}{{e^{{\alpha (\overline{x}_{i} ,\overline{y}_{i} )\,\,(M_{i} - 4.0 )}} }} + d} \right]^{{ - q(\overline{x}_{i} ,\overline{y}_{i} )}} }$$
(9)

This equation separates the background seismicity rate µ and the superposed space–time clusters. In the clusters, the temporal factor adheres to the Omori–Utsu law characterized by the parameters K, c and p; and the spatial factor assumes the inverse power-law of distance with the parameter q and scaling size is characterized by α, considering the following 2 × 2 covariance matrix depicting the possible anisotropy of spatial clustering distribution:

$$S_{i} = \left( {\begin{array}{*{20}c} {\sigma_{x,\,i}^{2} } & {\rho_{i} \sigma_{x,\,i} \sigma_{y,\,i} } \\ {\rho \sigma_{x,\,i} \sigma_{y,\,i} } & {\sigma_{y,\,i}^{2} } \\ \end{array} } \right)$$
(10)

Furthermore, sometimes the center of spatial clustering does not coincide with the epicenter coordinates of the triggering earthquake (Additional file 1: Fig. S1) particularly in cases where the earthquake is so large that it is necessary to estimate the centroid coordinates \((\overline{x}_{i} ,\overline{y}_{i} )\). Both \((\overline{x}_{i} ,\overline{y}_{i} )\) and the covariance matrix Si are automatically determined in quasi-real time after a relatively large earthquake of a certain magnitude or higher; for further details, see Section S1 in Additional file 1. Additional file 1: Fig. S1 shows some examples of this type of diverse spatial cluster of earthquakes that occurred within 1 h.

Furthermore, as it is necessary to make accurate predictions that reflect regional characteristics, I made the key parameter functions location-dependent so as to be able to run the HIST–ETAS model (9) with location-dependent parameters that adapt to various local seismicity patterns over a wide region. Therefore, as explained in Section S2 and Fig. S2 in the Additional file 1, each of the location-dependent parameters μ(x, y), K(x, y), α(x, y), p(x, y), and q(x, y) are represented by piecewise functions on Delaunay triangles. Namely, the value at any location (x, y) is linearly interpolated by the three values (the coefficients) at the locations of the nearest three earthquakes (triangle vertices) on the tessellated plane by the epicenters. When the parameters α, p, and q depend on the location (x, y), as in (9), it is called the HISTETAS5pa model, and when they are constant, it is called the HISTETASμK model.

We are particularly concerned with sensitive spatial estimates of the first two parameter functions of the model. First, the estimated parameter function μ(x, y) of the background activity is useful as the perpetuity probability, as will be discussed for long-term forecasting in “Results and evaluation of short-, intermediate- and long-term predictions in the seismic activity of inland Japan” section. Next, the parameter function K(x, y) represents heterogeneous aftershock productivity in space, which is useful for an accurate short-term prediction, because spatial aftershock intensity could possibly be heterogeneous in and around an asperity zone of fault rupture (Ogata 2004). Fortunately, the coefficients of these two factors are linear with respect to the log-likelihood function (Ogata 1978) such that its maximizing solutions are stably obtained.

Such a large model needs to be estimated with mutually constrained coefficients of parameter functions, which are determined by the ABIC (Akaike 1980). Then, I solve the inverse problem to find the parameter that maximizes the posterior distribution, i.e., the MAP estimate. The coefficients of the parameter functions are simultaneously estimated by maximizing a penalized log-likelihood function described in Additional file 1: Sects. S2 and S3 that determines the optimum trade-off between the goodness of fit to the data and uniformity constraints of the functions (i.e., facets of each piecewise linear function being as flat as possible) as mathematically described in Additional file 1: Section S2. Such an optimum trade-off is objectively attained by minimizing the ABIC in (2) (see “Results and evaluation of short-, intermediate- and long-term predictions in the seismic activity of inland Japan” section), which evaluates the expected predictive error of Bayesian models based on the data used for the estimation (e.g., Ogata 2004).

These parameter coefficients are represented by piecewise linear Delaunay functions, which are estimated by fitting them to the JMA seismic source data for earthquakes of M ≥ 4.0, in the target time interval 1923–2018. To ensure the long-term dependence of the seismic activity model, I further use the Utsu catalog from 1885 to 1922 (Utsu 1982, 1985) of the earlier interval for the precursory data to withstand the stationary nature of the ETAS model. The optimal posterior distribution of the coefficients of the local linear Delaunay function is then obtained by minimizing the ABIC, and the inverse problem is solved to obtain the MAP; the conditional intensities of the MAP coefficients are then used for short-term prediction. The prediction programs of the above models, HISTETASμK and HISTETAS5pa, have already been submitted to the CSEP (Collaboratory for the Study of Earthquake Predictability) Testing Center at the Earthquake Research Institute, University of Tokyo (Tsuruoka et al. 2012), and are undergoing comparative validation in different frameworks together with a number of space–time ETAS derived models as described by Nanjo et al. (2012) and Ogata et al. (2013). In this manuscript, a snapshot is shown in Fig. 2a that is taken from the conditional intensity function in Eq. (3) of HIST–ETAS–5pa model with the MAP parameters estimated during the target period 1923–2018 and the precursory period 1885–1922, along with another snapshot from the forecasting period 2019–2021 shown in Fig. 2b by establishing the forecasting model using data obtained from the precursory period 1885–2018. Figure 2a depicts the seismicity of onshore earthquakes at the time about 4 h after the M9 Tohoku–Oki earthquake, and Fig. 2b is predicted seismicity at the time about 3 days after the M6.7 Yamgata–Ken–Oki earthquake.

Fig. 2
figure 2

Snapshots of the optimal maximum a posteriori (MAP) conditional intensity function in Eq. (3) at the dates and times listed at the top for the HIST–ETAS–5pa model in a learning period 1885–2018 and b forecasting period starting from 2019 to September 2021. Contours in both images are presented with the common equidistant intervals in logarithmic scale. The color bars below the images represent the occurrence rate of a M ≥ 4.0 event per 1.0° × 1.0° cell per day on the ordinary logarithmic scale

Space–time Poison process models

Similarly, I considered four types of Poisson spatio-temporal models (4) that are stationary in time but nonuniform in space, independent of history. Calculations for these models are implemented for all short-, intermediate-, and long-term forecasts.

First, I consider the inland uniform Poisson process model with the same rate of occurrence only in the inland region and zero rate of occurrence outside the inland region (see Fig. 3a), such that

$$\begin{aligned} \lambda \left( {x,y} \right) &=& \hat{\lambda }_{{{\text{inland}}}} \;{\text{if }}\left( {x,y} \right){\text{ is within the inland region}}, \\ &=&0\;{\text{if }}\left( {x,y} \right){\text{ is outside of the inland region}}, \\ \end{aligned}$$
(11)

where the inland region boundary is shown in Fig. 1. Second, I use a spatial nonuniform Poisson process model of

$$\lambda (x,y) = \hat{\lambda }(x,y),$$
(12)
Fig. 3
figure 3

Poisson space–time models for the long-term seismicity of inland Japan, all of which are shown by the respective MAP estimate that minimizes the ABIC: a inland uniform Poisson process model, b nonuniform spatial Poisson process model, c background μ(x, y) intensities of the HIST–ETAS–μK model, and d HIST–ETAS–5pa model. The colors and contours are in logarithmic scale indicating the expected number of M ≥ 4 earthquakes/deg2/day)


with the optimal MAP estimate of a piecewise linear Delaunay function obtained by the ABIC minimization (see Fig. 3b and Additional file 1: S3b). The third is the background intensity rate of the HIST–ETAS–μK model (see Fig. 3c and Additional file 1: S3c) such that

$$\lambda (x,y) \propto \hat{\mu }_{{{\text{HIST}} - {\text{ETAS}} - \mu K}} (x,y),$$
(13)

where the proportional constant is adjusted using the average number of earthquakes per year in the target estimation period, and the last is

$$\lambda (x,y) \propto \hat{\mu }_{{{\text{HIST}} - {\text{ETAS}} - 5{\text{pa}}}} (x,y),$$
(14)

which is the background intensity rate of the HIST–ETAS–5pa model (see Fig. 3d and Additional file 1: S3d) with constant correction, both (13) and (14) are spatial nonuniform Poisson process models. The MAP estimates μ(x,y) of the HIST–ETAS model for the background seismic activity are very stable in the sense that they consistently show very similar solutions for the data because of the selection of the period of interest; see, for example, see Additional file 1: Fig. S4.

Results and evaluation of short-, intermediate- and long-term predictions in the seismic activity of inland Japan

The HIST–ETAS models were applied to the target data collected from both the JMA (1923–2018, M ≥ 4.0) and Utsu catalogs together (1885–1922, M ≥ 6.0) regarding large earthquakes that occurred in the precursory period; and the optimal Bayesian likelihood was determined by minimizing the ABIC. It was found that the HIST–ETAS–5pa model fitted significantly better than the HIST–ETAS–μK model for estimation in the wide-area seismic data, with a difference of ABIC = 1533.5. However, such a prediction model that inserts the MAP estimate into the posterior distribution model (plug-in model) is not necessarily superior in prediction skill (Akaike 1978).

Therefore, I examined the prediction ability of the proposed model by calculating the space–time log-likelihood score (7) of the prediction results using the plug-in predictor (6) of the MAP estimate model among the above models applied to the future data. Here, historical information \(\,H_{t} = \left\{ {(t_{j} ,x_{j} , y_{j} ,M_{j} );\,\,t_{j} < t,\,\,\,M_{j} \ge 4.0} \right\}\) is the information on the occurrence of earthquakes of M4.0 or greater up to time t at the end of 2018. In the case of the Poisson process model, historical information Ht is not necessary. It can also be compared and evaluated with the results of earthquake predictions for earthquakes of medium and large magnitude (M ≥ 4, 4.5, 5.0, 5.5), assuming that the b value of the G–R law in the entire inland region is uniformly 0.9, i.e., the MLE equals 0.9. The time frame of short-to-long-term forecasting in the following refers to the definitions as: short-term forecasting to mean within a few days, medium-term forecasting to mean within a few years, and long-term forecasting to mean longer than a few years.

Short-term forecasts

In the evaluation, I evaluated the short-term forecasting results using the log-likelihood ratio scores compared to those of the inland uniform Poisson process model (11) that has the location-independent forecast probability. Table 1 shows that the HIST–ETAS–5pa model has the best short-term prediction results for earthquakes up to the M5 class for the last 2 years and 9 months, followed by the HIST–ETAS–μK model; both of the HIST–ETAS models are far superior to the Poisson process models. This may be taken to suggest that, generally, triggered clusters are forecasted well in the short-term by the HIST–ETAS models. However, for earthquakes of M ≥ 5.5 in this table, the inland nonuniform model looks best, though this evaluation is unstable, since it uses only 3 such events. For a stable evaluation, more events of that magnitude would be required, and thus, accordingly, a longer experiment period would also be required.

Table 1 Space–time log-likelihood score (7) of short-term forecasts

Intermediate forecasts

Obtaining intermediate-term forecasts taking account of the clustering effect of the HIST–ETAS model may present challenges for further evaluations. This is because I need to simulate future data of magnitude series. For example, I need to use the G–R rule to simulate the magnitude many times for intermediately forecasting by the HIST–ETAS models for such scenario earthquakes and observe how they change. This needs to be done under a large number of, say, 10,000 iterations of calculations to show the variation of the spatio-temporally predicted probability density for all possible scenarios.

Alternatively, for an intermediate-term forecast for the period of a few years, I apply the space–time Poisson models with the intensity functions (11)–(14), where (13) and (14) are obtained by multiplication of the normalized background intensity of each HIST–ETAS model and the average number of earthquakes of M ≥ 4.0 per day estimated from the target period. All intensities of the competing models are shown in Fig. 3. According to the spatial log-likelihood score (8) in Table 2 for the intermediate-term forecast of 2019 to September 2021, the optimal MAP nonuniform Poisson process model (12) in the range up to M5.5 is superior to the two background intensities of the HIST–ETAS models and much better than the spatially uniform Poisson process model (11) over the entire inland area.

Table 2 Spatial log-likelihood score (8) of intermediate-term forecasts for 2019 to September 2021

Long-term forecasts

Within the last 3 years, the current number of large earthquakes (e.g., M6.0 and above) is insufficient for accurate verification, making evaluation difficult. However, large earthquakes in the long-term seem to have occurred more frequently in highly active background regions of the HIST–ETAS models (Ogata 2008, 2020). Indeed, the background rate models μ(x,y) in (13) and (14) look promising for forecasting events in the 25-year period of 1996–2018 as seen in Fig. 4a, when the optimal MAP estimate is used (Additional file 1: Fig. S4a) and for which I obtained the data from the target interval 1926–1995. Therefore, I fitted the ETAS models to the data in the period 1926–1995 and then forecasted it for the period 1996–2018.

Fig. 4
figure 4

a Blue circles represent shallow earthquakes of M ≥ 6 that occurred during the period 1996–2019 and the color image with contours represents the μ value distribution of the HIST–ETAS–5pa model for the period 1926–1995. b Blue and red circles show the locations of historical damaging earthquakes before and after 1585, respectively. The horizontal dotted line indicates 38°N. The color image with contours represents the μ value distribution of the HIST–ETAS–5pa model for the period 1926–2018. The color scale for both (a) and (b) are the same as in Fig. 3 and Additional file 1: S4

In fact, Table 3 shows that the optimal nonuniform Poisson process model forecast (12) in the range up to M5.0 again has the best performance, but the background spatial intensity of the HIST–ETAS–5pa model (14) then outperforms the others in M5.5 and larger, better than the nonuniform Poisson model (12). In particular, the inland uniform Poisson model (11) was considerably worse throughout all magnitude ranges. However, for M7.0 and larger earthquakes, the differences are not clear because of the small number of such earthquakes within 30 years.

Table 3 Spatial log-likelihood score (8) of retrospective long-term forecasts for 1996 to September 2021

Alternatively, instead of long-term forecasting in this present case, I try to make “backward predictions” for historical earthquakes (Utsu 1990). In other words, because of the nature of the Poisson process, I can ignore the causality of the time axis. Therefore, I can carry out a cross-validation evaluation using score (8), where \(\hat{\lambda }\) refers to the model derived from the JMA catalog (1923–2018, M ≥ 4.0), and \(\left\{ {(x_{j} , y_{j} ,M_{j} );\,\,M_{j} \ge M_{c} } \right\}\) is the estimated epicenter of damaging historical earthquakes (599–1884) by Utsu (1990), as shown in Fig. 4b. Table 4 presents a comparison of the fit performance. In contrast to the intermediate-term forecast, the background spatial intensity of the HIST–ETAS–5pa model is the best, far better than that of the nonuniform Poisson spatial model. For earthquakes of M ≥ 7.5, the inland uniform model performs best although the difference in scores is small owing to the small sample size, and regional dependence of great earthquakes of M8 class cannot be identified within 10 centuries of historical data. Surprisingly, the performance of nonuniform Poisson spatial model (b) is remarkably poor compared to the rest of the models. Although, the nonuniform Poisson spatial model appears similar to the HIST–ETAS models in Fig. 3 in terms of background seismicity rates, the intensity of the model is more spatially concentrated, i.e., the number of contour lines is higher, compared to the HIST–ETAS models. On the other hand, historical earthquakes in the M7 class are characteristic, considered to be intrinsic to major active faults, and the recurrence interval is very long, such as around 1000 years in Japan (Matsuda 1975); therefore, the spatial density of the seismic activity of the recent last 100 years, including aftershocks, appears to be more strongly biased from compared to that of the characteristic earthquakes.

Table 4 Spatial log-likelihood score of long-term “reverse prediction” of historical disaster earthquakes for 599–1884

There is a lack of damage earthquakes as we go back in history and regionally (see Additional file 1: Fig. S6).Thus, similar spatial log-likelihood score are herein calculated by restricting the period of historical data to 1585–1884 and the area in lower latitude 38°N to avoid regional bias from documented earthquakes. Namely, for ancient developed regions of capital cities, such as Kyoto, Nara, and Kamakura, such earthquakes may be better documented before 1585 as seen by blue circles in Fig. 4b. Furthermore, it is unclear whether earthquakes around Tokyo Bay may be shallow, where the interplate earthquakes occur at depths of 30 km or more. Table 5 suggests that the background spatial intensity of the HIST–ETAS–5pa model is best for all range of magnitudes, far better than the other Poisson spatial models.

Table 5 Spatial log-likelihood score of long-term “reverse prediction” of historical disaster earthquakes for 1500–1884 that are at a lower latitude than 38°N

Probability forecasting on a cell

When the MAP coefficients of the background intensities of the HIST–ETAS–5pa at the vertices of Delaunay triangles spanned by M ≥ 4.0 earthquakes is adopted in Fig. 5a, which is equivalent to Fig. 3d and Additional file 1: S3d that are the interpolated image at each 0.1° × 0.1° grid. Thus, consider a grid cell (i, j) of a small area ∆2 in the inland, and assume that the integral of background intensity \(\hat{\mu }(x,y)\) over the cell is approximated as \(\hat{\mu }(i,j)\). Then, according to Eqs. (3)–(6), long-term probability of a large earthquake of Mc or more for the future predicting period [T, U] is calculated by

$$P\left( {i,j:M \ge M_{c} } \right) \propto \mu \left( {i,j} \right) \cdot \Delta^{2} \left( {U - T} \right) \cdot 10^{{-b(M_{c} - 4.0)}} ,$$
(15)

where the proportional constant is ratio of all M ≥ 4.0 earthquakes over the entirety of inland Japan in the training period [S, T] to the expected sum of the background probability \(\hat{\mu }(i,j) \cdot \Delta^{2} (T - S)\,\) over all cells in the inland. If we set b = 0.9, \(\Delta = 0.2^{ \circ }\) and UT = 30 years to forecast M ≥ 6.0 earthquake on the 20 × 20 km cell, Fig. 5b shows the approximate long-term probabilities for a 30-year period.

Fig. 5
figure 5

a Optimal MAP μ(x, y) estimate of the HIST–ETAS–5pa model on epicenter locations (colored dots) of earthquakes of M ≥ 4.0 in and around Japan for the target period 1923–2018. The color table refers to the linearized frequency, and the scale represents the probability per day and deg2. b Probability of a M ≥ 6 shock during the next 30 years in each 0.2° × 0.2° (about 400 km2) cell in inland Japan assuming b = 0.9

Conclusions

This study used HIST–ETAS models and non-homogeneous space–time Poisson models including derivative models of the HIST–ETAS for the short-, intermediate-, and long-term probability forecasting of inland earthquakes in Japan. Based on the source data from the Utsu and the JMA hypocenter catalogs, Japan inland earthquake prospective forecasting for the later years and also post-dictions of historical disaster earthquakes are presented and evaluated.

The space–time log-likelihood scores were applied to evaluate the results of the short-term prediction for the recent few years, which showed that the location-dependent HIST–ETAS–5pa model provided the best prediction results, followed by the less location-dependent HIST–ETAS–μK model, for earthquakes of sizes ranging from M4.0 to M5.0. Both of the HIST–ETAS models performed far better than the Poisson process models owing to the clustering feature, even for larger earthquakes.

Furthermore, for intermediate and long-term prediction, the spatial log-likelihood score was adopted. Among several compared Poisson process models, the optimal nonuniform Poisson process model in the range up to M5.0 is found to be superior to the two background intensities of the HIST–ETAS model and performed much better than the spatially uniform Poisson process model over the entire inland area for an intermediate-term forecast of 2019–2021.

For a long-term forecast of large earthquakes for the range up to the M7 class, the training estimation period was reduced to 1885–1995 to evaluate retrospective forecasts for a sufficient number of larger earthquakes in the prediction period of 26 years during 1996–2021. It was found by the spatial log-likelihood score that the optimal nonuniform Poisson process model forecast in the range up to M5.0 had the best performance, but the background spatial intensity of the HIST–ETAS–5pa model outperformed the others in the class of M5.5 and larger, better than the nonuniform Poisson spatial model.

Finally, for the candidate models for the target data estimated by the data in the period 1923–2018, the spatial log-likelihood scores of historical damaging earthquakes (599–1884) were examined. The results show that the background spatial intensity of the HIST–ETAS–5pa model significantly outperformed the others, far better than the nonuniform Poisson spatial model. By restricting the period of historical data to 1585–1884 and the area in lower latitude 38°N taking better accuracy of the historical record into account, the background spatial intensity of the HIST–ETAS–5pa model is best for all range of magnitudes, far better than the other Poisson spatial models. For earthquakes of M7.5 class or above, the difference in scores was small owing to the small number of historical damaging earthquakes, which represents only a small fraction of the earthquakes that occurred in prehistory, and little significant regional difference could be observed.

The findings of this study can be expected to provide a new approach to estimating short-, intermediate-, and long-term inland earthquakes with better accuracy and reliability, since the model is based on location-dependent variables. Applications of the proposed HIST–ETAS model will be critical for regional earthquake hazard planning in Japan and similar locations worldwide.

Availability of data and materials

All data sets used in this manuscript have been cited. All computational codes are available as a FORTRAN and R package with a practical manual (Ogata et al. 2021).

Abbreviations

ABIC:

Akaike Bayesian Information Criterion

AIC:

Akaike Information Criterion

ETAS:

Epidemic-type aftershock sequence

G–R:

Gutenberg–Richter

HIST–ETAS:

Hierarchical space–time ETAS

JMA:

Japan Meteorological Agency

M:

Magnitude

MAP:

Maximum a posteriori

MLE:

Maximum likelihood estimate

References

Download references

Acknowledgements

I am grateful to Koichi Katsura for assistance with the implementation of a 3D visualization program. I used the JMA seismic source catalog and the TSEIS seismic activity visualization system (Tsuruoka 1998) for this analysis. Suggestions and queries of the anonymous reviewers were useful clarification of the manuscript.

Funding

This study was supported by the MEXT Project for Seismology toward Research Innovation with Data of Earthquake (STAR-E) Grant Number JPJ010217.

Author information

Authors and Affiliations

Authors

Contributions

All aspects of this research were carried out by YO. The author read and approved the final manuscript.

Corresponding author

Correspondence to Yosihiko Ogata.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Competing interests

The author declares that he has no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ogata, Y. Prediction and validation of short-to-long-term earthquake probabilities in inland Japan using the hierarchical space–time ETAS and space–time Poisson process models. Earth Planets Space 74, 110 (2022). https://doi.org/10.1186/s40623-022-01669-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s40623-022-01669-4

Keywords

  • Bayesian method
  • Cross-validation
  • HIST–ETAS
  • Location-dependent parameters
  • Maximum a posteriori (MAP)
  • Space–time log-likelihood score
  • Spatial log-likelihood score
  • Plug-in predictor