Skip to main content

Volume 63 Supplement 3

Earthquake Forecast Testing Experiment in Japan (II)

Performance of a seismicity model for earthquakes in Japan (M ≥ 5.0) based on P-wave velocity anomalies


We consider P-wave perturbations from a standard layered model for Japan, as a predictive parameter that may be useful for assessing regional seismogenesis. To assess the performance of a seismicity model with predictive parameters, we used the Kullback-Leibler statistic in terms of information gain per event (IGpe), which is the distance between two distributions of parameters, the background distribution (parameters over the entire space domain), and the conditional distribution (parameters at earthquake epicenters). We selected 198 epicenters of earthquakes with magnitudes ≥5.0 that occurred between 1961 and 2008 to estimate the conditional distribution. More than 3,000 points were selected at every point on a 0.1 × 0.1° grid for the background distribution. P-wave variations were considered at four different depths (10, 15, 20, and 25 km at each point) for both distributions. We compared the two distributions at each depth but found no significant difference in the average value of perturbations between them. As these distributions are well-approximated by normal distributions, IGpe can be estimated directly from the means and standard deviations of both distributions at each depth. We obtained an IGpe of ≤0.03 using a single parameter. However, when multiple parameters with correlations were considered, an IGpe of 0.3 was estimated, which means that the average probability across the 198 earthquakes is 1.35-fold higher than that of a Poisson process model.

1. Introduction

Seismicity models provide some of the most useful products in earthquake prediction research. The incorporation of various predictive parameters could result in better performing models. Utsu (1977), 1982) and many others (Rhoades and Evison, 1979; Aki, 1981; Hamada, 1983; Grandori et al., 1988) have formulated expressions for earthquake probabilities based on precursory anomalies from a variety of measurements. Imoto (2006, 2007) proposed a method to build models based on multiple predictive parameters in which independence among parameters is not necessarily assumed as it has been in previous studies. His result implies that mutual correlations among predictive parameters for certain conditions could produce a better performance than expected for those cases in which the parameters are independent.

With the development of dense seismic networks and computational power, seismic wave velocity structures have been modeled to higher resolutions than has previously been possible. Seismogenesis must be closely related to the physical properties (e.g., pressure, temperature, and properties of geology) of focal areas. Of these, the P-wave velocity is more generally and systematically sampled than any other parameter. Many issues are involved in the relationship between seismic wave velocity perturbations and seismicities, and some of these have been discussed in only limited terms (i.e., AL-Shukri and Mitchell, 1988; Michael, 1988; Kaufmann and Long, 1996; Hauksson and Haase, 1997).

In the 1970s, a large body of literature was published on changes in seismic velocity before earthquakes. A change in P-wave velocity was interpreted using the dilatancy theory (hypothesis) that rocks underwent dilatation in the last stage before failure (Nur, 1972; Scholz et al., 1973). To date, systematic and homogeneous measurements detecting such variations have not been obtained. Consequently, in the study reported here, variations in seismic velocity over time are not discussed.

After the Hi-net seismic network was established in Japan (Obara et al., 2005), Matsubara et al. (2008) revealed fine structures of P- and S-wave velocities in Japan. Some of their remarkable findings are as follows. The high-velocity Pacific plate and Philippine Sea plate are clearly imaged to the depth of 150 km beneath northeastern and the southwestern Japan, respectively. High-V p /V s (P-wave velocity by S-wave velocity) zones are widely distributed beneath the volcanic front where seismic swarm activities, including moderately sized earthquakes, have often been observed. Non-volcanic tremors occur in the high-V p /V s zone at depths of 30–40 km beneath southwestern Japan where the oceanic crust of the Philippine Sea plate encounters the wedge mantle of the Eurasian plate.

Matsubara and Obara (2008) reported characteristic features of the perturbations in zones beneath active faults, and this information may be incorporated into the construction of a seismicity model that performs better than any previous model. However, before building such a model, it is necessary to appropriately evaluate the information of each contributing parameter.

In the study reported here, we evaluate model performance in terms of information gain per event (IGpe; Daley and Vere-Jones, 2003; Imoto, 2004) in order to incorporate information on the P-wave velocity structure into current seismicity models. The results present a good example of cases in which correlations among predictive parameters increase predictive power more than those without correlations.

2. Method

The seismic hazard function is expressed as the expectation of the number of earthquakes in a space-time volume dx above some threshold magnitude (Daley and Vere-Jones, 2003). We consider both the unconditional and conditional probabilities of observing a potentially predictive parameter value of θ, which are represented by g(θ) (background density) and f (θ) (conditional density) and which are empirically determined with random samples of cells in the whole study volume and samples conditioned on occurrences of earthquakes in some cells. The hazard function at a space-time point (x), conditioned on a value of θ(x), is given by


where m0 is the number of earthquakes above the threshold, and V0 is the space-time volume being studied (see Appendix).

Taking the Poisson model as the baseline, the IGpe (Daley and Vere-Jones, 2003; Imoto, 2004, 2007) for a large number of earthquakes is given by


where the integral is performed within the whole space of θ defined, R. The above equation represents the fact that IGpe is equivalent to the Kullback-Leibler quantity of information expressing the distance between two probability distributions. Assuming that f(θ) and g(θ) are normal multivariate distributions, Imoto (2007) derived an analytical equation to estimate the IGpe value.

For the sake of convenience, the main results of the previous studies (Imoto, 2007) will be summarized below. For a single parameter θ1, the IGpe(θ1) can be represented as


where μ1 is the mean, and is the variance of f(θ1), and those of g(θ1) are scaled to be 0 (mean) and 1 (variance).

Next, we consider n variables θ1, θ2,…θ n as possessing joint density distributions f (θ1, θ2, …θ n ) and g(θ1, θ2, …θ n ), and their marginal distributions of θ i are noted as f i (θ i ) for the conditional distribution and g i (θ i ) for the background distribution. If variables θ1, θ2, …θ n are mutually independent in both distributions and are normally distributed with the mean μ i , and variance for the conditional distribution and 0 and 1 for the background distribution, the IGpe can be represented as follows.


We assume here that the correlation among the n variables θ1, θ2, …θ n only occurs in the conditional density distribution f (θ1, θ2, …θ n ):


where the superscript −1 refers to the inverse of a matrix, and the covariance matrix C can be expressed as


where ρ ij is the correlation coefficient between θ i and θ j .

By introducing an appropriate transformation of the (θ) coordinate system with an orthogonal matrix, the covariance matrix can be expressed as a diagonal matrix. At the same time, the vector μ is transformed into μ′ with the same orthogonal matrix. Referring to the previous case, the IGpe is represented by


where trace denotes the sum of the diagonal elements and is an invariant parameter for a unitary transformation, and represent the eigenvalues of C. Comparing Eqs. (7) and (4), the first term in the right side of Eq. (7) exceeds that of Eq. (4) unless every ρ ij is zero. The other three terms have the same values in both equations. Therefore, the IGpe for a conditional distribution of correlated variables always exceeds that with no correlation.

In general, some correlations among parameters may be observed in both distributions. The procedure from Eq. (5) to Eq. (7) could be applied after the covariance matrix for the background distribution is changed into the identity matrix by transformations of the coordinate system with an orthogonal matrix and a diagonal matrix.

Once we have estimated the means and variances of the parameters together with the correlation matrices for both the conditional and the background distributions, we can represent them by f (θ) and g(θ) and thus calculate the hazard function of Eq. (1). This function estimates the hazard rate at any point of interest conditioned on the parameter values observed at that point.

3. Data

We consider a seismicity model for earthquakes M ≥ 5.0 in Japan based on P-wave velocity perturbation data. We use the hypocenter parameters for 1961–2008 determined by the Japan Meteorological Agency (JMA). This period is selected to balance both the number of earthquakes and the accuracy of estimated locations. In terms of the complex tectonic setting in and around Japan, we restricted ourselves to earthquakes shallower than 30 km.

Matsubara et al. (2008) constructed three-dimensional P- and S-wave velocity models beneath all of Japan at depths of 0–40 km, with a 0.2° grid spacing in the horizontal direction and a 5- to 10-km spacing in the vertical direction. They also constructed a velocity model down to the depth of 400 km with less densely spaced grids. In general, velocity variations from a standard velocity model are estimated since P- and S-wave velocities strongly depend on depth. Therefore, comparing variations at the same depth may be useful for obtaining characteristic seismogenic features of a focal area.

Accordingly, we consider a two-dimensional seismicity model in which hazard rates at horizontal spacing grids are defined. P-wave velocity differences at four different depths (10, 15, 20, and 25 km) for each point are used. We consider that a set of these four parameters plays important roles as predictive parameters. More than 3,000 points with reliable velocity anomalies are selected at every point of a 0.1 × 0.1° grid for the background distributions, which mostly cover inland parts of Japan, with the exception of Hokkaido Island. To estimate the conditional distributions, we select 198 epicenters of earthquakes (Table 1) with magnitudes >5.0 that occurred between 1961 and 2008.

Table 1. List of target earthquakes used for the conditional distribution.

4. Information Gain per Event

Figure 1 illustrates the empirical background distributions (dark solid line) for the four parameters and the normal functions fitted to them (light dashed line). Each background distribution is generally well-approximated by a normal function. In the same way, Fig. 2 illustrates the conditional background distributions (dark solid line) and the normal functions fitted to them (light dashed line). For the conditional distributions at 20 and 25 km, the normal approximation is not a close fit.

Fig. 1.
figure 1

Cumulative background distributions for the four parameters. Empirical background distributions (dark line) and normal distributions (light dashed line) fitted to the background distributions.

Fig. 2.
figure 2

Cumulative conditional distributions for the four parameters. Empirical conditional distributions (dark line) and normal distributions (light dashed line) fitted to the conditional distributions.

The chi square-test for goodness-of-fit was performed within the framework of the null hypothesis that P-wave velocity differences at each depth possess a normal distribution. The hypothesis for samples at either 10 or 15 km is accepted at the 10% level of significance. The hypothesis for samples at either 20 or at 25 km is accepted at the 1% level of significance, which may appear higher than usual but is assessed to be adequate for fitting with a function of two parameters.

The parameters of these normal distributions are summarized in Table 2. The last column of the table indicates the IGpe for each predictive parameter, calculated using Eq. (3), where both distributions are assumed to be normally distributed. It is obvious that no large differences exist between conditional and background distributions. If we use the predictive parameter separately, an IGpe of 0.03 at most is expected for the parameter measured at a depth of 25 km.

Table 2. Terms of normal distributions for each parameter and its IGpe value.

However, correlations among the four parameters are observed for both distributions. Table 3 summarizes the correlation matrices in the background (lower left) and conditional (upper right) distributions. The coefficient in the conditional distribution always exceeds the corresponding one in the background distribution. Specifically, for the correlation between parameters at depths of 10 and 25 km, the coefficient of 0.498 in the conditional distribution is larger than that of 0.103 in the background distribution. These features of the correlation matrices suggest a better predictive power with correlated parameters than that expected from a single parameter, as indicated by Eq. (7). Given the values in Tables 2 and 3, we can construct the background and conditional densities. The hazard function is obtained by multiplying f (θ)/g(θ) by an average rate (Poisson rate). Using the formula developed by Imoto (2007), we can estimate an IGpe of 0.30 for the seismicity model of this hazard function. This value is equivalent to a probability gain of 1.35 across all target earthquakes.

Table 3. Correlation matrices. Lower left: Observed in background distribution. Upper right: Observed in conditional distribution.

Figure 3 plots joint distributions between the parameters at 10 and 25 km for both cases. The plot of the conditional distribution (dark circles) is more concentrated than that of the background distribution (light circles). The plot of the conditional distribution is more or less located in a lower right part of the background distribution, which corresponds to the evidence showing that velocity becomes higher beneath an epicenter than the average velocity at a depth of 10 km but becomes lower than the average velocity at a depth of 25 km (Table 2). A moderate correlation between two parameters is observed in the plot of the conditional distribution, whereas no clear correlation is observed in that of the background distribution (Table 3). In the present case, the contributions from every depth sum up to a negligible IGpe value of 0.06, but the correlations among parameters could lead to a useful IGpe value of 0.3.

Fig. 3.
figure 3

Joint distributions of the predictive parameters at depths of 10 and 25 km. The ordinate represents P-wave velocity at a depth of 25 km, and the abscissa represents P-wave velocity at a depth of 10 km. Dark symbols indicate plots of the conditional distributions, and light symbols indicate those of background distributions.

5. Discussion and Conclusions

To confirm the above IGpe estimates, we have performed a simulation study to obtain the distribution of IGpe values by a bootstrap method. In generating a set of samples, we consider only the variations of parameter values in the conditional distributions (Tables 2 (right group) and 3 (upper right). Two different data sources, the conditional distribution and the background distribution, are adopted. In both simulations, we randomly select 198 samples from the distribution, which are assigned as a simulated conditional distribution. After calculating means, variances, and correlation coefficients among parameters, we are able to estimate IGpe. By iterating 10,000 sets, we are also able to find the distribution of the IGpe value, which gives an average IGpe value and the standard deviation. When we select samples from the conditional distribution, we obtain an average of 0.36 and a standard deviation of 0.06 (right line in Fig. 4). The observed IGpe thus falls within one standard deviation of the mean. In contrast, when we use the background distribution, we obtain an average of 0.04 and a standard deviation of 0.02 (left line in Fig. 4). These results suggest that the observed IGpe is not obtained by chance from the background distribution of the parameters.

Fig. 4.
figure 4

Cumulative distributions of IGpe values simulated by a bootstrap method. The left curve with mean 0.04 and standard deviation 0.02 denotes IGpe values obtained based on background distributions, and the right curve with mean 0.36 and standard deviation 0.06 denotes IGpe values based on conditional distributions. The vertical line in the right curve indicates the IGpe estimated in the actual case.

Matsubara and Obara (2008) studied the relationship between the seismic velocity structure and the active tectonic faults in the Japan Islands. They first estimated velocity variations at depths of 5, 10, 15, and 20 km and then they compared the values beneath the fault zones with the nationwide averages. They found that velocity becomes higher than the average velocity in the shallow part beneath the fault zones but becomes lower than the average velocity in the deeper part. Based on this finding, they suggested that seismic velocity anomaly could contribute to the detection of blind active faults. Although their finding has not been examined quantitatively, it implies that a P-wave velocity model could contribute to the assessments of the seismoge-nesis of shallow earthquakes of moderate and large magnitude.

Taking into account the close relationship between large earthquakes and active faults, we focus on epicenters of earthquakes with a magnitude ≥5.0 at a shallow depth as the conditional group. It may be possible to adopt fault zones as a conditional group, but epicenters of earthquakes are more exactly defined and more easily selected than fault zones. Even with these simple selections, we are able to construct a seismicity model that could possibly assess the seismogenesis of shallow earthquakes. It may be possible to propose more effective models after various predictive parameters have been examined. However, how such models would perform remains to be seen.

Figure 5 shows the probability gains at every point of a 0.1 × 0.1° grid. In general, a probability gain is defined by a ratio of the hazard function (Eq. (1)) to the Poisson rate (m0/V0), which becomes equal to the f (θ)/g(θ) value in the present case. Although the map indicates some parts of high probability gain up to 2.0, an average over the values at 198 epicenters (Table 1) becomes 1.35. Imoto and Rhoades (2010) combined two models, namely, the Every Earthquake a Precursor According to Scale model (EEPAS, Rhoades and Evison, 2006) and a three-parameter model (Imoto, 2008), into a better performance model in which the hazard rate of the EEPAS model is treated as a surrogate precursor. In a similar way, we can combine the present parameters and an appropriate seismicity model into a better performance model. A study focusing on this point will be conducted in the future.

Fig. 5.
figure 5

A map of probability gains ( f (θ)/g(θ) values) at every point of a 0.1 × 0.1 grid.

In summary, we have attempted to assess the performance of a seismicity model for shallow earthquakes in Japan based on a P-wave velocity model. Applying the formula derived by Imoto (2007) to the P-wave velocity data, we assessed that IGpe of the model is 0.3 units, after incorporating the correlations among the parameters. The bootstrap method suggests that this IGpe value could not be obtained by chance from the background distribution.


  • Aki, K., A probabilistic synthesis of precursory phenomena, in Earthquake Prediction, edited by D. W. Simpson and P. G. Richards, 566–574, AGU, 1981.

    Google Scholar 

  • AL-Shukri, H. J. and B. J. Mitchell, Reduced seismic velocities in the source zone of New Madrid earthquakes, Bull. Seismol. Soc. Am., 78, 1491–1509, 1988.

    Google Scholar 

  • Daley, D. J. and D. Vere-Jones, An Introduction to the Theory of Point Processes, vol 1, Elementary theory and methods, Second edition, 469 pp, Springer, New York, 2003.

    Google Scholar 

  • Grandori, G., E. Guagenti, and F. Perotti, Alarm systems based on a pair of short-term earthquake precursors, Bull. Seismol. Soc. Am., 78, 1538–1549, 1988.

    Google Scholar 

  • Hamada, K., A probability model for earthquake prediction, Earthq. Predict. Res., 2, 227–234, 1983.

    Google Scholar 

  • Hauksson, E. and J. S. Haase, Three-dimensional Vp and Vp/Vs velocity models of the Los Angeles basin and central Transverse Ranges, California, J. Geophys. Res., 102, 5423–5453, 1997.

    Article  Google Scholar 

  • Imoto, M., Probability gains expected for renewal process models, Earth Planets Space, 56, 563–571, 2004.

    Article  Google Scholar 

  • Imoto, M., Earthquake probability based on multidisciplinary observations with correlations, Earth Planets Space, 57, 1447–1454, 2006.

    Article  Google Scholar 

  • Imoto, M., Information gain of a model based on multidisciplinary observations with correlations, J. Geophys. Res., 112, B05306, doi:10.1029/ 2006JB004662, 2007.

    Google Scholar 

  • Imoto, M., Performance of a seismicity model based on three parameters for earthquakes (M≥5.0) in Kanto, central Japan, Ann. Geophys., 51(4), 727–736, 2008.

    Google Scholar 

  • Imoto, M. and D. Rhoades, Seismicity models of moderate earthquakes in Kanto, Japan utilizing multiple predictive parameters, Pure Appl. Geophys., 167, 831–843, doi 10.1007/s00024-010-0066-4, 2010.

    Article  Google Scholar 

  • Kaufmann, R. D. and L. T. Long, Velocity structure and seismicity of southeastern Tennessee, J. Geophys. Res., 101, 8531–8542, 1996.

    Article  Google Scholar 

  • Matsubara, M. and K. Obara, Relationship between the seimic velocity structure and the active tectonic faults in the crust beneath Japan islands, Eos Trans. AGU, 89(53), Fall Meet. Suppl., Abstract T53C-1954, 2008.

    Article  Google Scholar 

  • Matsubara, M., K. Obara, and K. Kasahara, Three-dimensional P- and S-wave velocity structures beneath the Japan Islands obtained by high-density seismic stations by seismic tomography, Tectonophysics, 454, 86–103, 2008.

    Article  Google Scholar 

  • Michael, A. J., Effects of three-dimensional velocity structure on the seis-micity of the 1984 Morgan Hill, California, aftershock sequence, Bull. Seismol. Soc. Am., 78, 1199–1221, 1988.

    Google Scholar 

  • Nur, A., Dilatancy, pore fluids and premonitory variations of ts/tp travel times, Bull. Seismol. Soc. Am., 62, 1217–1222, 1972.

    Google Scholar 

  • Obara, K., K. Kasahara, S. Hori, and Y. Okada, A densely distributed high-sensitivity seismograph network in Japan: Hi-net by National Research Institute for Earth Science and Disaster Prevention, Rev. Sci. Instrum., 76, 021301, 2005.

    Article  Google Scholar 

  • Rhoades, D. and F. Evison, Long-range earthquake forecasting based on a single predictor, Geophys. J. R. Astron. Soc., 59, 43–56, 1979.

    Article  Google Scholar 

  • Rhoades, D. and F. Evison, The EEPAS forecasting model and the probability of moderate-to-large earthquakes in central Japan, Tectonophysics, 417, 119–130, 2006.

    Article  Google Scholar 

  • Scholz, C., L. R. Sykes, and Y. P. Aggarwal, Earthquake prediction: a physical basis, Science, 181, 803–810, 1973.

    Article  Google Scholar 

  • Utsu, T., Probabilities in earthquake prediction, Zisin II, 30, 179–185, 1977 (in Japanese).

    Google Scholar 

  • Utsu, T., Probabilities in earthquake prediction (the second paper), Bull. Earthq. Res. Inst., 57, 499–524, 1982 (in Japanese).

    Google Scholar 

Download references


The author would like to express his thanks to M. Matsubara for providing a convenient tool to use their P-wave velocity data. He also thanks Euan Smith and Masashi Kawamura for their detailed reviews of this article and their valuable comments.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Masajiro Imoto.

Appendix A.

Appendix A.

The hazard function is expressed as the expectation of the number of earthquakes in a space-time volume dx. For small dx, the expectation becomes equal to the probability of an earthquake occurring in the volume, denoted as P(Eǀθ). By Bayes’ theorem,


where P(θ) and P(θǀE) are unconditional and conditional probabilities of observing θ. We consider that P(θ) and P(θǀE) are represented by g(θ)dθ and f(θ)dθ, respectively. The probability of an earthquake, P(E), is given by m0/V0dx. Thus


is obtained.

Rights and permissions

Open Access  This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit

Reprints and Permissions

About this article

Cite this article

Imoto, M. Performance of a seismicity model for earthquakes in Japan (M ≥ 5.0) based on P-wave velocity anomalies. Earth Planet Sp 63, 11 (2011).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI:

Key words

  • Seismicity model
  • information gain
  • Kullback-Leibler quantity
  • P-wave velocity structure
  • predictive parameters
  • correlation
  • Japan