### Deviation of the FMD from the GR law

The PDDs of \(\eta\) obtained in this study show that the FMD is generally convex upward, and the GR law is not always strictly valid. This result has important implications, both for the interpretation of our results and seismologically. However, the discussion of this point is accompanied by doubts about whether we estimated \({M}_{\mathrm{c}}\) really accurately. After a large earthquake, \({M}_{\mathrm{c}}\) increases due to the decrease in detectability, and then decreases as aftershocks decay. If the change of \({M}_{\mathrm{c}}\) for the initial aftershocks is not accurately estimated, it may lead to underestimation of \(\eta\). Here, we used a method for estimating \({M}_{\mathrm{c}}\) that relies as little as possible on the assumption of the GR law, and we confirmed the validity of the estimation results (Fig. 4). However, considering the importance of the results, we used another method to confirm this tendency.

Here, we estimate \(b\)-positive (\({b}^{+}\)), which is insensitive to transient changes in \({M}_{\mathrm{c}}\) and was recently proposed (van der Elst 2021) to check for deviations from the GR law. \({b}^{+}\) is estimated from the difference in magnitude between each earthquake and that which preceded it, \(m\), that satisfies \(m\ge {m}_{\mathrm{th}}>0\), where \({m}_{\mathrm{th}}\) is an arbitrary constant representing a threshold, as follows:

$$\begin{array}{c}{b}^{+}=\frac{{N}_{\mathrm{th}}\log_{10}e}{\sum _{i=1}^{N_{\,\mathrm{th}}}\,\left({m}_{i}-{m}_{\mathrm{th}}+\delta \right)} ,\end{array}$$

(6)

where \({N}_{\mathrm{th}}\) is the number of earthquakes satisfying \(m\ge {m}_{\mathrm{th}}\) and \(\delta\) is the width of the magnitude discretization (here, \(\delta =0.05\)). If the GR law holds, \({b}^{+}\) is equivalent to *b* (van der Elst 2021). The advantage of \({b}^{+}\) is that it can be estimated almost independently of the change in \({M}_{\mathrm{c}}\) because it is sufficient to detect earthquakes of at least a certain magnitude larger than that observed immediately before, even if detectability is reduced after a large earthquake.

Assuming that any earthquakes larger than the preceding one were not missed and that the GR law holds, the estimated \({b}^{+}\) for different lower limits of earthquake magnitude, \({M}_{\mathrm{min}}\), should be approximately the same when averaged, even though \({N}_{\mathrm{th}}\) decreases and thus the variance of \({b}^{+}\) increases as \({M}_{\mathrm{min}}\) increases. To apply this point to the data used in our analysis, we used up to the earthquakes whose magnitudes are smaller than \({M}_{\mathrm{th}}\) in the catalog in all the spatiotemporal regions analyzed for the four sets of \(l\) and \(N\). Using \({M}_{\mathrm{ref}}={M}_{\mathrm{th}}-1.0\) (0.95 for area A and 2.45 for area B) as references, we estimated the differences between the mean estimated values of \({b}^{+}\) for different \({M}_{\mathrm{min}}\) with \({m}_{\mathrm{th}}=0.2\). As shown in Fig. 8, in the actual analysis, \({b}^{+}\) tends to become larger as \({M}_{\mathrm{min}}\) increases, and this tendency is larger in area B. This result is consistent with the PDDs of estimated \(\eta\)-values, which strongly suggests that the FMD has an upward convex trend.

Figure 8 also shows the expected results that were calculated numerically under the assumption of the GR law as a reference. Numerically, we generated \(M\) above \({M}_{\mathrm{ref}}\) until the number of earthquakes with \(M\) ≥ \({M}_{\mathrm{th}}\) reached \(N\) and then estimated \({b}^{+}\) for each \({M}_{\mathrm{min}}\). Because earthquakes larger than the preceding one are used to estimate \({b}^{+}\), the number of available data points becomes smaller as \({M}_{\mathrm{min}}\) becomes larger. When \({M}_{\mathrm{min}}={M}_{\mathrm{th}}\), the expected number of events used to estimate \({b}^{+}\) is less than half of \(N\). This reduction in event number used to estimate \({b}^{+}\) is reflected by the downturn in the numerically calculated \({b}^{+}\) values on the right side of Fig. 8. It is noted that the estimated \({b}^{+}\) is biased due to the small event number in a different way from the estimated \(b\), which tends to be larger than true value for smaller event number (e.g., Ogata and Yamashina 1986), because \({b}^{+}\) is estimated using only earthquakes that are larger in magnitude than the previous earthquakes. The actual analysis results (colored points in Fig. 8) seem to be only slightly influenced by the effect of reduction of event number used to estimate \({b}^{+}\).

Not only the result of \(\eta\)-value analysis but also the above result that \({b}^{+}\) shows an upward convex trend of the FMD, unlike that expected from the GR law, raises an important issue. In the context of comparing pre- and post-earthquake activities and estimating stress-field changes through such activities, \(b\) is often estimated immediately after a large earthquake. If, for this purpose, we take a large value for \({M}_{\mathrm{th}}\) to avoid the effect of reduced detectability, the estimated \(b\)-value may be too high due to its upward convex shape, even if the FMD has not truly changed. In such a case, instead of simply assuming the ideal GR law and comparing the estimation results using different \({M}_{\mathrm{th}}\), the shape of the FMD needs to be fully considered. The present results suggest that in many cases it is preferable to analyze the data with a constant \({M}_{\mathrm{th}}\).

### Comparison between anomalous seismicity and other physical phenomena

In this study, we used statistical analyses to identify spatial regions where the PDDs of index values for seismicity are significantly different from those in other regions. In other words, the seismicity in these regions may be considered to include unusual or "anomalous" seismicity, whereas regions where the PDDs are indistinguishable from those in other regions may be considered to experience "typical" seismicity. Hereafter, we use "anomalous" and "typical" in this sense. Specifically, when the \(p\)-value obtained by the analysis is below the threshold value of 0.05, activity is considered anomalous. The regions where such anomalous seismicity was mainly observed in the \(b\)-values (i.e., regions where the low \(p\)-values are spatially clustered) are indicated in Fig. 7a, b by arrows and letters (lower-case letters correspond to high \({f}_{\mathrm{lp}}\) and upper-case letters to low \({f}_{\mathrm{lp}}\)). Here, we first discuss the temporal variation of these activities, highlighting the case of \(l=0.4^\circ\) and \(N=50\) (Fig. 9). We then discuss anomalous index values shown in Fig. 7a–d by comparing them with different observation results and evaluating the universality of the typical seismicity obtained here.

#### Anomalous \({\varvec{b}}\)-values of shallow inland earthquakes (area A)

Figure 7a shows that many of the anomalous \(b\)-values are located near the epicenters of large earthquakes in area A (*M*6.1 or greater; C–H and a–d in Fig. 7a). Figure 9a, b shows the change of \(b\)-values with elapsed time since the occurrence of large earthquakes that were accompanied by the most aftershocks in these regions. The figures also show the PDDs of the anomalous \(b\)-values compared to those of typical \(b\)-values. Figure 9a, b separately shows data for regions of anomalously low and high \(b\)-values, respectively (hereafter referred to as the low and high \(b\)-value regions).

In Fig. 9a, although the change in \(b\)-value varies from event to event, collectively, the median and 10th and 90th percentile values (in black), which should reflect the change in PDD of \(b\), suggest that many low \(b\)-values are estimated for active aftershocks immediately after the mainshock. After that, there is a weak tendency for *b*-values to gradually increase, and at 10^{7}–10^{8} s (months to years) after the mainshock, the distribution becomes equivalent to that of typical *b*-values. The decrease in \(b\) after 10^{8} s corresponds to the occurrence of other large earthquakes and their aftershocks, again consistent with the above result that low \(b\)-values correspond to active aftershocks. It is noted that the very low \(b\)-values obtained when the analysis period (shown by the horizontal bar) includes an elapsed time of 0 or slightly precedes an event (spanning the left and middle boxes in Fig. 9a, b), especially when the right end of the horizontal bar is less than ~ 10^{3} s, are cases where the estimation of \({M}_{\mathrm{c}}\) is difficult, as shown in Additional file 1: Fig. S5. In the present analysis, such cases cannot be completely excluded, and there are cases where \({M}_{\mathrm{c}}\) and thus the \(b\)-value are underestimated. However, the number of such cases is small and has little effect on the overall statistical analysis.

The activity in the high \(b\)-value regions associated with large earthquakes has a slightly higher overall distribution (Fig. 9b). Although it is not clear because of the small amount of data, the slightly higher average \(b\)-value may correspond to the longer interval between earthquakes in general, as seen from the fact that there are far fewer estimated *b*-values in Fig. 9b (476) than in Fig. 9a (3243). In this case, too, the \(b\)-value tends to increase with elapsed time. In “Common temporal variation of typical b-values” section, we show that this trend does not change even after minimizing the effect of detectability reduction resulting from the short time intervals of active aftershocks.

Some of the anomalous \(b\)-value regions in area A do not seem to be the result of large earthquakes in their vicinity. In these regions, the anomalous \(b\)-values have been observed stably for a long time. Figure 9c, d shows the temporal variation of \(b\)-values and PDDs for these stably low and high \(b\)-value regions, respectively. Among the examples of low \(b\)-values, seismicity in regions I, J, and K increased after the \(M\) 9.0 Tohoku-oki earthquake in March 2011, as can be seen from the many points after the earthquake, and the \(b\)-values remain low in all analysis periods. The stably low \(b\)-values in region L suggest the influence of slow slip because much of the analyzed data in region L are the hypocenters of the seismic activities that became active during the Boso slow slip events (e.g., Ozawa et al. 2019) in this region (the blue vertical arrows in Fig. 9c). The large proportion of such activated seismicity is seen from the fact that much of the analysis period in region L includes these slow slip events. The stably high \(b\)-value regions (Fig. 9d) include the Izu Islands (region g), where many earthquakes associated with volcanic activity, such as the eruption of Miyakejima in 2000, were observed during the analysis period (e.g., Toda et al. 2002); the Wakayama swarm area (h), where high \(b\)-values associated with high-temperature fluids have been reported (Yoshida et al. 2011); and the vicinity of Sakurajima (j), where volcanic activity is very active. In addition, the Yamagata-Fukushima border swarm (e), which became active after the Tohoku-oki earthquake and has been linked to hydrothermal fluids (e.g., Yoshida et al. 2019), was also extracted as a high \(b\)-value area, suggesting that seismicity in many of the high \(b\)-value regions with different FMD characteristics is related to the influence of fluids. In Fig. 9d, the PDD excludes the activity in region g because it is significantly more active than the other regions, but the distribution still has a peak at a much higher value than the PDD of the typical \(b\)-values.

#### Anomalous \({\varvec{b}}\)-values of all earthquakes in and around Japan (area B)

The temporal variation of the anomalous \(b\)-values in area B shows that seismicity in many of the low \(b\)-value regions increased after the Tohoku-oki earthquake (Fig. 9e). In regions where \(b\)-values were estimated before the Tohoku-oki earthquake, we can see that the \(b\)-values remained low after the Tohoku-oki earthquake. In these cases, region R in area B overlaps with region g in area A. However, region g shows a high \(b\)-value (with \({M}_{\mathrm{th}}=1.95)\), whereas region R shows a low \(b\)-value (with \({M}_{\mathrm{th}}=3.45)\). These results are consistent with the high \(\eta\)-values in region g (Fig. 7c), suggesting that there is some characteristic scale corresponding to a magnitude range larger than 1.95 that deviates from the GR law. In Fig. 9e, the PDDs also exclude seismic activity in region R. In contrast, the high \(b\)-value regions mostly include the area where activity increased after the Tohoku-oki earthquake (Fig. 9f). Again, the very low \(b\)-values obtained when the analysis period extends backward from March 2011 are probably due to underestimated \(b\)-values in cases where \({M}_{\mathrm{c}}\) is difficult to estimate, as shown in Additional file 1: Fig. S5. In the analysis period before the Tohoku-oki earthquake, the *b*-values in region m (near land) were similar to or slightly lower than the typical distribution, whereas those in region p (far offshore) were higher.

Compared with the area of large coseismic slip during the Tohoku-oki earthquake (Suito et al. 2012) shown in Fig. 7b with a green solid line, the high \(b\)-value regions (red in Fig. 7b) are distributed along the landward and seaward extension of the large-slip area, and the low \(b\)-value regions (blue in Fig. 7b) are distributed within on along strike from the extension of the large-slip area, suggesting a correspondence. Previous studies have pointed out a relationship between \(b\)-values and differential stresses, both from laboratory experiments and wide-area observations (Scholz 1968, 2015), and this relationship has been reported to manifest in the relationship between \(b\)-values and the slip deficit rate on a plate interface (Nanjo and Yoshida 2018) or in the relationship between \(b\)-values and styles of faulting (Schorlemmer et al. 2005). It is possible that the present analysis also captures changes in the \(b\)-value that reflect these effects.

To evaluate this possibility, we chose the average slip rate on the plate interface based on a similar earthquake catalog (Igarashi 2020) and the fault rake angle, which corresponds to style of faulting, as observational data that can be directly compared with our analysis. Igarashi (2020) constructed a catalog of similar earthquakes and small repeating earthquakes from 1981 to 2019 in central Japan and from 2001 to 2019 throughout the Japanese Islands and showed that there is little difference in the spatial distribution of average slip rates estimated from these two catalogs. Here, we use a similar earthquake catalog to estimate the average slip rate in the same rectangular regions used in the seismicity analysis. The method for estimating the slip rate is similar to that used by Igarashi (2010, 2020). The average slip rate in the rectangular region is estimated by taking the mean of the average slip rate of all similar earthquake groups in the region. The period for estimating the slip rate is from the last event before 2000 (or the first after 2000 if there were none before) to the last event in the same group. In the estimation, the empirical relationship between magnitude and the amount of coseismic slip by Nadeau and Johnson (1998) is used. We note that the analysis period includes the Tohoku-oki earthquake and its afterslip and therefore includes much non-stationary slip.

Figure 10a shows the comparison of the estimated *b*-values to the logarithm of the estimated average slip rate. All results for each pair of \(l\) and \(N\) are shown together. Darker symbol colors correspond to results with high ratio of the number of similar earthquakes \({n}_{\mathrm{similarEQ}}\) to \(N\), that is, results for data including many interplate events. In these plots, the \(b\)-value is positively correlated with the average slip rate. Similarly, Nanjo and Yoshida (2018) showed a negative correlation between the \(b\)-value and the slip deficit rate in the Nankai Trough. In the present analysis, the \(b\)-value appears to be linearly correlated with the logarithm of the average slip rate, unlike the linear correlation between \(b\)-values and slip deficit rates in Nanjo and Yoshida (2018), probably due to the inclusion of the afterslip of the Tohoku-oki earthquake. Despite such differences, these results are qualitatively consistent in terms of the relationship between \(b\)-values and interplate coupling. The present analysis, where most target activity is along the Japan Trench, suggests that the \(b\)-values tend to be higher or lower in region with weak or strong interplate coupling, respectively, and regions of significantly higher and lower \(b\)-values are detected statistically as anomalous \(b\)-values. The \(b\)-values of seismicity without similar earthquakes, most of which are considered to be intraplate earthquakes, show no clear correlation with the average slip rate.

Fault rake angle, which corresponds to the styles of faulting, is another factor expected to correlate with \(b\)(Schorlemmer et al. 2005). To examine the relationship, we extracted the rake angles between \(-90^\circ\) and \(90^\circ\) of all of the earthquake mechanisms of F-net catalog (Kubo et al. 2002) whose epicenters are in the spatiotemporal domain used for the estimation of index values. Rake angles of about \(-90^\circ\) indicate normal faulting, \(0^\circ\) strike-slip faulting, and \(90^\circ\) reverse faulting. Figure 11a shows the PDDs of rake angles extracted in the high, low, and typical \(b\)-value regions north of \(34.5^\circ\) N. The latitudinal range was set to avoid the area around the Izu Islands (area R in Fig. 7b), where many earthquakes associated with volcanic activity occur, and to include activities with a clear regional difference between high and low *b*-value regions. The results for all pairs of \(l\) and \(N\) plotted in the figure show no significant difference. Figure 11a shows that, in the low \(b\)-value regions (blue lines), the proportion of normal faults is small and most earthquakes are reverse faults, whereas, in the high \(b\)-value region (red lines), the proportion of reverse faults is small and the number of normal faults is relatively large. This result is consistent with that of Schorlemmer et al. (2005), who found that reverse-fault earthquakes have low \(b\)-values and normal-fault earthquakes have high \(b\)-values. As mentioned previously, the \(b\)-values of seismicity include many interplate earthquakes, most of which are reverse-fault earthquakes and are related to not only the style of faulting but also the slip rate. The results of other intraplate earthquakes are consistent with the conventional idea that they correspond to the stress field that causes differences in the style of faulting.

#### Anomalous \({\varvec{\eta}}\)-values

The anomalous \(\eta\)-value regions appear to be near the anomalous \(b\)-value regions (Fig. 7a–d). Figure 12 shows the proportions of high, low, and typical \(\eta\)-value regions in the high, low, and typical \(b\)-value regions, respectively, for \(N=50\) and \(l=0.4^\circ\). No clear relationship is apparent between the characteristics of the \(b\)- and \(\eta\)-value distributions in each region. Although there are some regions where both of the \(b\)-value and \(\eta\)-value PDDs are anomalous, such as in region g (Fig. 7a), which has a high \(b\)-value and a high \(\eta\)-value, such anomalies probably reflect regional characteristics.

Figure 13 shows the relationship between each \(b\)-value and \(\eta\)-value. The anomalous \(b\)- and \(\eta\)-value regions (purple dots) may show some correlation, such as with high \(b\)-values and high \(\eta\)-values (Fig. 13a), which probably reflects the regional characteristics of region g. However, if we look at the plot of typical \(b\)- and \(\eta\)-value regions (green dots), which excludes the possibility of the influence of such regional characteristics, these index values seem to be uncorrelated. Moreover, the correlation coefficient between typical \(b\)-values and typical \(\eta\)-values is 0.02 for area A and 0.11 for area B, indicating no significant correlation.

It is often pointed out that mixing distinct types of seismicities with different characteristic \(b\)-values may cause deviations from the GR law (e.g., Wiemer and Wyss 2000). However, it is difficult to know the true *b*-value for each seismicity in advance, so one way for monitoring seismicity without making arbitrary assumptions is to observe seismicity from multiple index values, such as the \(\eta\)- and \(b\)-values. In particular, because the \(b\)- and \(\eta\)-values are basically uncorrelated, simultaneously monitoring them should improve the accuracy of anomaly detection.

Although we found no clear cause for a change in the \(\eta\)-value, it may be meaningful to look at the relationship between the \(\eta\)-value and interplate coupling or type of faulting. Figure 10b shows the relationship between the \(\eta\)-value and the average interplate slip rate. It is difficult to find a systematic correspondence in this figure, but the \(\eta\)-values are particularly low when the average slip rate is very high (> 400 mm/yr, much higher than the plate convergence velocity), most likely because of the afterslip of the Tohoku-oki earthquake. In these locations, the *b*-values are large (Fig. 10a), indicating that the number of large earthquakes is very small relative to the number of small earthquakes. Similarly, Vorobieva et al. (2016) compared the shape of the FMD and the creep rate along the San Andreas fault and showed that the FMD tends to become more convex upward as the creep rate increases. One interpretation is that, in the vicinity of areas sliding at sufficiently high velocities compared to the relative motion between the plates, small earthquakes are more likely to occur because of localized stress concentration without widespread stress accumulation, whereas relatively large earthquakes are less likely to occur.

In Fig. 11b, which shows the frequency distribution of rake angles in the high, low, and typical \(\eta\)-value regions, the low \(\eta\)-value regions (blue lines) have a relatively large proportion of reverse-fault-type earthquakes. Nevertheless, there are also many reverse-fault-type earthquakes in the high \(\eta\)-value regions (red lines), and no simple relationship was found. Normal-fault earthquakes seem to be relatively common in the high \(\eta\)-value regions.

### Probability density distribution of the index values of normal seismicity

#### Common temporal variation of typical *b*-values

The present analysis allowed us to extract activities in regions where there was no significant difference in the PDDs of index values compared to those of most other areas. These activities may include temporal changes in the index values that are common throughout the analysis area. Here, as the only example where a systematic dependence of the \(b\)-value was found, we highlight the relationship between \(b\) and the earthquake occurrence interval in the typical \(b\)-value regions.

Because \(b\) is an average property of the \(N\) analyzed hypocenters, we used \(\mathrm{min}{T}_{N/4}\), which was used to analyze \(D\)-values, as an index of the characteristics of the earthquake occurrence interval of those \(N\) earthquakes for comparison with \(b\). Figure 14 shows the relationship between \(b\) and \(\mathrm{min}{T}_{N/4}\). The results for area A (Fig. 14a) show that the entire \(b\)-value distribution increases with \(\mathrm{min}{T}_{N/4}\). The slope decreases as \(\mathrm{min}{T}_{N/4}\) increases, becoming almost constant above \(\sim {10}^{7}\) s and slightly decreasing near \(\sim {10}^{8}\) s. A small \(\mathrm{min}{T}_{N/4}\) reflects active aftershocks, and \(\mathrm{min}{T}_{N/4}\) increases with aftershock decay. Therefore, the relationship between \(b\) and \(\mathrm{min}{T}_{N/4}\) is consistent with the temporal variations of \(b\) in the anomalous \(b\)-value regions near the epicenters of large earthquakes in area A, where \(b\) is low for initial active aftershocks and increases with increasing aftershock decay (Fig. 9a, b). Many of these anomalous \(b\)-value regions near epicenters of large earthquakes tend to be in the low \(b\)-value regions, where many aftershocks occur over short time intervals (Fig. 9a), whereas regions that have relatively fewer aftershocks with shorter time intervals are extracted as high \(b\)-value regions (Fig. 9b). The results for area B (Fig. 14b) also show a similar positive correlation between \(b\) and \(\mathrm{min}{T}_{N/4}\) in the range of 10^{4}–10^{7} s. Relatively high median *b*-values for \(\mathrm{min}{T}_{N/4}\) smaller than 10^{4} s seem to be due to a lack of low \(b\)-values. The relationship between \(b\) and \(\mathrm{min}{T}_{N/4}\) becomes an inverse correlation at \(\mathrm{min}{T}_{N/4}\) > 10^{7} s. In the subduction zone included only in area B, large earthquakes occur over shorter time intervals than those on inland active faults in area A. The \(\mathrm{min}{T}_{N/4}\) value at which the slope changes in Fig. 14a, b might be related to the period required for the transition from aftershock decay to the accumulation of stresses before the next large earthquake. The physical interpretation of the relationship between \(b\) and \(\mathrm{min}{T}_{N/4}\) obtained here and the difference in the results for areas A and B are very interesting, but are beyond the scope of this paper, so further detailed discussion of these issues is left for future works.

Regarding the relationship between \(b\) and \(\mathrm{min}{T}_{N/4}\), there is still a concern that small earthquakes may be less likely to be detected over shorter time intervals between successive events. In fact, some of the \(b\)-values are extremely small, especially for small \(\mathrm{min}{T}_{N/4}\) in area A. Although the number is small, there is a possibility that the analysis includes data with \({M}_{\mathrm{c}}>{M}_{\mathrm{th}}\). For this reason, we also checked the relationship between \(\mathrm{min}{T}_{N/4}\) and \({b}^{+}\) (\({M}_{\mathrm{min}}={M}_{\mathrm{th}}-0.5\), \({m}_{\mathrm{th}}=0.2\)) in the same way (Additional file 1: Fig. S7). Similar to Fig. 14, Additional file 1: Fig. S7 also shows a weak positive correlation between \({b}^{+}\) and \(\mathrm{min}{T}_{N/4}\) except in the case of very small \(\mathrm{min}{T}_{N/4} (<300 \mathrm{s})\) for area A (Additional file 1: Fig. S7a) and large \(\mathrm{min}{T}_{N/4}\) (> 10^{7} s) for area B (Additional file 1: Fig. S7b). It is noted that \({b}^{+}\) would be larger for lower detectability of earthquakes when FMD is convex upward as probably seen in the case of very small \(\mathrm{min}{T}_{N/4}\) for area A (Additional file 1: Fig. S7a).

#### Typical probability density distributions of *b*, *η*, and *D*

The typical PDDs of the index values obtained from seismicity during the past 20 years in and around Japan are expected to be applicable as a representation of the characteristics of typical seismicity. For example, when monitoring future seismicity and detecting anomalies, it would be straightforward to use the typical PDDs of index values as a reference to quantify the degree of anomaly.

In the present study, to perform analyses that are as comprehensive as possible with a finite amount of data, the statistical tests were performed for eight non-overlapping patterns of gridding for four different pairs of \(l\) and \(N\). As a result, eight typical PDDs, estimated from the independent data and thus available for statistical tests, were obtained for each pair of \(l\) and \(N\) (Figs. 15, 16, 17). These PDDs depend on \(N\), but not on \(l\), and have variations according to the number of index values obtained by the analyses (maximum for \(N=50, l=0.4^\circ\) in area A, minimum for \(N=100, l=0.2^\circ\) in area B). Therefore, the FMD and timing of earthquake occurrence related to the tidal response seen through these indices have certain characteristics regardless of the location or spatial scale, and these results may be explained in a unified manner as simply having index values that vary according to the number of analyses, \(N\). If such a feature can be modeled, it will be useful, especially when applied to anomaly detection.

Therefore, in the following section, we present simple models for the FMD and tidal correlation of seismicity that explain the typical PDDs of \(b\), \(\eta\), and \(D\).

### Simple models explaining the observed typical PDDs

#### Frequency-magnitude distribution

The PDD of \(b\), which is the value obtained from Eq. (2) and corresponds to the slope of the FMD around \(M={M}_{\mathrm{th}}\), is not very different from that expected from the GR law with constant \(b\)-values (black dashed lines, Fig. 15). This means that the range of variation of the true \(b\)-value is not very wide in most of the analyzed seismicity (it rarely varied by more than \(\pm 0.1\) in area A or \(\pm 0.2\) in area B). In contrast, the PDD of the typical \(\eta\)-value (Fig. 16) is lower than the distribution expected from the GR law. This PDD shift does not indicate a deviation from the GR law only for large magnitude events, as often modeled (e.g., Hirose et al. 2019b); rather, it reflects the shape of the upward convex FMD whose slope gradually changes with increasing magnitude.

We use the equation proposed by Lomnitz-Adler and Lomnitz (1979) (hereafter referred to as the L-L formula) as a model to express the convex shape of the FMD for a small number of earthquakes. In the L-L formula, the number of earthquakes above a certain magnitude \(M\) is described as

$$\begin{array}{c}\log N\left(M\right)=A-c \exp\left(HM\right) ,\end{array}$$

(7)

where \(A,c,\) and \(H\) are positive parameters. The physical background of the derivation of this equation is not considered here; the reason for its adoption is simply that it is suitable for describing the observations. Because the slope of the FMD varies according to both \(M\) and \(H\) in this equation, it is convenient to transform it using the slope \(b^{\prime }\left( {M_{{{\text{th}}}} } \right)\) at \(M={M}_{\mathrm{th}}\) for comparison with the observed \(b\)-value. That is

$$\begin{array}{c}\log N\left(M\right)=A-\frac{{b}{^{\prime}}\left({M}_{\mathrm{th}}\right)}{H} \exp\left({H}\cdot \left(M-{M}_{\mathrm{th}}\right)\right),\end{array}$$

(8)

$$\begin{array}{c}{b}{^{\prime}}\left({M}_{\mathrm{th}}\right)=\frac{d\log N}{dM}\left({M=M}_{\mathrm{th}}\right)=cH \exp\left(B{M}_{\mathrm{th}}\right) .\end{array}$$

(9)

In Eqs. (8) and (9), \(b{^{\prime}}\left({M}_{\mathrm{th}}\right)\) and \(H\) correspond to the observed \(b\)- and \(\eta\)-values, respectively. From the results of the present analysis, it is considered that \(b\)-values fluctuate within a narrow range. Hence, \(b{^{\prime}}\left({M}_{\mathrm{th}}\right)\) is assumed to be normally distributed. From the present observations of the \(\eta\)-values and from the general observation that the GR law is almost valid, it seems that \(H\) fluctuates slightly and has a small positive value. Hence, a lognormal distribution is assumed here. In this case, the probability density functions of \({b}{^{\prime}}\) and \(H\) can be expressed as

$$\begin{array}{c}f\left({b}{^{\prime}}\right)=\frac{1}{\sqrt{2\pi {\sigma }_{{b}{^{\prime}}}^{2}}}\exp\left(-\frac{{\left({b}{^{\prime}}-{\mu }_{{b}{^{\prime}}}\right)}^{2}}{2{\sigma }_{{b}{^{\prime}}}^{2}}\right),\end{array}$$

(10)

$$\begin{array}{c}f\left(H\right)=\frac{1}{\sqrt{2\pi }{\sigma }_{H}{H}}\exp \left(-\frac{{\left(\ln H - {\mu }_{H}\right)}^{2}}{2{\sigma }_{H}^{2}}\right) ,\end{array}$$

(11)

where \({\mu }_{{b}{^{\prime}}}\) and \({\sigma }_{{b}{^{\prime}}}\) are the mean and standard deviation of \({b}{^{\prime}}\), respectively, and \({\mu }_{H}\) and \({\sigma }_{H}\) are the mean and standard deviation of \(\ln B\), respectively.

Here, we searched for the values of\({\mu }_{{b}{^{\prime}}}\),\({\sigma }_{{b}{^{\prime}}}\),\({\mu }_{H}\), and \({\sigma }_{H}\) by repeating the generation of \(N\) \(M\)-sequences to estimate the \(b\)- and \(\eta\)-values for each combination of search parameters 30,000 times so that the PDDs of the \(b\)- and \(\eta\)-values fit the observed ones well. The goodness of fit was determined by the weighted least squares method, which considers the variance of the observed values in each bin as the observation error and the inverse of the variance as the weight. That is, when the estimated value of the \(k\)-th bin is \({g}_{k}\) and the observed value is\({y}_{kn}\), where \(k(=\mathrm{1,2},\dots ,\mathrm{ K})\) corresponds to the bins of PDDs shown in Figs. 15a, b and 16a, b for area A and Figs. 15c, d and 16c, d for area B, excluding the bin with probability density of 0, and the number of observed values in each bin is \(n(=\mathrm{1,2}, \dots , 8)\), the combination of parameters that minimizes the following equation is obtained:

$$\begin{array}{c}{S}_{w}=\frac{1}{K}{\sum }_{k}\frac{1}{8}{\sum }_{n}\frac{{\left({g}_{k}-{y}_{kn}\right)}^{2}}{{\sigma }_{k}} ,\end{array}$$

(12)

where \({\sigma }_{k}\) is the variance of \({y}_{kn}\) in each bin.

The results of a grid search around the obtained parameter values are shown in Fig. 18. Although the parameter values range slightly, the values can be almost uniquely obtained as \({\mu }_{{b}{^{\prime}}}=0.875\), \({\sigma }_{{b}{^{\prime}}}=0.09\), \({\mu }_{H}=-2.7\), and \({\sigma }_{H}=0.2\) for area A, where \({M}_{\mathrm{th}}=1.95\), and \({\mu }_{{b}{^{\prime}}}=0.75\), \({\sigma }_{{b}{^{\prime}}}=0.105\), and \({\mu }_{H}=-1.35\), \({\sigma }_{H}=0.75\) for area B, where \({M}_{\mathrm{th}}=3.45\). In Figs. 15 and 16, the PDDs of 30,000 \(b\)- and \(\eta\)-values estimated in the simulation using these parameters are shown as black lines. The gray bars show the range in which 90% of the 10,000 PDDs estimated for almost the same number of \(b\)- and \(\eta\)-values as observed. These results show that using the estimated parameter values with Eqs. (9–11) reproduces well all observed PDDs of \(b\) and \(\eta\) simultaneously.

The above model reproduces the characteristics of the FMD expressed using two indices, \(b\) and \(\eta\), but does not guarantee that the FMD itself is well represented by the L-L formula (Eq. 7). However, the cumulative frequency distribution of all magnitudes generated by the proposed model is consistent with the overall observed FMD (Fig. 19), and the L-L formula seems to be a good choice for representing the entire distribution in a unified formula. It should also be emphasized that, as shown in Fig. 19, even if individual activities that consist of \(N\) events have a convex shape, the linear frequency distribution expressed by the GR law is almost reproduced when all events are grouped together.

#### Tidal correlation

As mentioned above, when seismic activity is uncorrelated with the tidal response, the PDD of \(D\) is expressed as in Eq. (5). The actual PDD of \(D\) that we obtained is generally slightly larger than that expected from Eq. (5), even if we exclude seismicity with anomalous PDDs (Fig. 17). Here, we show that these slightly larger \(D\)-values can be explained by considering the sequential nature of earthquakes.

In the present analysis, the condition of \(\mathrm{min}{T}_{N/4}<\mathrm{21,600}\; \mathrm{s}\) is used to exclude in advance seismic activity clustered within a short period relative to the tidal cycle (i.e., high \(D\)-value activity resulting from aftershocks). However, the above conditions do not account for the effect of several earthquakes occurring in succession. As shown by the Omori-Utsu law, the probability of aftershocks is highest immediately after a previous earthquake, and it is often the case that several earthquakes are observed in succession.

In order to account for the effect of such a succession of earthquakes, we assume that \({N}{^{\prime}}=rN\) (\(0<r\le 1\)) of the observed \(N\) earthquakes are uncorrelated with the tide and occur at intervals sufficiently longer than the tidal period, whereas the remaining \(N-{N}{^{\prime}}\) earthquakes are successive to the previous one with a sufficiently shorter interval than the tidal period. In this case, as in Eq. (4), the \(D\)-value of \({N}{^{\prime}}\) earthquakes is described as

$$\begin{array}{c}{D}{^{\prime}}=\left\{{\left({\sum }_{j=1}^{{N}{^{\prime}}}\mathrm{cos}{\theta }_{j}\right)}^{2}+{\left({\sum }_{j=1}^{{N}{^{\prime}}}\mathrm{sin}{\theta }_{j}\right)}^{2}\right\}^{1/2} ,\end{array}$$

(13)

where \({\theta }_{j} \left(j=1, 2,\dots ,N{^{\prime}}\right)\) is the tidal phase angle of the \(j\)-th event. The PDD of \({D}{^{\prime}}\) is approximated as

$$f\left({D}{^{\prime}}\right)=\frac{2{D}{^{\prime}}}{{N}{^{\prime}}}\mathrm{exp}\left(-\frac{{{D}{^{\prime}}}^{2}}{{N}{^{\prime}}}\right).$$

(14)

Assuming that the remaining \(N-{N}{^{\prime}}\) earthquakes are successive and their phase angles are approximately equal to those of their preceding earthquakes, the relationship between \(E\left[D\right]\), the expected value of \(D\) estimated by Eq. (5), and \(E\left[D{^{\prime}}\right]\) estimated by Eq. (14) can be approximated by \(E\left[D\right]={\int }_{0}^{\infty }Df(D)dD\approx E\left[D{^{\prime}}\right]/r={\int }_{0}^{\infty }D{^{\prime}}f(D{^{\prime}})dD{^{\prime}}/r\). The following PDD of \(D\) satisfies this relationship:

$$f\left(D\right)\approx \frac{2rD}{N}\mathrm{exp}\left(-\frac{r{D}^{2}}{N}\right).$$

(15)

In other words, when considering subsequent occurrences, the PDD of \(D\) is represented by a Rayleigh distribution that becomes wider than expected from Eq. (5) as \(r\) becomes smaller.

Here, we searched for the value of \(r\) by repeating the generation of \(N\) \(D\)-values expected from Eq. (15) for each \(r\) 30,000 times so that the PDD of \(D\) fits well with the observed one. That is, \(r\) is estimated by minimizing \({S}_{w}\) in Eq. (12), where \({g}_{k}\) is taken from the PDD of \(D\)-values obtained in the simulations, \({y}_{kn}\) is taken from the observed PDDs (colored symbols in Fig. 17a, b for area A and Fig. 17c, d for area B), and \(k\) is the number of bins for which the probability density of the observed \(D\)-value is non-zero.

The results of a grid search around the obtained parameter values are shown in Fig. 20; we estimate that \(r=0.67\) for area A and \(r=0.71\) for area B. Within the range 0.6–0.7 for area A and 0.65–0.8 for area B, there is no significant difference from the observations. Figure 17 shows the PDDs of 30,000 \(D\)-values estimated in the simulation using \(r=0.67\) or \(r=0.71\) as black lines. The gray bars show the range in which 90% of the 10,000 PDDs estimated for almost the same number of \(D\)-values as observed. The results show that, by using the estimated value of \(r\) with Eq. (15), all observed PDDs of \(D\) can be explained well. The model PDD is, however, rather wide for data with \(N=50\), and strictly speaking, it may be better to change the model depending on the value of \(N\). However, considering the error range of the observation results, the same model appears to be a sufficient approximation for \(N = 50-100\).

The estimated \(r\) values suggest that earthquakes that are followed by other earthquakes within an interval shorter than the tidal cycle occur about 30–40% of the time in area A and 20–30% of the time in area B for earthquakes that satisfy \(\mathrm{min}{T}_{N/4}\ge \mathrm{21,600}\; \mathrm{s}\). Figure 21 shows the average ratio of successive events that occurred within 3 h of the previous event (i.e., less than 1/4 of the main tidal cycle of about 12 h) to \(N\) in the typical, high, and low \(D\)-value regions. The values in the typical \(D\)-value regions, about 0.3 for area A and about 0.2 for area B, are consistent with the corresponding values obtained from the model. In area A, the proportion of subsequent earthquakes is lower in the case of \(N\) = 50 than in the case of \(N\) = 100, which may be why we obtained a slightly wider PDD in the case of \(N\) = 50 when the two cases were modeled together. Furthermore, the proportion of sequential earthquakes is higher in the high \(D\)-value regions and lower in the low \(D\)-value regions. Therefore, the PDDs of \(D\)-values are understood to basically depend on the proportion of sequential earthquakes, and that slight changes in the PDD resulting from the frequency of sequential earthquakes appear as anomalous \(D\)-values.

In this analysis, we could not find any tidal correlation, but if a real tidal correlation exists, it should be possible to detect it as an anomaly in the \(D\)-value by using the typical PDD of \(D\) obtained here as a reference. In this case, however, it is necessary to pay attention to whether there is an extreme increase or decrease in the number of subsequent earthquakes in an interval shorter than the tidal cycle. The \(D\)-value, also, may be useful as an index to understand the increase or decrease of subsequent earthquakes during such short intervals.

Finally, we show the relationship between \(D\) and \(b\) or \(\eta\) (Fig. 22). Because the \(b\)-values also show a weak dependence on the time interval of earthquake occurrence (Fig. 14), one might expect a very weak negative correlation between \(b\) and \(D\). However, there is no clear correlation in any of the plots in Fig. 22. This is probably due to the fact that we excluded in advance seismic activities with clearly short earthquake intervals (\(\mathrm{min}{T}_{N/4}<\mathrm{21,600}\; \mathrm{s}\)) in our analysis of the \(D\)-values. Therefore, these three index values can be treated as independent, at least for typical activity as defined here.