Skip to main content

Integration of Machine learning and equal differential time method for enhanced hypocenter localization in earthquake early warning systems: application to dense seismic arrays in Taiwan

Abstract

The Earthquake Early Warning System (EEWS) acts as a vital instrument for reducing seismic risks in regions with high seismic vulnerability. A rapid and accurate hypocenter estimation is pivotal for the EEWS, providing the groundwork for more reliable magnitude and intensity assessments necessary for effective earthquake warnings. This study presents an algorithm that integrates machine-learning-based (near) real-time phase picking with an Equal Differential Time (EDT) rapid hypocenter location algorithm, applying it to a 3D velocity model. The phase-picking model, refined through data augmentation, enhances the precision of phase detection in continuous recordings and simultaneous multiple events while ensuring the swift detection of the P-phase, which is critical for early earthquake warnings. Our rapid earthquake location method calculates theoretical P arrivals from potential hypocenters, which are grid points in a 3D velocity model, to stations that are close to their grid points, with the arrivals being stored by the station. As P arrivals are detected, the differences in arrival times across stations are utilized in EDT for estimating hypocenters. Furthermore, our earthquake location algorithm is adept at localizing multiple seismic events, a capability that can diminish the risk of unreported cases in scenarios where events occur in close temporal and spatial succession in high seismicity regions. We applied the algorithm to real waveform recordings of recent earthquakes in Taiwan that satisfied the early warning criteria. The results suggest that our algorithm consistently yields more reliable hypocenter estimates compared to those from the currently operational EEWS in Taiwan. Moreover, our algorithm succeeded in locating an earthquake that the current EEWS overlooked due to its failure to recognize P arrivals. These results showcase the potential of our algorithm to provide more accurate hypocenter estimates and to locate earthquake events with complex seismic recordings.

Graphical Abstract

1 Introduction

Earthquake Early Warning Systems (EEWS) are pivotal solutions aimed at promptly detecting earthquakes and issuing alerts before significant ground shaking arrives (e.g., Gasparini et al. 2007; Kodera et al. 2016; Allen and Melgar 2019; Cremen and Galasso 2020; Yamada et al. 2021; Saunders et al. 2022). These systems enable individuals and communities to take appropriate actions, thereby minimizing casualties and infrastructure damage. Typically, EEWS estimates earthquake source parameters, such as location and magnitude, by detecting the fastest seismic P-waves emanating from the earthquake. These estimated parameters form the basis for predicting ground shaking intensity and determining the necessity of issuing an alert. In seismically active regions, such as Mexico, Korea, Japan, and Taiwan, EEWS have been developed and implemented to mitigate losses and reduce the impact of significant seismic events (Allen and Melgar 2019).

Taiwan, located in an active orogenic zone adjacent to two subduction zones, frequently experiences seismic activity (Wu et al. 2007; Hsiao et al. 2009). The complexity of its geological structures increases the risk of seismic disasters. To enhance early warning capabilities for major earthquakes, thereby reducing damage and ensuring public safety, the Central Weather Administration (CWA) of Taiwan has established an EEWS. This system comprises a dense seismic monitoring network designed for rapid detection of significant seismic events. Currently, the EEWS of Taiwan employs a multi-step process for estimating the earthquake hypocenter. Initially, P-wave arrivals are selected using a Short-Term Average/Long-Term Average (STA/LTA) (Allen 1982) based method, incorporating multiple parameters to prevent false detections, such as signal-to-noise ratio, the number of zero crossings, and amplitude thresholds. When the number of triggered P-wave arrivals exceeds five, these arrivals are used for hypocenter estimation. The earthquake's epicenter is determined using Geiger's method (Geiger 1912), while the focal depth is estimated through a grid search with 10 km intervals. For rapid estimation, a half-space velocity model, based on the average velocity of the three-dimensional model by (Wu et al. 2009), is employed to calculate travel times (Chen et al. 2015, 2019). The hypocenter estimate is updated whenever new P arrivals are available to ensure a more reliable hypocenter estimation for the EEWS in Taiwan. Currently, the hypocenter used for the EEWS from at least the third update. Additionally, if the azimuthal gap (GAP) of the estimated hypocenter exceeds 150°, at least 11 P-wave arrivals are required. The magnitude estimation in the current EEWS of Taiwan is based on regression equations that utilize the hypocentral distance and the peak amplitude of displacement (Pd) from different types of sensors in the array (Wu and Zhao 2006; Chen et al. 2017, 2019). In regions where earthquake depths range from tens to hundreds of kilometers, using hypocentral distance more accurately represents the spatial distance between the source and the station. This approach can provide more reliable regression equations for magnitude estimates for the EEWS (Chen et al. 2017; Wu et al. 2023). Once the earthquake source parameters are determined, seismic intensity for various regions is estimated, and earthquake warnings are issued for areas meeting the warning threshold (Chen et al. 2015, 2019). Presently, Taiwan's EEWS effectively issues public alerts within 10 s of earthquake detection within the monitoring network.

The Earthquake Early Warning System (EEWS) in Taiwan typically provides timely alerts for significant local earthquakes. Nevertheless, due to the high seismicity in Taiwan (Fig. 1a), the system sometimes encounters difficulties in identifying P-wave arrivals in complex waveform recordings. For example, this occurs in the case of earthquakes that happen closely in time and space, or major quakes that are preceded by foreshocks (Chen et al. 2019). Moreover, the existing algorithm for source localization is challenged in localizing multiple events concurrently, which results in missed alerts for potentially significant earthquakes. Additionally, the complex geological structure of Taiwan suggests that the use of a more precise three-dimensional velocity model for earthquake localization would improve the accuracy of hypocenter estimates. Consequently, there is a need to develop a more efficient and accurate algorithm for hypocenter localization within the EEWS to enhance the monitoring of major earthquakes, thereby improving subsequent assessments of magnitude and intensity.

Fig. 1
figure 1

a Distribution of earthquakes with a magnitude (ML) greater than 4.0 in the Taiwan region from 1990 to October 2023. Earthquakes meeting the early warning criteria are marked with stars. b Distribution of potential hypocenter grid points at a depth of 10 km. Grid points are more closely spaced in areas with high station density or high seismicity. Stars indicate specific earthquake events selected for demonstrating our hypocenter location algorithm (as shown in Figs. 5, 6 and 7) and red triangles represent stations

Recent advancements in machine learning have demonstrated significant potential in recognizing seismic signals and identifying seismic phases from recordings (Zhu and Beroza 2018; Mousavi et al. 2020; Liao et al. 2021, 2022; Kubo et al. 2024). Traditional phase-picking methods, such as STA/LTA, rely on changes in seismic waveform amplitude for phase identification, which can lead to errors or omissions, particularly in abnormal situations or during dense seismic activity. Conversely, machine learning-based seismic phase identification utilizes the characteristics of seismic waves and can include abnormal signals and background noise in its training set, thereby reducing misjudgments. Furthermore, data augmentation techniques can be applied to enhance the model's ability to adapt to various seismic scenarios, significantly improving the identification of multiple seismic phases in a short time, and the real-time identification of P-wave arrivals (Liao et al. 2022). These improvements, especially in real-time P wave phase picking, are vital for effective earthquake early warning.

For rapid earthquake warning alerts, the first-arrived P waves recorded by stations during major earthquakes are primarily utilized for fast hypocenter estimation (Gasparini et al. 2007; Chen et al. 2019; Allen and Melgar 2019). This estimation is crucial for assessing earthquake magnitude and seismic intensity, thereby informing the decision to issue alerts. Once the machine learning model identifies P-wave arrivals, earthquake localization can proceed using the Equal Differential Time (EDT) method (Font et al. 2004; Lomax 2005; Satriano et al. 2007, 2008). This method leverages the differences in P-wave arrival times between station pairs. EDT localization, reliant on phase arrival time differences, does not require precise information on the origin time of an earthquake. Additionally, the robustness of EDT in the presence of data outliers is a key feature, particularly vital when dealing with limited data, which underscores its considerable practical utility in EEWS applications(Satriano et al. 2008). In this study, we combine reliable P arrivals identified by machine learning model with an EDT-based method for (near) real-time hypocenter localization. The effectiveness of this algorithm is demonstrated through its application to earthquake events and dense seismic recordings in Taiwan (Fig. 1b).

2 Methods

Within EEWS, obtaining reliable hypocenter estimates is crucial for subsequent calculations of earthquake magnitude and ground motion predictions. In this study, we integrate RED-PAN, a machine-learning driven phase picking model, which is designed to rapidly and precisely discern phase arrivals from real-time seismic recordings. These identified P phase picks are then incorporated into an Equal Differential Time (EDT) (e.g., Font et al. 2004; Lomax 2005; Satriano et al. 2008) based method for hypocenter localization. EDT capitalizes on phase arrival differences between station pairs to probabilistically determine potential hypocenter locations (Fig. 2). Notably, the EDT method operates independently of the event's origin time and remains resilient against potential phase arrival errors or outliers, highlighting its aptness for EEWS (Satriano et al. 2008). In our approach, the EDT method is applied to potential hypocenter grids within our study area, and the P arrivals are pre-calculated using a 3D velocity model. In regions with intricate velocity structures, such as Taiwan, the integration of a 3D velocity model proves instrumental in enhancing the accuracy of hypocenter determinations.

Fig. 2
figure 2

Workflow of the rapid hypocenter estimation algorithm developed in this study

The success of EEWS hinges upon rapid and robust detections of the initial P phase of major earthquakes. This leads to more accurate estimates of earthquake source parameters and subsequently more reliable hazard evaluations. Recent advancements in machine learning applied to seismic phase picking have shown considerable reliability in automated phase detections (e.g., Liao et al. 2021, 2022; Mousavi et al. 2020; Zhu & Beroza 2018). In this study, we employ Real-time Earthquake Detection and Phase Picking with Multitask Attention Network (RED-PAN) (Liao et al. 2022) for seismic phase picking in EEWS. Various data augmentation techniques have been deployed to enhance the model's performance for rapid P phase picking on real-time continuous recordings. These include superimposed earthquake waveforms for detecting seismic phases of multiple events, random shifts of waveforms for phase detection in continuous recordings, and utilizing only the front part of P arrival waveforms for rapid P phase detections (Liao et al. 2022). The RED-PAN model outperforms the STA/LTA-based method, currently in use by EEWS of Taiwan, by providing more reliable and precise P arrival pickings (Liao et al. 2022).

In this study, we utilize RED-PAN to process real-time waveforms and employ the P-wave probability derived from the latest 5 s of waveforms to identify P-wave arrivals. When the EEWS receives real-time data, RED-PAN updates the phase probability functions. The 5-s window is used to detect and track P waves from the most recently received seismic recordings for earthquake early warning. Additionally, the duration of the detected P phase must be at least 0.3 s, with the corresponding P-wave probability exceeding a threshold of 0.6 (Fig. 3). These thresholds were determined by applying RED-PAN to continuous recordings and ensuring that the recall, precision, and F1 score of the P-wave picks are higher than 0.97. We have conducted offline simulations for real-time data processing tests using a workstation equipped with an Intel Xeon W-2125 CPU @ 4.00 GHz and an Nvidia GeForce RTX 2080 Ti GPU. Our simulations demonstrate that RED-PAN can update every 0.5–1.0 s for seismic recordings from more than 900 stations with a 100 Hz sampling rate.

Fig. 3
figure 3

Diagram illustrating the machine learning model, RED-PAN, picking P-wave arrivals in real-time seismic records. The gray window denotes the prediction time range of RED-PAN. The pink area represents the P-wave triggering window; when the P-wave reaches the probability threshold (dashed line) and its amplitude meets the triggering criteria, it is determined as a P-wave arrival (red line)

To mitigate false detections of P phases due to random or anthropogenic noise, and to exclude P phases from small earthquakes, we established a threshold value for the amplitude after P phase identification. As most distinctive features of P waves are recorded on the vertical component, we processed continuous seismic records from each monitoring station in the vertical direction using the following steps to determine this threshold value:

  1. 1.

    Remove the mean and trend for each 20-min segment.

  2. 2.

    Take the absolute value of the entire waveform.

  3. 3.

    Implement a moving average over the amplitude for a duration of 0.3 s.

  4. 4.

    Derive the median from the moving average results.

We adopted the aforementioned approach to analyze multi-day data from each station, enabling us to observe trends in background noise levels. The threshold for determining P phases at each station is 5 times of 90th percentile of all the median of the station. This ensures that the amplitude of the detected P phase waveform is significantly larger than the background noises.

In this study, we have developed a rapid hypocenter location method based on the EDT algorithm (Zhou 1994; Font et al. 2004; Satriano et al. 2008) for use in EEWS. The EDT algorithm assesses the likelihood of potential hypocenter locations using observed phase arrival time differences (\(Ot_{i} - Ot_{j}\)) and the corresponding calculated arrival differences (\(Ct_{j} - Ct_{i}\)) between station i and station j at a source g, expressed as:

$$ Dt_{g,i,j} = \left( {Ot_{i} - Ot_{j} } \right) - \left( {Ct_{g,i} - Ct_{g,j} } \right) $$
(1)

Areas where these two differences are equal represent potential hypocenters, known as the EDT surface. In this study, we postulate the probability of the hypocenter to follow a Gaussian probability distribution with respect to Dt, represented as:

$$ H_{g,i,j} = \frac{1}{{\sigma \sqrt {2\pi } }} {\text{e}}^{{ - \frac{{Dt_{g,i,j}^{2} }}{{2\sigma^{2} }}}} $$
(2)

Here, \(\sigma\) is a tunable parameter that modifies the probability model according to various factors specific to the study area, such as uncertainty of phase arrivals, and the accuracy of the velocity model. For this research, we assign a value of 1.0 to \(\sigma\). Within the framework of EEWS, we can utilize the differences in P-wave arrivals triggered by each station to assess the probability of the hypocenter. As the count of triggered P arrivals for a significant earthquake increases, we can progressively update the probabilities of the hypocenter, Q, at location, g, by multiplying together all probability functions for all pairs:

$$ Q_{g,i,j} = \mathop \prod \limits_{i = 1}^{N - 1} \mathop \prod \limits_{j = i + 1}^{N} H_{g,i,j} $$
(3)

Or,

$$ Q_{g,i,j} = \frac{1}{{\sigma \sqrt {2\pi } }} {\text{e}}^{{ - \frac{{\mathop \sum \nolimits_{i = 1}^{N - 1} \mathop \sum \nolimits_{j = i + 1}^{N} Dt_{g,j,i}^{2} }}{{2\sigma^{2} }}}} $$
(4)

In the above equations, 'N' denotes the total number of triggered P arrivals associated with the earthquake event. The location associated with the highest probability, based on the triggered P arrivals, is deemed the optimal hypocenter of the event.

Recognized for its robustness, the EDT method has been adopted for real-time earthquake location by grid searching within a layered velocity model (Satriano et al. 2007, 2008). In regions characterized by pronounced topographic differences and intricate velocity structures, like Taiwan, adopting a 3D velocity model can enhance the precision of source localization. Consequently, we have incorporated the EDT approach for real-time hypocenter determination within a 3D velocity model. Currently, the monitoring areas of the EEWS in Taiwan are delineated by longitudes 119°–123° and latitudes 21°–26°, extending to a depth of 100 km. We have divided this range into grid points, which act as potential hypocenters for EDT localization. On the horizontal plane, the grid points are partitioned based on station density, with spacings of 0.2°, 0.4°, and 0.8° as station density coverage decreases. Vertically, the grid is divided into intervals of 2 km from 2 to 20 km, 4 km from 34 to 70 km, and 6 km from 76 to 100 km. To enhance location accuracy in areas of high seismic activity, we have halved the horizontal grid spacing larger than 0.2° within a 10 km radius of the hypocenters of earthquakes greater than magnitude 4.0 that occurred between 1990 and October 2023 (Fig. 1a). Figure 1b shows the distribution of grid points at a depth of 10 km.

Due to the crucial need for rapid earthquake alerts, EEWS primarily utilizes P-wave recordings from stations in closest proximity to the earthquake source for source parameter assessments (Chen et al. 2015, 2019). In our study, we compute the P-wave arrivals from hypocenter grid points to the 30 nearest stations. If the temporal difference between P-wave arrivals at the first and last station is less than 2.0 s, we expand the number of stations used for calculation, but not exceeding 60 stations. The P-wave arrivals for each grid point are then stored by the station for later use in source localization. In particular, station i stores: {\(Ct_{1,j}\), \(Ct_{2,j}\), …, \(Ct_{M,j}\)}, where 1, 2, …, M represents all grid points encompassing the calculated P arrivals. Additionally, each station also maintains a list of neighboring stations. Thus, station i also retains: Si = {\(s_{1}\), \(s_{2}\), …, \(s_{k}\)}, where the closest 'k' stations to station i are stored, with 'k' being 60 in this study. The time difference ranges for each station pair on potential grids are also retained.

$$ [{\text{min}}(Dt_{1,i,k} ,Dt_{2,i,k} , \ldots , Dt_{N,i,k} ) - {\text{Tt}},{\text{ max}}(Dt_{1,i,k} ,Dt_{2,i,k} , \ldots , Dt_{N,i,k} ) + {\text{Tt}}] $$

Here, 'N' represents grids commonly involving Station i and 'k' in the potential hypocenter grids. Tt is the tolerable time difference, in this study, we set it to 0.5 s. Typically, EEWS employs both temporal and spatial criteria to trigger P-wave arrivals for phase association, facilitating rapid hypocenter localization. In our study, the list of neighboring stations and the range of arrival time differences serve as the spatial and temporal criteria for the phase association of triggered P-wave arrivals in our EDT location. The aforementioned materials are pre-calculated and stored. When the code runs, they are loaded into memory for rapid hypocenter localization.

The methodologies described above collectively form our rapid hypocenter location workflow for EEWS (Fig. 2). Initially, real-time seismic recordings from the stations undergo P phase-detection via RED-PAN. If a P phase meets both the amplitude and probability thresholds, it proceeds to the next step of earthquake event identification.

  1. (1)

    If there is no earthquake event, a new event is established with the triggering station.

  2. (2)

    If existing events are present, it is determined whether the triggered P phase belongs to one of these earthquake events.

For rapid hypocenter determination in EEWS, the association of triggered P arrivals typically hinges on their temporal and spatial relationships (Chen et al. 2019). In the present study, the d temporal and spatial criteria are as follows:

  1. (1)

    If the station triggering the P-wave already exists in an earthquake event, it does not belong to the event.

  2. (2)

    If the triggering station, s, is within the intersection of neighboring station sets of stations in the event, s \(\in \sum\nolimits_{i - 1}^{n} {S_{i} }\), and the P-wave arrival time difference is within the range of the minimum and maximum time differences across all stations, then it will be added to the event. Where n is the number of stations in the event. These conditions are for stations that are close to the hypocenter.

  3. (3)

    If the P arrival at a triggered station occurs within Td seconds of the estimated P arrival from a hypocenter on the list, it will be incorporated into the event. In this study, Td represents the allowable time error and is set to 1.0 s. This criterion is specifically for stations that are not close to the hypocenter. The estimated P-wave arrivals are calculated using the layered velocity model of Taiwan.

If the triggered P arrival isn't associated with any existing event, it initiates the establishment of a new event. Conversely, the triggered P arrival serves to estimate or update the probabilities of potential hypocenter grids through EDT. After updating the potential hypocenter probabilities, the system checks whether the event reporting conditions are met.

The conditions based on the current EEWS in Taiwan are applied as follows:

  1. (1)

    The earthquake event should include a minimum of 5 P-wave arrivals.

  2. (2)

    If the azimuthal gap of data is larger than 220°, at least 10 P-wave arrivals are required.

  3. (3)

    The root mean square (RMS) differential between the observed and calculated P arrivals at the grid point, g, with the highest likelihood, should fall below a specific threshold. In this study, the threshold is set at 0.3 s.

    $$ {\text{RMS}} = \sqrt {\frac{1}{n}\mathop \sum \limits_{i = 1}^{n} \left( {Ot_{i} - Ct_{g,i} } \right)^{2} } $$
    (5)

Here, 'n' refers to the number of picks used in hypocenter estimation. Additionally, to remove outdated earthquake events, the system performs a check on existing events every second. If an existing event has not recorded any new P-wave arrivals for more than 30 s, that event is then deleted.

3 Results

To evaluate the efficiency and accuracy of our developed rapid hypocenter location algorithm, we applied it to earthquakes in Taiwan from January to October 2023 that met the early warning criteria (Fig. 4; Table 1). We compared our hypocenter estimates with those from the CWA earthquake catalog and its EEWS (Table 1). The CWA hypocenter locations are determined using all available P and S picks from the Central Weather Administration Seismographic Network (CWBSN), which are manually selected and verified by experts. The hypocenter locations in the CWA catalog are primarily derived from either a layered velocity model or the 3D velocity model by Wu et al. (2009), depending on the misfit of the location results. The magnitude of earthquakes is based on these CWA catalog hypocenter locations. Therefore, we used the hypocenter locations from the CWA catalog as a benchmark for optimal hypocenter location. Figure 4 shows the differences in hypocenter locations determined by different methods. Generally, our algorithm provides more accurate hypocenter estimates compared to those from the CWA EEWS. In regions with a high density of seismic stations and extensive azimuthal coverage, our algorithm typically estimated hypocenters using data from only 5 P arrivals, with calculation times for hypocenter estimates ranging from 0.11 to 0.55 s (Table 1). For earthquakes in regions requiring more observations for reliable hypocenter estimates, the calculation time was longer but still within 1.5 s (Table 1).

Fig. 4
figure 4

Comparisons of hypocenter estimates between different methods. a Comparison between the earthquake catalog of the CWA (solid blue circles) and its EEWS (open black circles). b Comparison between the CWA earthquake catalog (solid blue circles) and our rapid hypocenter estimation algorithm (open red circles)

Table 1 Comparison of hypocenter estimates among different methods

Since 2015, the CWA has been annually enhancing the real-time transmission capabilities of the stations in the Taiwan Strong Motion Instrumentation Program network (TSMIP). The primary objective of TSMIP is to provide detailed seismic intensity information, particularly in densely populated areas, resulting in a station distribution that is highly correlated with population density (Fig. 1b). Our algorithm proves to be effective for earthquakes occurring in areas with dense station coverage, facilitating rapid and accurate hypocenter estimations. This can significantly contribute to reducing the time required for earthquake early warnings in densely populated regions. For instance, the ML 5.6 earthquake on September 5, 2023 (Fig. 5; Event 1 in Fig. 1b) demonstrated the effectiveness of our algorithm in an area with dense station distribution near the hypocenter. Our algorithm required only about 0.11 s to estimate the hypocenter, with the hypocenter differences compared to the CWA catalog being less than 1 km (Table 1).

Fig. 5
figure 5

Comparison of hypocenters for the September 5, 2023, ML 5.6 earthquake (Event 1 in Fig. 1b). a Normalized probability maps of the hypocenter as derived from our algorithm. Hypocenters determined by different methods are indicated: CWA catalog (black star), CWA EEWS (black hollow star), and our algorithm (yellow star). b Seismic recordings utilized for rapid hypocenter estimation by our algorithm, showcasing the probability functions of P-wave arrivals and the P-wave picks for the earthquake

In determining hypocenter locations, large azimuthal gaps in data typically lead to increased uncertainty. To achieve more reliable hypocenter estimations, both our algorithm and the CWA EEWS have established criteria concerning the azimuthal gap of the data. In Taiwan, events with large azimuthal gaps frequently occur outside the coverage of the seismic network. Additionally, earthquakes occurring in mountainous areas sometimes exhibit large azimuthal gaps, owing to the limited number of seismic stations in these regions (Fig. 1b). A case in point is the ML 5.8 earthquake on October 11, 2023 (Fig. 6, Event 2 in Fig. 1b). The earthquake was located on the mountain side, while the majority of seismic stations were in the valley (Fig. 1b). The initial hypocenter estimate for this event, based solely on data from 5 P arrivals, showed low depth precision (Fig. 6a). However, once the event met the reporting criteria, we were able to achieve a more accurate hypocenter estimation (Fig. 6b). Our results consistently indicated smaller hypocentral difference compared to that of the CWA EEWS.

Fig. 6
figure 6

Comparison of hypocenter results for the October 11, 2023, ML 5.8 earthquake (Event 2 in Fig. 1b). a Normalized probability maps based on 5 P-wave arrivals and b on 10 P-wave arrivals, as derived from our algorithm. Hypocenters determined by various methods are marked: CWA catalog (black star), CWA EEWS (black hollow star), and our algorithm (yellow star). c Showcases seismic recordings used for our rapid hypocenter estimation, the probability functions of P-wave arrivals and the P-wave picks for the earthquake

The high seismicity in Taiwan sometimes leads to multiple earthquakes occurring closely in time and space. The CWA EEWS currently uses an STA/LTA-based method for P phase detecting. The STA/LTA mainly relies on changes in amplitude ratio and therefore may fail to detect phase picks for events that occur in close succession. Furthermore, the CWA EEWS is currently unable to localize multiple earthquake events simultaneously. As a result, earthquakes occurring in close temporal proximity may not be accurately located, leading to missed alerts. An example of this is the ML 5.3 offshore earthquake on September 15, 2023 (see Fig. 7c, d, Event 4 in Fig. 1b), which occurred merely 5 s after an ML 4.4 offshore earthquake (Fig. 7a, b, Event 3 in Fig. 1b). Due to their close temporal and spatial proximity, the waveforms of these earthquakes overlapped. The STA/LTA-based method predominantly detected the P arrivals of the first earthquake, neglecting the second event, which satisfied the early warning criteria. In contrast, our machine-learning-based phase-picking algorithm, RED-PAN, successfully identified most P-wave arrivals for both events, and these picks were utilized to estimate the hypocenters (Fig. 7). This highlights the potential of our algorithm to reduce the risk of missed alerts in areas with high seismic activity, such as Taiwan.

Fig. 7
figure 7

a Comparison of hypocenter results for the September 15, 2023, ML 4.4 earthquake (Event 3 in Fig. 1b) and c the subsequent ML 5.3 earthquake occurring approximately 5 s later (Event 4 in Fig. 1b). Displayed is a normalized probability map of the hypocenter estimates derived from our algorithm. The hypocenters as determined by different methods are marked: CWA catalog (black star), CWA EEWS (black hollow star), and our algorithm (yellow star). Notably, the CWA EEWS missed the ML 5.3 event. b, c Present seismic recordings of the two earthquakes used for our rapid hypocenter estimation, and the probability functions of P-wave arrivals. P3 and P4 denote the P-wave arrivals for events 3 and 4 detected by RED-PAN respectively

4 Discussion

In EEWS, the rapid and accurate estimation of seismic source parameters is crucial for determining whether to issue an alert. Therefore, rapid and precise identification and picking of P-wave arrivals are critical. Traditional phase pickers, such as widely used STA/LTA (Allen 1982), can generate fast and robust P picks with properly tuned parameters for seismic recordings containing a single earthquake (Table 1). However, machine learning-based models are adept at learning features in seismic recordings indicative of P-waves, eliminating the need to adjust model parameters for individual stations(e.g., Zhu and Beroza 2018; Mousavi et al. 2020; Liao et al. 2021). Moreover, the performance of machine learning models across various scenarios can be enhanced by increasing training data or through data augmentation specific to those scenarios. Figure 8 shows the P-wave picking results of RED-PAN and another machine learning method, EqTransformer (Mousavi et al. 2020), for seismic recordings in different scenarios. The RED-PAN, trained with various data augmentation techniques, performs better both in the number and speed of P-wave detections (Fig. 8). This indicates that even machine learning-based methods require data augmentation to improve performance, including the rapid detection and the identification of P-waves in complex waveforms (Liao et al. 2022). We have also applied the RED-PAN model to the seismic recordings of the 2016 Kaikoura earthquake (Fig. 8d), where the seismic recordings of New Zealand were not included in our training dataset. This demonstrates the generalizability of RED-PAN without the need for parameter adjustments. In this study, the RED-PAN model demonstrates the ability to rapidly identify P-waves in both real-time and complex waveform data (Fig. 7). This capability is largely attributed to the data augmentation methods used during training, which include an increased amount of data relevant to such situations (Liao et al. 2022). Given the high seismicity in Taiwan, closely timed earthquakes may occur (Fig. 7). Consequently, the RED-PAN model is able to accurately identify P-wave arrival times in complex waveform data, aiding in hypocenter localization for early warning systems and thus reducing the likelihood of missed alerts.

Fig. 8
figure 8

The figure illustrates the performance of two machine learning-based phase pickers, RED-PAN and EqTransformer (Mousavi et al. 2020), in picking P-waves at different probability thresholds using seismic waveforms from various events: a Event 1, b Event 3, c Event 4, and d the 2016 Kaikoura earthquake. For each event, the bar charts display the number of triggered stations out of the nearest 20 stations at P-wave probability thresholds ranging from 0.1 to 0.8 (blue bars for EqTransformer and red bars for RED-PAN). The line graphs show the average trigger time required for the stations to detect P-waves, along with the standard deviation (triangles for EqTransformer and circles for RED-PAN)

In EEWS, epicenter estimates are generally faster than hypocenter estimates. However, the use of epicenter-based magnitude estimates has different linear relationships with the depth of earthquakes in regions with wide ranges of earthquake depths, such as Japan (Sokolov et al. 2009). Several studies have employed hypocentral distance rather than epicentral distance for earthquake magnitude estimates in EEWS (e.g., Wu and Zhao 2006; Chen et al. 2017, 2019; Wu et al. 2023). This preference is likely due to the increased density of seismic arrays, which allows for rapid estimation of the hypocenter, more accurately representing the spatial distance between the source and the seismometer. Consequently, providing a more accurate hypocenter location in (near) real-time can enhance the reliability of magnitude assessments. Additionally, the hypocenter is a critical input for intensity and ground shaking estimates in various EEWS applications (e.g., Iwakiri et al. 2011; Suzuki et al. 2017; Kodera et al. 2018; Chen et al. 2019; Saunders et al. 2022). When the hypocenter of an earthquake is mislocated or inaccurately estimated, it can lead to errors in predicting ground motion intensity. In our application to Taiwan, EDT is employed to predict P-wave arrivals using a 3D velocity model. This approach generally offers more accurate hypocenter estimates (Fig. 4; Table 1). Such accuracy aids in improving the precision of subsequent estimates for earthquake magnitude and intensity in EEWS.

Currently, the hypocenter location algorithm in EEWS of Taiwan is designed to track only one earthquake event at a time, continuing to monitor the event for 60 s. Consequently, for earthquakes that occur in quick succession (within 60 s), the system is unable to estimate the hypocenters of subsequent earthquakes, even if these are significantly larger earthquakes. In the high seismicity of Taiwan, scenarios like a moderate earthquake occurring shortly before a larger one (Liao et al. 2022), or multiple warning-level earthquakes happening soon after a major quake (Chen et al. 2019), have previously occurred. Our algorithm utilizes temporal and spatial conditions to associate the triggered P-wave arrivals with potential earthquakes. It is capable of estimating sources for more than one earthquake simultaneously, which could enhance the performance of EEWS in Taiwan.

While our proposed method has demonstrated improvements in hypocenter estimates, there are situations where it may fail or underperform. It is crucial to be aware of the method's limitations and to adjust and test it as needed for specific applications. The EDT method utilizes the arrival time differences among P-wave arrivals from triggered stations rather than relying on individual arrivals. This approach can provide more constraints for hypocenter estimates when observations are limited, such as in earthquake early warning scenarios (e.g., Satriano et al. 2007, 2008). However, when the number of observations is very high, the EDT method may struggle to maintain computational efficiency. Currently, we use a grid system for potential hypocenter locations and calculate travel times based on a 3D velocity model to improve hypocenter estimates in regions with complex velocity structures (e.g., Huang et al. 2014). However, this approach imposes resolution limitations on location assessments. The precision of hypocenter localization is constrained by the grid spacing; finer grids can improve resolution but also increase computational demand. We propose using station density and seismicity as references for grid planning, recognizing the trade-off between resolution and computational efficiency. For hypocenter estimates in regions with significant gaps in station distribution, the EDT method, like other source location algorithms, suffers from large errors. In regions with large azimuthal gaps, the differential arrival times provide limited constraints on the probabilities of the hypocenters, resulting in larger errors in the estimated hypocenter.

Earthquake early warning systems are a crucial disaster mitigation tool in regions with high seismic risk. However, these systems still face numerous challenges that need to be overcome. In this study, we integrate the advantages of the machine learning-based phase picker, RED-PAN, and the EDT method for data processing and hypocenter estimate to enhance the performance of earthquake early warning systems. (1) The implementation of RED-PAN has significantly improved the robustness of P-wave phase picking, particularly in complex seismic waveforms (Fig. 7), and rapid detection of P arrivals (Fig. 8). (2) EDT utilizes the arrival time differences of P-waves among different stations, providing better constraints for hypocenter estimation when the number of stations is limited. (3) By using a three-dimensional velocity model, we pre-calculate and store the travel times from potential hypocenter grid points to stations. This approach enhances the accuracy and computational speed of hypocenter estimation, achieving near real-time hypocenter estimates suitable for earthquake early warning applications. For future developments, optimization techniques such as parallel processing and GPU acceleration should be explored to enhance the performance of both phase picking and localization algorithms. When applying our algorithm to other earthquake early warning systems, RED-PAN can be initially used to test data for phase picking. Due to RED-PAN’s training with diverse types of data and the use of various data augmentation techniques, the model exhibits good generalization capabilities (Fig. 8d). RED-PAN can also serve as a pre-trained model, requiring only a small amount of regional seismic waveform data to provide robust and accurate P-wave detection results, further improving the reliability and effectiveness of earthquake early warning systems.

5 Conclusion

This study combines the advantages of machine learning models in seismic phase picking with the strengths of EDT in earthquake source localization, applying these to rapid hypocenter estimates in EEWS. The machine learning model used in this study, RED-PAN, utilizes data augmentation techniques during its training phase to enhance its ability to identify P-waves in continuous and real-time data (Liao et al. 2022). Consequently, RED-PAN can provide rapid and accurate P-wave arrivals in complex seismic records, potentially reducing the incidence of missed P-wave arrivals for closely timed earthquakes. Upon detecting P-waves that reach the triggering threshold, we categorize them based on temporal and spatial conditions. For each set of P-wave arrivals, hypocenter estimation is conducted using EDT, which relies on the differences in P-wave arrivals between stations. EDT is performed on the grid points of potential earthquakes in the study area, which are configured according to the distribution density of stations and other considerations, such as the seismicity of the monitored area. The developed rapid hypocenter location algorithm has been applied to several earthquakes that meet the early warning criteria in Taiwan. Generally, the results demonstrate more reliable hypocenter estimates than those provided by the current EEWS of Taiwan. In Taiwan, seismic velocity varies significantly across areas due to its complex geological structures (e.g., Huang et al. 2014). Employing a 3D velocity model for calculating predicted P-wave arrivals at potential source grid points could enhance the accuracy of hypocenter estimations. In areas with dense station distribution, reliable hypocenter estimates can be rapidly achieved due to the availability of more P-wave arrival observations in a short time. These more accurate hypocenter estimates could improve subsequent earthquake magnitude and ground shaking assessments in EEWS (e.g., Chen et al. 2017, 2019; Kodera et al. 2018; Saunders et al. 2022; Wu et al. 2023). Furthermore, our method is capable of simultaneously localizing multiple earthquake events that occur in close succession, offering a significant advantage in reducing missed alerts in Taiwan and other regions with frequent seismic activity.

Availability of data and materials

The seismic waveform data utilized in this study were sourced from the Taiwan Seismological and Geophysical Data Management System (GDMS), which is managed by the Central Weather Administration (CWA) (Central Weather Administration, 2012). The website for accessing this data is https://gdms.cwa.gov.tw/ (last accessed in March 2024).

Abbreviations

EEWS:

Earthquake Early Warning System

EDT:

Equal differential time

CWA:

Central Weather Administration

STA/LTA:

Short-term average/long-term average

RED-PAN:

Real-time earthquake detection and phase picking with multitask attention network

RMS:

Root mean square

TSMIP:

Taiwan Strong Motion Instrumentation Program network

References

Download references

Acknowledgements

The authors would like to acknowledge the Science College of National Cheng Kung University (NCKU Science) and the National Science and Technology Council (NSTC), Taiwan, Republic of China, for a fellowship to support Wu-Yu Liao’s Ph.D. study. The authors also thank the National Center for High Performance Computing (NCHC) in Taiwan, for providing some computational and storage resources.

Funding

Jia-Xiang, Lian, Wu-Yu Liao, and En-Jui Lee are supported by the National Science and Technology Council, R.O.C., under contract 112-2116-M-006-014. This research was partly supported by the Central Weather Administration, R.O.C., under contract MOTC-CWB-112-E-06.

Author information

Authors and Affiliations

Authors

Contributions

Jia-Xiang, Lian: Algorithm improvement, programming, program application testing, and result comparison. Wu-Yu Liao: Data processing, machine learning model training. En-Jui Lee: Conceptualization of algorithms, data processing, analysis of test results, manuscript writing. Da-Yi Chen: Algorithm discussion, providing CWA result comparisons. Po Chen: Algorithm discussion, manuscript writing and discussion.

Corresponding author

Correspondence to En-Jui Lee.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lian, JX., Liao, WY., Lee, EJ. et al. Integration of Machine learning and equal differential time method for enhanced hypocenter localization in earthquake early warning systems: application to dense seismic arrays in Taiwan. Earth Planets Space 76, 94 (2024). https://doi.org/10.1186/s40623-024-02037-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s40623-024-02037-0

Keywords