Skip to main content

Estimating errors in autocorrelation functions for reliable investigations of reflection profiles

Abstract

Autocorrelation functions (ACFs) of vertically incident seismic waves are used to image subsurface reflectors. However, the reflection responses derived from ACFs usually contain many false signals. We present a method to quantify the errors in ACFs and extract true reflectors with high reliability. We estimated the errors for each earthquake at each station as follows. We calculated the amplitude of the observed waveform within the noise window and generated 1000 random noise traces that have this amplitude. By subtracting the random noise traces from the observed waveform, we created 1000 candidate earthquake waveforms. We computed the ACF for each of the 1000 waveforms and calculated the ensemble average and standard deviation of the 1000 different ACF amplitudes at each lag time. Then, we applied weighted stacking to the ACFs of many earthquakes to obtain the reflection response at the station. We calculated the standard deviation of the weighted stack to estimate errors in the reflection response. We evaluated the method by applying it to seismic data from the metropolitan area of Japan. The subsurface structure of the study area has been studied extensively and consists of a strong velocity discontinuity between sedimentary and basement layers. Following our method, the discontinuity was imaged as a clear reflector with an amplitude that was substantially greater than three times the standard deviation, which corresponds to statistical significance at the 99% confidence level. At other depths where reflectors are not expected to be present, the amplitudes of the peaks were less than or close to three times the standard deviation. The signal of the discontinuity was clearly visible at frequencies below 10 Hz and was less prominent at higher frequencies.

Graphical Abstract

Introduction

Seismic interferometry (Campillo and Paul 2003; Shapiro and Campillo 2004) has been used extensively to study subsurface structures in various fields of research. It can be used to retrieve a Green's function between two stations by cross-correlating the seismic waveforms at the stations caused by earthquakes or ambient noise. Many studies have focused on surface waves, while several studies have focused on body waves, including waves reflected from subhorizontal layer boundaries; either ambient noise (e.g., Roux et al. 2005; Draganov et al. 2007, 2013; Poli et al. 2012) or natural earthquakes (e.g., Abe et al. 2007; Tonegawa et al. 2009; Ruigrok et al. 2010) have been used to retrieve body waves. A pseudoreflection profile can be constructed by plotting the reflected waves on a depth section.

An autocorrelation function (ACF) of the waveform at a single station provides better body wave retrievals (Gorbatov et al. 2013). The ACF gives the reflection response of the station to a virtual shot at the same site. This ACF-based interferometry is called seismic daylight imaging (Schuster et al. 2004) and uses the principle first proposed by Claerbout (1968):

$$R\left(t\right)+R\left(-t\right)=\delta \left(t\right)-T\left(t\right)*T\left(-t\right),$$
(1)

where \(R(t)\) is the reflection response, \(T(t)\) is the transmission response at the station to a vertically incident impulsive wave, \(\delta \left(t\right)\) is the Dirac delta function, and * is a convolution. If the incident wave is impulsive, the observed record and its ACF are approximated by \(T\left(t\right)\) and \(T\left(t\right)*T\left(-t\right)\), respectively. Then, following Eq. (1), \(R(t)\) can be retrieved from the causal part (\(t>0\)) of the difference between the delta function and the ACF. This principle was tested before the widespread use of seismic interferometry (e.g., Scherbaum 1987; Tsutsui 1992; Daneshvar et al. 1995). In recent years, seismic daylight imaging based on natural earthquakes has been widely applied to the study of various targets, including basement (e.g., Watanabe et al. 2011; Plescia et al. 2020), the Moho (e.g., Ruigrok and Wapenaar 2012; Sun and Kennett 2016; Delph et al. 2019), volcanoes (e.g., Chaput et al. 2012; Heath et al. 2018), ice sheets (e.g., Pham and Tkalčić 2017; Yan et al. 2020), the inner core (Wang et al. 2015), and the moon (Nishitsuji et al. 2016). Many studies have also used the ACFs of ambient noise to estimate the reflection response (e.g., Ito et al. 2012; Tibuleac and von Seggern 2012; Gorbatov et al. 2013; Kennett et al. 2015; Oren and Nowack 2017; Saygin et al. 2017).

The reflection response derived from the ACF usually contains many positive and negative peaks; these peaks may indicate the presence of waves reflected from subsurface discontinuities but may also be false signals generated by unwanted waves and noise. Separating true signals from false signals remains a challenge. The false signals are partly attributed to complex media in which multiple scattering occurs. To distinguish multiple scattering from reflection in horizontally stratified media, a random matrix theory (e.g., Aubry and Derode 2009; Shahjahan et al. 2014; Blondel et al. 2018) has been proposed and applied. Both single and multiple scatterings are structural responses to the input earthquake signal. Another cause of false signals is the noise from various origins (e.g., human activity or wind) that overlap the earthquake records. Interpreting the noise as a structural response results in misdetection of subsurface features. In this study, we present a method to evaluate errors in ACFs caused by noise and identify signals in reflection responses on the basis of statistical significance. Our study area is in the metropolitan area of Japan; the subsurface structure of the study area has been examined in numerous studies (e.g., Koketsu and Higashi 1992; Yamanaka and Yamada 2006), including those based on seismic daylight imaging (Yoshimoto et al. 2008; Chimoto and Yamanaka 2020), and consists of a distinct velocity discontinuity between sedimentary and basement layers. The existence of this boundary has been confirmed by drilling down to the basement layer at several sites (e.g., Ohta et al. 1978; Yamamizu et al. 1981; Suzuki et al. 1983), and seismic data from stations proximal to the drilling sites are available, meaning that we almost know the “correct” subsurface structure with a high level of confidence. This situation makes the study area an ideal region to evaluate this method. In this study, we focus on ACFs based on natural earthquakes. The analysis of ambient noise requires a different approach, which will be examined in future studies. We focus on a horizontally stratified medium throughout this study; multiple scattering in heterogeneous media is beyond the scope of the present work. Although data and knowledge available in the metropolitan area of Japan are abundant, our goal is to establish a method applicable to poorly studied areas where the identification of major discontinuities with a high reliability is the primary task. In these areas, the station network is generally sparse, meaning that techniques that require dense arrays are not applicable. The ACF based on a single station is a powerful tool for investigating the subsurface structure in these areas.

Data and method

Data

We used data from the Metropolitan Seismic Observation network (MeSO-net; National Research Institute for Earth Science and Disaster Resilience (NIED) 2021; Sakai and Hirata 2009; Aoi et al. 2021), which was distributed around the metropolitan area of Japan (Fig. 1). Stations were deployed at approximately 20 m beneath the ground surface. We used the vertical component of the continuous accelerograms sampled at 200 Hz from the NIED website.

Fig. 1
figure 1

Seismic network of MeSO-net (triangles) and Hi-net (squares) stations. Blue, red, and purple triangles represent the stations used for the ENE–WSW profile, ESE–WNW profile, and both profiles, respectively. Circles represent the hypocenters of \(M\ge 2\) earthquakes at depths greater than 80 km from May 16, 2008, to June 30, 2021, from the unified catalog of the JMA; brown indicates earthquakes used in this study. Stars indicate the hypocenters of the earthquakes used in Fig. 2

From the unified catalog of the Japan Meteorological Agency (JMA), we extracted \(M\ge 2\) earthquakes that occurred between May 16, 2008 (start date of available MeSO-net data), and June 30, 2021 (Fig. 1). To keep S–P times > 10 s, we selected earthquakes at depths greater than 80 km. For each station, we used earthquakes with incident angles of ≤ 5° in accordance with the JMA2001 structure model (Ueno et al. 2002).

Method for estimating the ACFs and errors

For each earthquake i at each station, we created a 240-s waveform that started from the origin time of the event; the origin time was obtained from the unified catalog of the JMA. We manually identified the arrival time of the P-wave (\({T}_{i}^{p}\); Fig. 2a). If the arrival time could not be identified because of low signal-to-noise ratios or unclear onsets, the event was skipped. We removed the mean and whitened the spectrum (Fig. 2b and d) as follows: for each frequency, we divided the content by the average amplitude of 11 consecutive samples (0.0305-Hz width) centered on the frequency. We calculated the standard deviation (\({\sigma }_{i}^{obs}\)) of the whitened trace within the noise window, which was defined as 10.5–0.5 s before \({T}_{i}^{p}\) (arrow in Fig. 2b). The bandwidth for the whitening (0.0305 Hz) was narrower than that used in previous studies (e.g., Oren and Nowack 2017; Chimoto and Yamanaka 2019). As a result, our spectra were flatter than those in previous studies and were more consistent with the following assumption used throughout our method: each sample in the noise window independently obeyed a normal distribution (i.e., white noise).

Fig. 2
figure 2

Estimation of ACFs and errors using the largest earthquake at E.STHM as an example. a Raw waveform. The lateral axis is taken from the earthquake origin time. The arrow indicates the manually identified P-wave arrival time \({T}_{i}^{p}\). b Whitened waveform. The arrow indicates the noise window, which is defined as 10.5–0.5 s before \({T}_{i}^{p}\). c Bandpass-filtered (1–10 Hz) waveform. The orange square indicates the P-wave window, which is defined as between 0.5 s before and 9.5 s after \({T}_{i}^{p}\). d Fourier amplitude spectra of the raw (brown), whitened (gray), and bandpass-filtered (blue) waveforms. e Filtered waveform in the P-wave window multiplied with a 0.5 s cosine taper to both sides (\({u}_{i}^{obs}(t)\)). f Filtered noise traces (\({u}_{i,j}^{noise}(t)\)) generated by applying the same bandpass filter and taper that is used to generate \({u}_{i}^{obs}(t)\) to 1000 candidates of random traces. g 1000 candidates of \({u}_{i,j}^{eq}(t)\) (Eq. 2). h 1000 candidates of \({a}_{i,j}^{eq}(\tau )\) (Eq. 3; pink), their ensemble average (\({a}_{i}^{ave}(\tau )\); black), and three times their standard deviation (\(3{\sigma }_{i}^{a}(\tau )\); purple). i \(3{\sigma }_{i}^{a}(\tau )\) for 100, 1000, and 10,000 random traces. j Effects of the noise level on the estimates of \(3{\sigma }_{i}^{a}(\tau )\). The results from two and three times the actual noise level are shown. Common amplitude scales are used for eg and hj

We applied a two-pole zero-phase Butterworth bandpass filter of 1–10 Hz (Fig. 2c) to the whitened trace. We extracted the waveform in the P-wave window, which was defined as between 0.5 s before and 9.5 s after \({T}_{i}^{p}\); this window was based on picking errors in \({T}_{i}^{p}\) and expected S-P times for earthquakes deeper than 80 km. We applied a 0.5-s cosine taper to both sides of the waveform in the P-wave window (Fig. 2e). This trace was denoted as \({u}_{i}^{obs}(t)\).

We generated 1000 random traces, which obeyed a normal distribution with a mean of zero and a standard deviation of \({\sigma }_{i}^{obs}\); each trace had a duration of 10 s. We applied the same bandpass filter and taper that was used to generate \({u}_{i}^{obs}(t)\) to the 1000 random traces. The results were 1000 candidates of the filtered noise traces \({u}_{i,j}^{noise}(t)\) (Fig. 2f); \(j\) is an index of the random trace. We assumed that the observed waveform (\({u}_{i}^{obs}(t)\)) was a superposition of the earthquake signal and random noise (\({u}_{i,j}^{noise}(t)\)). Based on this assumption, we obtained 1000 candidates for the earthquake signal (Fig. 2g) as follows:

$${u}_{i,j}^{eq}\left(t\right)={u}_{i}^{obs}\left(t\right)-{u}_{i,j}^{noise}\left(t\right).$$
(2)

For each \({u}_{i,j}^{eq}(t)\), we calculated the ACF (\({a}_{i,j}^{eq}\left(\tau \right)\)) by:

$${a}_{i,j}^{eq}\left(\tau \right)=\frac{\left[\int {u}_{i,j}^{eq}\left(t\right){u}_{i,j}^{eq}\left(t+\tau \right)dt\right]}{\left[\int {u}_{i,j}^{eq}{\left(t\right)}^{2}dt\right]}.$$
(3)

There were 1000 candidate ACFs corresponding to the 1000 random noise traces for each earthquake at each station. We calculated the ensemble average \({a}_{i}^{ave}\left(\tau \right)\) and standard deviation \({\sigma }_{i}^{a}\left(\tau \right)\) of the 1000 ACFs \({a}_{i,j}^{eq}\left(\tau \right)\) for earthquake i at this station.

Finally, we calculated the weighted average for all earthquakes:

$${a}^{ave}\left(\tau \right)=\frac{\left\{{\sum }_{i}\left[\frac{{a}_{i}^{ave}\left(\tau \right)}{{{\sigma }_{i}^{a}\left(\tau \right)}^{2}}\right]\right\}}{\left\{{\sum }_{i}\left[\frac{1}{{{\sigma }_{i}^{a}\left(\tau \right)}^{2}}\right]\right\}}.$$
(4)

We subtracted the weighted average from a band-limited Dirac delta function to obtain the reflection response (Eq. 1) at this station. The standard deviation of this reflection response is:

$$\sigma ^{a} \left( \tau \right) = \left\{ {\sum _{i} \left[ {{1 \mathord{\left/ {\vphantom {1 {\sigma _{i}^{a} \left( \tau \right)^{2} }}} \right. \kern-\nulldelimiterspace} {\sigma _{i}^{a} \left( \tau \right)^{2} }}} \right]} \right\}^{{ - 1/2}} .$$
(5)

Evaluation of the method

We first applied our method to data from Station E.STHM (Fig. 1). This station is 1.6 km to the northwest of Station N.SHMH in the High Sensitivity Seismograph Network (Hi-net; NIED 2019a). A borehole was drilled at N.SHMH down to a depth of 2.3 km, and detailed descriptions and records of drilling were documented (Ohta et al. 1978; Suzuki et al. 1983). The sound velocity and density measured in the well logs indicate that the subsurface can be adequately modeled with a two-layer model. In this model, a sedimentary layer extends down to a depth of 1.5 km; the P-wave velocity (Vp) is approximately 2 km/s, and the density (\(\rho\)) is approximately 2000 kg/m3; underneath the sedimentary layer is a basement layer with a Vp of approximately 5 km/s and a \(\rho\) of approximately 2600 kg/m3. The resistivity measured in the well logs also shows an abrupt change at a depth of 1.5 km, while the resistivity is relatively stable from 0.5 to 1.5 km. The basement rock was encountered when drilling proceeded to a depth of 1.514 km (Suzuki et al. 1983). All these data consistently support the two-layer model. Although several minor discontinuities other than 1.5 km have been pointed out in Ohta et al. (1978) and Suzuki et al. (1983), these discontinuities are not prominent compared to the sedimentary–basement interface. Given the proximity of Stations N.SHMH and E.STHM, it is expected that the subsurface structures at the two stations are similar and that the reflection profile at E.STHM has a single reflector at a depth of 1.5 km. Using the ACFs of the S-waves of vertically incident earthquakes, a reflector at a depth of approximately 1.5 km has been identified in this region (Chimoto and Yamanaka 2020). Therefore, we applied our method to data from Station E.STHM to test whether the reflector at 1.5 km could be identified and other (false) reflectors could be rejected on the basis of statistical significance.

We next applied our method to data from stations along an east-northeast to west-southwest (ENE–WSW) transect (blue and purple triangles in Fig. 1) and an east-southeast to west-northwest (ESE–WNW) transect (red and purple triangles in Fig. 1) to create two-dimensional reflection profiles. To convert the lag times of ACFs to depths, we used the deep subsurface structure model of the Japan Seismic Hazard Information Station (J-SHIS; NIED 2019b). To highlight reflected signals with amplitudes that are statistically significantly larger than the errors, a color scale based on the estimated errors (Eq. 5) was used. We also conducted an analysis following a conventional procedure; in this procedure, errors were not taken into account when stacking and plotting the ACFs. We compared the reflection profiles obtained from our method and those from the conventional procedure.

Results

The pink lines in Fig. 2h show the ACFs (\({a}_{i,j}^{eq}\left(\tau \right)\)) (Eq. 3) for the largest (M4.5) earthquake at Station E.STHM estimated from the 1000 candidates of the random noise traces. The black line shows their ensemble average (\({a}_{i}^{ave}\left(\tau \right)\)). The amplitudes of the ensemble average are less than or only slightly greater than three times the standard deviation (\(3{\sigma }_{i}^{a}\left(\tau \right)\); purple lines) for most lag times, except for the large signal around lag time \(\tau =0\) and a negative peak at \(\tau =1.450\) s. Because ACFs take the maximum value of 1 at \(\tau =0\) and this signal leaks to a small \(\tau\) due to the bandpass filter, the large amplitudes for a small \(\tau\) are inevitable and can be regarded as artifacts.

Figure 2i shows the effects of the number of random traces on the standard deviations of the ACF. The standard deviations vary little with the number of traces, although the uniformity of the standard deviations increases with the number of traces. Figure 2j shows the effects of the signal-to-noise ratios of the waveforms on the standard deviations of the ACF. To simulate waveforms contaminated by larger noise, we generated noise waveforms \({u}_{i,j}^{noise}(t)\) that had two or three times the actual noise amplitudes (i.e., 2 \({\sigma }_{i}^{obs}\) or \(3{\sigma }_{i}^{obs}\)). The results show that the standard deviations are approximately proportional to the square root of the noise level (Fig. 2j).

We converted the ACFs at E.STHM to reflection responses using Eq. (1). Using the P-wave velocity structure from well logs at N.SHMH (Fig. 3a; Ohta et al. 1978; Suzuki et al. 1983), we converted the reflection responses to depth sections. The leftmost panel of Fig. 3b shows the variation in \({a}_{i}^{ave}(\tau )/{\sigma }_{i}^{a}\left(\tau \right)\) with depth for the largest earthquake at E.STHM, which was calculated from the ACF shown in Fig. 2h. The \({a}_{i}^{ave}(\tau )/{\sigma }_{i}^{a}\left(\tau \right)\) ratio is a measure of the significance of the amplitude of the reflection response; a value greater than three indicates an amplitude that is more than three times the standard deviation. This figure shows that the amplitudes of most reflectors are less than or only slightly greater than three times the standard deviation, except for the large signal close to the surface and a small positive peak at a depth of 1.5 km; they were derived from the large signal around \(\tau =0\) and the negative peak at \(\tau =1.450\) s in Fig. 2h, respectively. Although the peak at \(\tau =0\) was removed during the conversion to the reflection response (Eq. 1), the standard deviation \({\sigma }_{i}^{a}\left(\tau \right)\) also approaches zero near \(\tau =0\) (Fig. 2h–j) because \({a}_{i,j}^{eq}\left(\tau \right)\) for all \(j\) approaches the same value of 1. For this reason, \({a}_{i}^{ave}(\tau )/{\sigma }_{i}^{a}\left(\tau \right)\) is unstable for small \(\tau\) values.

Fig. 3
figure 3

Depth sections of the reflection responses at Station E.STHM. a Velocity profile (blue) and the relationship between lag time \(\tau\) and depth (green) that were used to generate the depth sections. b Estimated reflection responses for different numbers of stacks based on magnitude thresholds. Blue lines indicate the ratios of averages to standard deviations of individual earthquakes (\({a}_{i}^{ave}\left(\tau \right)/{\sigma }_{i}^{a}(\tau )\)). Red lines show the ratios of averages to standard deviations of weighted stacks (\({a}^{ave}\left(\tau \right)/{\sigma }^{a}(\tau )\)). Black lines indicate \(\pm 3\). The frequency band is 1–10 Hz. c Reflection responses for various frequency bands estimated from all 33 earthquakes. d Reflection responses for various frequency bands estimated from all earthquakes deeper than 60 km (purple; 113 events), 80 km (red; 33 events), and 100 km (green; 20 events)

We applied weighted stacking (Eqs. 4 and 5) to earthquakes. Figure 3b shows the ratios of the averages to standard deviations for the weighted stacks (\({a}^{ave}\left(\tau \right)/{\sigma }^{a}(\tau )\)) for different numbers of stacks; stacking was applied sequentially in decreasing order of earthquake magnitude. The prominence of the positive peak at 1.5 km increases with an increasing number of stacks. For all other peaks at depths greater than 0.5 km, their amplitudes do not increase with an increasing number of stacks; amplitudes remain less than or only slightly greater than three. These results show that the peak at 1.5 km is the only significant reflector at depths greater than 0.5 km and is consistent with the likely subsurface structure in this region (Fig. 3a; Ohta et al. 1978; Suzuki et al. 1983).

Figure 3c shows the variation in \({a}^{ave}\left(\tau \right)/{\sigma }^{a}(\tau )\) with depth for different frequency bands. Below 10 Hz, the prominence of the positive peak at 1.5 km is high. The prominence decreases with increasing frequency. At 15–20 Hz, the amplitudes of peaks at 1.5 km and other depths are comparable.

Figure 3d shows the variation in \({a}^{\mathrm{ave}}\left(\tau \right)/{\sigma }^{a}(\tau )\) with depth for different choices of earthquake depth ranges. When we used only the earthquakes at depths of \(\ge\) 100 km (20 events; 12.5 s for the P-wave window), the amplitude of \({a}^{ave}\left(\tau \right)/{\sigma }^{a}(\tau )\) at a depth of 1.5 km was smaller (Fig. 3d, green) than that of \(\ge\) 80 km earthquakes (red; 33 events), probably because of a smaller number of stacks. When we used earthquakes with depths \(\ge\) 60 km (113 events; 7.5 s for the P-wave window), the ratio \({a}^{\mathrm{ave}}\left(\tau \right)/{\sigma }^{a}(\tau )\) exceeded three at several depths other than 1.5 km (Fig. 3d, purple), where reflectors are not expected to be present. These false reflectors make it difficult to identify the true reflector at 1.5 km. In conclusion, the depth limit of 80 km showed the clearest image at this station. It is difficult to establish a criterion for the optimal depth limit that is generally applicable to various sites. By decreasing the depth limit, the number of stacks increases while the time window length decreases. The simplicity of the waveforms also affects the result. The waveforms of shallow earthquakes beneath E.STHM were more complicated than those of deep earthquakes of similar magnitudes, and may have violated the requirement of impulsive incident waves assumed in daylight imaging (Additional file 1: Text S1 and Fig. S1).

We investigated the reflection responses at all stations along the ENE–WSW and ESE–WNW transects in Fig. 1. Figure 4b shows the estimated reflection responses plotted on depth sections along the transects. To highlight the significance of the reflectors, different colors are used for the \({a}^{\mathrm{ave}}\left(\tau \right)/{\sigma }^{a}\left(\tau \right)\) ratio. White is used for amplitudes of less than three to suppress reflectors that are not significant. There are clear west-dipping reflectors from 139° 45′ E (depth: 3 km) to 140° 20′ E (1 km) along the ENE–WSW transect (Fig. 4b, left) and from 139° 50′ E (2 km) to 140° 20′ E (1 km) along the ESE–WNW transect (Fig. 4b, right). Along the ENE–WSW transect, there are also weak east-dipping reflectors from 139° 20′ E (0.5 km) to 139° 35′ E (1 km) and from 139° 20′ E (1.5 km) to 139° 35′ E (2.5 km) (Fig. 4b, left). These reflectors are also visible in the results from the conventional procedure (Fig. 4c). However, in Fig. 4c, there are many other positive and negative peaks; most of them are deemed false signals and obscure the signals from the west-dipping and east-dipping reflectors.

Fig. 4
figure 4

Reflection responses along the ENE–WSW and ESE–WNW transects that are shown in Fig. 1. a Number of stacks. Arrows indicate Station E.STHM. b Reflection responses estimated using our method. The color is based on the ratios of the averages to standard deviations of weighted stacks (\({a}^{ave}\left(\tau \right)/{\sigma }^{a}(\tau )\)). White is used for amplitudes of less than three. c Reflection responses estimated using the conventional procedure, where stacking was performed without weights. The color is based on the ratio of the amplitudes to the maximum amplitude for \(\tau \ge 0.2\) s at each station. d J-SHIS P-wave velocity model (NIED, 2019b) used to convert lag time to depth in b and c. Gradations of red to orange indicate layers with low velocity (\(\le\) 2400 m/s). Yellow indicates an intermediate-velocity (3200 m/s) layer. Blue indicates a high-velocity (5500 m/s) layer. e Boundaries between low-, intermediate- and high-velocity layers from d superposed on b

We also evaluated the proposed method using synthetic waveforms (Additional file 1: Text S2). The results showed that velocity discontinuities with reflection coefficients of more than 0.03 and 0.1 were detectable from the waveforms with signal-to-noise ratios similar to those of the largest and smallest earthquakes that were used, respectively (Additional file 1: Fig. S2 and S3). The simulation also showed that random noise causes false reflectors with amplitudes less than three times the standard deviation (Additional file 1: Fig. S2).

Discussion

Figure 4d shows the P-wave velocity model of the J-SHIS (NIED 2019b) along the transects. Two distinct boundaries lie between the low- (\(\le\) 2400 m/s; red to orange) and intermediate- (3200 m/s; yellow) velocity layers and between the intermediate- and high- (5500 m/s; blue) velocity layers. There are also minor boundaries with smaller velocity contrasts within the low-velocity layer. Likely candidates for the low-, intermediate- and high-velocity layers are the Kazusa and Miura layers and the basement (e.g., Koketsu et al. 2009). Figure 4e shows the boundaries superposed on the reflection response obtained using our method. The reflectors are clearly imaged at the sites where the intermediate (Miura) layer is thin enough to be negligible (from 139° 55′ E to 140° 20′ E along both transects); the subsurface approximates a two-layer medium with a large velocity contrast at these sites. The reflectors are less prominent at sites that approximate a three-layer medium. This could be because the intermediate (Miura) layer reduces the impedance contrast at the interfaces. Additionally, the western part of the ESE–WNW transect is close to the study area of Yoshimoto and Takemura (2014), who reported gradual rather than abrupt increases in the velocity in the sedimentary layer. These three-layer or gradually increasing velocity structures are possible explanations for the absence of prominent reflectors in our reflection responses. The eastern part was not imaged well mainly because the number of earthquakes was small (Fig. 4a).

We assumed that the noise obeyed a normal distribution. This was a good approximation, as shown in Fig. 5a and b; the cumulative distribution of absolute amplitudes of the observed (whitened) waveform in the noise window (black) was fitted well by that from a normal distribution (green). Our statistical tests showed that the assumption of the normal distribution is not dismissed at 5% significance level (Additional file 1: Text S4). Maeda et al. (2020) indicated that the normal distribution was a good approximation for the noise waveforms in a different region. The covariance in the noise waveform is another factor that needed to be evaluated. As we used a narrow frequency band (0.0305-Hz width) for whitening (“Method for estimating the ACFs and errors” section), the covariance was expected to be small. Additional file 1: Text S3 and Fig. S4 indicate that the covariance was indeed small, except for \(\tau \le 0.075\) s. As we used \(\tau >\) 0.075 s for the interpretation of the subsurface structure (Figs. 2 and 3), the covariance would not affect the result significantly, although it is ideal to design a noise waveform that has the same covariance as that of the data.

Fig. 5
figure 5

Evaluating the normal distribution approximation. a, b Cumulative distributions of absolute amplitudes for the observed (whitened) waveform in the noise window (black) and for the normal distribution (green). c, d Cumulative distribution of \(|({a}_{i,j}^{eq}\left(\tau \right)-{a}_{i}^{ave}\left(\tau \right))/{\sigma }_{i}^{a}(\tau )|\) for different values of \(\tau\) (from blue to red) and for the normal distribution (green). The results for the largest earthquake at Station E.STHM are shown in a and c; those for the deepest earthquake of the smallest (M2.0) magnitude at Station E.STHM are shown in b and d

We generated \({u}_{i,j}^{noise}(t)\) using a normal distribution, calculated \({a}_{i,j}^{eq}\left(\tau \right)\) from Eqs. (2) and (3) and used three times the standard deviation of \({a}_{i,j}^{eq}\left(\tau \right)\) as a measure of the errors in the ACFs. This threshold is consistent with the 99% confidence level if the ACFs obey a normal distribution. Theoretically, \({a}_{i,j}^{eq}\left(\tau \right)\) derived from Eqs. (2) and (3) obeys the normal distribution because \({a}_{i,j}^{eq}\left(\tau \right)\) is expressed by:

$${a}_{i,j}^{eq}\left(\tau \right)=\frac{\left[{a}_{i}^{oo}\left(\tau \right)-{a}_{i,j}^{on}\left(\tau \right)-{a}_{i,j}^{no}\left(\tau \right)+{a}_{i,j}^{nn}\left(\tau \right)\right]}{\int {u}_{i,j}^{eq}{\left(t\right)}^{2}dt},$$
(6)
$${a}_{i}^{oo}\left(\tau \right)=\int {u}_{i}^{obs}\left(t\right){u}_{i}^{obs}\left(t+\tau \right)dt,$$
(7)
$${a}_{i,j}^{on}\left(\tau \right)=\int {u}_{i}^{obs}\left(t\right){u}_{i,j}^{noise}\left(t+\tau \right)dt,$$
(8)
$${a}_{i,j}^{no}\left(\tau \right)=\int {u}_{i,j}^{noise}\left(t\right){u}_{i}^{obs}\left(t+\tau \right)dt,$$
(9)

and

$${a}_{i,j}^{nn}\left(\tau \right)=\int {u}_{i,j}^{noise}\left(t\right){u}_{i,j}^{noise}\left(t+\tau \right)dt$$
(10)

\({a}_{i}^{oo}\left(\tau \right)\) is not a random value, and \({a}_{i,j}^{on}\left(\tau \right)\) and \({a}_{i,j}^{no}\left(\tau \right)\) obey the normal distribution because they are linear combinations of \({u}_{i,j}^{noise}\left(t\right)\) that were generated using the normal distribution. To evaluate \({a}_{i,j}^{nn}\left(\tau \right)\) (Eq. 10), we arrange the equation as:

$$\frac{{a}_{i,j}^{nn}\left(\tau \right)}{T}=E\left[{u}_{i,j}^{\mathrm{noise}}\left(t\right){u}_{i,j}^{\mathrm{noise}}\left(t+\tau \right)\right],$$
(11)

where \(T\) is the length of the time window of the integration in Eq. (10) and \(E[]\) is an expected value. Although \({u}_{i,j}^{noise}\left(t\right){u}_{i,j}^{noise}\left(t+\tau \right)\) does not obey the normal distribution, the distribution of \({u}_{i,j}^{noise}\left(t\right){u}_{i,j}^{noise}\left(t+\tau \right)\) is independent of \(t\), meaning that \(E\left[{u}_{i,j}^{noise}\left(t\right){u}_{i,j}^{noise}\left(t+\tau \right)\right]\) asymptotes to the normal distribution by increasing the number of samples in \(T\) according to the central limit theorem. In summary, all terms in Eq. (6) obey the normal distributions, so that \({a}_{i,j}^{nn}\left(\tau \right)\) obeys the normal distribution. Figure 5c and d shows the cumulative distribution of \(|{a}_{i,j}^{eq}\left(\tau \right)-{a}_{i}^{\mathrm{ave}}\left(\tau \right)|\) and that from the normal distribution. The excellent fit between the two distributions and our statistical tests (Additional file 1: Text S4) indicate that \({a}_{i,j}^{eq}\left(\tau \right)\) indeed obeys the normal distribution.

A trade-off exists between avoiding the detection of false reflectors and discovering possible minor reflectors. This trade-off can be controlled by the threshold. For example, using two times the standard deviation instead of three increases the reflectors imaged, including both true and false ones. The threshold should be chosen based on purposes. In poorly studied areas, a high threshold is preferable because even the identification of only major reflectors is valuable, and other information to validate the existence of the reflectors is limited. In extensively studied areas where the discovery of minor reflectors is more essential and abundant knowledge and data other than ACFs are available for verification, the threshold can be lowered. Note that the threshold only affects the visual images; numerical values for the ratio of the amplitude to standard deviation are available in all depths regardless of the choice of the threshold.

In the conventional method, the meaning of the amplitudes of reflection responses from ACFs was unclear. For example, the amplitude of 0.1 only meant that the correlation coefficient was −0.1; whether this value was large or small has been evaluated in relative manner, often qualitatively. Our method realizes an evaluation of the amplitude; for example, if the amplitude is 2.0 times the standard deviation, it means that the reflector is significant at 95% confidence level. Because this evaluation is possible from a single station, the method is available in various areas regardless of the density of stations.

In this study, we focused on ACFs for seismic daylight imaging. Following this work, future studies can apply our method to other earthquake-based imaging techniques, including cross-correlation and receiver functions, to calculate errors, although a validation study is needed. Our method is not applicable to ambient noise-based techniques, and alternative approaches will need to be developed in future studies.

Conclusions

We presented a method to estimate errors in the ACFs of earthquakes for seismic daylight imaging. We assumed that the observed waveform of each earthquake at each station is a superposition of the earthquake signal and random noise. We generated 1000 candidates of random noise traces and subtracted them from the observed waveform to generate 1000 candidates of the earthquake signal. We calculated the ACFs of these signals and the ensemble average and standard deviation of the ACFs. We applied weighted stacking to the ACFs from all earthquakes to obtain the reflection response at the station. The standard deviation of this weighted stack is considered a measure of the error of the reflection response.

We evaluated this method using seismic data from the metropolitan area of Japan. It is likely that the subsurface structure of the study area consists of a distinct velocity discontinuity between the sedimentary and basement layers. Our results show that the discontinuity is imaged as a reflector with an amplitude that was substantially greater than three times the standard deviation. At other depths where discontinuities are not expected to be present, amplitudes of reflectors are less than or close to three times the standard deviation. The reflector at the depth of the discontinuity was clearly imaged in frequency bands below 10 Hz and at sites where the subsurface approximates a two-layer medium. The reflector was less prominent at higher frequencies and at sites that approximate a three-layer medium or where velocity increases gradually with depth. The method gives a quantitative measure for the reliability of each reflector from the observations at a single station, which is advantageous especially for poorly studied areas with limited stations.

Availability of data and materials

All the waveform data used in this study are available from the NIED website. Computer programs and derived data shown in this paper are available from the corresponding author upon request.

Abbreviations

ACF:

Autocorrelation function

Hi-net:

High Sensitivity Seismograph Network

JMA:

Japan Meteorological Agency

J-SHIS:

Japan Seismic Hazard Information Station

MeSO-net:

Metropolitan Seismic Observation network

NIED:

National Research Institute for Earth Science and Disaster Resilience

References

Download references

Acknowledgements

We used waveform data from MeSO-net (NIED, 2021) and Hi-net (NIED, 2019a) and the unified earthquake catalog of the JMA. We used HinetPy (Tian, 2021) to download the MeSO-net data. We used a deep subsurface structure model of the J-SHIS (NIED, 2019b) to convert lag times of ACFs to depths. This work was supported by JSPS KAKENHI Grant Number JP19K04016. Comments by two anonymous reviewers helped to improve the manuscript.

Funding

This work was funded by JSPS KAKENHI Grant Number JP19K04016.

Author information

Affiliations

Authors

Contributions

YM performed the analysis and drafted the manuscript. TW participated in discussions. Both authors read and approved the final manuscript.

Corresponding author

Correspondence to Yuta Maeda.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Maeda, Y., Watanabe, T. Estimating errors in autocorrelation functions for reliable investigations of reflection profiles. Earth Planets Space 74, 48 (2022). https://doi.org/10.1186/s40623-022-01606-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s40623-022-01606-5

Keywords

  • Seismic interferometry
  • Error estimation
  • Basement