- Technical report
- Open Access
Development and examination of new algorithms of traveltime detection in GPS/acoustic geodetic data for precise and automated analysis
Earth, Planets and Space volume 68, Article number: 143 (2016)
A GPS/acoustic (GPS/A) geodetic observation technique allows us to determine far offshore plate motion in order to understand the mechanism of megathrust earthquakes. In this technique, the distance between a sea-surface platform and seafloor transponders is estimated using the two-way traveltimes (TWT) of acoustic signals. TWTs are determined by maximizing the cross-correlation coefficient between the transmitted and returned signals. However, this analysis caused significantly wrong detection of TWT when the correlogram has an enlarged secondary envelope due to the enlarged amplitude of multiple signals depending on the relative spatial geometry between the ship and the transponder. The handled manual rereading of thousands of correlograms to obtain correct TWTs needs enormous time, and human errors may cause. To prevent these difficulties, an automated TWT determination procedure is valid to process numerous GPS/A data efficiently not only without human errors but also with high precision.
We developed automated methods for precisely analyzing GPS/A data. Method 1: The maximum peak in the observed correlogram is read, and a synthetic correlogram is then subtracted from the observation. Then, the same operation is applied to the subtracted waveform. This procedure is iterated until the correlation coefficient lowers than a pre-defined threshold. A true traveltime is defined as the fastest traveltime during the iterations. Method 2: The observed correlograms are divided into several groups based on their similarity through cluster analysis, and a master waveform in each group is selected. Then, the traveltime residual between the maximum and true peaks in the master waveform is manually evaluated. The obtained residual is employed as the correction value for each slave waveform. Further, we employed a seismic data projection to visually inspect the reliability of obtained results.
We confirmed that both new methods accurately correct misreadings in the current method, which amount to 0.4–0.8 ms roughly corresponding to 30–60 cm difference in the slant range.
Thus, the proposed algorithms significantly improve the estimation of the transponder location. Further analyses are required to determine the arbitrary threshold values and to construct fully automated algorithms.
Offshore of the Miyagi prefecture, NE Japan, ocean-bottom geodetic observatories have been installed since 2002. GPS/acoustic observation is a combined technique using acoustic ranging and kinematic GPS positioning for precise seafloor geodetic measurement. The GPS/A measurement helped discover an anomalously large displacement of the shallowest portion of the interplate fault during the 2011 Tohoku-oki earthquake (Kido et al. 2011; Sato et al. 2011). In order to understand the spatial and temporal development of the postseismic deformation of such large interplate earthquakes, geodetic observation in and around the focal area is extremely important.
Based on this background, Tohoku University has been extended the GPS/A network around the source area of the 2011 earthquake (Fig. 1a) with 20 additional sites using a new-type transponder. This effort was funded by the Ministry of Education, Culture, Sports, Science and Technology (MEXT) in 2012 (Kido et al. 2015). The new transponder has been applied changes in a height of acoustic component against the seafloor and acoustic directivity from the old type; in particular, extended acoustic directivity enabled a transponder on the far side of the array against the ship to definitely respond during a walk-around observation. Each GPS/A site is composed of three to six ocean-bottom transponders, and their distributions are shaped like triangles, squares, or double-triangles. The array position of the site is defined by the centroid of constituent transponders after determining the position of each transponder. To estimate the precise position of each array, we collected two types of data through different measurement styles: a walk-around measurement around each array to determine each transponder position, and a point measurement at the center of the array to estimate the centroid of the array. For acoustic ranging measurement, a maximum correlation (MC) method between the transmitted and returned signals is usually applied to detect two-way traveltimes (TWT). Although this conventional analysis worked well for acoustic data obtained with the old instruments, TWT detected with the new instruments showed a significant traveltime residual, indicating a misreading of the maximum peak in the correlogram as a true peak. This misreading is possibly caused by the larger amplitude of the later peak compared to the true peak in the correlogram (shown in Fig. 1e, f, h, j). These misreads, up to thousands of shots against each transponder, were reanalyzed through a manual procedure; therefore, human error might often be included into the checked dataset. Moreover, because this procedure has to be performed for all 20 sites, manual reanalysis takes a large amount of time. To solve these issues and obtain highly reliable results, new methods need to be developed for the analysis. Besides the present demand, in future, an automated analysis system is of use for the real-timed GPS/A measurement without cruise observations and therefore for monitoring seafloor motions to allow for earthquake early warning.
In this study, we introduce and verify newly developed methods that can automatically process acoustic signals with high precision and verify their validity.
Problem of the waveform of the observed signals
In GPS/A measurement, the seafloor transponder records the signal transmitted from the ship into its internal memory and then returns the recorded data to the ship. Our GPS/A technique adopts a 10-kHz carrier wave, encoded by binary phase-shift keying every two cycles with a seventh-order M-sequence, which amounts to 24.5-ms duration in total. TWT is determined from a correlogram estimated by a cross-correlation between the transmitted and returned signals. Correlograms collected at extended sites are often split into two envelopes of direct and later arrivals (Fig. 1e–j). This characteristic of correlograms tends to be visible and identifiable at deep sites (e.g., G06 and G07); however, at shallow sites, the true peak is difficult to read because the later envelope overlaps the first (e.g., at G14). One possible explanation for this depth-dependent feature is that higher frequency signals will be selectively attenuated, especially at the timing of the phase change in the carrier wave, due to inelastic absorption in seawater for longer ranges and hence deeper sites.
We investigated the variation in the appearance of splitting envelopes in more detail by displaying shot-gathered correlograms, called “pasteup” or “record section” usually applied in seismic surveys. We examined the data obtained on July 2013, measured at G06 (Fig. 1a, depth: 4770 m), which is composed of four transponders (Fig. 2a). At this site, measurement was started at the center of the array. The ship (1) walked around the array counterclockwise from the G06-1, (2) turned at the G06-1 after a round trip and walked by clockwise, and then (3) went back to the center of the array via G06-1 (Fig. 2a). After round trips, the point observation was conducted at the center of the array. During the observation, the carrier wave was transmitted every 30 s. Figure 2b shows the pasteup of correlograms for the transponder G06-1 arranged with the TWT, based on the maximum amplitude detected by the MC method, as T = 0. During analysis, we generally used the maximum peak in the former envelope group as the true peak, considering the sidelobe in the synthetic correlogram (Fig. 2c); this assumption may not always be correct; however, the important point is to pick the same peak among all ranging. The case of shot #568 (Fig. 2f) is a good illustration that the true arrival peak was identified correctly. On the other hand, the influence of envelope splitting can be seen as dragging later peaks with amplitudes greater than that of the true peak in shots #405, #486, and #2058 (Fig. 2d, e, g). We found that the maximum peak was distributed around the first or later envelope within a certain period and it strongly depended on the relative spatial geometry between the ship and the transponder (Fig. 2a, b). In the case of transponder G06-1, the correlation coefficient was at a maximum near the first envelope during shots #200–350 and #550–700 (Fig. 2b, e, f), where the ship was walking around the far side of the transponder, i.e., around G06-3 (Fig. 2a). However, it shifted to the later envelope during other time series, where the ship was walking close to the transponder (Fig. 2a), resulting in misreadings (Fig. 2d, e, g). The time lag between the first and secondary envelopes also varies from ~0.8 ms at the far side to ~0.4 ms at the near side. Thus, path difference in the secondary envelope (hereafter called the multipath) appeared to be on the order of 30–60 cm depending on its incident angle, which may be most probable due to reflection off the glass sphere itself rather than the seafloor, as illustrated in Fig. 3. We are not sure the reason why, including depth dependency, the multipath problem is prominent only in the new seafloor transponder. It may be related to the difference in directivity of the acoustic element in the transducer (ca. ±60° (−10 dB) of the new one is wider enough than ca. ±45° of previous limit), or the difference in geometrical position between the transducer and the glass sphere. Regardless, this misreading of ~0.8 ms during the walk-around observation and ~0.4 ms during the point observation corresponds to ~60 and ~30 cm in the range, respectively, and may degrade the reliability of the position calculation of each transponder.
To improve the quality of GPS/A data, the improvement of instruments and/or data processing methods are considered. As mentioned above, the possible source of the multiple reflections, based on the time lag between the direct and multipath signals, is the surface of the pressure-resistant glass sphere storing the acoustic control unit of the transponder (Fig. 3). Such a multipath effect must be identified and improved in actual field experiments. However, even if improving the acoustic unit of the instrument reduced the multipath effect, it requires considerably high monetary and temporal costs to replace or repair them at all sites because each site comprises 3–6 transponders. Thus, we developed a new solution to avoid the misreading of large-amplitude multipath signals, and we provisionally applied it to a waveform analysis.
We designed two different algorithms that avoid misreading of the maximum peak caused by the MC method. Both methods detect a true TWT by reprocessing the correlograms obtained by the MC method.
Peak subtraction (PS) method
This method detects the TWT by subtracting the pseudo peak over an arbitrary threshold C min. The detailed process is as follows.
Calculate the autocorrelation function f syn(t) (Fig. 3a) of the synthetic signal.
Calculate the observation correlation function f obs(t) by taking a cross-correlation of the synthetic and returned signals. Then, derive the correlation coefficient C 0 of the maximum peak and its traveltime t 0 (denoted by an arrow in Fig. 3b).
Normalize f syn(t) by C 0, subtract normalized f syn(t) from f obs(t) after aligning f syn(t) and f obs(t) in t 0, and then obtain f 1(t).
Determine the maximum peak C 1 and its traveltime t 1 (denoted by an arrow in Fig. 3c).
Normalize f syn(t) by C 1, align peaks of f syn(t) and f 1(t) in t 1, and then subtract f syn(t) from f 1(t) and obtain f 2(t) (Fig. 3d).
Steps 2 to 5 are iterated until C n < C min (Fig. 3). The smallest t n is recognized as the true traveltime t p . f obs(t) with C 0 < C min is excluded from the analysis objects.
Cluster analysis (CA) method
This method applies a specific traveltime correction to a certain correlogram group. All observation correlograms are grouped by applying cluster analysis using the k-means method (Hartingan and Wong 1979), in which the user determines the number of groups in advance.
Determine the observation correlation function of each returned signal.
Find cross-correlation between the observation correlation functions for all combinations.
Perform cluster analysis using the k-means method (Hartingan and Wong 1979) on the database obtained at the preceding step (Fig. 5a). The number of groups that the database should be divided into is determined by trial and error; we employed here 20 groups in this analysis, which is large enough to illustrate most types of waves. It should be noted that the number of groups does not significantly affect the final result if the number is sufficient.
Choose the correlogram whose average of whole the cross-correlation coefficient between others in a corresponding group is highest, as the master correlogram.
Determine \(\Delta t\) between the true peak in the former envelope group and the maximum peak in the master correlogram (Fig. 5b). \(\Delta t\) equals zero if the correlation coefficient of the direct arrival is the largest.
Obtain the true TWT of the slave correlograms by correcting \(\Delta t\) when a cross-correlation coefficient between the master and slave correlograms becomes the largest (Fig. 5c).
The PS method depends on the choice of C min so that it is obvious that a larger C min determines a peak with larger amplitude, while a smaller C min determines a more appropriate one (Fig. 4), whereas the CA method determines a unique \(\Delta t\) for each group (Fig. 5). Several C min should be examined to find a suitable value, as shown in Fig. 6. Both the PS and CA methods work automatically except for the steps of determining the threshold in the PS method and selecting \(\Delta t\) in the CA method. Details of the results are discussed in the next section.
Results and verification
We have obtained results of TWT determination by the MC, PS, and CA methods. The obtained correction by the new methods and the pasteups of corrected correlograms are shown in Figs. 6 and 7, respectively. From the pasteup output by the MC method, we found that the accepted multipath peak, which is the largest amplitude peak in the secondary envelope, was delayed by 0.4–0.8 ms compared to the true peak (Fig. 2b). However, both new procedures-corrected TWT and aligned correlograms showed no significant misidentification (Fig. 7c, d).
Figure 6a–c compares the time lag between the peaks by the MC and PS methods. From Fig. 6a–c, we can see that the TWTs output by the threshold of C min = 0.30 was dispersed more than three periods of the frequency and mainly accepted the peak in the later envelope even during the point observation, while almost all outputs by C min = 0.15 and 0.20 were distributed in the first envelope and dispersed with roughly one or sometime two wavelengths. This indicates that a large threshold (0.30) makes the result unstable due to overlooking peaks around direct arrivals with a smaller coefficient than the threshold so that the smaller threshold definitely picks peaks in the first envelope. When remarking plots of shots during the walk-around observation (Fig. 6b, c), dispersion within one period is still recognized and seems to arise due to the difference in the degree of correlativity of each shot. The distribution of lag time given by the CA method (Fig. 6d) is also dispersed during the walk-around observation and is similar to the case of C min = 0.15 and 0.20 (Fig. 6b, c), but the same group color lines on the individual peak. Thus, the dispersal of the accepted peaks by the PS method shown in Fig. 6b, c probably reflects a slight difference of correlograms between neighbor groups. On the other hand, the lag times by the CA method during the point observation are closely aligned because almost all correlograms are classified in a single group (Fig. 6d), whereas those by the PS method are still dispersed (Fig. 6b, c). Considering the present way for determining \(\Delta t\) by manually reading direct wave in the master correlogram during the CA procedure, it is natural to find a difference in lag time of one wavelength between the new methods. This is also recognized in Fig. 7d, which is the pasteup after aligning correlograms at corrected TWT as T = 0, as a gap of one wavelength between the neighbor groups around shots #150–350 and #450–700 (Fig. 7d). This offset by a systematic error results from no quantitative determination of \(\Delta t\), which is the misreading of the peak of the direct wave in the master correlogram. The improved TWT of 0.8 ms in maximum during the walk-around observation and 0.4 ms on average during the point observations (Figs. 6, 7) are consistent with 60 and 30 cm in slant range, respectively. Therefore, we conclude that both devised procedures perfectly avoid the misidentification of the multipath peak that occurred in the MC procedure; however, gaps of roughly one or sometime two wavelengths with neighbor shots or groups remain.
Incidentally, we find that, assuming the determined TWTs by the PS method is the completely true one, peaks that alternate around T = 0 with a time difference of less than ~0.01 ms appear in both types of observations; the walk-around observation at shot #0–800 and the point observation after shot #800 (Fig. 7c). We consider that this gap is possibly caused by instrumental limitation due to the 100 kHz sampling rate. The time of the point sampled near to a peak on the digital data would be off by a maximum 0.005 ms (half of a sample interval) compared to the peak top. As a result, the TWT of the maximum peak in f obs, as well as the TWT of subtracted correlogram f n , would be shifted to the time of a nearby sample. In brief, the PS method would cause this gap during iteration. Therefore, we concluded that the cause of the apparent gap in the peaks is an instrumental issue. A higher sampling-rate recording system for returned signals would be required to improve the results of the PS method. On the other hand, such instability in TWT estimation was not recognized in the results of the CA method (Fig. 5d). This difference originates from the difference in the approach toward handling correlograms, i.e., the CA method handles a single correction value within a group, whereas the PS method processes each correlogram. Thus, the difference between two devises increases or decreases with one epoch in sampling rate from one wavelength. In the point view of picking the same peak among the all ranging, the CA method is more solid than the PS method.
Next, we verified the accuracy of the identification of absolute TWT by the new developed methods. We examined the traveltime residual between the observed and synthetic signals that were estimated based on the constant transponder position. In Fig. 8b, we found that the residuals from the PS method occasionally jumped up and its offset was ~0.4 ms (e.g., at shot #200 in Fig. 8b), most probably because of the low correlation. That frequency was 0.06 % of the total number of shots, and such strange data would be excluded from the subsequent geodetic analysis. On the other hand, the residuals of the absolute TWT derived by CA method showed a narrower distribution than those determined by the PS method. The comparison of the TWT residual distribution at each transponder also shows that the CA method determines the true TWT in a more stable manner than the PS method (Fig. 9A(b, d), B(b, d), 10 A(b, d), B(b, d)) as shown in Fig. 6. In any case, the difference between the residuals of both results (\(\Delta dt\)) was only ~0.07 ms, less than and equal to one wavelength (Figs. 9A(c), B(c), 10 A(c), B(c)), which corresponds to a distance of ~5 cm in the slant range in the case of G06. This difference is approximately a single shot for one transponder, and therefore, both methods succeeded in decreasing traveltime dispersion.
In summary, the new methods rarely caused a slight gap in the peak of the observation correlogram (Fig. 7c, d), and then determined TWTs near the identical peak and stably and effectively corrected the misreading results of the MC method. The new methods improved TWT residuals of 0.4–0.8 ms, consistent with 30–60 cm in slant range, compared to those derived by the MC method (Figs. 7, 8, 9, 10). We therefore conclude that the developed methods greatly improved analysis precision. In addition, we suggest that parallel processing with these methods would allow comparison and verification of the reliability of the results.
Discussion for future work
Finally, we discuss current issues with the new methods. Several problems must be solved before automating the proposed methods. The PS method requires the threshold (C min) against the correlation coefficient to search the direct peak in the correlogram. The user must determine this threshold before analysis. If the threshold is too large, the peak near the direct signal with a lower coefficient than the criterion might be overlooked (Fig. 6a). In contrast, if the threshold is too small, the peak in front of the direct arrival may be detected. To accurately determine a true peak without overlooking peaks with lower coefficients than the threshold, the proposed method should be quantitatively estimated the required optimum threshold. Although we regard that the decrease of correlativity is probably caused by the decrease of signal-to-noise ratio depending on environment, a distance attenuation, the fact that C min of 0.15–0.20 could determine a peak within one or two wavelength with less difference from the CA output (Fig. 6b–d), they can determine TWT of direct arrivals with the precision of one period of correlogram notwithstanding the slant range changes up to ~1.5 times against the nearest (note that the maximum incident angle exceeds 45°). In this point, we therefore conclude the suggesting C min of correlograms are comparably stable in both walk-around and point observations no matter how deep the site is. At least from the data of G06 examined in this study, it is difficult to find (or rather extract) the affection of environment dependence. Further investigation for the waveform condition, which is under a different environment, is necessary to select optimal C min quantitatively. Meanwhile, in the CA method, the \(\Delta t\) might have difficulty in determining the suitable number of groups. Moreover, the time correction \(\Delta t\) also might be unsuitable because it is defined manually. To overcome the limitations of the CA method, the most suitable number of clusters should be considered and \(\Delta t\) must be detected objectively. To fully automate the analyses, it is necessary to compensate for their weak points, i.e., by determining the threshold in the PS and \(\Delta t\) in the CA methods, through a detailed analysis of the waveform and result stability, respectively. One possible way is to detect the \(\Delta t\) of the master wavelet by the PS method.
However, although the threshold problem remains, the procedure for GPS/A data can become almost completely automated. Fully automation will be useful for processing enormous numbers of wave data. Recently, the GPS/A technique has changed observation style from offline campaign observation by cruises to online data transfer via sea-surface stations and satellites. Therefore, full automatic processing could lead to real-time monitoring of seafloor displacement that can contribute to an earthquake and tsunami early warning system and provide priceless information on geophysical phenomena of the seafloor.
Problems in TWT detection exist for received signals collected at GPS/A seafloor sites deployed in 2012. The most significant problem with the conventional MC method has been misreading of multipaths with the largest amplitude peak as the true peak in the observed correlogram. We verified this by creating pasteups of the correlogram and found that the amplitude and peak splitting of correlograms varies depending on not only the water depth of the sites but also the incident angle of the transmitted signal, i.e., the relative spatial geometry between the ship and the transponder. To avoid the harmful influence of the large multipath signal and to improve the precision of the transponder position detection, we designed two methods that reanalyze the observation correlograms obtained by the MC method. The pasteupped correlograms after the reprocessing showed that both new methods accurately identified the peak around the direct arrival. The comparison of TWTs estimated by the new methods suggests that their differences converged within ~0.07 ms, equal to less than the wavelength of the correlogram. Therefore, parallel processing using these methods will help verify the reliability of the identification of direct arrivals. We believe that the new techniques proposed in this study are effective for high precision TWT detection in acoustic data processing. Furthermore, we recommend using pasteup views for visually verifying the validity of analysis results. Further improvement of the remaining subjective parameters of these methods will contribute to the perfect automation of GPS/A data processing, and, in future, the real-time monitoring of seafloor displacement to provide precise information of seafloor phenomena for an earthquake and tsunami early warning system.
Hartingan JA, Wong M (1979) Algorithm AS 136: a k-means clustering algorithm. Appl Stat 78:100–108
Kido M, Osada Y, Fujimoto H, Hino R, Ito Y (2011) Trench-normal variation in observed seafloor displacements associated the 2011 Tohoku-oki earthquake. Geophys Res Lett 38:L24303. doi:10.1029/2011GL050057
Kido M, Fujimoto H, Hino R, Ohta Y, Osada Y, Iinuma T, Azuma R, Wada I, Miura S, Suzuki S, Tomita F, Imano M (2015) Achievement of the project for advanced GPS/acoustic survey in the last four years. In: Hashimoto M (ed) International symposium on geodesy for earthquake and natural hazards (GENAH). Int Assoc Geod Symp, vol 145. Springer, Heidelberg. doi:10.1007/1345_2015_127
Sato M, Ishikawa T, Ujihara N, Yoshida S, Fujita M, Mochizuki M, Asada A (2011) Displacement above the hypocenter of the 2011 Tohoku-Oki earthquake. Science 332(6036):1395. doi:10.1126/science.1207401
Wessel P, Smith WHF (1998) New, improved version of the Generic Mapping Tools released. Eos Trans AGU 79(47):579
RA suggested the method for the visual inspection of correlograms, assessed the result of each method, and drafted the manuscript. FT suggested and developed the CA method and analyzed the data. TI suggested and developed the PS method. MK arranged the seafloor geodetic network and collected GPS/A data. RH contributed to discussions on the scientific content and suggested revisions to the manuscript. All authors read and approved the final manuscript.
The GPS/A surveys and benchmarks were financially supported by MEXT, Japan. This work was also partly supported by the Council for Science, Technology, and Innovation, the Cross-Ministerial Strategic Innovation Promotion Program, and the “Enhancement of social resiliency against natural disasters” program (fundamental agency: JST). We used the waveform analysis tool “PASTEUP” (personal communication with Dr. G. Fujie) for viewing and editing correlograms. We would like to thank Editage (www.editage.jp) for English language editing. All figures were prepared by using Generic Mapping Tools (GMT 4.5.3) (Wessel and Smith 1998).
The authors declare that they have no competing interests.
About this article
Cite this article
Azuma, R., Tomita, F., Iinuma, T. et al. Development and examination of new algorithms of traveltime detection in GPS/acoustic geodetic data for precise and automated analysis. Earth Planet Sp 68, 143 (2016). https://doi.org/10.1186/s40623-016-0521-2
- Marine geodesy
- GPS/acoustic positioning
- Acoustic ranging
- Two-way traveltime
- Multipath effects
- Cluster analysis