Variation of teleseismic short-period waveforms at Matsushiro Seismic Array System

We investigate the origin of variation of teleseismic P phase waveforms recorded at Matsushiro Seismic Array System. Short-period, vertical component waveforms from three source regions were analyzed separately in the frequency domain. Observed variation is mainly attributed to the site effects, which are frequency dependent and strongly depend on the incoming azimuth. Such variation is not seen for the pre-event noise spectra. The secondary wavefield that is excited by the incident P phase is the likely cause of the observed waveform variation.


Introduction
Microearthquake networks that have been originally developed to monitor the local seismicity can also be utilized as large aperture arrays to study the velocity structure in the Earth's mantle (e.g., Walck, 1984;Nakanishi, 1988;Vidale and Benz, 1992;Sugiyama and Nakanishi, 2001). One major advantage to use the seismic array recordings in global structural seismology is that we can achieve an efficient signal enhancement (Rost and Thomas, 2002). Such array techniques also provide estimates of the slowness vector of these phases, which are crucial in determining their origins.
When the array techniques are applied to a massive quantity of short-period recordings, it is often implicitly assumed that the observed recordings are composed only of the signal and the random noise; by stacking the array waveforms we can suppress the incoherent noise, and the signal-tonoise ratio for the target phase would improve as the number of traces becomes large (Kanesewich, 1975). One way to represent teleseismic records is to assume that an observed seismogram, O, is an output of a linear system, O = S * P * R + N , where S, P, R and N represent the source, propagation, and receiver characteristics and the noise, respectively, and * is the convolution operator. The receiver characteristics is defined as the product of sensor response and site response. In global structural seismology, we often stack waveforms from more than one events together to enhance weak phases from deep Earth, or in other words, to enhance the propagation characteristic P. To ensure an effective enhancement, it is required to deconvolve source characteristics from each waveform prior to stacking, and we usually estimate the source wavelet empirically from the observed direct P waveforms. Such empirical estimation of the source wavelet, however, should be effec- tive only when variations of receiver characteristics, R, are small among the array stations. This is not always guaranteed, as variations of the amplitude of short-period recordings are usually large. Among two terms in receiver characteristics, we usually know sensor response parameters and difference of the sensor responses among array stations can be corrected if necessary. Effect of the remaining site effect appears both on the phase as well as on the amplitude of the spectrum, and waveform processing in time domain, which appears to be a common practice, might not be sufficient to cancel out their variations. It would be desirable to recognize how site effects distort the waveforms before we apply array techniques to the observed wavefield, since our goal in processing teleseismic waveforms is to evaluate properties of incident wavefield.
We investigate effects of the receiver characteristics on teleseismic recordings of P phases at Matsushiro Seismic Array System (MSAS) in Japan. Despite the small diameter of MSAS, correlations among the observed waveforms are not high, which clearly shows the influence of the receiver characteristics at each station (Kato and Nakanishi, 2000). Our approach is to identify the relative receiver characteristics among the array stations in frequency domain. Site effects at MSAS stations appear to be the major contributor on waveform variation, and are stably estimated from the observed spectra, which are frequency dependent and depend also on the back azimuth strongly. On the other hand, such site effects are not seen in the pre-event noise spectra. These observations imply the local secondary wavefield that is excited by the incoming (primary) wavefield is the primary constituent of the site characteristics.

Method
The observed power spectrum of the i-th event recorded at the j-th station at an array, O i j ( f ), is expressed as the product of two terms, and P j ( f ) are the spectrum of the incident wave-field for the i-th event, and of the receiver terms at the j-th station, respectively. In the notation originally used in strong motion seismology (Andrews, 1982), S i ( f ) and P j ( f ) are the source and propagation terms, respectively, the latter of which include anelastic dissipation (Andrews, 1986;Irikura, 1986, 1988). In using the array recording of teleseismic events, we cannot isolate the source characteristics from the propagation characteristics so that the incident wavefield term would include anything that is not station dependent. By taking logarithm of both sides, we can obtain a set of linear equations, log O i j ( f ) = log S i ( f ) + log P j ( f ), for each frequency. When I events are recorded at all of J stations, the linearized system is expressed as Am = d + n, where the model vector m consists of I + J unknowns, the data vector d consists of I × J observations (knowns), and n is the noise vector. At least one singular value of A is zero because of its symmetry, and the inversion for m is an underconstrained problem (e.g., Andrews, 1982;Irikura, 1986, 1988). This implies that we cannot estimate the absolute receiver characteristics solely from the observations. Two approaches are often used to solve this system, both of which are mathematically equivalent. One approach is to take spectral ratios. A reference station is designated in the array and the observed spectra for each event are normalized by the spectrum at the reference station to cancel out the incident wavefield terms. By solving the normalized set of the equations, where R indicates the reference station, we can obtain the relative receiver characteristics. In strong motion seismology, for example, a hard rock site is taken as the reference station to isolate the effect of the soft surface layers at other stations.
Another approach to solve this system of equations which we take in this study is the array average method. We estimate for the i-th source spectrum by taking the log mean of the J observed power spectra , the average of observed spectra over the array). Similarly, we estimate the receiver characteristics at the j-th station as the average of the residuals These estimates are elements of the model vector m; the residual model vector δm = m − m satisfies the system Aδm = δd + n, where δd = d − Am. We can pursue the model perturbation vector δm via inversion of the system if necessary. For the present study, the initial estimates m explained more than 90% of the observations, and considering the uncertainty of the spectral estimates (see below), we did not pursue the model perturbation vector δm. The array average method is shown successful when the number of stations is large (Wilson and Pavlis, 2000). We utilized the multitaper method in estimating the power spectra. This method is suited to estimate a spectrum of a short time series. A set of spectral estimates is obtained using tapers that are orthogonal to each other, whose weighted average is the final spectral estimate (Thomson, 1982;Park et al., 1987;Percival and Walden, 1993;Lees and Park, 1995). The first 7 tapers of 4-π prolate taper series were used in this study. An outline with the pertinent equations is presented in Park et al. (1987). We also applied a jackknife method to estimate errors in the spec-   tral estimates (Vernon et al., 1991), though this is known to provide a conservative estimate of variance (e.g., Bear and Pavlis, 1997).

Observations
MSAS, operated since 1983 by Japan Meteorological Agency (JMA) (Osada et al., 1984), is located in central Japan (Fig. 1), and is the main contributor to the global CTBT monitoring network in Japan. MSAS consists of seven stations and the diameter of the array is approximately 10 kilometers (Fig. 1). The array is located in a highland area, and the elevations of the sensors vary within the array (Table 1). At station MAT the sensor is located in the horizontal vault that houses the broadband sensors of GSN station MAJO, and at the other six stations they are installed in boreholes at depths of 40 to 70 meters (Table 1). Each station is equipped with three component velocitytype sensors of the same type, which are of the standard short-period type, with the damping factor of 0.7 and the characteristic frequency of 1 Hz (Table 1). Calibration signals are monitored in the routine operation and no severe change of these sensor parameters has been detected during our target period. Sampling frequency is 80 Hz.
Reflecting the tectonic history in this area, the surface geology at stations is heterogeneous. The compressional velocity of the drilling samples varies from 3.6 km/sec to 6.1 km/sec (Table 1), and significant lateral velocity changes, in particular irregular subsurface layers, is detected by the refraction surveys (Asano et al., 1969). Locations of teleseismic events determined solely by MSAS data with array seismology methodology often disagree with those determined by the national network of JMA or US Geological Survey (USGS) even after the observed travel times are empirically corrected for the shallow structure (Nagai and Kashiwabara, 1985;Wakui and Nishiwaki, 1986;Maki et al., 1987;Kobayashi et al., 1993).
Receiver characteristics is the product of receiver and site terms. In a narrow sense the site effect is effects of surface soft layer, as is used in strong motion seismology. Our definition of the site in this report is that everything that is station dependent and causes variation of waveforms at array stations. MSAS stations are installed at different  Fig. 3. Thick and dash lines denote estimates for P phase and pre-event noise, respectively. Shaded areas indicate ±1σ error bound for the P phase spectra, estimated by jackknife method. The signal-to-noise ratios are sufficiently high, and the variances are adequately small below 5 Hz.
buried depths and amplification at the surface for the vault station and superposition of the incoming and surface reflected waves for boreholes should contribute to differences of the waveforms. These factors should be peculiar to each station and would affect the results of the array processing, and thus are treated as elements of the site characteristics in the following analysis. We analyzed the vertical component P wave signals from the shallow events whose epicentral distances from MSAS are larger than 40 degrees (Fig. 2); waveforms from these events are less susceptible to triplicated wave propagation in the upper mantle transition zone. Events in three separate regions occuring between 1984 and 1995 were selected, Alaskan events for northeastern Japan (NEJ) path, Indonesian events for western Philippine Sea (PHS) path, and New Guinea events for eastern Philippine Sea (MAR) path (Fig. 2). Numbers of events were 20, 17, and 20, respectively. These events were carefully chosen so that the ray paths of the P phase were similar and the propagation effects were equalized within each group. The records show adequately high signal-to-noise ratio in the frequency bands that are often used in high-frequency studies of the global structural seismology (∼1 Hz), as is shown below. We set the length of the analysis time window to 6 seconds, which should be long enough to cover the source process time for the events of these sizes (mb 5.0-6.1). Small change of the window length do not change our conclusions qualitatively. Figure 3 shows examples of observed P waveforms for an Alaskan event, along with the calculated spectra (Fig. 4). A high correlation among seven waveforms is seen only at the first half cycle at the arrival of the P phase. Variation of the waveforms becomes large in the following coda part, and the correlation is extremely low. For this event the maximum amplitude in the time window for the P phase at station MAT is almost twice as large as that at IRK. Such large contrast of the amplitude between MAT and IRK is a common feature in the NEJ records. A large amplification at MAT is seen over the entire frequency range in the power spectrum (Fig. 4). The spectra of the pre-event noise waveforms vary little in these frequencies and the signal-of-noise ratio is adequately high below 5 Hz for the present dataset. The receiver characteristics apparently have an azimuthal dependence. For an Indonesian event for the PHS path, the amplitude at MAT is close to the array average (Figs. 5 and 6), which is different from the records for the NEJ path. Similar amplitude variation among the array stations is repeatedly seen in the same group, and these variations are unlikely to be due to the complex source process or the wavefield influenced by the structure in the source region. These observations also implies that differences in sensor response parameters do not solely cause the variation of observed waveforms. Data from three regions therefore were separately treated in the subsequent analysis.
We did not apply any prior constraints on the shape of the spectra of incoming waveforms, and the data for each frequency band was treated independently. The frequency range for the analysis was set to 0.2 to 5 Hz. The upper limit is determined by the signal-to-noise ratio. The lower limit is set below the frequency band that the sensor response is flat (2 Hz and up), but this frequency range is often used in recent studies of global structural seismology, and thus of our interest. Results were plotted as the normalized values relative to station JZO, which is located at the center of the array, for the purpose of comparison. We obtain similar results for cases that sensor responses are corrected beforehand using the published parameters and are cases they are not corrected, as difference of sensor responses at stations are negligible (these tabulated sensor response parameters explain at most one third of the observed variation). In the subsequent figures, we display results for cases that sensor responses are not corrected.

Results and Discussion
Estimated receiver characteristics are both functions of frequency and azimuth (Fig. 7). For NEJ path, the amplification at MAT is higher than those at the other stations, and the amplification at IRK the lowest, both of which agree with the observation in the time domain. Although the azimuths from MSAS to the source regions are only slightly different for events of the PHS and the MAR paths, results for these paths do not agree well, the difference being largest at MAT and SGD. The relatively high amplification at high frequencies at all stations would imply that the amplification at the JZO, the reference station, is lower than the array average in this frequency band, but no other common pattern is seen in the three independent results. These results indicate that the incoming azimuth is one of the controlling parameters of the receiver effects. This supports our findings with waveforms, that is, sensors are not solely responsible for variation of the observed waveforms. We adopted the jackknife approach to estimate uncertainties of these estimates. I subsets of data, in each of which the spectra for the i-th event are discarded (i = 1 ∼ I ), were made from the original dataset, and each of I subsets was analyzed for the receiver responses in the same manner. Statistics of the estimates for these subsets would provide an appropriate estimate of the error of the results with the entire dataset. Such an empirical approach is suited to test, for example, whether the signal-to-noise ratio is adequate to infer the result or whether results were influenced by a particular observation. Results for NEJ path are shown in Fig. 8 Figure 5. Thick and dash lines denote estimates for P phase and pre-event noise, respectively. Shaded areas indicate ±1σ error bound for the P phase spectra, estimated by jackknife method. The signal-to-noise ratios are sufficiently high, and the variances are adequately small below 5 Hz.
for the subsets are very small, and the major features of the results appear to be robust. The stability of the results suggests that the observed variations of the spectra are unlikely to be caused by chance due to the large noise or an unknown random wavefield. The site characteristics are most likely due to the secondary wavefield that is excited by the incident direct P phase. When the power spectra of the pre-event noise waveforms of the same length, 6 seconds, were analyzed in the same manner, we find that variation of the site characteristics are similar among three independent datasets (Fig. 9). In the first 6 second from the onset of direct P phase, energy of P phase and immediate coda is contained. Stability of the results imply that a deterministic wavefield in the P coda is included in the time window, which is likely to be excited at the onset of the P phase.
We should note that results for pre-event noise have similar shapes for three datasets at all stations; relative responses are flat below approximately 1.2 Hz, and gradually increase until 2 or 3 Hz. For conventional short-period sensors, small changes in sensor parameters produce large changes in amplitude and phase responses at the vicinity of their characteristic frequency, which is for our case 1 Hz. Though no such change has been reported, there is a possibility that sensor response parameters might have small changes from the published values, and are partially responsible for the observed spectral variations (Fig. 9). Knowledge on properties of noises at these stations would be required to further verify this idea. This approach could be used to investigate properties of array stations when sensor responses are inaccurately known.
An ideal situation for a seismic array is that the site re-  Fig. 8. Receiver response spectra that are inverted from 20 subsets of events for NEJ corridor, with a jackknife approach. These estimates agree well with each others as well as with our estimates from full dataset (Fig. 7), suggesting small variances of the estimates. sponses are similar at all stations; in such case scaling the amplitude in the time domain would effectively equalize the site effects so that a linear stacking will favorably enhance the incoming wavefield. This is probably not the case for MSAS, as the site responses for array stations are strongly site dependent and frequency dependent. Caution must be taken in detecting weak later phases when the array seismograms are contaminated by the site effects. Onsets of phases should be clear when the high-frequency energy is abundant, and stations with relative high amplifications at high frequencies might strongly influence appearance of weak phases on the stacked trace. Variation of site responses appears to be stronger in high frequencies, and it is desirable to have an empirical estimate on how severe local effects are before analyzing array data (e.g., Der et al., 1987;Der et al., 1990). Robustness of results from array processing should be tested with statistical means.
On the other hand, we demonstrate that the site effects are likely to represent the local secondary wavefield that is deterministically excited by the primary P wavefield. Stability of our estimates of the site characteristics in the frequency domain implies that the removal of the site effects in the time domain could be realistically possible. Such attempts have been made previously to detect converted phase in the source region (e,g., Yamazaki et al., 1996). We are currently working on to extend our formulation so that the deconvolution of the relative site effects from the array seismograms could be enabled.

Conclusions
We investigated receiver characteristics at MSAS in central Japan. Multiple taper was used for the spectral estimate, results of which were used to estimate the site characteristics using the array average method. Results of three subsets of data indicate that site effect is strong, frequency dependent, and azimuthal dependent. Errors that were estimated with an empirical approach are small, indicating the site characteristics represent the secondary wavefield that is deterministically excited by the primary wavefield. Lack of such features in the pre-event noises suggests that these site effects are likely to reflect the local wavefield that is excited by the direct P phase in the vicinity of the stations.