Forward modeling of quake’s infrasound recorded in the stratosphere on board balloon platforms

Acoustic waves generated by seismic waves contain information on the internal structure of planets, and can be sensed by pressure sensors onboard high-altitude balloons. To identify the various contributions (infrasound signal, noise, balloon response, etc.) in such pressure records, a full waveform modeling is implemented and completed by infrasound ray tracing and additional data analysis. Here, we analyze the Stratéole-2 pressure data associated with two earthquakes (Garcia et al. Geophys. Res. Lett. 49(15):e98844, 2022) and compared these to full waveform simulations by SPECFEM2D-DG-LNS software. Even if our simulations do not precisely reproduce the waveform observed in the frequency range [0.05, 0.3] Hz, we show that the waveform presents more sensitivity to quake and internal structure parameters than to atmospheric structure, and that seismic surface wave dispersion is observed in balloon pressure records. The long-duration pressure oscillations observed after the main infrasonic signal cannot be fully reproduced by our one-dimensional input model even when source time function complexity and after-shocks are considered. These features are ascribed mainly to the complex vertical ground movements below the balloon and partly to late secondary infrasound arrivals excited by the interactions of seismic waves with the topography. These results enhance the advantages and limitations of quake-related infrasound observations on board terrestrial and planetary balloon platforms.


Introduction
Seismic activity is a source of pressure disturbances in the atmosphere due to ground movements (Lognonné et al. 1998).These acoustic waves, especially in the low frequencies range ( ≤ 10 Hz) are detectable at more than a thousand kilometers from their source (Drob et al. 2003).At those frequencies, other natural events such as volcanic eruptions (Johnson and Ripepe 2011;Thelen et al. 2022), microbaroms (Le Pichon et al. 2004;Bowman and Lees 2018;De Carlo et al. 2021), meteor entry (Edwards 2009;Ens et al. 2012) are also detectable.Thus, the observation of infrasound in the Earth's atmosphere represents a real challenge in various fields of geophysical applications.These observations can be extended to the study of other planets.For instance, on Venus which is a hostile planet for seismometers because of its high surface temperature and pressure conditions (Wood et al. 1968), the study of infrasound can provide a potential source of information on the seismicity and internal structure of the planet (Garcia et al. 2005;Stevenson et al. 2015;Krishnamoorthy et al. 2020;Garcia et al. 2021).
Hence, the detection in the Earth's atmosphere of infrasound generated by ground movements following quakes can be used as a first step toward similar detection on other planets.Such signals can be probed in the upper atmosphere using airglow variations as markers of infrasound (Garcia et al. 2009;Sutin et al. 2018).Another possibility is to sense the pressure variations induced by the infrasound on board atmospheric balloons.Many efforts have been made recently to develop onboard pressure sensors on stratospheric balloons (Bowman and Lees 2015;Bowman et al. 2017;Lees and Bowman 2017;Krishnamoorthy et al. 2020) and to demonstrate the concept of balloon-borne observations of infrasound on Earth (Krishnamoorthy et al. 2018(Krishnamoorthy et al. , 2019;;Garcia et al. 2021;Brissaud et al. 2021;Bowman et al. 2022).These efforts have been recently granted by the detection of a magnitude 4.2 quake at epicentral distances smaller than 100 km (Brissaud et al. 2021), and by the detection of two quakes of magnitudes larger than 7.0 by a network of long-duration balloons in the stratosphere with epicentral distances up to 3000 km (Garcia et al. 2022).
This study aims at providing full waveform simulations of the pressure records of infrasound generated by these two large quakes through the SPECFEM2D-DG-LNS numerical code based on a hybrid high order continuous/discontinuous Galerkin discretization of the coupling between the solid Earth and the windy atmosphere (Martire et al. 2022).This forward modeling approach will allow us to better understand the data content, validate the simulation tool and estimate the relative amplitudes of the infrasound generated below the balloon with respect to those created by ground movement at the epicenter of the quake.
"Data" presents the different types of data recorded for the two earthquakes by seismometers and pressure sensors on board balloons.We present in "Simulation tool and post-processing methods" the modeling tool, the atmosphere and ground models, the earthquake source parameters and the parameters of the 2D simulations used to simulate the two earthquakes, as well as the corrections applied to convert 2D simulations into 3D synthetics.The comparison between simulations, ground data, and balloon flight data are shown in "Results".Additional data analysis and ray tracing simulations are performed in "Discussion" to interpret the long coda, which is not properly reproduced by full waveform simulations.Conclusions and prospects are described in "Conclusions".

Data
The core data set of this study is composed of pressure records on board balloons of the Stratéole-2 project (Haase et al. 2018).During its first campaign in fall 2021, the Stratéole-2 project launched seventeen super-pressure balloons flying at constant altitude in the 18-20 km range.Each flight was equipped with a TSEN instrument (Podglajen et al. 2014) which integrated pressure sensors continuously acquiring data with a sampling frequency of 1 Hz.Data are available at Hertzog (2023).Some sensors were sufficiently near the two earthquakes considered in this study: the first one occurred in northern Peru on 28 November 2021 at 10:52:25 UTC and was detected by balloon number 09 (983 km away from the source), the second one occurred in Flores Sea on 14 December 2021 at 03:20:35 UTC and was observed with a good signal-tonoise ratio by balloon number 17 (675 km away from the source) and balloon number 16 (1736 km away from the source) (Garcia et al. 2022).Figures 1 and 2 are providing the geometrical positions of the quakes and sensors, respectively, for the Peru and Flores quakes.To add confidence in our synthetics, we use records of the seismometers located below the balloon sensors.Seismometer CZSB from the Brasilia University network, and located in the Amazonian basin, 583 km away from the epicenter, is used for the Peru quake.Although CZSB is further from the point below the Balloon 09 than other stations (NNA, LPAZ), it has a similar local crustal structure, and the CZSB signal seems more in line with the balloon's observations (Figs.S1 and S2).Consequently, the choice of stations to be selected for our analysis should, to some extent, focus on a station whose underlying crustal structure is similar to that beneath the balloon, rather than proximity to the balloon.
Fig. 1 Background map displays the position of the Peru quake epicenter (blue rectangle), the balloon (red circle), and the seismic station (green triangle).The inserts are describing the normalized source time function and centroid of the quake (a), the internal structure model given by CRUST2.0(b) and the atmosphere model (c) used in the simulations.In b, the dark green curve is associated with the density, the plain light green curve with the S-wave velocity and the light green curve with the P-wave velocity Seismometers SANI and DAV, respectively, from GEO-FON and GSN networks, and located on oceanic islands, are used for the quake in Flores Sea because they were located approximately below Balloon 17 and Balloon 16 during this event, respectively, at 684 and 1723 km from the quake.The instrument response of the seismometers is removed, and the signals are filtered in the 0.02-1 Hz range before analysis for the seismometers and in the 0.03 − 0.5 Hz range for the balloon-borne pressure sen- sors, since the noise is above the infrasound signal at frequencies below 0.03 Hz (Garcia et al. 2022).

SPECFEM2D-DG-LNS, a numerical tool for modeling wave propagation
We simulate both earthquakes using the SPECFEM2D-DG-LNS software (Martire et al. 2022).This software models the 2D wave propagation in coupled solid-fluid systems.It solves the elastic equation for the solid part and the linearized Navier-Stokes equations for the fluid part, including wind and acoustic attenuation.
The integration in time is explicit.We use a 5-stage 4th-order low-storage Runge-Kutta scheme (Carpenter and Kennedy 1994).The integration in space is different depending on the area.Continuous Galerkin spectral elements are used for the solid part, whereas discontinuous Galerkin spectral elements are used for the fluid part.Stability conditions require a time step dt = 0.004 s for a spatial element dx = 1 km.Each ele- ment is defined by 5 nodes in each direction of space to respect the dispersion condition and to correctly sample the different wavelengths of the seismic and acoustic signals.
Figure 3 shows snapshots of the modeling of the Flores quake and illustrates the propagation of different waves in a dispersive medium.The medium is here defined by vertical gradients of the model properties in both the Earth's shallow layers and the atmosphere.We distinguish several types of infrasound: those produced after the S-wave (S-generated infrasound) and surface waves, and the spherical infrasound generated at the epicenter, called "epicentral infrasound".

2D simulation domain
Let E be the epicenter of the studied earthquake, and R the receiver for which we have data.We model the waveform in a 2D plane linking R and E and perpendicular to the plane tangent at the surface of the Earth in E. The distance between E and R corresponds to the distance between the epicenter and the receiver, considering the Earth's curvature.
The domain contains a solid part and a fluid part.The limits of the domain are x = [−1300, 1900] km for Peru simulation, x = [−100, 3100] km for Flores simulation, and z = [−300, 130] km.

Atmospheric model
The atmospheric state is assumed constant along the x-axis.We extract the atmospheric state at the location and time of both earthquakes from the ERA5 database of ECMWF (Hersbach et al. 2023) for the 77 first kilometers and from MSISE (Picone et al. 2002) for altitudes between 92 km and 130 km.Between 77 km and 92 km, we use spline interpolation.Wind is projected in the plane perpendicular to the earth's surface, containing the source and receiver.The sound speed and the resulting effective sound speed are shown as a function of altitude in Figs.1c and 2d, e, respectively, for the simulation with Balloon 09, Balloon 17 and Balloon 16.

Internal structure model
As for the atmospheric model, the internal structure model varies only along the z-axis.We extract the density, the P-wave and S-wave velocities for the shallow layers at the receiver location from the CRUST2.0seismic model (Bassin 2000).We use the AK135 internal structure model (Kennett et al. 1995) to define the attenuation parameters Q κ and Q µ and the mantle density and wave velocities.Figures 1b and 2b, c describe the internal structure models chosen down to a depth of 40 km.

Seismic source
We do not use the same source modeling method for the Flores earthquake as for the Peru one, due to the difficulty to simulate the Flores earthquake.
Note that the moment tensor is 3D.The projection in the 2D plane is possible with only the parameter M xx , M xz and M zz of the 3D moment tensor in the far-field PSV system, according to formula (12) of Li et al. (2014).
Concerning the Flores earthquake, since two major aftershocks of Mw 5.7 and 5.8 happened in the 15 min following the main shock, we decide to model these two aftershocks (Supendi et al. 2022).Thus, three different simulations must be performed with three different source time functions and moment tensors.The three resulting synthetics acquired at the receiver are summed to get the complete signal.For the main shock, Fig. 4 illustrating the source time function given by SCARDEC highlights the difficulty to model the complexity of the earthquake source.We choose to use a simplified version of the SCARDEC source time function (illustrated in Fig. 4, because the SCARDEC function excites frequencies larger than 0.1 Hz that are not observed on DAV and SANI seismometer records.Thus, the source time function is approximated by the sum of three gaussians.Each gaussian is given by the following equation: where T max is the final time of the simulation, t 0 = 1.2 f 0 and f 0 are respectively a time delay and the dominant and representative frequency of the source that define the size of the gaussian.f 0 is equal to 0.07, 0.09, and 0.11 Hz, respectively, for the three different gaussians.The sum of the three gaussians is normalized to scale the synthetic source to the full moment tensor. (1) The aftershocks source time functions are approximated by only one gaussian (following Eq. 1).The dominant frequency of the source ( f 0 ) is computed from the seismic moment estimated by Supendi et al. (2022) by using the relation between these two parameters defined by GlobalCMT (Dziewonski et al. 1981;Ekström et al. 2012).To ensure consistency, this relation is compared to GlobalCMT estimates of all the earthquakes of magnitude 5.7 and 5.8 between 2020 and 2024.
Concerning the moment tensor, we convert the strike, dip, and rake parameters to get the 3D moment tensors of the main shock and the aftershocks that are projected along the station azimuth for 2D simulations.
Table 1 summarizes the different parameters we use.

2D to 3D conversion
The corrections of 2D synthetics for a comparison to 3D data are implemented with formula (3) of Li et al. (2014).These corrections include geometrical spreading effects and source effects.
Table 1 Source parameters for modeling the main shock and first two main aftershocks of the Flores quake on the 2021/12/14 Source time function of the main shock is described in "Simulation tool and post-processing methods" and Fig. 4 Name Time (UTC)

Scaling factor correction
The ground velocity and pressure record synthetics of the aftershocks of the Flores earthquake are multiplied by scaling factors to fit the observations at each station, detailled in Table 2.These scaling factors for synthetic records of the ground station SANI and DAV are chosen so that the amplitude of the synthetics matches that of the observations.The scaling factors for the Balloon 17 and Balloon 16 synthetic records are identical to those chosen for SANI and DAV synthetic records 2.
No scaling factor is used for the modeling of the Peru earthquake.

Peru quake
Usually, synthetic signals are compared with observations in a fairly low frequency range (period greater than 20-30 s).However, for the purposes of this study, we cannot use this frequency range for balloon signals, due to the high noise level below 0.03 Hz.For this reason, we subsequently compare time series in the frequency band [0.05, 0.3] Hz.However, Figure S3 shows an adequate fit of our synthetics with seismometer data in the [0.02, 0.06] Hz frequency band for the ground station CZSB: amplitude of synthetics corresponds to the one of the observations, arrival times of synthetics match the observed arrival time at CZSB ground station with 1 s of difference, the frequency content of synthetics is similar to that of the data.Figure 5 presents a comparison of data and synthetics for the ground station signal (a-c) and for the balloon signal (d-f).Note that a correction factor (detailed in the figure legend) is applied in (a-d) to compare the waveform with observations.The first minutes of ground displacement and velocity signals are reproduced by simulations: the arrival times of the P-waves and S-waves are reproduced within 1 s.The synthetics amplitude is also reproduced (see Fig. 5 where the scaling factor 1 is applied to synthetics).The The cyan curve in panels (e) and (f) of Fig. 5 corresponds to the upward propagation of the observed vertical velocity recorded by the ground seismometer.Assuming that the atmosphere is non-absorptive, homogeneous and windless, we use the conservation of the energy to express the vertical velocity at altitude z as a function of the vertical velocity at ground level.We also use the acoustic impedance to convert the velocity into pressure.The system of equations and the resulting expression of pressure are described as follows: where ρ 0 and c 0 are, respectively, the atmospheric den- sity and the sound speed at ground level, P 0 and A v are, respectively, the pressure perturbation at ground level and the ground vertical velocity.P (z) , ρ(z) and c (z) are the pressure perturbation, the density and the sound speed at altitude z.
We also add a delay, corresponding to the infrasound propagation time from the ground to the balloon altitude and the propagation time over the distance that still needs to be covered to reach the balloon's location.
The upward propagation of these ground oscillations reproduces better the frequency content (Fig. 5f ) and the arrival times and amplitudes of late arrivals (Fig. 5e) than the full waveform simulations performed in a one-dimensional model.The similarity between balloon records and upward propagated vertical ground velocity indicates that the pressure waves are mainly generated below the balloon and that their atmospheric propagation does not distort their waveform.
Thus, the infrasound signal and its frequency content are more sensitive to the ground parameters than to those of the atmosphere.

Flores Sea quake
The Flores Sea quake was observed by two balloons.For each balloon, the same analysis as for the Balloon 09 is performed.Figure 6 is for Balloon 17 and the SANI ground station, whereas Fig. 7 is for Balloon 16 and the DAV ground station.Note that a correction factor (detailed in the legend of each figure) is applied in panels (a), (b), (c) and (d) to compare the waveform with observations.Figures S5 and S6 present the results of the modeling for the ground station at low frequency.
First, Figures S5 and S6 show two different points at low frequency.First, the study at low frequency confirms that we can estimate amplitudes, arrival time and frequency content with small errors, either for SANI or DAV seismometer.However, the waveform for the DAV instrument is not properly reproduced, the long coda ( 2) frequency content between [0.02, 1] Hz is similar between synthetics and records of the vertical velocity recorded by CZSB.In particular, the peak of the source time function around 0.25 Hz is observed for both data and synthetic.However, the seismometer records exhibit oscillations that are not reproduced in the simulations after 10:56:30.This signal is probably related to either seismic wave reflections on 3D structures present in this subduction zone area, or waves trapped in the Amazonian sedimentary basin.
The waveforms associated with pressure waves observed by Balloon 09 (Fig. 5d, f) are not as well reproduced as those at the ground stations.First, the amplitude of Balloon 09 synthetics is approximately 1.71 smaller than the real records.Then, the estimation of the arrival time is more complex in the case of the balloon-borne sensors than the ground station, as the amplitude of the S-waves is not very significant compared to the noise level.On the other hand, different arrival trains (around 10:55:57, 10:57:11,10:58:18) are observed in both observations and synthetics.Concerning the frequency content, the energy peak around 0.25 Hz is reproduced and associated with the energy peak in the source time function.Between 0.03 and 0.3 Hz, the frequency content is similar.Pressure observations are thus sensitive to the frequency content of the source time function.However, synthetics present a lack of energy at the frequencies greater than 0.3 Hz.
Figure S3 describes in more detail the seismic signal at CZSB and the main infrasound signal at Balloon 09.Despite a significant noise level at lower frequencies for the Balloon 09 records and the horizontal distance separating the ground station and balloon sensor, we can see some similarities between the spectrogram of the Balloon 09 and the one of CZSB.Synthetics of the Balloon 09 signal reproduce the excitation around 0.2 Hz at 10:57:00 and 10:58:30.The quake being deep, no dispersion is visible in the data.
2D simulations in the frequency range [0.05, 0.3] Hz cannot precisely reproduce the full observed waveform but estimate seismic information like arrival time, amplitude, energy peaks in the frequency content.
in SANI and DAV data cannot be explained either.This illustrates the greater complexity in modeling the Flores Sea quake rather than the one of Northern Peru.
Besides, since sensors onboard balloons present noise at frequencies lower than 0.03 Hz, the analysis is extended to the comparison of synthetics and observations at frequencies between 0.05 and 0.3 Hz.We break down our analysis by first studying the synthetics of the ground stations (SANI and DAV) and then those of the pressure sensors on board the balloons (Balloon 16 and Balloon 17).

Ground stations (SANI and DAV)
Concerning the time series recorded by SANI seismometer in Fig. 6a, b, simulations reproduce important quake characteristics: synthetics arrival times of P-waves and S-waves are less than 1 s from the ones in the observation of the SANI instrument.Ratio of S-waves over P-waves amplitudes are similar in synthetics and data (25.06 for synthetics against 26.46 for observations).However, the synthetics amplitude is underestimated compared to the observations of SANI sensor since a correction factor of 0.51 is applied to synthetics in panels a and b of Fig. 6.
Moreover, the first wave trains are well estimated until 3:27:04.These wave trains are associated with the three gaussians of the source time function of the main shock.This last point illustrates the sensitivity of the observations to the frequency content of the source time function.However, the synthetic signal from 3:27:04, associated with aftershocks, is not correctly reproduced, although the aftershock signals correspond to certain wave trains in the observations.First, the amplitude of the first aftershock is 9 times smaller than the observation (Table 2).For the second aftershock, although the scaling factor is only 4, the frequency content is different from that of the observations: the aftershocks have too many high frequencies compared with the observations.Furthermore, the frequency content of the synthetics in Fig. 6c presents similarities with the one of SANI data.The maximum energy is between 0.1 and 0.15 Hz either for synthetics or data.Synthetics and data show the same peaks at the frequency of 0.08, 0.12, and 0.15 Hz.These peaks are close to the ones present in the frequency content of the source time function.However, synthetics have more energy around 0.3 Hz.This energy is associated with signals generated by aftershocks.
Concerning the time series recorded by DAV seismometer in Fig. 7a, b, vertical displacement and velocity are poorly estimated: amplitude of DAV synthetics is multiplied by a correction factor of 0.16 and estimated arrival time of P-waves are 14 s late, despite the good estimation at low frequency.Nonetheless, some wave trains in the synthetics of the DAV instrument correspond to the ones The first three wave trains are associated with the three gaussians in the source time function of the main shock.Moreover, the same remark applies to aftershocks.Their amplitudes are too small (scaling factor ≥ 10 in Table 2) to explain the observed signal.
Concerning the frequency content of DAV signal, even though frequencies between 0.05 and 0.08 Hz are not excited enough, the frequency content presents similar patterns as shown by Fig. 7c (peaks around 0.08, 0.11, 0.17 and 0.24 Hz).These peaks can be related to the source time function frequency content.
This study emphasizes that, when considering a frequency range [0.05, 0.3] Hz, 2D simulations are not sufficient for retrieving the observed full waveform recorded by SANI or DAV seismometer, but it still allows to get some information like arrival time and amplitudes with relatively small errors.Considering the aftershocks have not significantly improved our synthetic estimation: either the associated frequency content is too high, or the time series amplitude does not match the observations.The difference between simulations and observations can be explained by the peculiar situation of this quake.Indeed, the SCARDEC source time function shows several shocks that are unusual.If these shocks are badly estimated, simulations will present a different waveform and frequency content from observations.The finiteness of source can oversimplify synthetics.In addition, the complex geography of the area with the back arc fault (as illustrated by Hall (2002); Audley-Charles (2004); Supendi et al. (2020)) can be a source of 3D effects impacting waveforms.As a fact supports this hypothesis, if we consider the first two surface wave trains in Figs.S5 and S6, the time separating these two wave trains is 1 min 50 s for SANI recording and 3 min 30 s for DAV recording.The time between the two wave trains evolves with distance, making the hypothesis of 3D effects more plausible than source effects.

Pressure sensors on board balloons (Balloon 17 and Balloon 16)
Figures 6d-f and 7d-f present the analysis for pressure sensors onboard Balloon 17 and Balloon 16.
First, the upward propagation of the ground vertical velocity recorded at SANI and DAV station shows a good agreement with the data of Balloon 17 and Balloon 16 respectively, in terms of frequency content (Figs.6f, 7f ), arrival times and amplitudes of main arrival and late arrivals (Figs.6e, 7e).It clearly demonstrates, also in this second example, that the arrivals after the main seismic surface wave infrasound can be mainly explained by the complex vertical ground movements below the balloon.
Concerning the full waveform simulations plotted in Fig. 6d, synthetics of Balloon 17 instrument are able to reproduce the arrival time of S-waves (difference of 1 s with the data), and the first four wave trains at 3:24:50, 3:25:40, 3:26:24 and 3:27:12.These wave trains are associated with the peaks in the source time function of the main shock.Moreover, late arrivals (after 3:28:00) in the Balloon 17 data are not explained by aftershocks because periods and amplitude of the observed late arrivals do not match the synthetic aftershock signals.The estimated amplitude of Balloon 17 records is consistent with the error done in the ground station simulation.
The frequency content of Balloon 17 is similar to that of the observations (Figs.6f ): high frequencies are ten times smaller than the signal between 0.07 and 0.15 Hz.The frequency content of the synthetics presents energy peaks at some frequency (0.08, 0.11, 0.15 Hz), these peaks are related to the frequency content of the source time function.However, these energy peaks do not match those in the data.
As for DAV ground station simulations, the synthetic waveform of Balloon 16 instrument is different from data, even though some wave trains (at 3:32:33 and 3:34:32) are identified in synthetics and data.
The frequency content of the synthetics of Balloon 16 is similar to data in the frequency range 0.09 and 0.25 Hz.Some energy peaks at frequencies of 0.08 and 0.15 Hz are present in the data and synthetics spectrum.These peaks are close to those of the source time function.
Finally, for the main arrival, the synthetic waveform differing from the observations and the source dependence in frequency content point toward an insufficiently precise estimation of the source time function.Moreover, the late arrivals clearly observed on both ground seismometers and balloon pressure records cannot be ascribed to aftershocks, but can possibly be ascribed to reflections of seismic surface waves on 3D structures in this subduction area.
The same conclusions as in the Peru quake case apply for this case: • 2D simulations are not able to reproduce the full waveform in the frequency band [0.05, 0.3] Hz but are able to reproduce some characteristics (arrival time, energy peak in the frequency content).• Pressure signals are sensitive to the frequency content of the source time function.• The infrasound signal and its frequency content are more sensitive to the ground parameters than to those of the atmosphere.

Time-frequency analysis
Figures 8 and 9 present a more in-depth analysis of the time-frequency content through spectrograms computed with continuous wavelet transform method.
All these spectrograms illustrate well the points raised above.The similarity of the spectrograms from ground stations with those from pressure sensors highlights the fact that propagation in the atmosphere does not alter the  time-frequency content of the signal as much as propagation in the ground.Moreover, synthetics show similar patterns to those of the observations, but the synthetics excite more at frequencies higher than 0.1 Hz.
In addition, a white curve has been added in each panel of Figs. 8 and 9 to represent the dispersion of the seismic surface wave train.The white curve, called the "dispersion curve", is deduced from the synthetics of each sensor (SANI, Balloon 17, DAV and Balloon 16), then reproduced on the spectrogram of the observed data.To get the white curve, we narrow-band filter the synthetic signal in each frequency interval shown in the spectrogram.Then, we select the time of the maximum of the envelope of the filtered signal.Finally, we plot on the spectrogram the 5-point moving mean of the acquired dispersion curve.The analysis of the four spectrograms of Figs. 8 and 9 point out two points.First, the dispersion curve acquired from synthetics (either from ground station or balloon instrument) in panels (c) and (d) fits, respectively, the observations in panels (a) and (b).Then, the dispersion curve of synthetic records of balloonborne instruments in panel (d) is similar to that obtained from synthetic records of ground sensors in the panels (c).Consequently, the time-frequency content of the synthetics of the Balloon 16 and Balloon 17 instruments present patterns of the dispersion of the seismic surface wave train.Note that the procedure for obtaining the dispersion curve is automatic and may be applied to the data without any knowledge of the underlying internal structure.On the other hand, this study underline the fact that we should restrict ourselves to the first 2-3 min after the first arrival, so as not to consider the aftershocks signal.Consequently, the method may encounter difficulties if an aftershock wave arrives within this time range at the receiver.In addition, we can see on Fig. 9b some energy at low frequency.This energy can be ascribed to dispersion effect despite the noise level of the instrument.Finally, pressure sensors onboard high-altitude balloons are instruments that can reveal information about dispersion and thus the internal structure of the crust, just as seismometers.However, this method is only available if dispersion occurs.Thus, it could not be done for the Peru quake, since spectrograms do not show a clear dispersion (Fig. S4).

Discussion
As shown in the previous section, the SPECFEM2D-DG-LNS software may partly explain the frequency content and arrival time of the first and main signal recorded by pressure sensors which correspond to the arrival of infrasound generated by seismic surface waves.However, the balloon records contain energy during a much longer duration, up to 25 min after the first arrival.We investigate in this section potential other phenomena explaining these signals.

Secondary infrasound sources
Secondary infrasound sources produced by seismic waves interacting with mountains ranges (Le Pichon et al. 2003;Shani-Kadmiel et al. 2018) or sedimentary basins (Marchetti et al. 2016) are potential sources to explain the late arriving signals on the balloons.To investigate what parts of the signals can be impacted by these sources, we compute ray trajectories and propagation times between the ground and the balloons by using the InfraGA software (Blom andWaxler 2012, 2017).To simplify the computations, we locate the source signal at balloon position, invert the horizontal winds and determine what position at ground level is receiving the signal for rays defined on a grid of 5 • in back azimuth and 0.5 • in inclination angle.An example of ray trajectories between the Balloon 17 and the ground level is provided in Fig. 10 for an azimuth of 300 • (ground emission point at an azimuth of 300 • relative to North at balloon position).Figure 10b clearly exhibits all the possible ray trajectories between the ground and the balloon.First, rays coming straight from below the balloon appear with the shortest propagation times.Then a clear shadow zone is visible extending over more than 150 km.We do not expect a significant contribution from infrasound scattered in the shadow zone because these signals are usually observed at frequencies above 0.3 Hz (Vorobeva et al. 2023).The first arrivals after this shadow zone are coming from the wind duct at 60 km altitude with positive inclination angles at the balloon position, followed by rays refracted in the stratosphere at negative inclination angles.Then, various arrivals from the thermosphere duct with positive inclination angles at the balloon position are preceding the rays with a rebound on the Earth's surface.This simulation confirms that all the infrasounds arriving within the first 5 min after the first arrival are generated just below the balloon.
To analyze further these simulations, we project all emission points at ground level on a map and select only the rays emitted by topographic features above sea level for Flores case and at altitudes larger than 500 m for Peru.This selection is justified by the fact that only topography slopes can emit rays at inclination angles smaller than 90 • .In addition, we select only rays with a transmission loss larger than −15 dB relative to the trans- mission loss of the first arrival, and we add to the infrasound propagation times the seismic wave propagation time between the quake and the ground emission point.The origin points and arrival times of these infrasounds are presented in Figs. 11,12,and 13, respectively, for Balloons 17 and 16 of the Flores quake, and Balloon 09 of the Peru quake.For the records of the Flores event, the secondary infrasound can explain the duration of the signals because some energetic arrivals are expected up to 20 min after the first arrival from oceanic islands.However, the energy between 03:28 and 03:33 on Balloon 17 and between 03:35 and 03:44 for Balloon 16 is within the shadow zone and cannot be explained by secondary infrasound.Particularly for these two balloons, a second energy packet with a dominant frequency above 0.1 Hz is observed just after the main arrival produced by seismic surface waves.
For the Peru quake (Fig. 13), secondary infrasounds are predicted to be radiated by the Andes but arrive more than 15 min after the first arrival in a time range for which no clear pressure waves are observed above the background noise.
Overall, the secondary infrasound created by interactions between seismic waves and topography slopes can partly explain the long coda observed for the Flores quake in the time range 8-35 min after the first infrasound arrival.They have amplitudes similar to those observed in the balloon records.However, the signals in the 4-8 min range after the first infrasound arrival are in the shadow zone of these secondary infrasounds.

Epicentral infrasound
To assess whether the signal presents infrasound generated at the earthquake epicenter (so-called "epicentral infrasound"), we use SPECFEM2D-DG-LNS simulations.With this software, we can estimate the amplitude and the arrival time of the epicentral infrasound at each point of the domain.From these simulations, we cannot identify any epicentral infrasound at distances larger than 40 km.To confirm these results, ray tracing is performed with a method similar to that used in the previous section to study the secondary infrasound.Figure 14 presents an example of these computations for rays traveling in the plane passing through the epicenter and the Balloon 17.A shadow zone is present between ≈ 50 and ≈200 km.For dis- tances larger than 200 km, the balloon would be able to perceive rays that had passed through the thermosphere and partly through the stratosphere.However, the transmission coefficient for these rays is at least 20 dB less than the infrasound generated under the balloon.Their amplitude is too small to be observed.In addition, some rays are also connecting the ground surface to the balloon at large distances by passing through the stratosphere wave guide.However, these rays correspond to low inclination angles at the surface and cannot be excited by seismic waves which mainly emit in the vertical direction, at high inclination angles.Overall, both full waveform simulations and ray tracing demonstrate that epicentral infrasounds are difficult to measure by balloons in the stratosphere when epicentral distances of the balloons are larger than 60 km due to their small amplitude.

Balloon response to acoustic forcing
The balloon oscillating around its equilibrium altitude can be described by a pendulum oscillating at a period around 210 s.Forcing this system in the period range 2-25 s can force the harmonics of the pendulum (Garcia et al. 2022), but it is low pass filtering the forcing and it cannot thus produce the additional oscillations observed 4-8 min after the first arrival.

Perturbations induced by balloon oscillations
Even in the absence of forcing by acoustic waves, the analysis of pressure records reveals that the frequency range 0.04 − 0.09 Hz is perturbed by balloon oscillations around its equilibrium position.Examples of such perturbations are provided in Fig. 15 for Balloon 16 records before Flores quake.Such perturbations are generated mainly when the gondola, located about 13 m below the balloon, is reaching the minimum altitude of the balloon at the previous oscillation (red vertical bars on panel 15d) and disappear when the balloon vertical velocity is zero at the apex of the balloon (magenta vertical bars on panel Fig. 15d).As observed in this example, these perturbations are greatly reduced when the vertical oscillation amplitude is low, and enhanced when the balloon presents large oscillations.This effect is mainly observed for balloons at 18 km altitude (Balloons 16 and 17) but much lower for balloons at 20 km altitude (Balloon 09).Such perturbations are tentatively associated with pressure perturbations induced by the up and down movements of the balloon due to either air flow or vortex shedding behind the balloon envelope.However, these perturbations are not observed above 0.1 Hz, and thus cannot explain the signals above 0.1 Hz for the Peru quake and their amplitude in the 0.04 − 0.09 Hz range is below 0.03 Pa, which is much lower than the observed signals during Flores quake on Balloons 16 and 17.

Seismic forcing not explained by 2D full waveform simulations with a 1D structure model
As shown in Figs.5e, 6e and 7e, and partly discussed, the late infrasound arrivals are mainly explained by vertical ground movements below the balloons that are not reproduced by the 2D full waveform simulations with a 1D structure model.As determined in "Results", these signals cannot be explained by aftershocks or by complexities in the source time function of the main shock.These signals are therefore likely due to additional seismic waves not considered in our full waveform modeling and possibly induced by surface wave reflections on the numerous plate boundaries present in this region.This is particularly true for the Flores quake which presents the longest coda on both ground sensors and balloon records.Even if contributions by secondary infrasound cannot be completely excluded, their absence on Balloon 09 which is close to the Andes, as well as the low Simulations with 3D internal structure and 3D atmosphere models including topography are required to explain the full waveform.Inchin et al. (2021Inchin et al. ( , 2022) ) have applied such a simulation approach to total electron content signal using GNSS techniques, to observe the signal in the atmosphere following an earthquake.Furthermore, we use point sources to model the earthquake sources located in a plane containing both the source and the receiver, which does not seem to be accurate enough to reproduce adequately the observed wave signals.For large earthquakes (magnitudes greater than 6), rupture extent processes involving radiation patterns and directivity effects should be considered (Pacor et al. 2005;Böse et al. 2012).Obviously, the directivity of the sources is not always aligned with the 2D plane of the simulation, and finite fault rupture process in extended faults should be taken into account.A more realistic source time function should take into account the finite path of the rupture, the rupture velocity, the distance between the receiver and each point defining the plane of the rupture, the apparent corner frequency defined as a function of the apparent duration of the rupture, the angle between the direction to the receiver and the direction of rupture propagation.The total source spectrum over the finite fault length can be seen as the integration of the different spectra over the path of the fault plane for all the point sources that define the fault plane.This defines an envelope that decays in frequency squared ( f 2 ).Simi- lar frequency decays are observed, for instance, in the data spectra obtained for the Flores quake studied here (see the spectra decays recorded at SANI and DAV on Figs. 6 and 7 after 0.1 Hz and 0.05 Hz, respectively).Besides, the fault rupture processes and the heterogeneity of the rupture can be also simulated as a series of interconnected sub-faults triggered successively which provides different frequency contents and amplitudes at the different locations on or near the plane of the rupture as well as at the receivers locations depending on if they are in the direction of the rupture propagation or not.For instance, at low frequency, the solutions can be explained by a whole system of faults; while at high frequency, the recorded signals can be mainly explained by site effect due to shallow sub-surface or by shallow crust heterogeneities as well as by the complexity of the fault rupture.For more details about these methods related to fault rupture process and propagation modeling, we can refer the reader to the interesting works and summaries of Pacor et al. (2005) and Malagnini et al. (2021).In our 2D version, many of these parameters could be tested in a future study, but we should pay particular attention to the crucial issue related to the angle between the direction of the rupture and the direction of the receiver.This is not trivial because of the 3D nature of the real configuration.However, this is beyond the scope of this study because the 3D version of the Linearized Navier-Stokes atmosphere modeling is not yet coupled to the SPECFEM3D modeling tool.

Conclusions
The simulations of infrasound pressure records generated by quakes estimate with small errors the arrival time, amplitudes and frequency content of the first signals observed by sensors on board Stratéole-2 balloons.In particular, the dispersion patterns of infrasound generated by seismic surface waves are reproduced.We show that the structure of the ground has a greater impact on the high-altitude pressure waveforms than the atmosphere.Moreover, our simulations demonstrate the strong sensitivity of these pressure signals to seismic source parameters, in particular the frequency content of the source time function.On the other hand, since the signals recorded by the balloon sensors present a good signal-to-noise ratio only in the 0.05 − 0.3 Hz frequency range, the waveform estimation is more complex than the analysis of seismometer signals at low frequencies.The balloon-borne pressure oscillations observed after the main infrasound signals are not reproduced by our full waveform simulations in a one-dimensional model.We show that, even if the first two aftershocks generate signals during the long coda of the observation, their frequency content and magnitude cannot explain the observed signal.Further analysis demonstrates that these signals are coming mainly from the complex vertical ground movements below the balloons with a minor contribution from secondary infrasound sources for time ranges larger than 10 min after the main arrival.Overall, the pressure records of seismic infrasound on board stratospheric balloons can be interpreted as records of vertical ground velocity of the surface just below the balloon.These sensors act like a seismometer in the atmosphere, but they are sensitive to noise sources intrinsic to the balloon/gondola system.
This study demonstrates the interest of these quake infrasound observations for terrestrial and planetary applications.In particular the capability of these records to recover a surface wave dispersion curve which is a marker of internal structure.However, further work is required to understand the relative influences of the 3D effects and the balloon/gondola system on the pressure records of the infrasound wave field.In addition, in the future, we would like to integrate fault rupture modeling to better model the source to improve the data fit by synthetic waveforms.

Fig. 2
Fig. 2 Background map displays the position of the Flores quake epicenter (blue rectangle), the balloons (red circle), and the seismic stations (green triangle).The inserts are describing the normalized source time function and centroid of the quake (a), the internal structure models given by CRUST2.0 (b, c) and the atmosphere model (d, e) used in the simulations.b, d Are used to simulate the SANI seismometer and the Balloon 17 as pressure sensor and (c) and (e) are used to simulate the DAV seismometer and the Balloon 16 pressure sensor.In b and c, the dark green curve is associated with the density, the plain light green curve with the S-wave velocity and the dotted light green curve with the P-wave velocity

Fig. 3
Fig. 3 Snapshots of the simulation of the Flores quake for the Balloon 17 pressure sensor observation.Snapshots represent vertical velocity in the ground and pressure perturbations in the atmosphere.Red indicates values larger than 1% of maximum amplitude, while blue indicates values smaller than −1% .The yellow cross indicates the position of the source, while the green dots indicate the position of the receivers.The width and height of the figure correspond to 1885 km and 430 km, respectively.PW P-wave, SW S-wave, SfW Surface waves, EI epicentral infrasound, SI S-generated infrasound, SfI infrasound generated by surface waves

Fig. 4
Fig. 4 Illustration of the source time function (STF).a shows the temporal source function.SCARDEC STF is the solid gray line and the STF used in our simulations is the dotted blue line.b shows the ASD of these source time functions.Added to this is the amplitude spectral domain (ASD) of the vertical velocity ( m • s −1 • (Hz) −1 ) observed at the SANI station (solid dark red line) and the ASD of the vertical velocity observed ( m • s −1 • (Hz) −1 ) at the DAV station (dotted orange line)

Fig. 5
Fig. 5 Comparison of observation and synthetics for the CZSB seismometer (on the left) and for the Balloon 09 pressure sensor (on the right).a shows vertical displacement on the ground station CZSB, whereas b illustrates the vertical velocity.c is the spectral analysis of vertical velocities.d, e highlight the pressure observed at the location of Balloon 09.f is the spectral analysis of pressure waves.In all panels, blue curves are associated with synthetic data and red curves with observed data.Cyan curves are the upward propagation of pressure perturbations induced by the observed ground vertical velocity.In a, b, d, e black vertical solid lines show the estimated arrival time of P-waves issued of the main shock and the two aftershocks, whereas black vertical dashed lines highlight the S-wave estimated arrival times.In c, f, the black dashed curve represents the ASD of the time derivative of the source time function.The vertical black dotted line shows the frequency peak of the source.A [0.05, 0.3] Hz band-pass filter is applied for signals in a, b, d, e, a [0.02, 1] Hz band-pass filter is applied for signals in c, and a [0.03, 0.5] Hz band-pass filter is applied for signals in f.Gray areas in the time series plot delimits the signal that is used for the spectral density analysis.Gray area in the spectrum illustrates the band-pass filter used for the time series.Synthetics are multiplied by a scaling factor of 1. in a and b for comparison purposes, by 1.71 in d and e. Upward propagation of CZSB records is multiplied by 0.33 in e

Fig. 6
Fig. 6 Comparison of observations and synthetics for seismometers (on the left) and for the balloon pressure sensor (on the right).a shows vertical displacement on the ground station SANI, whereas b illustrates the vertical velocity.c is the spectral analysis of vertical velocities.d, e highlight the pressure observed at the location of Balloon 17. f is the spectral analysis of pressure waves.In all panels, blue curves are associated with synthetic data and red with the observed data.Cyan curves are the upward propagation of pressure perturbations induced by the observed ground vertical velocity.In a, b, d and e, black vertical solid lines show the estimated arrival time of P-waves issued of the main shock and the two aftershocks, whereas black vertical dashed lines highlight the S-wave estimated arrival times.In c and e, the black dotted curve represents the ASD derivative of the source time function.A [0.05, 0.3] Hz band-pass filter is applied for signals in a, b, d and e, a [0.02, 1] Hz band-pass filter is applied for signals in c, and a [0.03, 0.5] Hz band-pass filter is applied for signals in f.Gray areas in the time series plot delimits the signal that is used for the spectral density analysis.Gray area in the spectrum illustrates the band-pass filter used for the time series.Synthetics are multiplied by a scaling factor of 0.51 in a, b for comparison purposes, by 1.88 in d and e. Upward propagation of SANI records is multiplied by 4.08 in e

Fig. 7
Fig. 7 Comparison of observation and synthetics for seismometers (on the left) and for the balloon pressure sensor (on the right).a shows vertical displacement on the ground station DAV, whereas panel b illustrates the vertical velocity.c is the spectral analysis of vertical velocities.d, e highlight the pressure observed at the location of Balloon 16. f is the spectral analysis of pressure waves.In all panels, blue curves are associated with synthetic data and red with the observed data.Cyan curves are the upward propagation of pressure perturbations induced by the observed ground vertical velocity.In a, b, d and e, black vertical solid lines show the estimated arrival time of P-waves issued of the main shock and the two aftershocks, whereas black vertical dashed lines highlight the S-wave estimated arrival times.In c and f, the black dotted curve represents the ASD derivative of the source time function.A [0.05, 0.3] Hz band-pass filter is applied for signals in a, b, d, e, a [0.02, 1] Hz band-pass filter is applied for signals in c, and a [0.03, 0.5] Hz band-pass filter is applied for signals in f.Gray areas in the time series plot delimits the signal that is used for the spectral density analysis.Gray area in the spectrum illustrates the band-pass filter used for the time series.Synthetics are multiplied by a scaling factor of 0.17 in a, b for comparison purposes, by 0.56 in d, e. Upward propagation of DAV records is multiplied by 3.72 in e

Fig. 9
Fig. 9 Comparisons of spectrograms of synthetic and observed data acquired at the DAV station position and at the Balloon 16 position.a, c show, respectively, the wavelet spectrograms of observed and synthetic data recorded at the DAV station.b, d show, respectively, the wavelet spectrograms of the observed and synthetic data recorded at Balloon 16.The white curve illustrates the dispersion of the seismic surface waves generated by the main shock

Fig. 8
Fig. 8 Comparisons of spectrograms of synthetic and observed data acquired at the SANI station position and at the Balloon 17 position.a, c show respectively the wavelet spectrograms of observed and synthetic data recorded at the SANI station.b, d show respectively the wavelet spectrograms of the observed and synthetic data recorded at Balloon 17.The white curve illustrates the dispersion of the seismic surface waves generated by the main shock

Fig. 10
Fig. 10 Outputs of InfraGA computations for ray tracing from Balloon 17 to the ground along azimuth 300 • .a Atmosphere model with sound speed (dashed line) and effective sound speed (plain line).b Ray trajectories color coded by relative amplitude (in dB).c Propagation time between the ground and the balloon color coded with inclination angle of rays at the balloon

Fig. 11
Fig. 11 Estimates of infrasound ray arrivals emitted by seismic waves interacting with topographic features for Flores quake and Balloon 17. a Position of emission points on the Earth's surface.b Pressure records filtered in the 0.04 − 0.3 Hz frequency range and arrival times and relative amplitude estimates from transmission loss scaled to maximum amplitude of the records (magenta vertical bars)

Fig. 13
Fig. 13 Estimates of infrasound ray arrivals emitted by seismic waves interacting with topographic features for Peru quake and Balloon 09. a Position of emission points on the Earth's surface.b Pressure records filtered in the 0.04 − 0.3 Hz frequency range and arrival times and relative amplitude estimates from transmission loss scaled to maximum amplitude of the records (magenta vertical bars)

Fig. 14
Fig. 14 Outputs of InfraGA computations for ray tracing along the quake to Balloon 17 great circle.a Atmosphere model with sound speed (dashed line) and effective sound speed (plain line).b Ray trajectories color coded by relative amplitude (in dB).c Propagation time between the ground and the balloon color coded with the inclination angle of rays arriving at the ground

Fig. 15
Fig. 15 Altitude (a), pressure (b), time derivative of pressure (c) and pressure filtered in the 0.04 − 0.09 Hz range (d) for Balloon 16 before the Flores quake.Red and blue dots indicate, respectively, maxima and minima of pressure and pressure derivatives on b, c.Red and magenta vertical bars on d indicate, respectively, times for which the gondola is reaching the altitude of the balloon at its previous minimum position, and times for which the balloon is at its maximum altitude

Table 2
Scaling factor for synthetics of the Flores quake on the 2021/12/14