### Correction with SH models

In a second stage of the processing, we correct the measurements for the main and external magnetic fields (process 2a, 2b, and/or process 3 in Fig. 1). Explicit models relying on the parameterization of the main internal and external fields in SH are most appropriate for such data correction. The Swarm processing chains developed within the SCARF consortium (Olsen et al. 2013) provide various models for these source fields (Hamilton 2013; Rother et al. 2013; Sabaka et al. 2013). However, data corrections based on models derived from different dataset may lead to bias and errors (e.g., Sabaka and Olsen 2006). For example, the CI core field model relies on a selection that uses both night and day data, which is different from the strict LT and Sun elevation selection criteria we use here. This model also attempts to separate the core and the internal fields induced in the Earth’s mantle by the external fields (Sabaka et al. 2015). As a consequence, it is preferable to use either the full CI model or a dedicated core field model together with an additional model for the ionospheric field (Chulliat et al. 2013), to ensure that the contribution of the ionosphere induced in the Earth’s mantle is also corrected for. Additional magnetospheric field corrections can then also be implemented based on the so-called Model of the MAgnetosphere (MMA) SH model (Hamilton 2013). As an attractive alternative to the CI model the Dedicated COre field model (DCO) can, for instance, be used. It is derived from night-side data and does not separate core and internally induced fields (Rother et al. 2013). It thus provides the correction sought with a minimum risk of bias. For the current release of L2 models to ESA, the work flow was designed to ensure that DLFI model is only computed once the DCO and CI models are available. For the present study, we use the DCO model to correct the measurements for the Earth’s core field over the time span of interest.

In order to complement this first-order correction and to allow the DLFI chain to also operate independently of the SCARF sequences of model production, an alternative processing block (process 3) is used. We derive an additional built-in model of the Earth’s main and lithospheric internal field up to SH degree 60, the main field secular variation and acceleration up to SH degrees 10 and 8, respectively, and a simple model for the magnetospheric field to SH degree 1 scaled with the hourly Dst index [details about this relatively standard parameterization may be found in Thébault et al. (2010a)]. The internal part of the model up to SH degree 15 and its magnetospheric part can then be used to correct the measurements for both these main and magnetospheric field contributions.

For the present study, since the DCO model had been made available to us, a combined approach is in fact used. The data are first corrected for the DCO model over the time span of interest (process 2a), and the residuals are next used to compute a model itself used to further correct the data in the way described above (process 3). The root-mean-square (RMS) of the final residuals is then plotted within 1° longitudinal bands over all latitudes to identify satellite tracks showing abnormal RMS values.

### Empirical corrections

The magnetospheric field signal from the ring current, which is at least of the same order as the magnitude of the lithospheric field strength, is dynamic and varies from one orbit to the next. The Dst is an hourly index, and its parameterization for the modeling of the magnetospheric field in SH to degree 1 (in process 2b or/and 3) is not sufficient to account for all its time and space variations. This is especially true for time variations shorter than 1 h which are responsible for systematic offsets between the magnetic field measurements on nearby satellite orbits. To correct for this, we introduce an additional processing block (process 4). For each low-latitude satellite track, we fit an SH external dipole (three parameters) with its internally induced zonal dipole part (one parameter) in the inclined internal dipole reference frame and remove this model from the measurements. Such along-track filtering must be applied with care as it also tends to filter out genuine lithospheric magnetic field structures (e.g., Maus et al. 2006a, b) and to introduce artifacts (Thébault et al. 2012). To minimize its adverse effect, we first subtract the lithospheric field contribution from SH degrees 16–45 of the model derived in process 2 from the scalar and vector measurements before estimating the along-track correction. The along-track estimation of the external field parameters is performed using jointly Swarm A and Swarm C measurements whenever possible. Since Swarm A and Swarm C travel along slightly different orbits, this joint inversion introduces a longitudinal constrain that minimizes the error of correction. The along-track primary and induced external field correction is the one we apply to the selected and corrected data.

At mid-latitudes, this processing reduces considerably the offsets between the adjacent satellite tracks but not perfectly particularly along the magnetic equator and South East Asia (see Fig. 2). However, initial tests with this Swarm dataset demonstrated that more complex data processing along the satellite tracks would distort the lithospheric field structures and lead to models with less power than expected.

At polar latitudes, this SH along-track analysis is also imperfect. The data selection process minimizes the ionosphere diurnal variations but cannot avoid all ionospheric field contributions. In particular, the field produced by the polar electrojets persists at night times. It is confined within the auroral oval (between 55° and 75° absolute latitude) during magnetically quiet times, and along-track correction with low SH degree models is not as efficient. Maus et al. (2008) propose an additional line leveling procedure that consists in minimizing the differences at crossover points between satellite tracks. The leveling is effective for mapping the lithospheric field. However, the corresponding inverse problem is ill-posed because one can always add a constant to the satellite tracks without affecting the difference between the tracks (Menke 2012). Setting this constant value a priori is challenging, and an erroneous value can lead to a lithospheric field model with a biased offset. For this reason, we do not apply the line leveling procedure for the present model.

Rather, at polar latitudes, we stick to the SH along-track correction and implement an additional empirical correction based on singular spectral analysis (SSA). SSA is a numerical method that decomposes the signal into its principal components (Golyandina and Zhigljavsky 2013). The vector and scalar measurements of each Swarm A and C satellite are analyzed independently track by track. For each polar portion of a satellite half orbit, the measurements are treated as a series *B*
_{
i
}(*r*), where *r* represents the location of the *i*th magnetic component (vertical, north, east, or scalar). In a first step, the covariance matrix of the series *B*
_{
i
}(*r*) is computed and decomposed by singular value decomposition (SVD). This SVD provides a set of eigenvalues sorted by order of magnitude together with a matrix of empirical orthogonal functions (EOF). The projection of the signal *B*
_{
i
}(*r*) onto each EOF that is multiplied by the corresponding set of eigenvalues is then used to identify and filter out unwanted features. The filtering is done without a priori information about the nature or the shape of magnetic field structures; the procedure is model independent. This flexibility is both the main advantage and the main disadvantage of SSA. Compared to SH filtering, no information is required with respect to the typical length scale of the noise along the satellite orbit. SSA can handle automatically changes in the spatial scales of the external field signal due to changes in the magnetic activity. Contrary to techniques based on Fourier analyses [see, for instance, Langel and Hinze (1998)], SSA is also less prone to severe ringing and aliasing that often produce artificial NS oscillations in the corrected measurements. However, SSA first requires the identification of the set of EOFs that carry the external fields. This is a crucial yet fairly arbitrary step that is left to the operator who must decide which sets of EOFs best represent the signal to be filtered out.

The SSA filtering is applied on polar data after correction for the main, magnetospheric, and lithospheric fields up to SH degree 45 and after implementing the dedicated along-track SH correction. Synthetic tests and inspection of the residual plots (see Fig. 3) reveal that the contaminating external fields are mostly represented by the first two EOFs in the polar regions. This, we note, is often violated for satellite track segments shorter than about 8° (typically corresponding to the wavelength at SH degree 45) if one is not careful to first remove a lithospheric field model to SH degree 45 in the way we did. This additional SSA along-track correction is applied to all selected data in the high-latitude regions (but not at the mid-latitude regions). Figure 3 shows the measurements after the SH and SSA corrections above the polar regions. Once this series of dedicated correction are made, the lithospheric field model from SH degree 16 to 45 that was first subtracted is added back and the measurements temporarily selected above magnetic latitude |±52°| in the mid-latitude dataset and below magnetic latitude |±52°| in the polar dataset are removed.

Across-track vector and scalar differences are next computed by selecting Swarm A and C data measured at the same universal time (UT) with the additional condition that the distance between the measurements does not exceed 1.4° in great circle distance (this corresponds to the angular distance between Swarm A and C at the equator crossings). These across-track differences approximate the gradient of the measurements. However, we do not divide the difference by the actual distance between the measurements so that vector, scalar, and “gradient data” are all in units of nanotesla. This leads to an inverse problem statistically easier to handle (e.g., Thébault et al. 2013). The across-track difference computed from the Swarm A and C synchronous measurements is not exactly EW oriented and contains a significant amount of NS contributions near the geographic poles where Swarm A and C no longer fly side by side. This construction of the gradients aims at canceling out all large-scale contributions but also very rapid and transient field fluctuations related to external field sources remaining in the data. This differs from the approach chosen by Olsen et al. (2015) where a more exact separation of EW and NS gradients is sought numerically. Their approach does not preclude the contamination of gradients by rapid external field variations (under a few seconds) although it opens the interesting, but not yet fully exploited, possibility to build an optimal combination from all gradient components depending on their sensitivity to the lithospheric field (Kotsiaros and Olsen 2012).

Differences along each satellite A and C tracks that approximate the NS gradients at equator crossing are computed by selecting data measured by the same satellite with a time stamp difference of 20 s. This time difference corresponds to about 1.2° or 140 km (at 460 km altitude) spacing between the measurements. This distance is chosen smaller but close to the angular EW separation of Swarm A and C of about 1.4° at the equator. Kotsiaros et al. (2015) performed a retrospective analysis of the last 2 years of the CHAMP satellite measurements and concluded that computing NS gradients from measurements separated by not more than 30 s is close to optimum. Contrary to the across-track gradients, we found that uncorrected external fields were enhanced at high latitudes when including the satellite A and C along-track gradients, particularly in the auroral ovals. For this reason, along-track differences were considered only in mid-latitude regions and not in polar regions.

### Data weighting

The scalar, vector, and gradient data we use are statistically different and undergo various types of processing in the algorithm. Defining which weight should be affected to each dataset is therefore not trivial. For example, it is known that a Sun-related perturbation affects the VFM vector measurements on all three Swarm satellites and that an empirical model is currently being used to correct and calibrate the official 0408 and 0409 vector data, using an approach similar to the one initially proposed by Lesur et al. (2015) (see https://earth.esa.int/web/guest/swarm/data-access/dataset-history and related documents for more details). In addition, because of an anomaly affecting the ASM instrument on Swarm C, no more absolute scalar data are being acquired on that satellite since November 5, 2014, and calibration of Swarm C data must also rely on absolute scalar ASM measurements made on the nearby Swarm A satellite. Scalar measurements are also less sensitive to external field variations and errors due to the satellite attitude uncertainties (Holme and Bloxham 1996). Overall, it is thus fair to state that the error budget of vector measurements must a priori be considered as being larger than that of scalar measurements. On the other hand, it is important to avoid putting too much weight on scalar measurements in order to limit possible Backus effects near the equatorial regions (Backus 1970). More generally, we note that regional geomagnetic field models also can suffer from similar non-uniqueness issues when derived from scalar measurements only. To guarantee uniqueness of the derived models, regional modeling thus also requires vector measurements in all spherical caps, even in polar regions where they are comparatively noisier than in the mid-latitude ones.

Across-track gradient data are less sensitive to large-scale external fields, in particular to those responsible for leveling issues between nearby measurements. Therefore, putting more weights on the across-track gradient data seems an attractive option. However, handling the gradient data is not trivial because NS and EW differences of Swarm measurements only provide information about the horizontal and not on the vertical gradient. In addition, since gradient data are mostly sensitive to the small spatial scales of the lithospheric fields, vector and scalar measurements are much needed for constraining the larger-scale lithospheric field structures (Friis-Christensen et al. 2006). For these two reasons, it is not certain that the incomplete gradient data carry enough information to completely constrain the vector magnetic field at every data location.

The different experimental and processing errors thus justify weighting each data type differently. We do not try to set realistic variances to each data type, but we rather define weights with the priority order *w*
_{
δF
} ≥ *w*
_{
F
} ≥ *w*
_{
δV
} ≥ *w*
_{
V
}, with *w*
_{
δF
} the weight on the scalar gradients, *w*
_{
F
} the weight on the scalar measurements, *w*
_{
δV
} the weight on the vector gradients, and *w*
_{
V
} the weight on the vector measurements, respectively. Setting the numerical values is arbitrary, but taking into account the data quality and theoretical limitations, we choose *w*
_{
δF
} = 10, *w*
_{
F
} = 5, *w*
_{
δV
} = 5, *w*
_{
V
} = 1. These a priori weights are used in the inverse problem.