Co-estimation of geomagnetic field and in-orbit fluxgate magnetometer calibration parameters

For the past 20 years, state of the art geomagnetic core field models have relied heavily on magnetic measurements made from space-based instrumentation. These models have revealed rapid global magnetic field variations on sub-decadal timescales originating in Earth’s core. With the end of the CHAMP mission in 2010 and the launch of Swarm in late 2013, there has been a 3-year gap in high-quality satellite measurements of the geomagnetic field. Geomagnetic field models have therefore relied on ground observatory data to fill in this gap period. However, ground observatories are unable to provide a truly global picture of the core field and its temporal changes. Many satellites in operation carry vector fluxgate “platform” magnetometers for attitude control, which can offer an alternative to relying on ground observatory measurements during the gap period. However, these instruments need to be carefully calibrated in order to provide meaningful information on Earth’s core field. Some previous studies attempted to calibrate such instruments with a priori geomagnetic field models. This approach has several disadvantages: (1) errors in the model will introduce errors in the calibration parameters, and (2) relying on an a priori model may not be feasible in the post-Swarm era. In this paper, we develop a novel approach to build a time-dependent geomagnetic field model from platform magnetometer data, by co-estimating their calibration parameters with the internal field parameters. This method does not require an a priori geomagnetic field model, but does require a dataset of previously calibrated data. We use CHAMP, Swarm, and ground observatory measurements to supply this dataset, and incorporate platform magnetic measurements from DMSP and Cryosat-2 during the gap years. We find that the calibration parameters of DMSP and Cryosat-2 can be reliably estimated, and these missions provide meaningful information on rapid core field variations during the gap period.


Introduction
Earth's internal magnetic field is constantly changing due to what is believed to be a geodynamo inside the liquid outer core. According to this theory, cooling of the core when combined with gravitational forces drives convective fluid motion. Earth's rotation imposes additional forces on the core fluid, and a fraction of this total kinetic energy is converted to magnetic energy through induction. If the magnetic energy generated is sufficient to overcome viscous and ohmic dissipation, the dynamo can be self-sustaining. In recent decades, the numerical geodynamo modeling community has demonstrated the viability of self-sustaining dynamos by rotating convection (Glatzmaier and Roberts 1995;Schaeffer et al. 2017;Zhang and Busse 1988). However, geodynamo simulations are thus far unable to reach a dynamical regime similar to planetary cores, and their predictive capabilities are extremely limited. An alternative approach for studying geomagnetic core field variations is to analyze measurements recorded by ground observatories as well as satellite missions carrying magnetic instrumentation in Low-Earth Orbit (LEO). Ground observatories are well suited to capture temporal variations at specific locations, while satellites provide a much richer spatial picture of the geomagnetic field. While many satellite missions carry fluxgate magnetometers, these instruments are subject to drift due to mechanical effects (such as vibrations), temperature variations, or aging of sensor material, and extreme care must be taken before using these data in geomagnetic field modeling.
Magsat (1979)(1980) was the first mission to carry both a vector fluxgate magnetometer and a Cesium vapor absolute scalar magnetometer which was used to calibrate the fluxgate instrument in-flight. Since the scalar instrument accuracy depends on well-known atomic energy transitions, it does not suffer from drift and can be used to calibrate fluxgate instruments. After the end of the Magsat mission, it took nearly 20 years before another mission was launched with both a fluxgate and absolute scalar magnetometer. However, since 1999, the geomagnetic core field modeling community has enjoyed unprecedented and nearly continuous high-quality global magnetic measurements from space. Ørsted (1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013), SAC-C (2000SAC-C ( -2004, CHAMP (2000CHAMP ( -2010, and Swarm (2013-present) have provided high-accuracy vector magnetic measurements from LEO since 1999 with the exception of a roughly 3-year gap period between 19 September 2010 and 22 November 2013. This 20-year period of satellite observations has revealed a global picture of secular variation (SV), and perhaps most interestingly, a series of localized secular acceleration (SA) pulses on sub-decadal timescales (Aubert and Finlay 2019;Lesur et al. 2008;Chulliat and Maus 2014;Chulliat et al. 2010Chulliat et al. , 2015Olsen and Mandea 2007;Torta et al. 2015).
Due to these discoveries, there has been significant interest in improving core field models during the 3-year gap period between CHAMP and Swarm, particularly because one of the secular acceleration pulses occurred in 2012.5 . The purpose of this study is to investigate the possibility of using fluxgate magnetometers carried by other satellite missions (without scalar reference magnetometers) to provide global information on secular variation and secular acceleration during this gap period. The idea of using lesser-quality fluxgate data for core field modeling is not new. Ridgeway et al. (1989) combined DMSP F-7 fluxgate data with Magsat and ground observatory measurements, co-estimating a subset of fluxgate calibration parameters with internal field coefficients. The authors of this study encountered significant challenges in accurately determining crustal biases for the observatories, as well as removing large spacecraft fields from the DMSP spacecraft body, and stated "the results obtained are inconsistent and contradictory". Sabaka et al. (1997) similarly applied a coestimation of calibration and core field parameters to DMSP F-12 and F-13 data. Their study used an a priori core field model (IGRF95) to constrain the least-squares minimization, in order to perform a vector calibration of the satellite data. Langel et al. (1997) analyzed fluxgate measurements from the UARS, POGS, and DE-1 satellites. They used an a priori core field model built from Magsat data to act as the scalar reference for the fluxgate measurements, and used only the scalar magnitudes of the fluxgate measurements since attitude information on those satellites was limited. Their study focused only on calibration, and they did not take the next step to build a model from their calibrated data, instead analyzing the calibrated residuals against the same field model used for the calibration. Alken et al. (2014) used a similar approach on DMSP satellite fluxgate measurements, using an a priori model built from CHAMP data to act as a scalar reference in order to estimate the calibration parameters. Then in a second step, they built a core field model from the calibrated data, but noted that it was likely the scalar reference model would contaminate the calibration parameters, resulting in a final model similar to the original reference model.
To avoid the problems inherent in using a priori models to calibrate fluxgate instruments as described above, we have developed an approach to co-estimate the fluxgate calibration parameters along with the internal core field coefficients. Our method completely removes the need for an a priori calibration model, instead relying on combining the uncalibrated fluxgate data with an additional dataset of previously calibrated data. Since we now have multiple years of high-quality calibrated data from CHAMP and Swarm on both sides of the gap period, as well as ground observatory data inside and outside the gap period, we have an ideal opportunity to fill in the gap with platform fluxgate measurements.
In the "Instrumentation" section, we review the satellite missions and ground observatories used for this study. In the "DMSP data processing" section, we discuss the data cleaning and preprocessing applied to the DMSP satellites. The "Data selection and preprocessing" section presents the data selection and preprocessing steps for all satellite missions and ground observatories to build a final dataset for the model inversion. "Coestimation of internal field and calibration parameters" describes our methodology for co-estimating the internal core field and fluxgate calibration parameters, as well as other model parameters. Model results are presented in "Results and discussion" and validation against independent data and models is discussed in "Model validation". We examine the secular acceleration signal observed during the gap period in "Secular acceleration". Finally, we make concluding remarks in the "Conclusions" section.

Instrumentation
We make use of magnetic field observations recorded by four different satellite missions (CHAMP, DMSP, Cryosat-2, and Swarm) as well as ground-based measurements extracted from the World Data Center at Edinburgh (Macmillan and Olsen 2013). CHAMP (CHAllenging Minisatellite Payload) and Swarm are dedicated magnetic field missions, whose primary science goals include observing all sources of the geomagnetic field. These missions were designed from the beginning to conform to high magnetic cleanliness standards in order to be able to characterize and remove signals of non-geophysical origin, such as spacecraft fields, from the measurements. The primary purpose of the DMSP (Defense Meteorological Satellite Program) series, in contrast, is to measure tropospheric weather with a high-resolution visible and infrared imaging system. Early DMSP missions did not include fluxgate magnetometers, as these were added later in the program to study magnetic signatures of ionospheric field-aligned currents (FACs). Since the magnetic signature of these currents can reach several hundreds to thousands of nanoTeslas, later DMSP spacecraft designs made little effort to implement magnetic cleanliness standards. Cryosat-2, whose primary mission is to monitor polar ice thickness, similarly exhibits significant spacecraft-generated fields in its fluxgate measurements. An earlier mission, named Cryosat-1, was lost due to a launch failure in 2005. We therefore will use Cryosat to refer to Cryosat-2 in the remainder of this paper.
CHAMP was launched on 15 July 2000 into a near circular, polar orbit (inclination 87.2 • ) with an initial altitude of about 450 km. Its payload included an Overhauser scalar magnetometer at the end of a 4-m boom, as well as a fluxgate vector magnetometer and two star cameras, mounted on an optical bench and situated half-way along the boom (Reigber et al. 2002). The scalar magnetometer was used to perform daily absolute calibrations of the fluxgate instrument. CHAMP provided scientific-quality geomagnetic observations for 10 years, decaying to about 250 km altitude, until it re-entered the atmosphere in September 2010.
The first DMSP satellite equipped with a triaxial fluxgate magnetometer, known as the special sensor magnetometer (SSM), was Flight 7 (F-7), launched on 18 November 1983. However, the SSM instrument did not become a standard part of the DMSP payload until the launch of F-12 on 29 August 1994. The SSM sensors on F-7, F-12, F-13, and F-14 were mounted on the spacecraft body, but due to excessive noise from the spacecraft, these data were found to be unsuitable for studying signals originating in Earth's core (Rich et al. 2007). Therefore, starting with F-15 (launched 12 December 1999), the SSM sensors were mounted on 5-m booms directed anti-radially from the spacecraft. F-16 (18 October 2003), F-17 (4 November 2006), and F-18 (18 October 2009 similarly contain boommounted fluxgate magnetometers. DMSP spacecraft fly in polar near-circular orbits of about 840 km altitude and inclinations of 98.8 • . The satellites are sun-synchronous, and starting with DMSP F-10, the ascending node is in the dusk-evening local time sector (Rich et al. 2007). While DMSP does carry some sensors onboard for attitude control, these data are unavailable to the scientific community, and so accurate attitude determination for DMSP is challenging (see "Attitude determination"). In the present study, we use data from the F-16, F-17, and F-18 boom-mounted SSM instruments.
Cryosat was launched on 8 April 2010 into a 92 • inclination orbit with an altitude of about 720 km. Its primary payload consists of radar instrumentation to record ice elevation. It also carries three platform fluxgate magnetometers, named FGM1, FGM2, and FGM3 for attitude control. These magnetometers are mounted on the satellite body, and are therefore subject to significant spacecraft fields and other noise. Cryosat is additionally equipped with three star cameras for attitude determination, although these cameras are not in proximity to the fluxgate instruments. For this study, we use a Cryosat dataset which has been extensively corrected for magnetotorquer fields, temperature drift variations, and other effects, which are detailed in Olsen et al. (2020).
The Swarm mission consists of three satellites (A, B, and C), launched on November 2013 into near-circular orbits. Swarm A and C fly in a side-by-side configuration with an inclination of 87.4 • and an initial altitude of about 460 km, while Swarm B flies in a higher orbit of about 530 km altitude and has an inclination of 88 • . Each satellite has a 4-m long boom, at the end of which is a dual redundancy absolute scalar magnetometer (ASM) package, which features an optional vector mode (Léger et al. 2015). At the center of the boom is an optical bench containing a fluxgate vector magnetometer and three star cameras for attitude determination (Friis-Christensen et al. 2006;Merayo et al. 2008). The ASM instruments are used to calibrate the fluxgate magnetometers. One of the redundant ASM instruments on Swarm C never switched on after launch, and the other ASM failed on 5 November 2014, and so calibration of Swarm C's fluxgate instrument with onboard absolute measurements is not possible after this date. For this study, we use only data from Swarm A and B.
Finally, the International Real-Time Magnetic Observatory Network (INTERMAGNET) is a global consortium which promotes the operation of magnetic observatories. The consortium has also developed standards for data exchange, data processing, and quality control to ensure that the provided data meet the stringent standards required for geomagnetic research. Many ground observatories providing data today have been operating for decades, making them ideally suited to record signals originating in the core on long timescales. Modern observatories use fluxgate magnetometers to record vector geomagnetic field observations, which are routinely calibrated against scalar proton precession magnetometers as well as theodolites (Love 2008). In this study, we use a ground observatory dataset prepared by the British Geological Survey following the methodology of Macmillan and Olsen (2013). This dataset relies heavily on INTERMAGNET stations, but also includes data from other ground observatories. Figure 1 shows the altitude profiles of the satellites considered here. Note that the DMSP and Cryosat missions continue to the present but we have only processed data until 31 December 2016 for DMSP and 31 December 2018 for Cryosat.

DMSP data processing
Ephemeris DMSP satellites do not typically carry GPS receivers, and so their orbital positions are determined through radar tracking and orbital propagation. Measurements from the Space Surveillance Network (SSN) are fitted for each DMSP orbit to determine the six Keplerian elements and the ballistic coefficient. Orbital equations are then solved, which include effects of the Earth's gravitation (Lemoine et al. 1998), Sun and moon gravitation, solar radiation pressure, ocean tidal forces, and finally atmospheric drag. The atmospheric density model combines a modified Jacchia 1970(Jacchia 1970) model with the US Air Force's High Accuracy Satellite Drag Model (HASDM) (Storz et al. 2002), which processes trajectory data of 75 to 80 calibration satellites to calculate global corrections to the thermospheric and exospheric neutral density. The accuracy of the DMSP ephemeris has been estimated to be 30 m at 1 standard deviation (B. Bowman, personal communication).

Fig. 1
Altitude evolution of the satellite missions used for this study. Solid curves are daily mean altitudes. Envelope curves are the daily minimum and maximum altitudes. Only one DMSP satellite is shown since the altitude profiles of the other two are nearly identical Attitude determination Rich et al. (2007) states that DMSP "attitude is maintained to within 0.01 • of the local vertical-forward orientation. " This is accomplished using a reaction wheel assembly, magneto-torquers, and a suite of instruments including star trackers, sun sensors, and horizon sensors. The attitude sensor data are unavailable to the scientific community, and so we will instead rely on the 0.01 • pointing accuracy specification to define a quasispacecraft-fixed coordinate system from which we can rotate into an Earth-fixed geographic frame. The "vertical-forward" orientation on DMSP is defined as a local vertical geodetic frame, which we call S to indicate it is the spacecraft frame. In this frame, the X S axis is a line from the spacecraft normal to the reference ellipsoid on the near side, which for DMSP is the WGS66 ellipsoid, having a mean equatorial radius of 6,378,145 m and an inverse flattening of 298.25 (Manuel Valenzuela, personal communication, 12 March 2019). The Z S axis is normal to both X S and the satellite velocity vector and is positive toward the sun. The Y S axis completes the right hand set. Next we define basis vectors ŝ 1 ,ŝ 2 ,ŝ 3 lying along the (X S , Y S , and Z S ) axes, respectively: Here, ê µ is an outward normal unit vector to the WGS66 ellipsoid surface, pointing toward the satellite and v is the satellite velocity vector. This is similar to a local vertical, local horizontal (LVLH) frame (Schaub and Junkins 2009), except r is replaced with ê µ . The components of ê µ in a desired frame can be determined by projecting the position vector of the satellite onto the ellipsoid along a normal to its surface. The location of the projected point on the ellipsoid surface can be determined by solving a system of three nonlinear equations which are detailed in Feltens (2009) andLigas (2012). We solve the "Case 3" set of equations described by Ligas (2012, Eqs. 17) in the usual Cartesian Earth Centered Earth Fixed (ECEF) frame, using Newton's method to determine the projected point for any given satellite position vector. The ECEF coordinates of the projected point can be used to calculate ê µ , using for example Ligas (2012, Eq. 7). With ê µ and thus ŝ 1 determined, the remaining two basis vectors can be readily computed using the velocity vector and Eqs. (2), (3). The attitude control system on DMSP attempts to keep the satellite body within 0.01 • of the frame S we have defined above, but ultimately we are interested in the orientation of the fluxgate magnetometer instrument, which is located on a 5-m boom. The fluxgate axes are mechanically aligned to be within 0.5 • of the spacecraftfixed axes ŝ 1 ,ŝ 2 ,ŝ 3 (Rich et al. 2007). Later during the model inversion, we will solve for a slow time-varying set of rotation parameters which bring the fluxgate axes into alignment with the S frame. Allowing these parameters to vary slowly in time can help account for seasonal thermal effects and mechanical noise. It is likely that the alignment between the fluxgate frame and S experiences more rapid temporal variations due to boom twisting, boom oscillations, and solar heating, but a treatment of these effects is beyond the scope of this study (see Miller and Sexton (2001) for a more detailed discussion of these phenomena). It remains to define a rotation matrix from the S frame to a local Earth-fixed North-East-Center (NEC) frame at each point along the satellite orbit, since the NEC frame is more convenient for expressing the equations representing the geomagnetic field. In the following, we will adopt the notation convention R B A to denote a rotation matrix which rotates a vector from components with respect to frame A to components with respect to the frame B . We therefore seek the matrix R NEC S which we will construct in two steps. Since both S and NEC are non-inertial, it is useful to rotate first from the spacecraft frame to a quasi-inertial frame, such as Earth Centered Inertial (ECI) [also known as Geocentric Equatorial Inertial (GEI) (Hapgood 1992)], and then to perform a second rotation from ECI to NEC: The matrix R ECI S can be written simply as where the column vectors ŝ i are expressed with respect to ECI components. The matrix R NEC ECI can be written as where R ECEF ECI requires a single rotation by the Greenwich sidereal angle about the Earth's rotation axis [see Hapgood (1992, Eq. 2) or Schaub and Junkins (2009, Eq. 9.42)], and R NEC ECEF is the standard transformation from ECEF Cartesian coordinates to NEC, which is simply a permutation of spherical coordinates.

Data cleaning
While later DMSP missions with boom-mounted magnetic sensors benefited from significantly reduced spacecraft noise, the magnetic data from these missions still contain significant signals of non-geophysical origin, primarily due to onboard systems, such as solar panels, heaters, and magnetotorquers. Unfortunately, the housekeeping data related to these systems are unavailable to the community, and so we adopt an empirical approach to identify and correct these spacecraft-generated signals. Figure 2 (top panel) shows two orbits of the residual X component (geodetic vertical) recorded by DMSP F-17 on 28 March 2012. Here, the CHAOS-6-x9 ) core field model using spherical harmonic degrees 1 to 15 was rotated into the S frame using the transformation described in "Attitude determination" and then subtracted from the measurements. Level shifts on the order of 30 nT are readily visible in this field component, which are likely due to the magnetotorquer coils (Rich 1984). These level shifts, if left uncorrected, would have adverse effects on the instrument calibration, and so we developed a two-step methodology to (1) identify and (2) correct these shifts. There are numerous ways to approach the problem of detecting and correcting level shifts of the type shown in Fig. 2 (top panel). One could use a spectral method, transforming the residual time series into the frequency domain, applying a low-pass filter, and transforming back to the time domain. Such a method would potentially modify samples in the time series which are not directly part of a level shift. We prefer a more minimalist approach, modifying as few measurements as possible to obtain a clean time series free of spacecraft fields. Our algorithms for the two steps are detailed below.

Level shift detection
To detect level shifts in the magnetometer time series, we follow the moving window approach detailed in Fried (2007) with some modifications. Given a discrete time series y j = y(t j ), j = 1, . . . , n , we define a window around the ith sample: where H and J are non-negative integers specifying the number of samples to include before and after the sample i, respectively. The total size of the window is K = H + J + 1 . We assume that an ideal shift (Fried 2007, Eq. 2) occurs between two adjacent samples at times t i and t i+1 , and test whether the difference between two level estimates (one looking backward from sample i and the other looking forward from sample i + 1 ) is statistically significant. Specifically, we test α is a threshold to define a significant level shift, σ 2 i is an estimate of the variance of the combined window , and τ i is a standardization so that the test statistic is asymptotically normal under the null hypothesis. We use window medians instead of arithmetic means to make the level estimates ŷ − i ,ŷ + i+1 robust to outliers which occasionally occur in our magnetic sensor time series. To estimate the variance σ 2 i , we use the Q n statistic Rousseeuw and Croux 1992) which is also robust to outliers and attains a high statistical efficiency of 82%. We found Q n to produce more accurate results on small window sizes compared with alternative robust-scale estimates, such as the median absolute deviation (MAD) due to its higher efficiency. To avoid problems with the two windows containing different levels, we first remove the level estimates ŷ − i ,ŷ + i+1 from their respective windows prior to calculating the variance, so that Estimating the dispersion from the combined windows in this manner assumes that the variances of both windows are identical, i.e., the application of the magnetotorquer signal does not change the noise characteristics of the observed signal. Based on inspection of the DMSP SSM data, we found no discernable changes in the signal noise after a level shift. For signals where level shifts produce known changes in the variance, more sophisticated methods can be used to estimate τ i (Fried 2007). We apply the procedure described above to each of the three 1 Hz magnetic sensor residual time series in the S frame, after removing the CHAOS-6-x9 core field model from each vector component. We choose H = J = 10 to define the moving window, and truncate the window as the signal endpoints are approached. We choose α = 10 to define the test criteria in Eq. (8), and further require ŷ − i −ŷ + i+1 > 5 nT to attempt to reduce false-positives. All samples i which pass the test criteria are flagged as potential level shifts to be analyzed in the second correction step. Figure 2 (bottom panel) plots the test statistic [left-hand side of Eq. (8)] in green for the same time series shown in the top panel. We see that all level shifts in this example produce a large value of the test statistic which can easily be flagged using the chosen threshold value α . There are additionally some false-positives, for example around 05:00 UT, which are caused by rapid field changes at high latitudes due to the polar ionospheric current system. While these features are flagged in the detection stage, we perform additional tests in the correction stage designed to prevent modification of the signal during these false-positive events.

Level shift correction
As discussed in the previous section, the shift detection algorithm can suffer from false-positives in regions where the signal is changing rapidly (particularly at high latitudes). However, the level shifts of interest are due to torquer coils and other instruments switching on for brief periods of time, manifesting as a squarewave type feature in the time series. In some cases, multiple instruments can switch on during the same time period, leading to multiple level shifts of different magnitudes. For the majority of the time, we could search for two level shifts of opposite sign and roughly equal in magnitude, occurring within some allowed time interval. To allow for multiple shifts, we designed a correction algorithm based on a double-ended queue (deque) data structure. Algorithm 1 presents the level shift correction, which proceeds by looping through all potential shift candidates identified in the first step. For each candidate, the queue is first emptied of all previous candidates which have fallen outside of some prescribed time window. Then, the sum of all level shifts still in the queue is compared with the shift of the current candidate, to determine if the sequence of level shifts has returned to the signal baseline (see Algorithm 2). If so, all elements of the queue are flagged as belonging to the same sequence of level shifts, and appropriate offsets are added to the time series to bring it down to the signal baseline, after which they are removed from the queue. If the current shift candidate does not match the other elements of the queue, it is appended to the front of the queue and the search continues. The level shifts detected with this algorithm are shown as blue triangles in Fig. 2 (bottom panel). The corrected data are shown in red, overlaying the original data.

Algorithm 1 Level Shift Correction algorithm
while DEQUE is not empty do purge old samples from DEQUE 6: if nshift > 0 then detected at least one shift 13: offset := δ i offset between shifted data and baseline 14: for k = 1, . . . ,nshift do loop over detected shifts 15:

Satellite data
Our combined dataset of CHAMP, DMSP, Cryosat, and Swarm observations is sub-sampled to a rate of 1 sample per 60 s. For each satellite orbit, we divide the data into half-orbital sections, spanning from the north to south pole, or vice versa. We will refer to these half-orbital sections as tracks. For each track, we compute a root-meansquare (rms) difference with respect to spherical harmonic degrees 1 to 15 of the CHAOS-6-x9 core field model ) for each vector component as well as the total field component. Tracks whose rms difference exceeds 200 nT in any component are rejected. This procedure helps to remove anomalous data recorded during satellite maneuvers. We further select data for geomagnetically quiet periods using the Kp index and the RC index (Olsen et al. 2014), which tracks the strength of the magnetospheric ring current. We use the following criteria: • Kp index does not exceed 2 from a time period starting 2 h before the track start time to the track end time. • The temporal change of the RC index |dRC/dt| does not exceed 3 nT/h from 3 h prior to the track start time to the track end time.
We additionally attempt to minimize perturbations due to the ionospheric field using local time criteria at low and mid latitudes and the solar zenith angle at high latitudes. These criteria are applied only to CHAMP and Swarm data, since we found the calibration parameters for DMSP and Cryosat are more robustly estimated when using both ascending and descending orbits. For CHAMP and Swarm, we use the following criteria: • For quasi-dipole (QD) (Richmond 1995) latitudes equatorward of ±55 • , the track's local time of ascending or descending node is between midnight and 05:00. • For QD latitudes poleward of ±55 • , the sun must be at least 10 • below the horizon, according to the solar zenith angle.
DMSP satellites are sun-synchronous in near dawn-dusk orbits, and due to their higher altitude we expect minimal ionospheric signals at low and mid latitudes. Therefore, we perform no local time selection for DMSP. Cryosat drifts in local time, so it can record equatorial electrojet signals of a few tens of nT on the dayside. To mitigate this, we discard Cryosat data between ±20 • QD latitude from 06:00 to 18:00 local time. The remaining mid-latitude Cryosat data could contain signals from the Sq current system (both primary and induced), but we expect these signals to be on the order of a few nT at Cryosat altitude, and we ignore them for this study. Finally, for CHAMP and Swarm vector measurements, we use only data for which two star cameras were available to provide attitude information. We use both vector and scalar measurements at mid and low latitudes (equatorward of ±55 • QD latitude). Poleward of ±55 • QD latitude, we use only scalar data to attempt to minimize signals due to field-aligned currents in the polar regions. Figure 3 shows temporal histograms of the vector and scalar measurements from each satellite after data selection.

Observatory data
We also process measurements from the global magnetic observatory network, both to add calibrated data during the gap period and also to validate our results. While geomagnetic observatories record the full magnetic field vector, these measurements contain a significant contribution from local crustal anomalies, which can be challenging to determine accurately, and satellite-based crustal models do not contain the required spatial resolution to correctly remove these "crustal biases". We therefore opt to calculate time series of secular variation as determined from magnetic recordings by the ground observatory network. Ground observatories are ideally suited to measure secular changes in the geomagnetic field, since they record long time series at a fixed location, so we can easily compare measurements at two different times to estimate a linear change in the field. Since the local crustal anomalies are assumed to remain fixed on our timescales of interest, they can be easily removed by subtracting measurements from different times. Our method of calculating secular variation time series from the ground observatory network is as follows. We start from the observatory dataset of hourly means provided by Macmillan and Olsen (2013), version 0121. We use

Co-estimation of internal field and calibration parameters
Most previous work on geomagnetic core field modeling has used a two-step approach: (1) calibrate satellite fluxgate magnetometer instruments using scalar reference values provided by another onboard instrument (Lancaster et al. 1980;Mandea et al. 2010;Olsen et al. 2003;Tøffner-Clausen et al. 2016), or using an a priori core field model (Alken et al. 2014;Langel et al. 1997;Sabaka et al. 1997), and then (2) fit internal core field parameters to the calibrated dataset Lesur et al. 2008;Maus et al. 2010;Olsen et al. 2014;Sabaka et al. 2004Sabaka et al. , 2015Sabaka et al. , 2018Thébault et al. 2015a, and references therein). Our new approach is to combine these two steps into a single co-estimation of both the core field parameters and calibration parameters. In the sections which follow, we discuss each set of parameters and the least-squares procedure used to perform the co-estimation.

Internal field parameterization
We describe the geomagnetic field originating in Earth's core and lithosphere using a scalar potential V of internal origin, where r, θ , φ are the geocentric radius, co-latitude, and longitude, respectively, a = 6371.2 km is a reference radius, g m n (t),h m n (t) are the Gauss coefficient time series, P m n (cos θ) are the Schmidt-normalized associated Legendre functions (Schmidt 1917;Winch et al. 2005), and N is an integer specifying the truncation level of the series expansion. This expression can be written compactly by making the definitions so that with (14) V (r, θ , φ, t) = a N n=1 n m=0 a r n+1 g m n (t) cos mφ +h m n (t) sin mφ P m n (cos θ), only hourly values between 01:00 A.M. and 05:00 A.M. local time to minimize ionospheric field contributions, and discard data for which |dRC/dt| > 4 nT/h to avoid storm periods. Daily means are then computed from the hourly mean values remaining for each day. We then estimate the secular variation at a given time t (in nT/year) as where B (t) is the daily mean value for time t. If a daily mean value is not available for ±6 months from the day of interest, due to our data selection criteria, we do not compute a secular variation for that day. The temporal histogram of secular variation data derived from the ground observatories is shown in Fig. 3 The corresponding magnetic field B = −∇V is then The Gauss coefficients g m n (t) describe both the spatial structure and temporal evolution of the internal geomagnetic field. Of particular interest in our study is the secular acceleration of the core field, which is given by the second time derivative of the field vector, Predicting the temporal evolution of the core field, even for short periods of time, is a challenging task and beyond the scope of this study. Therefore, we use a generic parameterization in terms of basis splines (B-splines) (De Boor 2001) to model each Gauss coefficient. A commonly used alternative defines a low-degree Taylor series expansion of each Gauss coefficient about the time period midpoint (Alken et al. 2014Chulliat and Maus 2014;Chulliat et al. 2015;Maus et al. 2006Maus et al. , 2010, but this typically works best when modeling the core field during time intervals of a few years or less. In the present study, we include data recorded by CHAMP, DMSP, Cryosat, Swarm, and ground observatories spanning 20 years, for which a spline-based approach is more appropriate for tracking temporal changes in the field. To capture the salient features of the secular acceleration, we choose to use cubic B-splines between uniform knots separated by 6 months to represent g m n (t) , which have proven successful in previous modeling efforts Olsen et al. 2014). Cubic splines for the second time derivatives require fifth degree polynomials (order 6 B-splines) for the Gauss coefficients g m n (t) . Therefore, we have where N i,k (t) are the normalized B-spline basis functions of order k = 6 (De Boor 2001), and g m n,i are the spline coefficients (also known as control points), which will be determined by inverting the observations. The functions N i,k (t) additionally depend on a knot vector which defines the interface between each piecewise polynomial used to construct the spline. The knot vector can be used to change continuity conditions on the spline, as well as define periodicity. In this work, we use open uniform knot vectors, which feature uniformly spaced simple knots, so that the spline is continuous to order k − 1 . The number of coefficients g m n,i depends on the spline order and the number of knots spanning the time interval of interest.

Alignment
Vector fluxgate magnetometers (VFMs) carried onboard satellites record magnetic field measurements along the fluxgate instrument axes. We will refer to measurements in the instrument frame as B VFM . However, satellite orientation data collected by star cameras or other attitude sensors are typically referenced to another coordinate frame, which we will call the common reference frame (CRF). Therefore, we must account for a rotation between the VFM and CRF frames, Here, the rotation matrix depends on a set of alignment parameters α(t) which are permitted to change slowly in time to account for thermal variations and mechanical noise. Alignment parameters have been routinely coestimated with internal geomagnetic field models from the Magsat era to the present Langel et al. 1981;Olsen et al. 2000;Rother et al. 2013;Sabaka et al. 2004). There exist numerous ways to parameterize an arbitrary rotation matrix in three dimensions (Schaub and Junkins 2009, Ch. 3). We choose a 1-2-3 Euler angle representation for all satellites, so that the matrices R CRF VFM (α(t)) can be written in the form We use order k = 4 (cubic) B-splines to define the time variation of the alignment parameters, where the B-splines N j,k (t) are defined with 30-day uniform knots. A set of alignment coefficients α j is estimated individually for the CHAMP and Swarm satellites. For DMSP and Cryosat, the alignment parameters are combined with the calibration parameters (see "Fluxgate calibration"). The number of coefficients α j depends on the time span of each satellite dataset.

Fluxgate calibration
We adopt the approach of Olsen et al. (2003) to define the calibration of the fluxgate magnetometer instruments onboard the DMSP and Cryosat satellites. We assume that these vector fluxgate magnetometers have a linear response to the background field, so that the magnetic field vector in physical units in the instrument frame, B VFM , is related to the raw instrument output vector in engineering units (eu), E VFM , as where o = (o 1 , o 2 , o 3 ) T is a vector of offsets (in eu), S = diag(s 1 , s 2 , s 3 ) a diagonal matrix of scale factors (nT/ eu), and is a matrix which maps a vector from the orthogonal magnetic axes coordinate system to the non-orthogonal magnetic sensor axes coordinate system. Alternative formulations of this transformation can be found in Langel et al. (1997) and Merayo et al. (2000). The 9 parameters s i , o i , u i , i = 1, 2, 3 completely determine the response of a linear magnetometer; however, estimating the individual The matrix elements m ij depend nonlinearly on the scales, non-orthogonalities, and alignment parameters; however, these can be easily extracted by performing a QL factorization of the matrix M(m) , identifying the Q factor with R CRF VFM (α) and the L factor with P −1 (u)S(s). The calibration parameters can change in time due to temperature variations and other effects such as mechanical noise. In addition, when co-estimating these parameters along with the geomagnetic field, unmodeled sources could contaminate the calibration parameters and introduce additional time dependencies. To allow for time variations in these parameters, we again use B-splines, where i, j = 1, 2, 3 and l is summed over the number of coefficients of each spline. For these calibration splines, we use order k = 4 corresponding to cubic polynomials between adjacent knots. We also chose uniform knots with 30-day intervals. This knot spacing allows us to track seasonal variations in the calibration parameters. There are likely to be variations in the calibration parameters on orbital time periods due to day/night differences; however, these are ignored in the present study.

Co-estimation approach
The Gauss coefficients representing the internal core field are co-estimated with the alignment and fluxgate (30) calibration parameters using a least-squares minimization approach. The least-squares residuals are defined with respect to a model vector representing geomagnetic field sources defined as where B ext (r, t) is the CHAOS-6 model of the magnetospheric ring and tail current sources and their induced counterparts Olsen et al. 2014) and B int (r, t; g) contains the internal (core and lithospheric) geomagnetic field sources, as given in Eqs. (19), (20), parameterized by a set of Gauss coefficients g . We allow the Gauss coefficients g m n (t) to vary in time up to spherical harmonic degree and order 15 to capture the time variations of the core field, after which they are restricted to be static to represent the lithospheric field. Therefore, we have The time dependence of g m n (t) is parameterized with B-splines as described in Eq. (21) and surrounding text. We choose N = 50 as the upper spherical harmonic degree for the static internal field primarily due to lithospheric sources. Including the CHAOS-6 magnetospheric field model, B ext (r, t) in Eq. (35) is equivalent to removing this from the magnetic measurements prior to the model fitting. This model was constructed from a combined dataset of CHAMP, Ørsted, Swarm, and ground observatory measurements Olsen et al. 2014) which covers our time period of interest, and so we have not attempted to co-estimate a model of magnetospheric sources in this study. Due to the inclusion of Ørsted data, we believe this model provides valid estimates of the magnetospheric and induced sources at Cryosat and DMSP altitudes. With the geomagnetic field model vector defined, we can now present the residuals to be minimized in the least-squares problem. There are five classes of residuals: ǫ Vector residuals for previously calibrated measurements (CHAMP/Swarm) ξ Vector residuals for uncalibrated measurements (Cryosat/DMSP) δ Vector residuals for secular variation time series recorded by observatories κ Scalar residuals for previously calibrated measurements (CHAMP/Swarm) ψ Scalar residuals for uncalibrated measurements (Cryosat/DMSP) These residuals are defined as (37) For CHAMP, Swarm, and Cryosat, the common reference frame is defined with respect to their star camera axes, and the matrices R NEC CRF which rotate from CRF to NEC are provided in the Level-1b data files. For DMSP, the CRF is the S frame defined in Eqs. (1)-(3). For CHAMP and Swarm, B VFM i is the calibrated vector fluxgate measurement made at location r i and time t i , while F i is the corresponding scalar measurement made by the absolute scalar magnetometer instrument. For DMSP and Cryosat, E VFM i is the uncalibrated vector fluxgate measurement at (r i , t i ) . For Cryosat, the vector E VFM i has additionally been corrected for temperature effects, spacecraft fields, and magnetometer non-linearities, as detailed in Olsen et al. (2020). Ḃ i is the secular variation measurement made from a ground observatory at (r i , t i ) using the methodology described in "Observatory data".
The model parameter vector to be determined consists of the coefficients of all timedependent Gauss splines g t for degrees 1 through 15, the time-independent Gauss coefficients g static for degrees 16 through N, coefficients of the alignment splines for CHAMP and Swarm, α , and finally the coefficients for the calibration splines for DMSP and Cryosat, c = (m, β) T . These parameters are determined by minimizing the following least-squares penalty function, are weights assigned to each residual (see "Data weighting"), and is a regularization matrix. Further details on the model regularization are given in the following section.

Model regularization
We perform regularization of the model parameters to give preference to solutions with desired physical properties. With the exception of g static , all model parameters vary in time, and so we apply temporal smoothing to prevent large and unphysical oscillations in their time series. Because all time variations are parameterized in terms of B-splines, we introduce the following B-spline Gram matrix, which will appear in the temporal regularization terms, In most cases, we will compute the integrals over the full support of the B-spline basis N i,k (t) so that t a , t b correspond to the minimum and maximum knot, respectively, and use the notation G (p) ij for simplicity. These matrices are square, symmetric, and banded, with lower bandwidth k − 1 . Their size is equal to the number of control points (coefficients) on the B-spline basis. The matrix G (0) is additionally positive definite, but for p > 0 the matrices are indefinite. Since the N i,k (t) are polynomials of degree k − 1 , the Gram matrix elements can be accurately and efficiently computed with Gauss-Legendre quadrature.
The time-dependent internal field parameters g t are regularized by minimizing the squared third time derivative of B r averaged over the core mantle boundary (CMB) with geocentric radius c = 3485 km, where S(c) is a spherical surface of radius c and [t min , t max ] is the time span (in years) of the data used in the modeling (see "Results and discussion" and Table 1). The spatial integral over S(c) is based on the so-called minimum energy norm (Gubbins 1983). The time integral helps to ensure a smooth temporal variation in the core field Gauss coefficients. We additionally found when co-estimating calibration scale factors and the time-dependent internal field, there were significant cross-correlations between the spherical harmonic degree 1 splines and scale factors (see "Results and discussion"). Therefore, we additionally minimize the dipole part of the squared second time derivative of B r , averaged over the CMB, across the gap period, Here, ⊗ denotes the Kronecker product (Van Loan 2000),

, gap 2
are regularization parameters, D is a diagonal matrix whose size is equal to the number of spherical harmonic coefficients parameterizing the time-dependent internal field, and whose entries are and D dipole corresponds only to the dipole portion of D, No regularization is applied to the static internal field coefficients g static .
The alignment parameter splines are regularized by minimizing an approximation to their curvature (the second time derivative), integrated over the time period of interest, where t is the time span of the corresponding satellite (CHAMP or Swarm). Equation 50 can be written as α , . . . is a block diagonal matrix with one block per satellite fluxgate instrument. Each satellite block has the form We use the same regularization parameter for each alignment parameter α 1 , α 2 , α 3 and also for all satellites, so that 2 α 1 = 2 α 2 = 2 α 3 . Finally, the fluxgate calibration parameter splines are regularized in a manner similar to the alignment parameters by minimizing their curvature, integrated over the time interval. This is equivalent to minimizing the term c , . . . is block diagonal, with one block per Cryosat/DMSP instrument. Each satellite instrument has a block of the form We use the same regularization parameter m for each matrix element spline and for all Cryosat/DMSP instruments, and similarly for the offsets ( β ).

The full regularization matrix is then given by
The regularization parameters 3 , gap 2 , α1 , α2 , α3 , m , β are chosen heuristically to prevent large and unphysical oscillations in the time series of the model parameters.

Data Weighting
The data weights appearing in Eq. (43) are a product of three factors: w (s) spatial factor to achieve a uniform weighting over all latitudes and longitudes w (d) dataset factor to account for the relative quality of different data sources w (r) a robust weight designed to reduce the effect of outliers in the data Since polar orbiting satellites sample the polar regions more frequently than low and mid latitudes, the factor w (s) attempts to assign larger weight to sparsely sampled (equatorial) regions and smaller weights to densely sampled (polar) regions. For the ground observatory network, this factor will assign larger weight to observatories in oceanic regions which are sparsely covered and lower weight to places like Europe and North America which have dense concentrations of ground observatories. To define spatial weights, we construct an equal area grid over Earth's surface, and count the number of data collected in each grid cell. The spatial weight of all data falling into a grid cell (i, j) is then defined to be proportional to the reciprical number of data in that cell, w (s) ij ∝ŵ (s) ij with ŵ (s) ij = 1/n ij , where n ij is the number of data in cell (i, j). The proportionality constant is chosen so that the spatial weights sum to unity: kl . The dataset-specific weight factors w (d) are used to account for the significantly higher noise characteristics of DMSP relative to the other satellites, due to its poorer attitude knowledge and larger spacecraft fields. We also use w (d) to assign higher weight to ground observations, since these are spatially scarcer than satellite measurements. For CHAMP, Swarm, and Cryosat, we select w (d) = 1 . For DMSP, we select w (d) = 0.1 , and finally for ground observatories we chose w (d) = 2.
The weights w (s) and w (d) are fixed once all the model data are known. The robust weights w (r) change during the course of the iterative least-squares procedure, since they (53) are designed to downweight large model outliers. These weights are discussed in more detail in "Least squares inversion".

Least-squares inversion
The cost function in Eq. (43) is minimized using a Levenberg-Marquardt nonlinear least-squares approach, combined with robust iterative re-weighting of the residuals to reduce the effect of outliers. We start with an initial guess where the core field parameters are set according a b Fig. 4 a Residual histograms for all satellites and field components used in Model A. b Residual histograms for ground observatory secular variation dataset used in Model A. Histograms are normalized to have unit area. Nonpolar histograms are calculated from residual data equatorward of 55 • QD latitude, while polar histograms use residuals poleward of 55 • to the IGRF prediction (Thébault et al. 2015b), the alignment parameters are set to zero, and the fluxgate calibration parameters are set to their ideal values (scale factors = 1, offsets = 0, non-orthogonalities = 0). Then proceeding from this initial guess, we solve a trust region subproblem for each iteration k, where ζ k is the total residual vector for iteration k, including vector residuals, scalar residuals, and regularization terms, J k = ∂ζ k /∂x k is the Jacobian matrix of the residuals and regularization terms, W k is the data weighting matrix, A k =J T k W kJk is an approximation to the Hessian matrix, µ k is the Levenberg-Marquardt parameter, and δx k is the step direction. We use the notation J to distinguish from the Jacobian matrix of the residuals without regularization terms, J, which is discussed later in the "Resolution" section. The model parameter vector is updated according to x k+1 = x k + δx k , provided the new step δx k reduces the cost function χ 2 . If a computed step does not reduce the cost function, it is rejected and the parameter µ k is increased to reduce the size of the trust region. If the step does reduce the cost function, then it is accepted and µ k is decreased to enlarge the trust region and take more ambitious steps in future iterations. The parameter µ k is updated according to the procedure given in Nielsen (1999). For each iteration, we test for convergence by checking for a small step vector δx k . These "inner" Levenberg-Marquardt iterations are capped at a maximum of 10, since we also implement "outer" iterations based on iterative re-weighting of the residuals (Farquharson and Oldenburg 1998). For each outer iteration, we re-weight the residuals using Huber's function (Huber 1996). This function assigns small weights to large residuals and large weights to small residuals, eventually ensuring that large outliers provide a negligible contribution to the final model parameters. We perform a total of 10 outer robust iterations.

Results and discussion
We built five core field models, listed in Table 1, which utilized different combinations of datasets and regularization methods. Model A, spanning 20 July 2000 to 31 December 2019, included measurements from CHAMP, Swarm A and B, DMSP F-16, F-17 and F-18, Cryosat FGM1, FGM2, and FGM3, and ground observatories. In this model, the CHAMP, Swarm, and observatory datasets have been calibrated previously against absolute magnetic field measurements, and therefore act as truth data to enable a robust estimation of the Cryosat/DMSP calibration parameters. Model A includes previously calibrated data during the entire time interval, completely covering the time spans of the Cryosat and DMSP datasets. Model A is regularized by minimizing the third time derivative of the radial field component averaged over the CMB ( 3 ), smoothing the alignment splines for CHAMP and Swarm ( α 1,2,3 ), and finally smoothing the calibration splines for Cryosat and DMSP ( m , β ). The values of these regularization parameters are given in Table 2. Figure 4 shows the Model A residual histograms (normalized so that integrated area equals unity) for each dataset and magnetic field component. The CHAMP and Swarm vector and nonpolar scalar (F) residuals present tall narrow peaks centered on a zero mean, indicating the low noise and general superior quality of those datasets. The DMSP and Cryosat residuals exhibit broader peaks, indicating the higher noise level of these data and possible contamination by non-geophysical field sources. All satellites exhibit asymmetries in the polar scalar residuals, due to unmodeled ionospheric fields at high latitudes. The observatory nonpolar histograms show peaks with near-zero mean for the time derivatives of each of the magnetic field components. The dB θ /dt histogram exhibits a broader peak than dB r /dt, dB φ /dt , which is likely due to effects of the magnetospheric ring current which are not fully removed from the observatory measurements. The ring current has the largest effect in the B θ component at low latitudes and climatological models do not fully account for its variability. The observatory polar histograms exhibit broader peaks than their nonpolar counterparts, indicating the higher noise level at highlatitudes due to ionospheric current systems and their induced fields which are not accounted for in our modeling or pre-processing. The residual statistics for Model A are provided in Table 3. Models B and C cover the same time interval as Model A, but use only satellite data, excluding the observatory dataset. They have the same number of model parameters as Model A. These models were built to investigate the quality of the co-estimated calibration parameters of Cryosat and DMSP during the gap period between CHAMP and Swarm when previously calibrated truth  shows the time series of the internal dipole Gauss coefficient g 0 1 (t) for CHAOS-6-x9 (red), Model A (blue), and Model B (green). We find close agreement between CHAOS-6-x9 and Model A during the full time interval, with only small differences during the gap period between CHAMP and Swarm, indicated by the blue rectangular region in the figure (see also inset zoomed view). CHAOS-6-x9 relies entirely on observatory data during this gap period, while Model A uses both observatories and Cryosat/DMSP data, which could account for these small differences. Interestingly, the g 0 1 (t) of Model B matches Model A closely during the   CHAMP and Swarm periods, but exhibits a deviation of several tens of nanoTeslas in the gap period. To investigate further, we present the time series of the Cryosat FGM1 scale factor s 3 (t) (middle panel) and DMSP F-16 scale factor s 1 (t) (bottom panel). In these panels, the red curves are the result of performing a vector calibration of the respective satellite dataset against the CHAOS-6-x9 model using the procedure detailed in the "Fluxgate calibration" section. The blue and green curves are the solutions obtained from our co-estimation approach for Models A and B. We find that the scale factors of Model A closely follow the independent CHAOS-6-x9 calibration. The scale factors of Model B agree with Model A during the CHAMP and Swarm periods, but exhibits clear deviations of up to a tenth of 1% during the gap. A tenth of 1% error in the scale factors could easily translate into tens of nanoTesla errors in the dipole strength. We found that this error in the calibration parameters during the gap period of Model B was restricted only to the scale factors; the offsets, non-orthogonalities, and alignment parameters (not shown) exhibited close agreement with Model A during the full time interval. We further inspected the time series of the other Gauss coefficients and found reasonable agreement between Models A and B, and so only the dipole term g 0 1 (t) exhibits such a disturbance during the gap period.
For Model A, the ground observatory dataset is acting as a constraint on the dipole term through the gap period. However, to build forward-looking models using platform magnetometers in an era without a mission like Swarm, it may be necessary to use other strategies, in addition to observatory data. Many observatories take months, and sometimes years, to provide their data to INTERMAGNET. Also, to estimate secular variation at an observatory location, our method requires measurements from 6 months past the day of interest. Therefore, we would like the ability to build geomagnetic field models from platform magnetometer data accounting for possible delays in the availability of ground observatory measurements. To prevent anomalous behavior in g 0 1 (t) during periods without previously calibrated measurements, we built Model C, which was calculated from exactly the same datasets as Model B, except we apply an additional regularization to minimize the second time derivative of the dipole part of B r averaged over the CMB during the gap period ( gap 2 ), as discussed in "Model regularization". This constraint essentially forces the internal dipole terms g 0 1 (t), g 1 1 (t), g −1 1 (t) to follow a near-linear trend through the gap period, preventing the large deviation seen in Model B. We believe this is justified since the dipole part of the field changes slowly on the timescales of interest here, and can be well approximated by a linear fit. The resulting Model C curve for g 0 1 (t) is shown in Fig. 5 (top panel, orange). The zoomed inset view shows that Model C closely follows Model A and CHAOS-6-x9. The middle and bottom panels additionally demonstrate that the scale factors predicted by Model C agree remarkably well with Model A and CHAOS-6-x9. These results  Table 4 show that the platform magnetometers on Cryosat and DMSP can be accurately calibrated using one of the following two methods: 1 Inclusion of previously calibrated data spanning the full calibration time interval (e.g., Model A). 2 Specification of the internal dipole during the calibration time interval, either through regularization constraints or an a priori model (e.g., Model C).
We built two additional models (D and E). Model D was built only from CHAMP and Swarm A/B data, using the gap 2 regularization, to test whether Cryosat and DMSP are in fact providing significant information on secular variation and acceleration during the gap years, or if regularization is playing a dominant role during this time. Model E was built entirely from Cryosat and DMSP data, again using the gap 2 regularization, to investigate the necessity of using the CHAMP and Swarm datasets to achieve a robust estimation of the calibration parameters. Due to the higher altitudes of Cryosat and DMSP, we do not fit a static internal field to Model E. These models will be discussed further in the "Model validation" section. Figure 6 shows time series of the fluxgate magnetometer calibration parameters, after removing their mean values, estimated from the three fluxgate instruments onboard Cryosat (top three rows) and the DMSP F-16,  Table 4. The Cryosat FGM2 offset o 3 (t) exhibits large annual variations compared with FGM1 and FGM3. Also, the FGM2 non-orthogonality angle u 2 (t) shows several localized spike features which are not present in FGM1 or FGM3. We do not have an explanation for this, but we believe these features are related to the significantly larger σ in the B r component of FGM2 as seen in Table 3 compared with FGM1 and FGM3. The DMSP satellites exhibit larger variations in nearly all calibration parameters compared with Cryosat, which is indicative of the higher noise characteristics of the DMSP measurements. Nevertheless, our analysis of the relationship between the calibration and dipole solution g 0 1 (t) as previously discussed indicates that both the DMSP and Cryosat calibration parameters are robustly estimated. Figure 7 shows the alignment parameters computed to rotate the vector magnetic field measurement from the instrument (VFM) frame to the common reference frame. We use the Euler angle 1-2-3 convention for all satellites. The mean value of each Euler angle time series is removed prior to plotting. These mean values are provided in Table 4. Variations of the alignment parameters about their mean values can be caused by multiple phenomena, such as thermal variations, mechanical noise, boom twisting, boom oscillations, spacecraft fields, and unmodeled geomagnetic sources, particularly from the ionosphere. We find variations of several tens of arcseconds for the CHAMP, Swarm, and Cryosat instruments over multiple years of observations. DMSP presents the largest variations, spanning several hundred arcseconds over its model time period. This is likely due to the less accurate attitude knowledge on DMSP due to a lack of star cameras, as well as the higher noise level of the DMSP instruments, including large spacecraft fields which could not be removed from the data. Similar variations in CHAMP and Swarm alignment parameters were analyzed by Maus (2015) who hypothesized they could be due to either unmodeled ionospheric fields or a stellar aberration effect which is not fully corrected in the star camera data. Herceg et al. (2017) performed a detailed analysis of this effect on the Swarm mission, ruling out a flawed stellar aberration correction, and attributing the effect to thermal variations in the optical bench system. We find similar variations in our DMSP alignment parameters, and since DMSP does not carry star cameras for attitude determination, we can say at least for these satellites that the effect is not due to a flawed stellar aberration correction. Whether the source of these variations comes from unmodeled ionospheric fields, thermal variations, or some other phenomena is beyond the scope of this study.  Alken et al. Earth, Planets and Space (2020) 72:49 Model validation

Model comparisons with ground observatories
Model A is the only model which includes measurements from the ground observatory network, and so these ground stations offer a convenient way of evaluating the quality of the satellite-only models, particularly during the gap period. Due to the difficulties posed by large localized crustal anomalies in observatory measurements, we opt to use secular variation time series as discussed in "Observatory data" in order to validate our models. We are particularly interested in the quality of Model C, which is derived from all four available satellite missions, and contains no previously calibrated data during the gap period. Figure 8 shows examples of secular variation recorded by five ground observatories (KOU, MBO, ASC, HON, HER) along with our prediction of Model A (green) and Model C (red) for each field component. The blue shaded region indicates the gap period between the CHAMP and Swarm missions. We can see by visual inspection that Model A exhibits excellent agreement with the secular variation at each of these stations both inside and outside the gap period, particularly in  the Ḃ r and Ḃ φ components. This is unsurprising since these same ground observatory data are used in building Model A. The Ḃ θ datasets show larger variance due to external fields which are difficult to remove from the data. Model C, which uses no observatory data, shows reasonable agreement with the secular variation time series, although there are some exceptions, such as the B r component of Mbour during the gap period. It additionally exhibits undulations in the gap region, which indicates there may be a need to regularize higher degree spherical harmonics in the model beyond the dipole regularization ( gap 2 ). To quantify the secular variation predictions of our models during the gap period, we computed the mean, standard deviation, median, and median absolute deviation (MAD) of the residuals between the observatory SV time series and our models. We include the median and MAD since they are robust to a small number of outliers, which occasionally occur in some station data due to instrument errors. For this calculation, we used only ground observatories equatorward of ±40 • QD latitude to minimize noise due to the high-latitude polar ionospheric current systems. These criteria resulted in a total of 99 observatories for the analysis. The results are shown in Table 5. Model A (all data sources) exhibits residual median absolute deviations of 3. 70, 8.69, and 4.56 nT/year for the B r , B θ , B φ components respectively throughout the gap interval 19 September 2010 to 22 November 2013. These numbers increase for Model C, which relies only Cryosat and DMSP during the gap to 4. 50, 8.85, and 4.82 nT/year, respectively. This modest increase is likely due to some remaining ambiguity in separating the Cryosat and DMSP calibration parameters from the internal field without having a previously calibrated dataset like INTERMAGNET available. Model D, built from CHAMP and Swarm only shows even larger residual MAD values of 5.11, 10.62, and 4.85 nT/year, respectively. Model D uses the gap 2 regularization during the gap but does not include any measurements during this period. These numbers represent increases in all components relative to Model C, which confirms that Cryosat and DMSP are providing meaningful information on secular variation during the gap period, beyond what regularization can provide. Model E, built from only Cryosat/DMSP with no previously calibrated datasets, shows significantly larger residuals in all components compared with Model C. While Model E applies the gap 2 regularization similar to Model C, it is clear that the model quality deteriorates without the inclusion of previously calibrated datasets such as CHAMP and Swarm. Figure 9 shows the Lowes-Mauersberger spectra (Mauersberger 1956;Lowes 1966Lowes , 1974 for the main field (red), secular variation (green), and secular acceleration (blue) for four different epochs at the Earth's surface. Solid lines present the spectra for Model A, dashed lines represent CHAOS-6-x9, and dotted lines are the differences between the two models. We present the spectra only for the time-varying portion of the field, up to spherical harmonic degree 15. We find close agreement for the main field at all degrees up to 15, with differences of 1 nT 2 or less, except for the epoch 2012.5, which shows slightly larger main field differences. This is likely due to the different datasets used to fill in the gap period in the two models. The secular variation spectra exhibit close agreement for all spherical harmonic degrees with differences of order 1 (nT/year) 2 or less. The secular acceleration spectra show agreement to within about 1 (nT/year 2 ) 2 for epochs 2006, 2009.5, and 2017, with the exception of the degree 1 terms in 2017. The degree 1 secular acceleration can vary between models due to the way in which internal fields are separated from external magnetospheric fields. CHAOS co-estimates a spherical harmonic degree 2 external field (Olsen et al. 2014) while Model A does not. For epoch 2012.5, we see visible differences in the secular acceleration for all spherical harmonic degrees, with Model A showing more power at all degrees. This could be due to the inclusion of Cryosat and DMSP data in Model A, as well as differences in model regularization across the gap period.

Resolution
Resolution analysis provides a mechanism to investigate the ability of the data to determine (or resolve) the model parameters. The model resolution matrix R relates the unknown true model parameters x true to the estimated parameters x through the relation x = Rx true (Menke 2012;Aster et al. 2013). For our nonlinear inverse problem, if we assume that the model is approximately linear in the vicinity of the estimated model parameters (Tarantola 2005), then the resolution matrix is defined as where the Jacobian matrix J is evaluated at the final estimate of the model parameters x . However, following the approach of Sabaka et al. (2004), we can partition our data into subsets and define model resolution matrices for each subset i, as well as a resolution matrix for the a Here, the J i matrices are constructed only from the ith data subset and similarly for the weight matrices W i . The trace of R i gives the number of model parameters resolved by the ith data subset, while the trace of R gives the number of parameters resolved by the a priori regularization (Tarantola 2005;Sabaka et al. 2004). In the present study, we are particularly interested in the timevarying internal field parameters resolved by Cryosat and DMSP beyond what the ground observatories alone can resolve during the gap period. We therefore define four data subsets for Model A: (1) CHAMP, (2) Swarm A and B, (3) ground observatories, and (4) Cryosat and DMSP. The trace of the resolution matrices for these subsets is given in Table 6. Column N i in the table provides the total number of measurements in data subset i, while for the regularization norm , N i = rank(�) . The matrices R i in the table refer to the diagonal block of R i corresponding to only the time-varying internal field parameters. Here, we can see that CHAMP is resolving about 8.0% of the time-dependent internal field parameters compared to Swarm's 5.5%. This is primarily due to CHAMP spanning a longer time interval than Swarm. Interestingly, we find that the ground observatories are resolving about 1.7% time-varying internal field parameters compared with 1.9% for Cryosat and DMSP. This indicates that these satellites are indeed providing additional information about the internal field parameters beyond what ground observatories alone can provide, particularly since these ground measurements span the full 20-year time period, while Cryosat and DMSP combined span only 10 years. However, we note that because the data weighting of subsets is included in the resolution matrices (Eqs. (56), (57)), we must exercise caution when interpreting these numbers especially due to the large number of measurements contained in the Cryosat/DMSP subset relative to the ground observations. Additional insight can be gained by plotting the diagonal entries of the resolution matrices. Figure 10a presents the diagonal entries of R i (Model A) for each of our four data subsets, as well as the full resolution matrix [ diag(R) , Eq. (55)], plotted as a function of coefficient index. The full resolution diagonal entries are shown in grey, CHAMP in blue, Swarm A/B in red, observatories in orange, and Cryosat/DMSP in green. The figure is partitioned into four sections, as indicated by the arrows at the top. The time-dependent internal part is characterized by a series of spikes in resolution followed by a rapid decrease. The ordering of coefficients in this section is as follows. There are a total of 43 B-spline coefficients g m n,i for each spherical harmonic degree n and order m as written in Eq. (21). Since we expand the time-varying coefficients to degree and order 15, there are a total of 255 spherical harmonic coefficients for each of the 43 B-spline parameters. We organize the coefficient indices into 43 blocks, each block containing 255 SH coefficients for that block, with the n = 1 dipole terms first, then n = 2 and so on. The spikes seen in Figure 10a indicate an increase in resolution for the low degree SH coefficients, followed by a decrease in resolution as we move toward the degree 15 parameters. This resolution decrease is the result of the 3 regularization which primarily aims to reduce the uncertainty in the higher SH degrees at the cost of decreased resolution. Because the time-varying internal field coefficients are organized in blocks according to the B-spline coefficients, as we move higher in coefficient index, the resolution values correspond approximately to a time series, due to the local nature of the B-spline basis functions. The resolution of the time-varying parameters is explicitly plotted as a function of time and spherical harmonic degree for datasets: CHAMP and Swarm (b), Cryosat and DMSP (c), observatories (d), and all data sources (e). Returning to panel (a), we can clearly see that the total resolution (grey) exhibits a dip below 0.8 in the gap period between CHAMP (blue) and Swarm (red). This indicates that the 3 regularization is playing a larger role in the gap period and that the combined ground observatory, Cryosat, and DMSP datasets cannot provide the same resolution as the CHAMP and Swarm datasets. In the inset zoom view in the right of the figure, we expand this gap region further to visualize the differences between observatories (orange) and Cryosat/ DMSP (green). The amplitude of the spikes (due to the low SH degree parameters) are higher for Cryosat/DMSP, but as mentioned previously, this could be due to relative weighting differences between the two datasets. What is more interesting is that the the observatory curves drop to nearzero resolution discernibly faster than the Cryosat/DMSP curves as we move to higher SH degrees. This feature is less likely to be caused by relative weighting differences, as it shows that ground stations are unable to provide resolution for the internal field parameters past some threshold, while Cryosat/DMSP continue to provide a meaningful resolution. For example, we find that during the gap period, the observatory resolution drops below 0.1 at spherical harmonic degree 5, while Cryosat/DMSP resolution drops below the same threshold at SH degree 8. This observation, combined with the validation results of the "Model comparisons with ground observatories" section, allows us to conclude that Cryosat/DMSP are indeed providing valuable information about the time-varying internal field parameters to spherical harmonic degrees higher than what ground observatories and regularization alone can provide.
The static field portion of Fig. 10a shows CHAMP (blue) contributing significantly higher resolution than the other datasets, with its resolution increasing at the higher SH degrees. This is expected due to CHAMP's lower altitude at the end of its mission life. CHAMP's ability to resolve higher degrees of the lithospheric field is further analyzed by Olsen et al. (2017). The total resolution (grey) is exactly Fig. 11 Time/longitude maps of secular acceleration in the B r component (top row) and third time derivative of B r (bottom row) at the core mantle boundary along the geographic equator. Maps are generated using spherical harmonic degrees 1 to 10 for Model A (left column) and CHAOS-6-x9 (right column). Dashed lines indicate gap period between CHAMP and Swarm one for these parameters, since we have not applied any regularization to the static part of the model. The next portion shows the alignment parameters due to CHAMP and Swarm, which exhibit significant variation between about 0.4 and 0.9 resolution. We have not fully investigated the cause of this, but unmodeled signals from the ionosphere and magnetosphere are often responsible for uncertainties in alignment parameters, and these variations could likely be reduced by applying heavier regularization to the alignment splines. A similar statement could be made for the Cryosat/DMSP calibration parameters shown in the last portion of the figure. Here, we see that Cryosat's parameters (the first three signals shown in green, one for each of the fluxgate instruments) are quite well resolved, above 0.9 for all three instruments. We attribute this to the accurate attitude knowledge provided by Cryosat's star cameras, as well as the careful data cleaning and preprocessing performed by Olsen et al. (2020). The DMSP calibration parameters (last three signals shown in green, one for each satellite F-16, F-17, F-18) show much larger variations, which are likely due to the poor attitude knowledge and significant spacecraft noise present on these satellites.

Secular acceleration
Several recent studies have identified pulses of secular acceleration occurring on sub-decadal timescales (Olsen and Mandea 2007;Lesur et al. 2008;Chulliat et al. 2010;Chulliat and Maus 2014;Chulliat et al. 2015;Torta et al. 2015;Finlay et al. 2016). Secular acceleration pulses have been identified in models built from CHAMP, Swarm, and the ground observatory network in , 2009.5 (Chulliat et al. 2015. The occurrence of this latter pulse during the gap period between CHAMP and Swarm is one of the motivations of calibrating other space-based fluxgate instruments in the present study to obtain a global picture of these rapid field changes. To analyze secular acceleration, it is necessary to downward continue the radial component of the field to its outer source region at the core mantle boundary (CMB, 3485 km radius). To do this, we assume electrical currents in the mantle are negligible on our timescales of interest, so that the magnetic field can be represented as a potential field at the CMB, allowing a straightforward downward continuation. Figure 11 (top row) presents longitude vs time maps of the secular acceleration to spherical harmonic degree 10 along the geographic equator at the CMB for our Model A (left) and CHAOS-6-x9 (right). The black dashed lines indicate the gap period between CHAMP and Swarm data, and the signal during this time represents the previously reported 2012.5 pulse. Model A and CHAOS-6-x9 exhibit some agreement during this period, although there are some small-scale differences in the patches of secular acceleration at different longitudes. In particular, Model A shows more power in the secular acceleration pulses during this time, which could be due to the addition of the Cryosat and DMSP datasets for Model A, while CHAOS-6-x9 uses only ground observatories during the gap period. They could also be due to differences in regularization applied to the different models. A careful analysis of these differences is beyond the scope of the present study, but our validation results of the "Model validation" section indicate that DMSP and Cryosat are providing valuable information about secular acceleration during this period. Figure 11 also indicates the presence of a fourth secular acceleration pulse around 2017. The bottom row of Fig. 11 presents longitude vs time plots of the third time derivative of B r at the CMB along the geographic equator for Model A (left) and CHAOS-6-x9 (right). Here, we see more clearly the geomagnetic jerks occurring in 2005, 2008, 2011, and 2014 in the Atlantic sector. We also see indications of a jerk around 2018 to 2019, which we expect will become more visible in the coming years as more satellite data are recorded. These figures again demonstrate close agreement in the large-scale features between the two models, with some small-scale differences. In particular, we note the significantly stronger signal in 2011 for Model A compared with CHAOS. This again is possibly due to the increased resolution provided by Cryosat and DMSP at the higher spherical harmonic degrees. Figure 12 presents spatial maps of the secular acceleration at the CMB for Model A at epochs corresponding to strong equatorial pulses. We see pulses along the equator which change polarity roughly every 3-4 years. We additionally see these pulses localized in the Atlantic region in 2006 and moving toward the Pacific into 2017.

Conclusions
We have presented a novel approach to co-estimating the calibration parameters of space-based fluxgate magnetometers simultaneously with internal spherical harmonic coefficients representing Earth's geomagnetic core field. Our approach eliminates the need for an a priori internal field model for calibration, such as IGRF, which could contaminate the calibration parameters with model errors, though at present we do use a prior external field model which could affect the final model parameters. Instead, our method requires a reference dataset of previously calibrated measurements. The present study uses CHAMP, Swarm, and ground observatories to provide this reference dataset. We supplemented these reference data with six additional space-based magnetic instruments (three DMSP satellites and three fluxgates carried by Cryosat). We found that we could robustly estimate calibration parameters for each of these six instruments, as well as obtain internal field model parameters which closely match existing state-of-the-art field models. To obtain a satisfactory separation of these different sets of model parameters during the gap period between CHAMP and Swarm, we found it was critical to either (1) include previously calibrated data from ground observatories (e.g., Model A) or (2) apply an aggressive regularization to the dipole internal field parameters (e.g., Model C).
Including DMSP and Cryosat data provided improved predictions of secular variation at 99 low-and mid-latitude ground observatories during the gap period, and a resolution analysis confirms that these datasets can resolve the time-varying internal field to a higher spherical harmonic degree than ground observatories. The DMSP and Cryosat data were also able to record a global picture of the secular acceleration pulse occurring in 2012.5 to spherical harmonic degree 10. We believe that future studies on the origin of these secular acceleration pulses would benefit significantly from including DMSP and Cryosat data during the gap period.