 Full paper
 Open Access
 Published:
Kalmag: a high spatiotemporal model of the geomagnetic field
Earth, Planets and Space volume 74, Article number: 139 (2022)
Abstract
We present the extension of the Kalmag model, proposed as a candidate for IGRF13, to the twentieth century. The dataset serving its derivation has been complemented by new measurements coming from satellites, groundbased observatories and land, marine and airborne surveys. As its predecessor, this version is derived from a combination of a Kalman filter and a smoothing algorithm, providing mean models and associated uncertainties. These quantities permit a precise estimation of locations where mean solutions can be considered as reliable or not. The temporal resolution of the core field and the secular variation was set to 0.1 year over the 122 years the model is spanning. Nevertheless, it can be shown through ensembles a posteriori sampled, that this resolution can be effectively achieved only by a limited amount of spatial scales and during certain time periods. Unsurprisingly, highest accuracy in both space and time of the core field and the secular variation is achieved during the CHAMP and Swarm era. In this version of Kalmag, a particular effort was made for resolving the smallscale lithospheric field. Under specific statistical assumptions, the latter was modeled up to spherical harmonic degree and order 1000, and signal from both satellite and survey measurements contributed to its development. External and induced fields were jointly estimated with the rest of the model. We show that their large scales could be accurately extracted from direct measurements whenever the latter exhibit a sufficiently high temporal coverage. Temporally resolving these fields down to 3 hours during the CHAMP and Swarm missions, gave us access to the link between induced and magnetospheric fields. In particular, the period dependence of the driving signal on the induced one could be directly observed. The model is available through various physical and statistical quantities on a dedicated website at https://ionocovar.agnld.unipotsdam.de/Kalmag/.
Graphical abstract
Introduction
Separating the different contributions to the Earth’s magnetic field from direct measurements of it is a difficult task. The main reason making this problem complex is the wide range of spatial and temporal scales overlapping one another. The core field, which is sustained by dynamo action in the Earth’s outer core, is at the Earth’s surface the dominant largescale field, and it evolves on timescale ranging from months to millennia. On the opposite, the lithospheric field is dominant at small scales. Emanating from the remnant magnetization of the rocks lying within the crust, it follows the motions of the latter and therefore varies very slowly with time. External sources, such as the magnetospheric fields or the ionospheric field are driven by thermospheric winds and solar radiations. Their direct link to solar activity make them subject to intense variations from very short up to decadal timescales. These fluctuations induce currents within the electrically conducting parts of the crust and the mantle which in return generate a secondary magnetic field. Induction processes also occurs within the oceans. The circulation or tidal motions of the latter within the ambient magnetic field create electrical currents which also produce a secondary field.
From the seventeenth century to today, geomagnetic data have been continuously accumulated. First collected during marine and land surveys, measurements of the Earth’s magnetic field were quickly complemented by instrumentation installed within groundbased observatories [see Jackson and Finlay (2007)]. The development of aviation in the 1950s offered another support to measure the field. But the biggest step in geomagnetic monitoring certainly comes from the rise of loworbiting satellite missions. Starting in 1965 with the POGO mission, many spacecrafts dedicated to geomagnetic field modeling were later launched. These include nonexhaustively the MagSat, the Oersted and the CHAMP spacecrafts and the Swarm constellation.
Technical constraints to build geomagnetic field models strongly depend on the type of data to be assimilated. Satellite missions provide measurements at a high frequency. The algorithms they are feeding therefore need to be adapted to treat a large amount of observations. Land, marine and airborne surveys operate at the level or slightly above the Earth’s surface. As a consequence, the contribution of the smallscale lithospheric field to the data they produce is important. Accounting for this field requires to model it at a very high resolution, a technical challenge.
Many models of the geomagnetic field have been proposed over the last decades [see Hulot et al. (2015)], and most of them were obtained with a regularized least square approach. This is the case for the CHAOS model series from Olsen et al. (2006) to Finlay et al. (2020), the comprehensive models by Sabaka et al. (2002, 2015, 2018, 2020), the GRIMM models by Lesur et al. (2008, 2010, 2015), the POMME models by Maus et al. (2005, 2010), or the gufm1 by Jackson et al. (2000). Least square methods are very efficient numerically, but the usually considered reweighed version can only provide unique solution. On the opposite, Bayesian inversions are computationally demanding but results are expressed in terms of posterior distributions, providing therefore predictions of mean solutions together with their associated uncertainties. Bayesian inversion in the context of geomagnetic field modeling was initiated by Gillet et al. (2013). Considering groundbased observatory, survey and satellite data, they could derive the COVOBS model spanning the \(18402010\) time window, a period which was recently increased to 2020 by Huder et al. (2020). Similar efforts have been followed by Holschneider et al. (2016) in a study where emphasis was put on better characterizing the spatial properties of the different magnetic sources through correlation kernels. Extending this work to the time domain, and sequentializing the problem, Baerenzung et al. (2020), Ropp et al. (2020) could derive geomagnetic field models from the combination of a Kalman filter and a smoothing algorithm. This approach conserves all the advantages of the Bayesian method proposed by Gillet et al. (2013) and alleviates most of its drawbacks. In particular, the dimension of the system, the amount of observations to be assimilated, or the non linear link between certain magnetic sources, are not anymore strong limiting factors.
In this paper, we present the extension of the Kalmag model by Baerenzung et al. (2020) to the twentieth century. Deriving only from CHAMP and Swarm data, Kalmag covered the \(2000.52020\) time period, and was a candidate for the IGRF13 model [see Alken et al. (2021)]. The present version resulted from the assimilation of extra measurements taken by groundbased observatories, POGO, MagSat and Oersted satellites and during land, airborne and marine (L.A.M.) surveys. To assimilate the latter type of data, we introduced a statistical approximation within the Kalman filter algorithm enabling us to resolve the lithospheric field up to spherical harmonics degree and order \(\ell =1000\). Therefore, there is no need to subtract the lithospheric contribution to L.A.M. survey observations with highresolution models such as the WDMAM by Lesur et al. (2016), the EMM model by Maus (2010) or the recent model of Thébault et al. (2021), to build the model. In addition, a smallscale lithospheric field model could be recovered without preprocessing of the data.
The article is organized as follows. In the first part, the dataset used to construct the model and the selection criteria applied are presented. In the second part, the different magnetic sources, their prior characterization and dynamical behavior are detailed. At the end of this section, the various formulations to assimilate data, update the model and sample it are provided. In "Results" section, the properties of our model for the core field, the secular variation, the lithospheric field and external and induced fields are discussed. The article ends with a discussion and some concluding remarks.
Data
The proposed model was derived from either vector field or intensity measurements of the geomagnetic field taken from 1900.0 to today by satellites, groundbased observatories and during land, airborne and marine surveys. Satellite observations from five different missions were considered. These are, the POGO (1965–1971) (e.g., Cain and Sweeney 1973), the MagSat (1979–1980) (e.g., Langel and Estes 1985a), the Oersted (since 1999) (e.g., Neubert et al. 2001), the CHAMP (20002010) (e.g., Rother et al. 2000), and the SWARM (since 2013) (e.g., Olsen et al. 2013) missions. For groundbased observatories, hourly mean vector fields provided by the World data center for geomagnetism from 1886 (e.g., Macmillan and Olsen 2013) and selected through the procedure which is detailed in the following, were used to derive secular variation data, the latter being used only to constrain the core field evolution. These types of observations, feeding also other models such as the CHAOS series by Olsen et al. (2006), Finlay et al. (2020), the C3FM by Wardinski and Holme (2011), Wardinski et al. (2020) or the COVOBS model by Gillet et al. (2013), Huder et al. (2020) were here obtained by first averaging vector field measurements over 0.1year time windows. The resulting mean values \(\bar{\mathbf{b }}(t)\) were then used to derive secular variation data \({\gamma }(t)\) through the relation \({\gamma }(t) = \bar{\mathbf{b }}(t+0.5 yr)  \bar{\mathbf{b }}(t0.5 yr)\). The location of each observatory taken into account is displayed with black triangles in Fig. 1. For aeromagnetic, land and marine survey data, three compilations served the model derivation (e.g., Quesnel et al. 2009). The first one is provided by British Geological Survey at www.wdc.bgs.ac.uk/, the second one by the National Oceanic and Atmospheric Administration at maps.ngdc.noaa.gov and the third one is made accessible by the U.S. geophysical survey at www.mrdata.usgs.gov. The positions of L.A.M. data are typically given through the latitude, longitude and altitude location of the measuring vessel. For airborne measurements, whenever altitude was provided by radar altimeter it was corrected above land surfaces with the ETOPO1 global relief model of Amante and Eakins (2009).
Before being assimilated, each data containing vector information, such as North, East, Down or declination, inclination and intensity, was projected in geographic spherical coordinates. The resulting dataset was then subject to selection. The main purposes of this procedure are to avoid the contribution of the dayside ionospheric field which is not modeled, to operating during low geomagnetic activity and, for satellite observations, to be weakly perturbed by the substorm auroral electrojet. The latter two criteria were fulfilled through a selection based on the values of independently derived indices, respectively, a given threshold on the Kp geomagnetic index and the required positiveness of the zcomponent of the interplanetary magnetic field (IMF). The Kp threshold was set to \(2^\) for satellite data and to \(4^\) for all other observations. To limit the contribution of the dayside ionospheric field, only nighttime measurements (when the sun is below the horizon) were kept at magnetic latitude lying between \(\pm 60^\circ\). This constraint was nevertheless relaxed for MagSat data for which the satellite followed a dawn–dusk orbit and for some land survey data which were either not dated precisely enough to determine their local solar time, or only used to derive the lithospheric field model. Note also that for CHAMP and SWARM satellites, it was also required that measurements were taken when both the vector field magnetometer and the star tracker were functioning in nominal mode.
Finally, each L.A.M. surveys and satellite dataset were subsampled. For POGO, MagSat, and Oersted satellites, a rate of 1 datum every 10s (0.1Hz) was chosen. For CHAMP satellite, the sampling rate was increased to 0.2Hz. For SWARM, only satellites Alpha and Bravo are considered with a simultaneous sampling rate of 0.1Hz. Distance criteria were applied to subsample L.A.M. surveys data. In a first selection, a minimum distance of 5 km between any data point within 1h time windows was imposed. Every measure lying too close to the previously selected ones were removed. The resulting dataset was then split in 8 subsets in which the minimum distance was set to 40 km. Therefore, at a given epoch within the Kalman filter algorithm, data from each of these subsets were sequentially assimilated whenever they were available.
In Table 1, the time period, the selection criteria and the type and total number of measurements associated with each dataset are summarized.
Magnetic sources
Seven sources compose the Kalmag model. These are a core field (\(b_c\)), a lithospheric field (\(b_l\)), an induced/residual ionospheric field (\(b_{ii}\)), a remote (\(b_{rm}\)), a close (\(b_m\)) and a fluctuating (\(b_{fm}\)) magnetospheric fields and a source associated with fieldaligned currents (\(b_{fac}\)). Except for \(b_{fac}\), each of these sources \(b_s\) is assumed to derive from a potential \(V_s\) such as \(b_s = \nabla V_s\). For \(b_{fac}\), as in Sabaka et al. (2004) the currents themselves are assumed to derive from a potential \(V_{fac}\). Waters et al. (2001) has shown that under this assumption the resulting magnetic field could be expressed as \(b_s = \mathbf{r} \times \nabla V_s\).
The potentials \(V_s\) are then expanded in spherical harmonics (SH) such as potentials of internal and external origin are, respectively, given by:
Where \(Y_{\ell ,m}\) are Schmidt seminormalized spherical harmonics of degree \(\ell\) and order m considered, respectively, up to \(\ell _{max}\) and \(m_{\tiny {max}}\), \(a_s\) is a reference radius, and \(g_{s,\ell ,m}(t)\) (later referred as \(g_s\)) are the spherical harmonics coefficients expressed at \(a_s\). Each field is projected in a given spherical coordinate system \(\{r,\theta _s,\phi _s\}\) as indicated in Table 2. These systems can either be geographic (GEO), magnetic (MAG), solar magnetic (SM), or geocentric solar magnetospheric (GSM) (see Laundal (2017)).
Depending on the observations which are being assimilated, the spatial resolution of the lithospheric field is varied. Whereas for CHAMP and Swarm data, the latter is expanded up to \(\ell =150\), it is only modeled up to \(\ell =100\) for other satellite measurements. Since L.A.M. survey data are taken close to the Earth’s surface, they contain a strong contribution of the small scale lithospheric field. To assimilate such measurements, the lithospheric field is therefore parameterized up to spherical harmonics degree \(\ell = 1000\) with an approximation of the associated covariance matrix between \(100 < \ell \le 1000\) as detailed in the following.
Sequential modeling
The Kalmag model is constructed sequentially through a Kalman filter approach [see Kalman (1960)]. This technique proceeds in two alternating steps, namely a forecast and an analysis. In the forecast, the model is propagated in space and time until some measurements become available. Then the analysis takes place and the model is updated accordingly to them. Because this method provides the posterior distribution of the model only given the previously assimilated data, it is complemented by a smoothing algorithm. Performing backward in time, this algorithm enables us to correct the model at any time according to the complete dataset.
Dynamical model
The spatiotemporal evolution of the various sources composing the geomagnetic field is of complex nature. Involving nonlinear couplings, a large range of spatial and temporal scales, some regimes which are not yet numerically achievable or simply not sufficiently well characterized, the dynamics of the Earth’s magnetic field cannot be directly simulated. This is why, as initiated by Gillet et al. (2013) in the context of geomagnetic modeling, we chose simplified stochastic equations, namely autoregressive processes (or ARPs), to predict the evolution of the different fields. Mimicking dispersion and memory effects occurring within dynamical systems, such processes are computationally cheap to simulate and are formulated within a Gaussian framework as required by the Kalman filter approach. A priori, each source is characterized by its own process which is independent from the others. As shown in Appendix A, ARPs in their sequential form, can be described by the following general relation:
where \(z_s\) is a quantity characterizing the sth source to be propagated, \(F_s(\Delta t)\) is the parameter of the ARP and \(\xi _i(t,\Delta t)\) is a temporal Gaussian white noise spatially characterized by the distribution \({\mathcal {N}}\left( {0},\Sigma _{z_s}^\infty  F_s\Sigma _{z_s}^\infty F_s^T \right)\), where \(\Sigma _{z_s}^\infty\) is the stationary state covariance matrix associated with \(z_s\). Except for the lithospheric field which is assumed to be static, and for the core field which evolution is prescribed by a secondorder process, the dynamics of each source is controlled by a firstorder ARP. In this case, \(z_s(t)\) simply corresponds to the vector of SH coefficients \(g_s(t)\) associated with the sth field and the parameter of the process is given by:
where \(\tau _s(\ell )\) is a parameterized scaledependent characteristic time which is specified for each source in the following. For the core field, the use of a secondorder ARP induces a coupling between the field itself (\(g_c\)) and its first time derivative (\(\partial _t g_c\)). Therefore, \(z_c = (g_c,\partial _t g_c)^T\) and the parameter of the process is given by:
where \(\tau _c(\ell )\) is also chosen to be scale dependent. Contrary to firstorder ARPs where the stationary state covariance matrices are given by \(\Sigma _{z_s}^\infty = \Sigma _{g_s}^\infty\), for the core field it reads:
as shown by Hulot and Le Mouël (1994). With the proposed setup, \(\Sigma _{z_s}^\infty\) and \(\tau _s(\ell )\) completely defines the dynamical behavior of the ARPs. The covariance matrices characterizing the stationary state of each source are assumed to derive from energy spectra \(E_s(\ell ,a_s)\) expressed at given radii \(a_s\) such as:
where \(N_m\) is the number of modeled spherical harmonics coefficients per degree \(\ell\), and R is given by \(R(\ell ) = \ell +1\) and \(R(\ell ) = \ell\) for internal and external sources, respectively. The shape of each energy spectrum is imposed. It can either be flat, such as \(E_s(\ell ) = A_s^2\) or identical to the correlation kernels proposed by Holschneider et al. (2016) that we refer as of Cbased type with \(E_s = A_s^2 (2\ell +1) R(\ell )\), where \(A_s\) is the magnitude of the spectrum. For most sources, the dipole part is assumed to be independent from the rest of the spectrum such as \(E_s(\ell =1) = D_s^2\). Under these assumptions, the radii \(a_s\), the amplitudes \(A_s\) and the dipole magnitudes \(D_s\) form the free parameters of the stationary state covariance matrices \(\Sigma _{z_s}^\infty\). Characteristic timescales are parameterized by power laws such as \(\tau _s(\ell ) = M_s \ell ^{\alpha _s}\) with given magnitudes (\(M_s\)) and slopes (\(\alpha _s\)) which are for some sources allowed to continuously vary from one range of spherical harmonics to the other. The ARP’s parameters were estimated through a machine learning algorithm with a subsample of CHAMP and Swarm data as detailed in Baerenzung et al. (2020). The same values are used in this study. They are reported in Table 3.
Note that here, the energy spectrum of the lithospheric field is split into two ranges. In the first one, between \(\ell =1\) and \(\ell =74\), the spectrum is of the Cbased type and exhibits a characteristic radius of \(a_l=6287\) km and a magnitude of \(A_l=0.16\) nT. These are values obtained by Baerenzung et al. (2020). In the second range, between \(\ell =75\) and \(\ell =1000\), the spectrum is flat with \(a_l=6367.9\) km and \(A_l=6.5\) nT. In this case, the parameters were estimated through a least square fit between \(\ell =75\) and \(\ell =400\) of the energy spectrum associated with the WDMAM model of Lesur et al. (2016).
The source associated with fieldaligned currents, as well as the components at SH degree larger than \(\ell =1\) of the fluctuating magnetospheric field, exhibit very small characteristic timescales of, respectively, \(\tau _{fac}(\ell ) = 1\) min and \(\tau _{fm}(\ell > 1) = 18\) min. These timescales being smaller than the time step of the Kalman filter algorithm (here set to 30 min), the associated fields are assumed to temporally evolve as a white noise but are correlated in space and time during the analysis. Setting a priori a zero mean for both fields their covariance can be expressed as:
Filtering, smoothing, sampling
The prior statistical properties as well as the dynamics of the different magnetic sources being characterized, assimilation can be initiated. As a first step, a vector \(\mathbf{z }\) containing the spherical harmonics coefficients of each field is constructed. For the full model, \(\mathbf{z }\) is composed of \(N_M = 1002696\) entries. The lithospheric field which is expanded up to \(\ell =1000\) is filling more than \(99.9\%\) of the \(\mathbf{z }\) vector. With such a model dimension, the size of the covariance matrix associated with \(\mathbf{z }\), namely \(\varvec{\Sigma }_{\mathbf{z }}\), should be of \(N_M \times N_M \sim 10^{12}\). Yet computations with such a matrix would be numerically impossible. This is why we approximate the predicted uncertainties of the small scale lithospheric field (for \(101 \le \ell \le 1000\)) by only keeping its variance information (the one associated with each of its spherical harmonics coefficients). Under such an assumption the dimension of \(\varvec{\Sigma }_{\mathbf{z }}\) reduces to \(N_M \times N_M \sim 10^8\), a computationally conceivable size. This strong approximation, which induces a complete loss of the predicted spatial correlations of the lithospheric field beyond \(\ell =100\) is evaluated in "Lithospheric field" section.
To forecast \(\mathbf{z }\), each parameter matrix \(F_s\) of equation 3 are incorporated in a global matrix \(\mathbf{F }\). The same operation is performed for the stationary state covariance matrices \(\Sigma ^\infty\) which are assembled into the covariance matrix \(\varvec{\Sigma }^\infty\). Given \(\mathbf{F }\) and \(\varvec{\Sigma }^\infty\), the covariance matrix associated with the Gaussian white noise of the full model forecast step reads \({\tilde{\varvec{\Sigma }}} =\varvec{\Sigma }^\infty  \mathbf{F }\varvec{\Sigma }^\infty \mathbf{F }^T\). Therefore, the evolution of the mean model and its covariance from time step \(k1\) to step k is then given by:
After the forecast, whenever measurements are available, the model is updated accordingly. This operation is performed through a Bayesian inversion which reads:
where \(\mathbf{R }_k\) is the covariance matrix associated with measurement errors, \(\mathbf{K }_k\) is the Kalman gain matrix and \(\mathbf{H }_k\) is the operator projecting the model to the observations \(d_k\) at iteration k. \(\mathbf{R }_k\) is chosen to be diagonal with constant standard deviations of 0.1 nT for intensity data [see Quesnel et al. (2009)] and vector field measurements, and of 4.85 nT/yr for each component of secular variation data as we estimated it with a similar algorithm used to calibrate Kalmag (see Baerenzung et al. (2020)). When \(d_k\) corresponds to intensity measurements, the linearization approach proposed by Mauerberger et al. (2020); Schanner et al. (2022) is applied. In their developments, they showed that at first order, the predicted intensity \(I_k\) could be related to the predicted magnetic field \(\mathbf{B }_k\) through the relation \(I_k \sim E[\mathbf{B }_k]^T\mathbf{B }_k/{\tilde{I}}_k\), where \({\tilde{I}}_k\) is the intensity derived from the mean magnetic field \(E[\mathbf{B }_k]\). Note that this projection is realized with the mean magnetic field prediction. Therefore, no iteration over the updated solutions is required. When \(d_k\) correspond to secular variation data, \(\mathbf{H }_k = \mathbf{0}\) for each source except for the core field where \(\mathbf{H }_k\) projects its associated secular variation on the data. Once all data have been assimilated, the different modeled epochs are corrected through a smoothing algorithm (see Rauch et al. (1965)). Starting at the final step of the Kalman filter it performs iteratively backward in time through the following relations:
where \(\mathbf{d }\) corresponds to the full dataset. The smoothing algorithm only provides snapshots of the posterior model, therefore the resulting solution does not contain information about temporal correlations. Although the posterior covariance between the model at different epochs can be analytically derived, obvious storage limitations makes this option numerically inapplicable. This is why we introduced a formulation to sample ensembles from the posterior model which are correlated both in space and time. Starting with an ensemble \(\mathbf{z }^e\) randomly drawn from the last state of the Kalman filter solution, the algorithm proceeds similarly to the smoothing algorithm, backward in time with:
where \(\zeta ^e\) is a random realization from the Gaussian distribution characterized by a \(\mathbf{0}\) mean and a covariance matrix given by:
Note that to correct deviations, due to sampling errors, between the ensembles and the true posterior means, the ensembles were recentered at each epochs accordingly to the mean smoothing solutions. For this study, we used an ensemble of 1024 members.
Model construction
To construct the model, the time step of the Kalman filter algorithm was set to \(\Delta t = 30\) min. Nevertheless, whenever the distance between two analysis windows exceeded this value, \(\Delta t\) was increased accordingly. With a dataset covering the \(20^{th}\) century and the last 22 years, the direct approach would have been to start assimilating measurements in 1900.0 and to progress forward in time until today. However, we did not proceed this way. Instead, the Kalman filter simulation was initiated in 2000.5 to first assimilate groundbased observatories, CHAMP and then Swarm data until 2022.18, the last epoch at which measurements were currently available. The smoothing algorithm was then applied to update the model within this time window. In a third part, the smoothing solution in 2000.5 was used as a restart file to assimilate, backward in time, the measurements taken prior to this date. Finally, the smoothing algorithm was applied from 1900 to today with a slight modification beyond 2000.5 which is detailed in Appendix B.
Two reasons motivated this choice of splitting the assimilation process. The first one is to possess a wellresolved largescale lithospheric field before assimilating survey data. This, in order to be able to distinguish the gain of assimilating such observations on this part of the field. To this end, the lithospheric field was fully modeled up to SH degree \(\ell = 150\) during the CHAMP and Swarm eras. The full solution (mean and covariance) was then truncated at \(\ell = 100\) in 2000.5 to restart the Kalman filter between 2000.5 and 1900.0. Beyond \(\ell =100\) only the mean and variance were kept from the CHAMP and Swarm solution. This part was finally extrapolated with a zero mean and the prior variance of equation 7 between \(\ell =150\) and \(\ell =1000\). The second reason for splitting the assimilation process was motivated by the fact that the older the measurements, the lower their accuracy and spatial coverage. Yet, before assimilating any measurement an outlier detection is performed. The latter process consists in checking that the measurements do not excessively deviate from their predicted values, in particular that each vector field or intensity measure lies within the \(95.6\%\) confidence interval of the model prediction. On top of this selection, the misfit of the sequentially assimilated tracks was evaluated. Whenever the misfit value exceeded the imposed threshold of 3, the corresponding track was dismissed. The algorithm to detect outliers performs better when the model accuracy is high, which occurs when the data quality and coverage are good. This is why starting with a very well constrained solution in 2000.5 and assimilating data backward in time enabled us to optimize the detection process.
Over the entire time span of the model, each source, except the one associated with fieldaligned currents, is stored every 0.1 year, setting up the temporal resolution of the model to this time step. However, to better track the evolution of rapidly evolving sources, such as the close and fluctuating magnetospheric fields or the residual ionospheric/induced fields, the latter were stored every 3 hours during the CHAMP and Swarm eras and every 5 days between 1900 and 2000.5.
Results
Main field and secular variation
The Kalman filter and smoothing algorithms provide a model in terms of mean solution and associated covariance matrix. Combining these two quantities gives a precise knowledge of locations where the solution is reliable and where it is not. As an illustration for the main field, i.e., the sum of the core field and the lithospheric expanded up to SH degree \(\ell =20\), Fig. 2 shows at different epochs the radial component of the mean field (isocontours) and its associated standard deviation (color maps). Locations where the maps are red correspond to locations where the mean solution is likely to deviate strongly from the true field. On the opposite, within blue and purple areas the model predicts that the true and the mean predicted field are close. These maps are complemented on their bottom right by a global measure of the predicted uncertainty. It corresponds to the r.m.s. standard deviation given in nT and expressed as:
where \(\sigma\) is the standard deviation associated with the radial component of the field and \(\Omega\) is the Earth’s surface.
Until the 1960s, uncertainty maps exhibited a strong dichotomy between the Northern and Southern hemispheres. Whereas in the North, the standard deviation associated with the radial component of the field does not globally excess 25 nT, it reaches and even exceeds 50 nT in the South. The difference of predicted uncertainties is particularly important between land and oceanic surfaces reflecting the lack of measurements taken over the latter, the location where the field is best resolved is Europe. This is a benefit of the high density of groundbased observatories operating at this place and during this time period. When looking at the r.m.s. standard deviation, the year 1920 slightly stands out with \({\bar{\sigma }} = 44\) nT, whereas this value oscillates around \(\bar{\sigma } \sim 50\) nT in 1910, 1930 and 1940. This phenomenon can be explained by the multiple land and marine surveys occurring at and around this epoch and which are offering a large data coverage of the globe (see Fig. 1). In 1960, the global resolution of the model is improved and the North–South dichotomy mostly disappears. Two reasons explain this gain of accuracy. The first one is the dense spatial coverage of survey data at this epoch (see Fig. 1). The second one is the time proximity of the POGO mission which started in 1965. One can also observe that observatories still play an important role to reduce the posterior variability as it is the case in and around Europe and Japan. In 1970, the jump in accuracy of the model is striking. At this period lying within the POGO era, the standard deviation associated with the Kalmag solution is strongly reduced. However, the model predicts a higher possible variability around the magnetic dip equator. This phenomenon is the transcription of the Backus effect, or more generally the “perpendicular error” effects within the model. Indeed, as first recognized by Backus (1970), to be then generalized by Lowes (1975), when constructing a geomagnetic field model with intensity measurements alone, larger errors will contaminate the model near the equator. This effect is surely affecting our mean solution, but covariance information enables us to quantify it. With MagSat observations, which cover less than a year (\(19791980\)), the model precision is equivalent to the one obtained with POGO data except around the dip equator where vector field measurements eliminate the “perpendicular error” effects induced by the assimilation of intensity data. The map in 1990 highlights the importance of loworbiting satellites to recover the Earth’s magnetic field. Lying between MagSat and Oersted missions, in the middle of almost 20 years without satellite measurements, the solution obtained at this time is strongly degraded. It presents levels of uncertainties equivalent to the 1960 ones except in Northern America and Russia where the coverage with groundbased observatories has since been increased. The situation is ameliorated with Oersted measurements and becomes even better with CHAMP and Swarm observations. With the highquality instrumentation of CHAMP and Swarm satellites, the model is extremely precise and this is almost everywhere at the Earth’s surface. It is however worth noting that the constellation of Swarm satellites permits to obtain a slightly more accurate solution than the unique CHAMP spacecraft.
When looking at the mean secular variation (SV) and its associated standard deviation as displayed at similar epochs in Fig. 3, one can observe that the dichotomy in accuracy between the North and the South is also present for this quantity. The dichotomy persists until the year 2000, but with a lower contrast after 1960. Groundbased observatory data are of particular importance to constrain the secular variation, as locations where their density is high always coincide with areas of low posterior variability. Globally, uncertainties are decreasing with time except between 1970 and 2000, where the r.m.s. standard deviation fluctuates due to the lack of persistent loworbiting satellite missions. In addition, the distribution of uncertainties over the different spatial scales is not homogeneous. Instead, small scales typically exhibit a higher posterior variability relatively to their mean signal than large scales. This effect can be observed in Fig. 4 where time series between 1900 and 2022 of the \(68.2\%\) confidence interval associated with some selected SH coefficients are displayed in red. In this figure, it is clearly visible that the larger the degree of the coefficient (from left to right and top to bottom), the larger its posterior standard deviation relatively to its mean values. The COVOBS.x2 model of Huder et al. (2020), exhibits a similar behavior as its predicted \(68.2\%\) confidence intervals (blue areas) show. Although the two models are mostly consistent with one another, small differences can nevertheless be distinguished, in particular in the predicted standard deviations. Until \(\sim 1920\) their level is lower for COVOVS.x2, they become equivalent between COVOVS.x2 and Kalmag until \(\sim 1960\) to be lower for Kalmag afterwards.
To precisely characterize the spatiotemporal resolution of the secular variation over the model time span, we computed the ratio \(C_{{\dot{g}}}(\ell ,k)\) between the Fourier power spectra of the mean secular variation and its associated standard deviation for 20 years time periods. This quantity, which was proposed by Gillet et al. (2015), can be expressed as:
where \(\hat{{\dot{g}}}_{c,\ell ,m}(k)\) is the Fourier transform of the secular variation, and \(\sigma _{\hat{{\dot{g}}}_c,\ell ,m}(k)\) is its associated standard deviation. To estimate the latter quantity, we used an ensemble of 1024 Fourier transform of secular variation time series. In Fig. 5, \(C_{{\dot{g}}}(\ell ,k)\) is displayed for 6 different time windows. The blue and red areas correspond to spatiotemporal scales which are, respectively, well resolved and not resolved. At early times, between 1900 and 1920, only some limited amount of temporal scales of the SV up to SH degree \(\ell =4\) are resolved. The situation slightly improves between 1920 and 1960 where some signal up to SH \(\ell =6\) can be accurately recovered, and this down to a few years for the largest spatial scales. The emergence of satellite missions and the increase of groundbased observatory and survey data helps improving the model resolution between 1960 and 2000. During this time interval some spherical harmonics coefficient up to degree \(\ell =5\) are either partially or fully resolved down to time periods lower than a year. Reaching such a temporal resolution is impossible with the secular variation data derived from annual differences of observatory measurements. It can therefore only be achieved thanks to the high temporal coverage of satellite and survey data. In agreement with our previous results and with the study of Gillet (2019), the secular variation is best resolved during the CHAMP and Swarm eras, where spatial scale up \(\ell =15\) can be partially resolved down to periods of approximately 5 years, and 2year fluctuations can be very well captured up to \(\ell =10\).
Lithospheric field
As previously mentioned, the lithospheric field model was built in multiple steps. During the CHAMP and Swarm eras, it was fully modeled up to SH degree \(\ell =150\). After applying the smoothing algorithm, the lithospheric in 2000.5 was divided in three parts. In the first one, between \(\ell =1\) and \(\ell =100\), the full smoothing solution (mean and associated covariance matrix) was kept. In the second part, between \(\ell =101\) and \(\ell =150\), only mean and variance information were considered. Finally, between \(\ell =151\) and \(\ell =1000\) a zero mean and the variance derived from equation 7 with parameters of Table 3 were a priori imposed. The Kalman filter algorithm was then launched backward in time with this prior lithospheric field between 2000.5 and 1900.
Keeping only variance information within the Kalman filter algorithm is a strong approximation. Before implementing it, this approximation was tested during the CHAMP and Swarm eras. For this evaluation phase, the lithospheric field was fully modeled up to \(\ell =30\) and partially modeled (keeping only variance information) between \(\ell =31\) and \(\ell =150\). The remaining part of the model was simulated normally and the dataset used is the one described in "Data" section. The resulting model is referred as the PR model. With this setup, comparisons with the solution obtained at full resolution (FR model) can be performed. In a first simulation, it was observed that the posterior variance associated with the approximated solution had a tendency to be underestimated. In particular, the transition between the degree variance (the sum of the variances at a given degree) at SH degree \(\ell =30\) and \(\ell =31\) exhibited a pronounced discontinuity. To partially correct this effect, variances beyond the transition were increased by a multiplication factor. The latter was imposed to vary linearly with the degree of the SH expansion, and forced a smooth transition as well as a level of variance at the last modeled degree corresponding to stationary state variance of equation 7. Because of the latter operation, the lithospheric field resolution was increased to \(\ell =200\), a degree at which the signal at satellite altitude becomes very low as shown by Olsen et al. (2017).
The results of this evaluation phase are displayed in Fig. 6. On the left panels, the mean downward component of the lithospheric field at the Earth’s surface is shown for both the solution obtained at full resolution (top) and the one obtained at partial resolution (bottom). These two maps look very similar and most features which can be recovered by the FR model are present in the PR model. This aspect is confirmed by the map which exhibits the difference between the two mean solutions (top right). Only at the level of Antarctica, Eastern Europe and Western Russia, discrepancies become quite intense. These discrepancies coincide with relatively largescale errors (up to \(\ell =70\)) as shown with crosses by the energy spectrum at the Earth’s surface of the difference between the two mean models (bottom right panel). Beyond \(\ell =70\), the level of error decreases. The computation of the degree correlation between the two models, as introduced by Langel and Hinze (1998) reads:
also highlights their proximity. The latter reaches a minimum of 0.915 at \(\ell =66\) and stabilizes around the mean value of 0.979 beyond \(\ell =100\). The energy spectra associated with the standard deviations show that the model where only variance information was updated, had a tendency to underestimate the level of predicted uncertainties. Although the technique previously mentioned to rescale the variance was applied, it did not completely resolve this issue. Nevertheless, the fact that the smallscale lithospheric field was only marginally affected by the proposed modeling approximation comforted us to implement it for the complete model derivation.
The lithospheric field resulting from the assimilation of the entire dataset is first analyzed through energy spectra at the Earth’s surface. In the left part of Fig. 7, the spectra of the mean, the standard deviation and the prior standard deviation of the lithospheric field are displayed with black lines. In this solution, energy populates the entire range of modeled scales. However, the mean field is predicted to be globally reliable only up to SH degree \(\ell \sim 450\), where the spectrum of the mean and the spectrum of the standard deviation cross one another. In addition, the discontinuity in the spectrum of the mean at SH degree \(\ell =150\) indicates that even up to \(\ell \sim 450\) a nonnegligible portion of the crustal signal remains unmodeled. Nevertheless, comparisons with the FR model previously discussed (blue lines and dots) demonstrate that the assimilation of survey data helps to better constrain the largescale lithospheric field. Indeed, the mean signal of the final solution has gained in intensity, and its standard deviation has decreased. In the same figure, the spectra of the difference with two other lithospheric models, the WDMAM model by Lesur et al. (2016) (red dots), and the LCS1 model by Olsen et al. (2017) (green dots), are also shown.
Although our solution is apparently closer at any degree to the LCS1 model than to the WDMAM model, the examination of the degree correlation (right panel of Fig. 7) indicates that this aspect is only true up \(\ell =150\). Beyond this value, even if \(\rho _\ell\) is relatively low, the correlation between Kalmag and WDMAM (red line) is higher. Contrary to the degree correlation between LCS1 and Kalmag which decays smoothly, the one associated with Kalmag and WDMAM presents two transitions. One of them is at SH degree \(\ell =100\), the spatial scale delimiting the satellite data solution (\(\ell \le 100\)) from the survey data solution (\(\ell > 100\)) of the WDMAM model. The other transition occurs at \(\ell =150\), the degree beyond which our model is only constrained by survey data. This second drop in \(\rho _\ell\) may be explained by the lower spatial resolution that our solution exhibits in certain areas. This phenomenon can be observed in Fig. 8 where the downward components of WDMAM (top left) and Kalmag (bottom left) expanded up to \(\ell =450\) (the resolution up to which we predict a globally wellresolved solution) are displayed.
The intense signals predicted by WDMAM in the Southern parts of the Pacific, the Atlantic and the Indian oceans, or on large portions of continental areas are mostly absent in our solution. It is however worth noting that WDMAM does not only derive from direct measurements of the geomagnetic field, but also from the combination of ocean floor age map, relative plate motions and geomagnetic polarity time scale (see Dyment et al. (2015)). Logically, the difference between the downward component of both models (top right of Fig. 8) is larger at these oceanic and land locations than anywhere else. On the opposite, discrepancies are reduced in most areas where the standard deviation associated with the large scale part of the field (up to \(\ell =100\)) is low (map on the bottom right). These uncertainty predictions which are tied to data coverage (see Fig. 1) therefore provide a good approximation of locations where the Kalmag model is likely to be well resolved.
The model being expressed in terms of posterior distributions, it can be used as a prior information to assimilate new data when some of them become available, and therefore be updated accordingly. To illustrate this aspect, airborne intensity measurements taken above Afghanistan in 2006 and 2008 were put aside from the dataset serving the model derivation. They are now used to update the lithospheric field following the method detailed in Appendix C. The locations at which each measure was taken during these surveys are shown with colored dots (blue for 2006 red for 2008) in the bottom left panel of Fig. 9. The downward component of the mean prior lithospheric field, which comes from the smoothing solution taken up to \(\ell =1000\) in 2006.0, is shown on the top left panel. Its resolution was increased to \(\ell =2000\) before the Kalman filter simulation was launched. The result of the assimilation process is shown through the downward component of the mean posterior field in 2009.0 in the second panel of the top row of Fig. 9. On this map, it can be seen that structures which were completely invisible in the prior model appear in the posterior one. In particular, highintensity anomalies could be detected along the Southern and Western border of Afghanistan. The field in the central part of the land is globally weaker. Such patterns are also predicted by the EMM 2017 model of Maus (2010) as shown on the third panel of the top row. They are nevertheless of lower magnitude, and less detailed due to the resolution of the model which is limited to \(\ell =790\). To make the comparison with the EMM solution possible, the posterior mean was truncated at SH degree \(\ell =790\). The resulting downward component is shown in the top right of the figure. Now the two models are looking more alike. Nevertheless, discrepancies in predicted intensity still remain. In order to assess the degree of compatibility of the different models with the observations, the absolute value of the difference between a subset of the measurements and the intensities predicted by the sum of the core and the different lithospheric field solutions was computed. The results are shown on the bottom panel below each corresponding downward components. The model exhibiting the higher degree of freedom, displayed on the second column, is without surprise the model which can better explain the data. As shown on the bottom of the map, the r.m.s. difference between the model and the measurements is of 18nT. Globally the predictions of the truncated model (right column) are closer to the data than the EMM predictions (third column). Of course Afghanistan is a particular location and no claim is made here that the Kalmag model would be globally more accurate than the EMM model since this is certainly not the case. However, this example shows that the method proposed in this study is well suited to construct regional highresolution models of the lithospheric field and this even when data coverage is not optimal.
Magnetospheric and induced fields
With the proposed approach, magnetospheric and induced fields are jointly estimated with the rest of the model. A priori, the field generated by the currents flowing in the outer magnetosphere (\(g_{rm}\)) is predicted to evolve slowly with time (\(\tau _{g_{rm}} = 10.3\) years) in comparison to other external sources. A posteriori, such a behavior is confirmed as illustrated by the evolution of the annual mean dipole component of \(E[g_{rm}]\) projected in magnetic coordinates and shown in the left panel of Fig. 10 with circles. Note that prior to 1953, our model cannot correctly extract this field and the latter oscillates around 0 with a large posterior variance. However, \(g_{rm}\) alone cannot explain decadal variations of external sources as they can be detected at the Earth’s surface or at the altitude of loworbiting satellites. The rapidly evolving magnetospheric components also exhibit longterm trends whenever the latter can be captured. This effect can be observed when comparing the annual mean dipole component of \(E[g_{rm}]\) to the one of \(E[g_{rm} + g_{m}+ g_{fm}]\) shown with a continuous line in Fig. 10. During satellite eras, the latter is always found to be more intense than the former, meaning that the ring current can generate some persistent annual signal as already documented by Lühr and Maus (2010). With our current method, this signal can only be recovered when temporal data coverage is high enough due to the fact that \(E[g_{m}]\) and \(E[g_{fm}]\) exhibit very low memory timescales. A possible way to improve the AR processes characterizing these sources would be to consider some extra timescales accounting for the slow varying part of the field generated by the ring current. The cycle of approximately 10.5 years highlighted by Huder et al. (2020) with the COVOBS.x2 model (shown with dashed lines in Fig. 10) is also present in our solution. Although the mean solutions of both models slightly differ from one another, the COVOBS.x2 dipole always lies within the 68.7 confidence interval predicted by our model (gray areas in Fig. 10).
To evaluate the model over short periods of time and when all sources are predicted to be well separated, we now compare predictions of the azimuthal component of the model with ground based observatory measurements taken at four different locations, Hermanus, Niemegk, Canberra and Kakioka. Observatory data being only assimilated to constrain the core field secular variation, they can be considered as independent measurements for external and induced fields. In order to make visual comparisons possible and to remain within the conditions the model was built in, only hourly nighttime measurements and predictions were kept to be then averaged over 10 days time periods. The results are reported in the right panel of Fig. 10 with red lines for observatory data, black lines for the full model predictions and blue lines for the predictions of the core field alone. Globally, monthly and annual variations of \(B_\theta\) are well captured by the model. Only during the time gap between the CHAMP and Swarm missions, when external sources are not updated anymore, predictions and observations differ strongly. One can also notice that the core field does not seem to be contaminated by external or induced fields, as its evolution does not reproduce the rapid variations observed in the data. The largest discrepancies between predictions and observations are in the magnitude of the signals. Intense excursions are not predicted by the model. The reason for this is that the model was trained on a dataset selected for very quiet magnetic conditions [see Baerenzung et al. (2020)]. Therefore, the selection algorithm of the Kalman filter prevents the assimilation of data containing a too strong signal from external sources. A recalibration of the model for more general conditions would certainly solve this issue.
Finally, our model contains a source for induced/residual ionospheric fields. The latter is a priori uncorrelated from magnetospheric fields. Yet rapid variations of external fields generate currents within the Earth’s interior, which in return induce a secondary magnetic field (e.g., Schmucker 1985; Langel and Estes 1985b; Olsen et al. 2005; Finlay et al. 2020). The intensity and temporal evolution of the induced field depends on the conductivity of the crust, the mantle and the core. Under the assumption that conductivity only depends on depth, each spherical harmonics coefficient of the induced field will be linked the same coefficient of the external field through the relation:
where \(\iota\) is the induced field, \(\epsilon\) the external fields, and \({\mathcal {Q}}\) is referred as the \({\mathcal {Q}}\)response. In our model, \(\iota = g_{ii}\) and \(\epsilon =g_{rm} + g_{m}+ g_{fm}\), where \(g_{rm}\) is projected in magnetic coordinates.
In the particular case discussed by Olsen et al. (2005), where the mantle is assumed to be insulating until a given depth d followed by a superconductor, \({\mathcal {Q}}_l^m(tt^\prime ) = \tilde{{\mathcal {Q}}}_l^m\delta (tt^\prime )\) and therefore \({\iota }_{l,m}(t) = \tilde{{\mathcal {Q}}}_{l,m}\epsilon _{l,m}(t)\). Focusing on the dipole component of induced and external fields, and assuming a depth of \(d = 1200\) km, leads to \({\iota }_{1,0}(t) = \tilde{{\mathcal {Q}}}_{1,0}\epsilon _{1,0}(t)\) with \(\tilde{{\mathcal {Q}}}_{1,0} = 0.27\) as estimated by Langel and Estes (1985b) with POGO data. In the left panel of Fig. 11, the evolution of, respectively, \(\epsilon _{1,0}\) and \({\iota }_{1,0}(t)/\tilde{{\mathcal {Q}}}_{1,0}\) is displayed between 2019.45 and 2019.65 with, respectively, red and black lines. In order to concentrate on rapid variations only (we recall that external and induced field were stored every 3 hours during the CHAMP and Swarm eras), temporal scales larger than 15 days have been filtered out from both time series. Furthermore, we chose the [2019.45, 2019.65] time period because temporal coverage of Swarm data is optimal during this interval. The two time series in Fig. 11 follow one another quite closely and \(\tilde{{\mathcal {Q}}}_{1,0}^{1}\) seems appropriate to rescale the induced field. Over the current Swarm time span, induced and external fields exhibit a Pearson correlation \(\rho = \mathrm {Cov}(\epsilon ,\iota )/(\sigma _\epsilon \sigma _\iota )\), calculated here with the mean Kalmag solutions, of \(\rho = 0.79\). It is of \(\rho = 0.84\) over the time interval of Fig. 11 and of \(\rho = 0.73\) over the CHAMP era. This lower correlation value is probably caused by the uncertainty level of external and induced fields which are higher during the CHAMP mission than during the Swarm one. However, the particular 1D conductivity model leading to \(\tilde{{\mathcal {Q}}}_{1,0}\) is known to be imperfect. More complex conductivity profiles are required to better model induction processes within the Earth’s interior.
We now investigate the \({\mathcal {Q}}\)response predicted by our model when keeping the assumption that the conductivity within the Earth is only depthdependent, but relaxing the constraint about its profile. For this evaluation, we operate in spectral space. Considering only dipole components of \(\iota\) and \(\epsilon\) and applying a Fourier transform to equation 22 the latter becomes:
From this equation, the real and imaginary parts of \(\hat{{\mathcal {Q}}}(k)\) are, respectively, given by:
To evaluate these two quantities we considered induced and external fields during the [2015.0, 2021.0] time interval when the model reaches its peak accuracy. In the right panel of Fig. 11, \({\text {Re}}\{\hat{{\mathcal {Q}}}(2\pi /k)\}\) and \({\text {Im}}\{\hat{{\mathcal {Q}}}(2\pi /k)\}\) averaged at period \(T_i=2\pi /k_i\) over \([T_i,2T_i]\) are, respectively, displayed with red and black continuous lines. For comparisons, the real and imaginary parts of \(\hat{\tilde{{\mathcal {Q}}}}_{1,0}\) as well as the \({\mathcal {Q}}\)response (referred as \({\mathcal {Q}}^O\) ) estimated by Olsen et al. (2005) with a realistic conductivity model are, respectively, shown with dashed and dotted lines. The general behavior of the \({\mathcal {Q}}\)response we recover is coherent with our prior knowledge about it. Indeed, for short periods of time the real part of \(\hat{{\mathcal {Q}}}\) is much more intense than its imaginary part and its decay pattern is close to the one predicted by Olsen et al. (2005). However, in comparison to \({\text {Re}}\{\hat{{\mathcal {Q}}}^O\}\), \({\text {Re}}\{\hat{{\mathcal {Q}}}\}\) is globally underestimated. This effect might be due to the fact that induced fields vary rapidly with time, and when no data is feeding the model, its mean value tends quickly toward 0 contrary to the remote and close magnetospheric fields which evolves slower. The behavior of the imaginary part of \(\hat{{\mathcal {Q}}}\), which reflects the temporal lag of the induced field response, is on the contrary very similar to the one predicted by the direct model of Olsen et al. (2005).
Conclusion
In this study, we proposed a method to assimilate different types of geomagnetic data in order to construct a high spatiotemporal model of the Earth’s magnetic field. The model being expressed in terms of posterior distribution, it reflects the quality and spatial coverage of the measurements it is derived from. At the beginning of the twentieth century, the main field and the secular variation are quite uncertain in the Southern hemisphere and more particularly in oceanic areas and in Antarctica. With the first data collected by loworbiting satellites, these two fields gain in precision and become very reliable during the CHAMP and Swarm eras. We demonstrated that the rapid dynamics of the core field could be captured by the model. However, the spatial resolution at which short timescale fluctuations are recovered is not constant over time and strongly depends on the spatial scale considered. Typically, rapid variations can only be accurately modeled at large spatial scale. On the opposite, fluctuations of the secular variation at high spherical harmonics degree can only be resolved for long periods of time. The model reaches its peak accuracy both spatially and temporally during the CHAMP and Swarm eras. It is therefore mandatory that such satellite missions are perpetrated in the future to better understand the nonlinear and wave dynamics occurring within the Earth’s outer core (see Aubert and Gillet (2021), Gillet et al. (2021)).
To be able to consider land, airborne and marine survey observations, which contain an intense contribution of the smallscale lithospheric field, the latter was modeled up to spherical harmonic degree \(\ell =1000\). However, this operation could not be performed directly, since the dimension of the associated covariance matrix would have forbidden any numerical computation. We therefore introduced, and conclusively evaluated, a statistical approximation where only mean and variance information were updated beyond \(\ell =100\). The resulting mean solution exhibits highly detailed structures on every areas where data coverage was dense enough. Furthermore, the part of the covariance which is still fully modeled (up to \(\ell =100\)) provides a rough estimation of locations where the mean is likely to be well resolved.
An important aspect of the proposed approach is that whenever new observations become available, the model can be updated accordingly without restarting the entire assimilation process. The example presented with the dataset taken above Afghanistan demonstrates the flexibility of the method.
As for the core field, the accuracy of external and induced fields is not constant over the model time span. While signal of the remote magnetospheric field could be extracted from 1953 on, rapidly evolving sources such as the close and fluctuating magnetospheric fields or the induced field, could only be separated from the data when the latter exhibit a high temporal coverage. In general, optimal solution for such field was obtained during satellite eras and in particular during the CHAMP and Swarm ones. The global behavior of external fields is in agreement with previous studies of it (see Lühr and Maus (2010); Huder et al. (2020)). However, the training of the model under very quiet magnetic conditions forbids the reproduction of most intense external field variations. A recalibration of the model under more general conditions appears therefore as necessary. Although magnetospheric and induced fields were a priori assumed to be independent, their connection revealed itself a posteriori. Through the proposed approach we showed that external and induced fields could be jointly estimated from direct measurements of the geomagnetic field although the process characterizing their evolution remain quite simplistic. A refined parametrization of their dynamical behavior would certainly enhance the ability of the algorithm to extract such sources from the data.
The model will be frequently updated (at least once every 2 months), in particular with Swarm and observatory data. Furthermore, it can be accessed through different physical and statistical properties on a dedicated website at: https://ionocovar.agnld.unipotsdam.de/Kalmag/.
Availability of data and materials
The Kalmag model can be downloaded at https://ionocovar.agnld.unipotsdam.de/Kalmag/ POGO data can be downloaded at ftp://ftp.spacecenter.dk/data/magneticsatellites/POGO/ MagSat data can be downloaded at ftp://ftp.spacecenter.dk/data/magneticsatellites/MagSat/ Oersted data can be downloaded at ftp://ftp.spacecenter.dk/data/magneticsatellites/Oersted/ Champ data can be downloaded at https://isdc.gfzpotsdam.de/champisdc/accesstothechampdata/ Swarm data can be downloaded at ftp://swarmdiss.eo.esa.int/Level1b/Entire\(\_\)mission\(\_\)data/MAGx\(\_\)LR/ The Kp index can be downloaded at ftp://ftp.gfzpotsdam.de/pub/home/obs/kpap/ The IMF indices can be downloaded at https://spdf.gsfc.nasa.gov/pub/data/omni/low_res_omni/ Land, marine and airborne survey data can be download at: \(\cdot\) http://www.wdc.bgs.ac.uk/data.html\(\cdot\) https://www.ncei.noaa.gov/maps/geophysics/\(\cdot\) https://mrdata.usgs.gov/magnetic/
Abbreviations
 SH:

Spherical harmonics
 SV:

Secular variation
 SD:

Standard deviation
 ARP:

Autoregressive process
References
Alken P, Thébault E, Beggan CD, Amit H, Aubert J, Baerenzung J, Bondar TN, Brown WJ, Califf S, Chambodut A, Chulliat A, Cox GA, Finlay CC, Fournier A, Gillet N, Grayver A, Hammer MD, Holschneider M, Huder L, Hulot G, Jager T, Kloss C, Korte M, Kuang W, Kuvshinov A, Langlais B, Léger JM, Lesur V, Livermore PW, Lowes FJ, Macmillan S, Magnes W, Mandea M, Marsal S, Matzka J, Metman MC, Minami T, Morschhauser A, Mound JE, Nair M, Nakano S, Olsen N, PavónCarrasco FJ, Petrov VG, Ropp G, Rother M, Sabaka TJ, Sanchez S, Saturnino D, Schnepf NR, Shen X, Stolle C, Tangborn A, TøffnerClausen L, Toh H, Torta JM, Varner J, Vervelidou F, Vigneron P, Wardinski I, Wicht J, Woods A, Yang Y, Zeren Z, Zhou B (2021) International geomagnetic reference field: the thirteenth generation. Earth Planets Space 73(1):49. https://doi.org/10.1186/s4062302001288x
Amante C, Eakins B (2009) ETOPO1 1 arcminute global relief model: procedures, data sources and analysis. NOAA technical memorandum NESDIS NGDC24. Natl Geophys Data Center. https://doi.org/10.7289/V5C8276M
Aubert J, Gillet N (2021) The interplay of fast waves and slow convection in geodynamo simulations nearing Earth’s core conditions. Geophys J Int 225(3):1854–1873. https://doi.org/10.1093/gji/ggab054
Backus GE (1970) Nonuniqueness of the external geomagnetic field determined by surface intensity measurements. J Geophys Res 75(31):6339. https://doi.org/10.1029/JA075i031p06339
Baerenzung J, Holschneider M, Wicht J, Lesur V, Sanchez S (2020) The Kalmag model as a candidate for IGRF13. Earth Planets Space 72(1):163. https://doi.org/10.1186/s4062302001295y
Cain JC, Sweeney RE (1973) The POGO data. J Atmos Terr Phys 35:1231. https://doi.org/10.1016/00219169(73)900214
Dyment J, Choi Y, Hamoudi M, Lesur V, Thebault E (2015) Global equivalent magnetization of the oceanic lithosphere. Earth Planet Sci Lett 430:54–65. https://doi.org/10.1016/j.epsl.2015.08.002
Finlay CC, Kloss C, Olsen N, Hammer MD, TøffnerClausen L, Grayver A, Kuvshinov A (2020) The CHAOS7 geomagnetic field model and observed changes in the South Atlantic Anomaly. Earth Planets Space 72(1):156. https://doi.org/10.1186/s40623020012529
Gillet N (2019) Spatial and temporal changes of the geomagnetic field: insights from forward and inverse core field models. Geomagn Aeron Space weather. https://doi.org/10.48550/arXiv.1902.08098
Gillet N, Jault D, Finlay CC, Olsen N (2013) Stochastic modeling of the Earth’s magnetic field: inversion for covariances over the observatory era. Geochem Geophys Geosyst 14:766–786. https://doi.org/10.1002/ggge.20041
Gillet N, Jault D, Finlay CC (2015) Planetary gyre, timedependent eddies, torsional waves, and equatorial jets at the Earth’s core surface. J Geophys Res 120:3991–4013. https://doi.org/10.1002/2014JB011786
Gillet N, Gerick F, Angappan R, Jault D (2021) A dynamical prospective on interannual geomagnetic field changes. Surv Geophys. https://doi.org/10.1007/s10712021096642
Holschneider M, Lesur V, Mauerberger S, Baerenzung J (2016) Correlationbased modeling and separation of geomagnetic field components. J Geophys Res 121:3142–3160. https://doi.org/10.1002/2015JB012629
Huder L, Gillet N, Finlay CC, Hammer MD, Tchoungui H (2020) COVOBS.x2: 180 years of geomagnetic field evolution from groundbased and satellite observations. Earth Planets Space 72(1):160. https://doi.org/10.1186/s40623020011942
Hulot G, Le Mouël JL (1994) A statistical approach to the Earth’s main magnetic field. Phys Earth Planet Inter 82(3–4):167–183. https://doi.org/10.1016/00319201(94)900701
Hulot G, Sabaka T, Olsen N, Fournier A (2015) 5.02  the present and future geomagnetic field. In: Schubert G (ed) Treatise on geophysics, 2nd edn. Elsevier, Oxford, pp 33–78. https://doi.org/10.1016/B9780444538024.000968
Jackson A, Finlay CC (2007) Geomagnetic secular variation and its applications to the core. Treatise Geophys 5:147–193. https://doi.org/10.1016/B9780444527486.000900
Jackson A, Jonkers ART, Walker MR (2000) Four centuries of geomagnetic secular variation from historical records, in Astronomy, physics and chemistry of H\(^{+}\)\(_{3}\). Philos Trans R Soc Lond Ser A 358:957. https://doi.org/10.1098/rsta.2000.0569
Kalman RE (1960) A new approach to linear filtering and prediction problems. J Basic Eng 82:35–45. https://doi.org/10.1115/1.3662552
Langel RA, Estes RH (1985) The nearearth magnetic field at 1980 determined from magsat data. J Geophys Res 90(B3):2495–2510. https://doi.org/10.1029/JB090iB03p02495
Langel RA, Estes RH (1985) Largescale, nearfield magnetic fields from external sources and the corresponding induced internal field. J Geophys Res 90(B3):2487–2494. https://doi.org/10.1029/JB090iB03p02487
Langel RA, Hinze WJ (1998) The magnetic field of the earth’s lithosphere. Cambridge University Press, Cambridge
Laundal R (2017) Magnetic coordinate systems. Space Sci Rev 206:27–59. https://doi.org/10.1007/s112140160275y
Lesur V, Wardinski I, Rother M, Mandea M (2008) GRIMM: the GFZ reference internal magnetic model based on vector satellite and observatory data. Geophys J Int 173:382–394. https://doi.org/10.1111/j.1365246X.2008.03724.x
Lesur V, Wardinski I, Hamoudi M, Rother M (2010) The second generation of the GFZ reference internal magnetic model: GRIMM2. Earth Planets Space 62:765–773. https://doi.org/10.5047/eps.2010.07.007
Lesur V, Whaler K, Wardinski I (2015) Are geomagnetic data consistent with stably stratified flow at the coremantle boundary? Geophys J Int 201:929–946. https://doi.org/10.1093/gji/ggv031
Lesur V, Hamoudi M, Choi Y, Dyment J, Thébault E (2016) Building the second version of the World Digital Magnetic Anomaly Map (WDMAM), Earth. Planets Space 68:27. https://doi.org/10.1186/s4062301604046
Lowes FJ (1975) Vector errors in spherical harmonic analysis of scalar data. Geophys J Int 42(2):637–651. https://doi.org/10.1111/j.1365246X.1975.tb05884.x
Lühr H, Maus S (2010) Solar cycle dependence of quiettime magnetospheric currents and a model of their nearEarth magnetic fields. Earth Planets Space 62(10):843–848. https://doi.org/10.5047/eps.2010.07.012
Macmillan S, Olsen N (2013) Observatory data and the Swarm mission. Earth Planets Space 65(11):1355–1362. https://doi.org/10.5047/eps.2013.07.011
Mauerberger S, Schanner M, Korte M, Holschneider M (2020) Correlation based snapshot models of the archeomagnetic field. Geophys J Int 223(1):648–665. https://doi.org/10.1093/gji/ggaa336
Maus S (2010) An ellipsoidal harmonic representation of Earth’s lithospheric magnetic field to degree and order 720. Geochem Geophy Geosyst 11(6):Q06015. https://doi.org/10.1029/2010GC003026
Maus S, Lühr H, Balasis G, Rother M, Mandea M (2005) Introducing POMME, the POtsdam magnetic model of the earth. Earth Obs Champ. https://doi.org/10.1007/3540268006_46
Maus S, Manoj C, Rauberg J, Michaelis I, Lühr H (2010) NOAA/NGDC candidate models for the 11th generation International geomagnetic reference field and the concurrent release of the 6th generation Pomme magnetic model. Earth Planets Space 62:729–735. https://doi.org/10.5047/eps.2010.07.006
Neubert T, Mandea M, Hulot G, von Frese R, Primdahl F, Jørgensen JL, FriisChristensen E, Stauning P, Olsen N, Risbo T (2001) Ørsted satellite captures highprecision geomagnetic field data. EOS Trans 82(7):81–88. https://doi.org/10.1029/01EO00043
Olsen N, Sabaka TJ, Lowes F (2005) New parameterization of external and induced fields in geomagnetic field modeling, and a candidate model for IGRF 2005. Earth Planets Space 57:1141–1149. https://doi.org/10.1186/BF03351897
Olsen N, Lühr H, Sabaka TJ, Mandea M, Rother M, ToeffnerClausen L, Choi S (2006) CHAOSa model of the Earth’s magnetic field derived from CHAMP, Oersted, and SACC magnetic satellite data. Geophy J Int 166:67–75. https://doi.org/10.1111/j.1365246X.2006.02959.x
...Olsen N, FriisChristensen E, Floberghagen R, Alken P, Beggan CD, Chulliat A, Doornbos E, da Encarnação JT, Hamilton B, Hulot G, van den IJssel J, Kuvshinov A, Lesur V, Lühr H, Macmillan S, Maus S, Noja M, Olsen PEH, Park J, Plank G, Püthe C, Rauberg J, Ritter P, Rother M, Sabaka TJ, Schachtschneider R, Sirol O, Stolle C, Thébault E, Thomson AWP, TøffnerClausen L, Velímský J, Vigneron P, Visser PN (2013) The Swarm Satellite Constellation Application and Research Facility (SCARF) and Swarm data products. Earth Planets Space 65(11):1189–1200. https://doi.org/10.5047/eps.2013.07.001
Olsen N, Ravat D, Finlay CC, Kother LK (2017) LCS1: a highresolution global model of the lithospheric magnetic field derived from CHAMP and Swarm satellite observations. Geophys J Int 211(3):1461–1477. https://doi.org/10.1093/gji/ggx381
Quesnel Y, CataláN M, Ishihara T (2009) A new global marine magnetic anomaly data set. J Geophys Res 114(B4):B04106. https://doi.org/10.1029/2008JB006144
Rauch HE, Striebel CT, Tung F (1965) Maximum likelihood estimates of linear dynamic systems. AIAA J 3(8):1445–1450. https://doi.org/10.2514/3.3166
Ropp G, Lesur V, Baerenzung J, Holschneider M (2020) Sequential modelling of the Earth’s core magnetic field. Earth Planets Space 72(1):153. https://doi.org/10.1186/s40623020012301
Rother M, Michaelis K, Olsen N (2000) Resolution studies of fluid flow models near the coremantle boundary using bayesian inversion. In: Hansen P, Jacobsen B, Mosegaard K (eds) Methods and applications of inversion, lecture notes in earth sciences, vol 92. Springer, Berlin, pp 255–275. https://doi.org/10.1007/BFb0010296
Sabaka TJ, Olsen N, Langel RA (2002) A comprehensive model of the quiettime, nearEarth magnetic field: phase 3. Geophy J Int 151:32–68. https://doi.org/10.1046/j.1365246X.2002.01774.x
Sabaka TJ, Olsen N, Purucker ME (2004) Extending comprehensive models of the Earth’s magnetic field with Ørsted and CHAMP data. Geophys J Int 159(2):521–547. https://doi.org/10.1111/j.1365246X.2004.02421.x
Sabaka TJ, Olsen N, Tyler RH, Kuvshinov A (2015) CM5, a preSwarm comprehensive geomagnetic field model derived from over 12 yr of CHAMP, Oersted, SACC and observatory data. Geophys J Int 200:1596–1626. https://doi.org/10.1093/gji/ggu493
Sabaka TJ, TøffnerClausen L, Olsen N, Finlay CC (2018) A comprehensive model of Earth’s magnetic field determined from 4 years of Swarm satellite observations. Earth Planets Space 70:130. https://doi.org/10.1186/s4062301808963
Sabaka TJ, TøffnerClausen L, Olsen N, Finlay CC (2020) CM6: a comprehensive geomagnetic field model derived from both CHAMP and Swarm satellite observations. Earth Planets Space 72(1):80. https://doi.org/10.1186/s40623020012105
Schanner M, Korte M, Holschneider M (2022) ArchKalmag14k: A Kalmanfilter based global geomagnetic model for the Holocene. J Geophys Res 127(2):e23166. https://doi.org/10.1029/2021JB023166
Schmucker U (1985) Magnetic and electric fields due to electromagnetic induction by external sources. In: Schmucker (eds.) LandoltBörnstein, NewSeries, 5/2b. Springer, Berlin, p 100125
Thébault E, Hulot G, Langlais B, Vigneron P (2021) A spherical harmonic model of Earth’s lithospheric magnetic field up to degree 1050. Geophys Res Lett 48(21):e95147. https://doi.org/10.1029/2021GL095147
Wardinski I, Holme R (2011) Signal from noise in geomagnetic field modelling: denoising data for secular variation studies. Geophys J Int 185(2):653–662. https://doi.org/10.1111/j.1365246X.2011.04988.x
Wardinski I, Saturnino D, Amit H, Chambodut A, Langlais B, Mandea M, Thébault E (2020) Geomagnetic core field models and secular variation forecasts for the 13th International geomagnetic reference field (IGRF13). Earth Planets Space 72(1):155. https://doi.org/10.1186/s40623020012547
Waters CL, Anderson BJ, Liou K (2001) Estimation of global field aligned currents using the iridium® System magnetometer data. Geophys Res Lett 28(11):2165–2168. https://doi.org/10.1029/2000GL012725
Acknowledgements
We wish to particularly thank Professor Niels Olsen from the DTU space for providing us \({\mathcal {Q}}\)response models. This work has been supported by the German Research Foundation (DFG) within the Priority Program SPP1788 “Dynamic Earth”.
Funding
Open Access funding enabled and organized by Projekt DEAL. This work has been funded by the German Research Foundation (DFG) within the Priority Program SPP1788 “Dynamic Earth”.
Author information
Authors and Affiliations
Contributions
All the authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Sequentialization
Describing the evolution of a given g quantity by continuous first and secondorder autoregressive processes can be preformed through the following relations:
where \({\dot{\omega }}\) is a Gaussian white noise scaled by the factor \(\sigma\). Introducing \(z=g\) for firstorder ARP and \(z=\left( g,\partial t g \right) ^T\) for secondorder ARP, equations (A.1) and (A.2) can be written as:
with \(A = 1/\tau\) and \(\zeta = \sigma {\dot{\omega }}\) for firstorder ARP and:
for secondorder ARP.
The homogeneous solution of equation A.3 is given by:
where \(F=\exp (A \Delta t)\) is the parameter of the ARP as expressed in equations 4 and 5. The general solution of equation A.3 is simply \(z(t+\Delta t) = F z(t) +\xi\), where the white noise \(\xi\) characterized by the distribution \({\mathcal {N}}(0,{\tilde{\Sigma }})\) is chosen here to force the process to remain stationary. Under such a constraint, one can write that \(\Sigma (t) = E\left[ \left( z(t)E\left[ z(t)\right] \right) \left( z(t)E\left[ z(t)\right] \right) ^T\right] = \Sigma (t=\infty ) = \Sigma _\infty\). Therefore, calculating the spatial covariance of both sides of the solution \(z(t+\Delta t) = F z(t) +\xi\) and rearranging the result gives \({\tilde{\Sigma }} = \Sigma _\infty  F \Sigma _\infty F^T\).
Appendix B: Smoothing and merging
The model being constructed in multiple steps, the smoothing algorithm had to be adapted. In particular, information gained with the assimilation of data prior to 2000.5 had to be propagated to the model constructed after this date with CHAMP, Swarm and groundbased observatory data. To do so, the solution in 2000.4 of the smoothing algorithm running between 1900 and 2000.4 was taken as a reference model. Its is referred as \(\mathbf{z }{_r}\), with mean \(E[\mathbf{z }_r]\) and covariance \(\varvec{\Sigma }_{\mathbf{z }_r}\). Information accumulated within this snapshot is then transferred to the first smoothing solution (the one running from 2022.2 to 2000.5). The algorithm to perform this task proceeds iteratively in time through the relations:
where \(\mathbf{z }_{k}\) is a snapshot taken at iteration k of the \(2000.5  2022.2\) smoothing model, \(\mathbf{F }_{k}\) is the parameter of the ARP enabling to forecast \(\mathbf{z }_r\) to the k iteration, and \({\tilde{\varvec{\Sigma }}}_{\mathbf{z }_r}\) is the covariance of the ARP white noise derived from \(\mathbf{F }_k\).
Appendix C: Lithospheric field update
Whenever new survey data become available, one may wish to assimilate them to improve the lithospheric field model. A possibility would be to merge this new dataset with the global one and to relaunch the entire Kalman filter/smoothing algorithm. Yet this operation is extremely time consuming. A better option would be to use the posterior model resulting from the smoothing algorithm as a prior information to assimilate the new dataset and then to propagate the information gained on the lithospheric field to the entire model. To perform this task lets assume that new data become available within the time interval \([k1,k+1]\), where k corresponds to a stored snapshot of the smoothing solution. Assimilating data between \(k1\) and k is straightforward. One can simply simulate the Kalman filter algorithm with the smoothing solution \(\mathbf{z }_{k1}\) taken as a restart file. Arriving at k, if the new dataset offers only a limited coverage of the Earth’s surface, the accuracy the core field and other sources exhibit will likely be lower than the accuracy of the smoothing solution at this epoch. It would therefore be beneficial to use this solution \(\mathbf{z }_k\) as a prior model. At the same time nevertheless, information gained on the lithospheric field between \(k1\) and k needs to be transferred to it. To perform this operation we proceeded as following:
where \(g_{k}^u\) and \(g_{k}^s\) are the vectors of SH harmonics components associated with the lithospheric field for, respectively, the Kalman filter and the smoothing solution and \(\varvec{\Sigma }_{\mathbf{z }_{k},g_{k}^s}\) is the smoothing crosscovariance matrix between \(\mathbf{z }_{k}\) and \(g_{k}^s\). These updated smoothing solutions at epoch k are then used as a prior information for the Kalman filter running between k and \(k+1\). Such operation is then repeated every 0.1 year, as constrained by the chosen temporal resolution of the model.
Once the entire new dataset has been assimilated, the Kalman filter solution in 1900 is updated with the new lithospheric field through equations C.1 to C.3 and the smoothing algorithm between 1900 and 2022 is simulated again.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Julien, B., Matthias, H., SaynischWagner, J. et al. Kalmag: a high spatiotemporal model of the geomagnetic field. Earth Planets Space 74, 139 (2022). https://doi.org/10.1186/s40623022016925
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s40623022016925
Keywords
 Geomagnetic field
 Lithospheric field
 Secular variation
 Magnetospheric field
 Induced field
 Assimilation
 Kalman filter
 Machine learning