- Article
- Open Access
- Published:
Use of the Comprehensive Inversion method for Swarm satellite data analysis
Earth, Planets and Space volume 65, Article number: 2 (2013)
Abstract
An advanced algorithm, known as the “Comprehensive Inversion” (CI), is presented for the analysis of Swarm measurements to generate a consistent set of Level-2 data products to be delivered by the Swarm “Satellite Constellation Application and Research Facility” (SCARF) to the European Space Agency (ESA). This new algorithm improves on a previously developed version in several ways, including the ability to process ground-based observatory data, estimation of rotations describing the alignment of vector magnetometer measurements with a known reference system, and the inclusion of ionospheric induction effects due to an a priori 3-dimensional conductivity model. However, the most substantial improvements entail the application of a mechanism termed “Selective Infinite Variance Weighting” (SIVW), which mitigates the effects of non-zero mean systematic noise and allows for the exploitation of gradient information from the low-altitude Swarm satellite pair to determine small-scale lithospheric fields, and an improvement in the treatment of attitude error due to noise in star-tracking systems over previously established methods. The advanced CI algorithm is validated by applying it to synthetic data from a full simulation of the Swarm mission, where it is found to significantly exceed all mandatory and most target accuracy requirements.
1. Introduction
The European Space Agency (ESA) is scheduled to launch the Swarm mission (Friis-Christensen et al., 2006) in 2013, a constellation of three satellites to map the Earth’s magnetic field to unprecedented accuracy. During its multi-year lifetime, two low orbiting spacecraft will act as a magnetic gradiometer while a third at higher altitude monitors the main and external fields at other local times. ESA has established the Swarm “Satellite Constellation Application and Research Facility” (SCARF) for the purposes of generating derived Level-2 products from the single-satellite Level-1b data. The “Comprehensive Inversion” (CI) method of Sabaka and Olsen (2006) is a major processing chain of SCARF, producing one version of each of five defined items: core, lithospheric, magnetospheric, and ionospheric spherical harmonic expansions, time-varying when appropriate, and Euler angles describing the alignment between the vector fluxgate magnetometer frame (VFM) system and that of the Common Reference Frame (CRF) system of the star imager.
The basic CI algorithm is presented in Sabaka and Olsen (2006) where the magnetic fields from all major near-Earth current sources are parameterized and then co-estimated to obtain optimal field separation. This co-estimation approach is the key to the superior results obtained because it eliminates ambiguities between parameters spaces. Technically, the basic CI algorithm is an iterative Gauss-Newton (GN) least-squares estimator (Seber and Wild, 2003), which derives the model parameters from Swarm vector magnetometer measurements. While its application to the Swarm E2E simulator (Sabaka and Olsen, 2006) showed promising performance in field recoverability, it still lacked several features that would render it a truly competent algorithm for the generation of actual Level-2 products. For instance, the basic algorithm did not allow for surface measurements such as observatory hourly-means (OHM) data, which are known to greatly enhance field separation. Estimation of the rotation between the VFM and CRF system for vector measurements mentioned above was not included in the basic algorithm. The a priori conductivity model assumed for ionospheric induction was 1-dimensional (1D) rather than 3-dimensional (3D) in its variation. Finally, the formal treatment of measurement and theory errors was very primitive and not considered versatile enough for actual mission application.
In this paper an attempt will be made to remedy the aforementioned inadequacies of the basic algorithm by developing an advanced CI algorithm. While this new algorithm now admits the OHM data, estimates the Euler angles describing the VFM to CRF rotations, and includes ionospheric induction due to 3D conductivity structure, its greatest advancement is in the area of formal error treatment. Here a methodology, termed “Selective Infinite Variance Weighting” (SIVW), is developed to handle non-zero mean error due to, for instance, theory inadequacies through the use of bias estimation which exploits Signal-to-Noise ratio (SNR) levels in different data subsets in order to extract the best models. In addition, the methodology of Holme and Bloxham (1995, 1996) and Holme (2000) to account for attitude error present in vector magnetometer measurements due to star tracker instabilities is revised in order to improve parameter estimates in the CI algorithm. The entire error treatment mechanism is then placed in the context of robust estimation by applying a Huber weighting scheme to mitigate the effects of outliers (Constable, 1988; Walker and Jackson, 2000; Olsen, 2002).
The structure of the paper is as follows: a brief overview of the basic CI algorithm will be presented in Section 2, followed by the development of SIVW in Section 3. In Section 4 the improved attitude error framework will be derived, followed by the development of the advanced CI algorithm in Section 5. The results of the application of the advanced algorithm to synthetic data from a mission simulation, known as “Version-2” (V2) (Olsen et al., 2013), and a discussion are in Section 7, followed by conclusions in Section 8. Finally, Appendices A and B are provided that contain some technical information and derivations of formulae presented in Sections 4 and 5, respectively.
2. Overview of the Basic CI Algorithm
The basic CI algorithm essentially employs the same field source parameterizations as the CM4 model of Sabaka et al. (2004), except for the magnetosphere and its associated induced field. These latter fields were rather dis-cretized in contiguous bins through time and considered static within each bin. The magnetospheric and associated induced fields are modeled independently with the intent of discovering something of the underlying conductivity structure in Earth’s outer shell. These overall parameterizations have been presented in Sabaka and Olsen (2006) where specific details may be found. However, a synopsis of this parameterization is given in Table 1, where the spherical harmonic (SH) expansions begin at degree 1 and are truncated at maximum degree/order N_{max}/M_{max}.
Because the induced field associated with the magneto-sphere is modeled as an internal field to full degree/order 3, it can mimic the internal core field as its bin duration decreases. As a consequence, separation of induced fields at periods longer than, say, a few months from rapid core field changes is not possible. This happens even though the induced field is expressed in dipole SH functions while the core is in geographic. This means that co-linearity potentially exists between the parameters of these two fields. To alleviate this, the basic CI algorithm employs equality constraints that force point-wise orthogonality over the measurement times between any linear functionals of the core and induced SH parameters. This will be derived again in this paper for clarity. Let
where the subscript “c” and “i” indicate core or induced fields, respectively, and ζ(t, r) is the linear functional at time t and position r, ℓ(r) is the SH linear mapping at position r,p(t) is the vector of SH coefficients at time t, P is the matrix of temporal coefficients for each SH coefficient such that (P)_{ ij } is the j-th temporal coefficient for the i-th SH coefficient, and b(t) is the vector of temporal basis functions at time t. Specifically, the first element of b_{ c }(t) is a one (for the static term) followed by a series of B-spline terms defined over the time domain of the model, while b_{i}(t) is of length N_{b} (the number of time bins for the induced field) with a one in the position of the bin where t resides and zeros elsewhere. The objective is to make the following summation vanish over the set of measurement times
where , and G = B_{c}B_{i}. This leads to the desired condition
A more convenient form is realized by stacking the N_{ i } columns of (or the Ni rows of P_{i}) to make a vector ι, which itself is a sub-vector of the parameter vector for the entire model x, so that
If b_{c} (t) is of length N_{ c }, then is a sparse matrix containing the appropriate pattern of G such that its N_{c} ∙N_{i} rows enforce a like number of constraints specified in Eq. (4) when multiplying x. Therefore, it is Eq. (5) that is used to constrain the least-squares solution so that the induced magnetospheric field is subjugated to the core field to be point-wise orthogonal to it over the measurement times. Because the core spline basis is broad-scale in time, this means that ι will reflect more high-frequency behavior in the internal field, which is reasonable for much of the induced effects.
The result is that the basic CI algorithm solves the following least-squares problem with linear equality constraints (LSLE) at the k-th GN step (LSLE-GN)
where ǀ ∙ ǀ_{2} is the ℓ_{2} norm, Δd_{k} ≡ Δd(x_{ k }) = d − a(x_{ k }) are the residuals of the data vector d with respect to the non-linear model vector a(x_{ k }) evaluated at x_{ k }, A_{ k } ≡ A(x_{ k }) is the Jacobian of the model vector evaluated at x_{ k }, Δx_{ k } are the adjustments to the current parameter vector x_{ k }, L is the square-root factor of the data noise covariance matrix C = LL^{T}, and F_{ j } is the square-root factor of the j-th a priori covariance matrix that, along with the Lagrange multiplier λ_{ j }, specifies the deviation of the solution from the preferred a priori model vector . The solution to LSLE-GN, denoted , may be found through Lagrange multiplier theory (Toutenburg, 1982; Golub and van Loan, 1989; Bertsekas, 1999) to be
where , and is the unconstrained solution given by
Note that only the linear LSLE-GN was discussed in Sabaka and Olsen (2006). In practice, the N_{ q } quadratic norms in Eq. (6) are smoothing terms whose preferred solutions are zero, that is, . These smoothing norms affect the secular variation of the core field, conductivity structure of the ionospheric E-region, polar gaps, etc., and will be discussed briefly in Section 5.4.
What is not included in the basic CI is a mechanism to exploit the enhanced lithospheric SNR in the differences of the vector measurements made by the Swarm low satellite pair. It is more complicated than simply using only the low pair vector differences to make models since the complementary data set (the low pair vector sums) is needed in order to resolve other field constituents. In the next section such a mechanism is indeed developed. The application of this mechanism to Swarm will be discussed in Section 5.
3. Selective Infinite Variance Weighting
The near-Earth magnetic field is a highly dynamic system containing signals that vary over a large range of spatial and temporal scales. Even a constellation like Swarm cannot completely decouple all of the space and time modes. A prominent example is the mis-modeling of time-varying external fields which can manifest itself as spurious static structure, for instance, in the lithospheric field. While this structure may be spatially broad-scale, it can vary rapidly in time and represents a systematic noise bias that can contaminate the nominal estimate of the lithospheric field. Different philosophies exist on how to enhance the recovery of signals of interest, like the lithosphere, while mitigating the effects of unwanted or contaminating signals in the estimation procedure. Several recent models have employed direct data selection techniques to derive good descriptions of core and crustal fields (Maus et al., 2007, 2008; Thomson and Lesur, 2007; Lesur et al, 2008; Olsen et al, 2011). The “Comprehensive Modelling” (CM) approach (Sabaka et al., 2002, 2004) has not generally relied upon this practice, with the exception of gross outliers, etc., but rather has focused on using as much data as possible to ensure a stable co-estimation of parameters from all sources. Comparison of the CMs with these models, however, has revealed effects of contamination, particularly of ionospheric “leakage” into lithospheric fields.
The Swarm constellation offers the ability of taking differences between the vector magnetometer measurements of the satellite low pair, which effectively eliminates most of the broad-scale external contamination, thus leading to a high SNR of the crustal field signal. However, the complementary data set, i.e., the summation of the vector magnetometer measurements of the satellite low pair, is necessary for a full co-estimation of the other field constituents, but it has a much lower SNR for its lithospheric field because it suffers from the aforementioned systematic noise bias. With this in mind, a more sophisticated error treatment will be required of the CI models in order to exploit the high SNR in one data set while limiting damage done by retaining low SNR data. The SIVW mechanism is now developed by first showing its ability to eliminate bias from systematic noise of a particularly common form that would otherwise contaminate estimates if treated in traditional ways, and secondly, how to combine this property with selection of data subsets that exhibit different SNRs for different parameter sets in order to obtain optimal solutions.
3.1 Mitigating biases
This challenge can be stated mathematically by how best to handle additive systematic error terms so as to not bias the estimation of signals of interest. The name “systematic” is used here to describe a vector error term that has the form z = By, where B is a matrix having more rows than columns and y is a vector of Gaussian random errors having a mean vector μ and covariance matrix Q, indicated by . Thus, z represents noise that cannot be reduced to arbitrary levels by repeated experiments because it has a non-zero mean, but can be eliminated in certain subspaces due to the fact that q = dim (z) > p = dim (y).
Consider the following model
in which a vector of parameters of interest x are related to a vector of measurements d through a linear operator A in the presence of an additive error term ν = z +η, where z is the systemic error defined previously and is a random error independent of z such that . If a zero-mean assumption is made about ν, then the data noise weight matrix will be given by
and the least-squares estimate of x will be
but this is a biased estimate since , where E [∙] is the expectation operator.
Now consider the case in which y is treated as deterministic rather than stochastic and is co-estimated with x giving
The partitioned solution for x may be written as
where
Thus, W_{∞}B = 0 and the estimate is now unbiased since . In this case, y is treated as a vector of “nuisance” parameters that are co-estimated with the nominal parameters x in order to absorb error biases. If Q is rewritten as and W in Eq. (11) is expanded by the Sherman-Morrison-Woodbury formula (Toutenburg, 1982), then
that is, it is the limit of W as the variance of the systematic noise goes to infinity, and hence the name “infinite variance weighting”. Thus, the least-squares estimate of x in Eq. (9) given by Eq. (14) with explicit use of W_{∞} given in Eq. (17) is a form that will mitigate the effects of systematic errors like z (e.g., time-varying external field leakage into the lithospheric field). However, because W_{∞} can be large and dense, the least-squares estimates of x and y in Eq. (10) given by Eq. (13), where y are treated as nuisance parameters, yields the same solution for x while using the simpler noise covariance matrix C and is the preferred form used in the CI approach.
There are two interesting properties about the estimate given in Eq. (14) that should be mentioned. First, there is no need to actually specify Q, that is, the covariance of the systematic noise term. The weight matrix W_{∞} gives zero weights to the directions defined by the column space of B rendering a specification of Q completely unnecessary. It is also interesting to note that W_{∞} not only eliminates Bμ in the mean, but also individual realizations of the systematic error z.
The second property becomes apparent by first letting C = LL^{T} be the Cholesky factorization of C, which must exist assuming C is full-rank. Now, rewrite Eq. (20) such that
where U_{ N } is a matrix whose orthogonal columns span the null-space of the columns of L^{−1}B, and . This means that Eq. (14) is now the least-squares solution to
but this system consists of only q − p equations; the p-dimensional subspace where the columns of B, and hence z, reside has been eliminated. Therefore, the action of W_{∞} can also be interpreted as a selection mechanism that only admits “data” that are not contaminated by z. Note that if q = p, that is, if B is a square full-rank matrix, then the entire data set is eliminated from consideration.
3.2 Subset selection
The idea of solving for sets of nominal and nuisance parameters from different subsets of data depending upon their SNRs in a hierarchal framework is the basis for the “selective” part of the method. Consider again Eq. (10), but now partition the data into two subsets and let x_{1} and x_{2} be vectors of parameters of interest where x_{2} is related to the data through the matrix B such that
where ν_{2} = B_{2}y + η_{2} is systematic with independent random errors , and such that . Taking the infinite variance limit on Q to mitigate the bias in y leads to the data weight matrix
which can be used in the usual weighted least-squares solution. However, this is completely equivalent to solving the following system
with data weight matrix
Solving Eq. (27) with Eq. (28) is preferable to solving Eq. (25) with Eq. (26) because W is typically much less dense than W_{∞}.
A useful property may now be derived by performing the following parameter transformation on the system in Eq. (27) such that
where I are appropriately dimensioned identity matrices. Because the parameter transformation is full-rank, the least-squares solution of Eq. (30) with Eq. (28) is equivalent to that of Eq. (27) with Eq. (28) in the nominal parameters and , but is equal to the sum in the nuisance parameters. The advantage of solving Eq. (30) over (27) is that the Jacobian of the former is more sparse. Again, the use of either Eq. (27) or (30) with the weight matrix defined in Eq. (28) in the CI is preferable to the usual weighted least-squares solution using the weight matrix defined in Eq. (26).
Clearly a hierarchy of nominal/nuisance parameter combinations could be distributed over the observation equations in order to mitigate systematic errors. Obviously, in extreme cases where a data subset is contaminated by such errors in all parameters subsets, it would be wise to simply eliminate that data.
4. Attitude Error
The observation equations relating spherical harmonic coefficients in a local spherical system (North, East, Center or NEC (Olsen et al., 2013)) to the VFM will rely upon coordinate transformations provided by star-imagers (STRs). As such, these transformations will be degraded by random errors due to physical limitations of the STRs and should therefore be accounted for in the error analysis of the estimators. In a series of papers (Holme and Bloxham, 1995, 1996; Holme, 2000) a mechanism was developed in order to account for these errors that will henceforth be referred to as “HB theory”. This theory has been used successfully in such models as the Oersted Initial Field Model (OIFM) (Olsen et al., 2000) and Comprehensive Model-4 (CM4) (Sabaka et al., 2004) for instance. However, it turns out that simplifying assumptions have been made in the HB theory that render the forms used in these models (equations (13) and (18) of Holme and Bloxham (1996)) less suitable for describing the actual attitude error encountered. While Holme and Bloxham (1996) provide a form that is technically always applicable (their equation (20)), the quantities involved are non-intuitive and require prior knowledge of the eigen-structure of the attitude covariance; something that is not obvious, even in isotropic error cases. In this section an attempt is made to remedy the situation by applying a slightly more generalized treatment to attitude errors in the CI algorithm.
4.1 Covariance under general, finite rotations
Consider the case of a compound rotation matrix R representing successive rotations about three normalized axes , and ŵ such that
where a general, elemental rotation matrix describing a positive rotation of angleΨabout the axisêis given by (Wertz and Larson, 1999)
and a general cross-product matrix E_{ u } of u is given by
The goal is then to derive the covariance matrix of a vector B_{2} due to random perturbations about finite, non-zero angles of the rotation matrix R in Eq. (31) such that
This is derived in Appendix A and is given by
where
with the vector of random angular perturbations . The vectors , and are the vectors rotated into reference frame “2”, respectively, as will be shown in Appendix A.
4.2 Covariance under HB theory assumptions
The HB theory essentially considers the case of zero-mean random perturbations about a set of orthogonal axes , which is to say χ = δ = λ = 0. This is equivalent to having
such that
and is shown in Appendix A to reduce to the various covariance matrices derived in the HB theory.
It turns out that the covariance matrix corresponding to the “no equal variances” case in Eq. (A.24) of Appendix A is a general expression that is always true and is equivalent to setting the three rotation axes and corresponding angular variances equal to the eigenvectors and corresponding eigenvalues, respectively, of the symmetric positive semi-definite matrix AC_{a}A^{T} in Eq. (35). The “three variances equal” case in Eq. (A.22) and “two variances equal” case in Eq. (A.23) of Appendix A are true when either all three or just two out of three eigenvalues are equal, respectively.
4.3 Inertial to Common Reference Frame transformations
The STR reference system, or in the case of multiple-head STRs the CRF, is often constructed so that the Z-axis points along the bore-sight direction in the case of a single STR or some average direction in the case of the CRF while the X and Y axes lie in the plane perpendicular to this, e.g., in the plane of a charged-coupling device (CCD) in the case of a single STR. What is usually provided are rotations from the CRF to some inertial reference system like J2000. This coordinate system is right-handed with the X-axis directed towards the mean vernal equinox at noon on January 1, 2000, and whose Z-axis points along the Earth’s rotation axis in the northern hemisphere. This naturally appears to be a compound rotation of the form
such that
where δ and λ are the colatitude and longitude of the J2000 Z-axis in the CRF, andχis the rotation around the bore-sight axis. With this definition, the R matrix can be written as
where the notation “C_{[∙]}” and “S_{[∙]}” denote sin(∙) and cos (∙), respectively. From Eq. (42) it can be seen that
The corresponding A matrix is then
Although the rotation axes are normalized, they only form an orthonormal set when δ = π/2.
Uncertainties in the STR or CRF are usually quoted in terms of errors in the angles comprising the rotation matrices, such as errors in bore-sight pointing angles δ and λ and errors in rotation angles χ about bore-sights. If these angles are finite, then one begins to see the pitfalls in using the HB theory expressions because the columns of the matrix A in Eq. (44) are rarely orthonormal. This means that one has to compute the eigen-decompositon of AC_{a}A^{T} in order to use the “no equal variance” or general formula of the HB theory (equation (20) of Holme and Bloxham (1996)). If two or all three of the eigenvalues are equal, then one can use the more specialized HB formulas (equations (18) and (13) of Holme and Bloxham (1996), respectively), but this is also likely a very rare event. Consequently, while the HB theory general formula is always available, it is very non-intuitive to use as one must compute eigen-decompositions to even use it. Alternatively, the general covariance formula in Eq. (35) follows directly from the statements of error that are presumed to be provided for the STR or CRF and is thus very intuitive.
4.4 Application to CHAMP transformations
To test the accuracy of the attitude covariances derived from the CI versus HB theory in realistic cases, they were computed and compared with covariances generated via Monte Carlo simulation for 3217 actual quaternions, and corresponding B_{J2000}, describing the rotation in Eq. (39) for a set of CHAMP satellite data used in the derivation of the CHAOS-3 model (Olsen et al, 2010). Each quaternion was first expressed as a rotation matrix in the form of Eq. (42) and then the angles χ, δ, and λ were extracted via Eq. (43). These angles were then perturbed by zero-mean Gaussian noise and used in Eq. (39) to produce N = 1000 samples of (B_{crf})_{ j }, j = 1, …, N. Because these quaternions were selected only during times when both heads of the dual-head STR were in operation, the standard deviations of all angular perturbations were set to σ = 10 arcsecs. The Monte Carlo estimate of the covariance is then given by
where
The covariances from CI, denoted C_{CI}, were computed for each case by using Eq. (35) with A from Eq. (44) and C_{a} = σ^{2}I. For HB theory, they were computed for each case using their isotropic attitude error formula (Holme and Bloxham, 1996)
where B_{CRF} = ǀB_{CRF}ǀ. This formula was chosen because it appears, for instance, that Holme (2000) would advocate its use in this case. In both the CI and HB cases, B_{crf} was computed from Eq. (39) using the actual values of R from Eq. (42) and B_{J2000}.
The comparison is shown graphically in Fig. 1 where there are three panels such that the top shows the six independent elements of C_{mc} on the vertical axis for each of the 3217 CHAMP rotation cases on the horizontal axis, but sorted by their cos δ value. Recall from Eq. (44) that the columns of A become orthonormal when δ = π/2 or cos δ = 0. The six matrix elements are in the order of the first column, followed by the bottom two elements of the second column, followed by the lower-right corner element. The middle panel is the same except that the matrix is now C_{hb} − C_{mc}, and the last panel is also the same except the matrix is now C_{CI} − C_{mc}. The scales are equivalent in all three panels. Clearly there is a high level of agreement between C_{CI} and C_{MC} for all rotation cases, but much less agreement between C_{HB} and C_{mc}, except in the vicinity of cos δ = 0. While agreement would be expected near cos δ = 0 in the isotropic case, it is not clear that agreement would occur in the anisotropic case , even though A has orthonormal columns. The maximum absolute deviation from C_{MC} is 5.77 nT^{2} for C_{HB} and it occurs in element (2, 2) when cosδ= 0.99359. For C_{CI} this number is 0.35 nT^{2} and it also occurs in element (2, 2) when cosδ= −0.99953. Figure 1 shows that many of the C_{HB} - C_{MC} values are often larger than the corresponding values of C_{MC}, particularly in the (2, 2) elements (matrix element 4 on the vertical axis of Fig. 1). While the use of C_{HB} has proven beneficial in magnetic field modeling to date, the use of C_{CI} could be a significant improvement, especially as models attempt to describe finer details of Earth’s magnetic field.
5. Development of the Advanced CI Algorithm
An advanced CI algorithm will now be built upon the foundation outlined in Section 2 with improvements from SIVW in Section 3 and the revised treatment of vector magnetometer attitude errors in Section 4. The modified parameterization will be described as well as how Swarm gradient information is to be exploited for improved lithospheric recovery which entails a more sophisticated error analysis then was used in the basic algorithm.
5.1 Parameterization
The parameterization for the advanced algorithm is listed in Table 2 and is similar to the basic algorithm in the core field except for higher time resolving splines that are higher in order and knot density. The lithosphere is now split into two parameter types, “nominal” and “nuisance”, that will be discussed in the next section. The magnetosphere is the same as in the basic algorithm while the ionosphere is different in that its a priori conductivity model now has 3D structure (Kuvshinov, 2011). If ϵ(ω) and ι(ω) are the vectors of SH coefficients for the inducing and induced ionospheric fields, respectively, at frequencyωthen the a priori coupling via the conductivity model is manifested in the relationship ι(ω) = Q (ω)ϵ(ω), where Q(ω) is the coupling matrix at frequencyωIn a 1D treatment, as in Sabaka et al (2002, 2004) and Sabaka and Olsen (2006), Q(ω) is diagonal and square and its elements are only dependent upon SH degree. In the full 3D treatment, Q(ω) is a dense, generally rectangular matrix allowing for very complicated induced structure to result from relatively smooth inducing structure. Therefore, the change from 1D to 3D comes from simply using a different set of Q(ω). In addition, toroidal magnetic fields due to meridional currents that exist within the satellite sampling shells are also modeled in the advanced CI algorithm. These follow the parameterization of CM4 (Sabaka et al., 2004). Finally, because OHM data are now processed in the advanced algorithm, static vector biases are now included in the parameter set for each observatory in order to absorb effects such as local crustal anomalies (Sabaka et al., 2002, 2004).
The truly new parameters are those that describe the magnetometer alignment, that is, the rotation of the vector magnetometer measurement B_{VFM} in the VFM frame to CRF
For Swarm, this rotation is parameterized by a set of 3 positive counter-clockwise Euler angles of type (XYZ) for each satellite such that (Olsen et al., 2013)
In the advanced CI algorithm the observation equations involving vector magnetometer measurements are expressed in the CRF. If the model parameter vector at the k-th GN step xk is split into two subsets, the “geophysical” parameters in vector z_{ k } and the Euler parameters for a particular satellite in vector e_{ k }, then for the i-th vector measurement of that satellite, the observation equation is
where η_{ i } is the error vector, and g_{ i }(z_{ k }) and a_{ i }(xk) are the geophysical and total model vectors in the CRF, respectively. The reason for solving in the CRF rather than the VFM frame is to decouple the product that exists in the latter system, thus decreasing the level of non-linearity in the estimation process. Recalling Eq. (6), it can be seen that Eq. (51) is in a form that is equivalent to having d_{ i } = 0.
Note that while the magnetospheric and associated induced field parameters described so far are estimated by iteratively solving LSLE-GN in Eq. (6) using Eqs. (7) and (8), they do not represent the final Level-2 product MMA_SHA_2 because they are only estimated during geomagnetic quiet times. Rather, they provide a crucial step in the generation of these products, which is elaborated upon further in Section 7.5. This is the reason for using the term “precursor” in Table 2.
5.2 Exploiting Swarm gradient information
One of the great advantages of the Swarm constellation is that the low satellite pair have orbits that differ only by 1.4° in the values of their Right Ascension of the Ascending Node (RAAN), thus allowing for east-west gradiome-try to be carried out at low-mid latitudes. Let the Swarm low pair, denoted “A” and “B”, be at positions (r, θ, ϕ) and (r, θ, ϕ + ϕ), respectively, where r, Ϙ, and ϕ are the radius, colatitude and longitude, respectively, and Δø is a longitude increment. Assume that they provide vector measurements B_{ECEF}( r, θ, ϕ) and B_{ECEF}( r, θ, ϕ + Δϕ) that have been rotated into the Earth Centered Earth Fixed (ECEF) frame, where the z axis points to the north geographic pole, the x axis points along the prime meridian, and the y axis completes the right-handed system. If these vectors are further rotated into the local spherical NEC frame at the midpoint of the satellite positions, then for small Δϕ certain components of their difference behave as a negative gradient of a potential function whose SH coefficients are multiplied by a gain factor of approximately
as compared to the potential coefficients leading to the individual field measurements. These components correspond to the direction of the ECEF z axis in the NEC frame and the direction of the average of the two measurement vectors in the NEC frame. If these two directions are coincident, then all components will exhibit this gain enhancement. It can be seen that and that within the range 0 ≤ m ≤ 150, the maximum gain for vector difference measurements is found at approximately m = 129. Conversely, certain components of their sum behave as a negative gradient of a potential function whose SH coefficients are multiplied by a gain factor of approximately
These components correspond to the direction of the ECEF z axis in the NEC frame and the direction of the difference of the two measurement vectors in the NEC frame. Again, if these two directions are coincident, then all components will exhibit this gain. Note that these gain factors are out of phase such that . The gain factors are shown in Fig. 2 for the order range of the core and crustal fields and are derived in Appendix B.
If one were only interested in recovering high degree/order lithospheric signals, then based on the gain factors one might naively exclude the vector summation data and focus only on the vector differences. However, the summation data is critical for determining broad-scale, highly time varying fields such as the magnetospheric and the high-frequency induced fields, which if not properly modeled can cause spurious signals that mimic lithospheric signal. This strongly suggests using the SIVW mechanism in order to preserve the vector summation data, but account for systematic bias in its high degree/order lithospheric signal that must certainly exist, especially given its low gain factors at high orders. Because the CRFs of Swarm A and B (CRF_{A} and CRF_{B}, respectively) cannot be considered the same, the CI algorithm first rotates the observation equations for each satellite to the local NEC coordinate system at the mid-point between the two satellites before adding and subtracting. If and be the rotations from CRF_{a} and CRF_{b} to the mid-point, respectively, then a given pair of vector measurements from Swarm A and B are transformed to differences and sums via the following orthogonal transformation
The covariances are similarly transformed as
where the notation C(∙) is now used to indicate auto or cross-covariance.
At this point, the observation equations and covariance of Eqs. (54) and (55) could be formally introduced into the LSLE-GN framework of Eq. (6), with the modification of an additional infinite variance term in the covariance to account for high degree/order lithospheric systematic bias in the vector summations. In practice, however, it is much more feasible to perform this through co-estimation of nuisance parameters as shown in Section 3. This means that while Eq. (6) is strictly followed, Eqs. (7) and (8) are modified to include the crustal nuisance parameters. This essentially modifies the Ak matrix in the previous equations and renders the linearized observation equations at GN step k to be
where the subscript “k” has been suppressed, the subscripts “−”, “∋”, “C”, and “OHM” indicate vector differences, summations, Swarm satellite C (the high satellite), and the ground observatories, respectively, and the superscripts “h” and “r” indicate the high degree/order lithospheric field parameters containing systematic bias in the summation data, and the remainder of the parameters, respectively. Likewise, Δx_{h}, Δn_{h}, and Δx_{r} are the vector adjustments to the nominal and nuisance high degree/order lithospheric fields, and the remaining nominal parameters, respectively. Note that because of its high altitude, the measurements from Swarm C are assumed to have a low SNR in the high degree/order lithospheric field, and are therefore eliminated from the nominal model. In the case of the OHM measurements, the static vector biases that are solved for effectively decouple this data from the static lithosphere and so it is not affected by either the nominal or nuisance lithospheric parameters, at least for n > 20. The C matrix in Eqs. (7) and (8) is now that in Eq. (55). However, in the current implementation of the CI algorithm, C_{−+} is ignored. Again, it should be stated that when solving LSLE-GN, only the nominal parameters are used to calculate Δx_{ k } and A_{ k } and are the only parameters updated. The nuisance parameters are only included to expedite the use of the dense SIVW covariance matrix. In this study, the high degree/order lithospheric nuisance field is defined to be in the range n, m > 20, as shown by the green vertical line in Fig. 2, and so does not include the time varying part of the internal field.
5.3 Weighting and robust estimation
The next task is to define C in Eqs. (7) and (8) for each measurement type. For the vector differences and summations, this is commensurate to defining C_{AA} and C_{BB} in Eq. (55). Beginning with the simplest case, the OHM measurement noise covariance is expressed in the form , where is a function of geomagnetic latitude with polar stations having higher variance than lower latitude stations. Thus the noise is treated as isotropic and uncorrelated between vector components and other data. Likewise, satellite scalar measurements are treated as uncorrelated with all other data and the variance is denoted by . For satellite vector measurements, the formalism of Section 4 is employed to account for the CRF attitude error while an additional isotropic term is added to account for instrument noise (Holme, 2000) and is chosen here to match the scalar variance, which is assumed the same for each spacecraft fluxgate magnetometer. Therefore, and . Notice that because the attitude error (second term) is a function of B_{crf}(x_{ k }), then C_{aa} and C_{bb} change at each GN iteration. Specifically, both and are in the form of Eq. (35) under the assumptions of Section 4.3. These vector measurements are also assumed uncorrelated with all other data.
If the linearized model residuals are Gaussian distributed, then a weighted least-squares estimate, which minimizes ℓ_{2} norm of a vector, as does LSLE-GN, would provide the maximum-likelihood estimate. Since this is rarely the case in real-world problems, the estimator can suffer deleterious effects due to excessive influence of outliers. To combat this, a robust estimation procedure know as “Iteratively Reweighted Least-Squares” (IRLS) with Huber weighting will be employed here (Constable, 1988). This method has been used successfully in such models as CM4 (Sabaka et al., 2004) where details may be found. What is important here is that the IRLS formulation is defined for uncorrelated, scalar measurements. IRLS assigns Huber weights to the i-th measurement at the k-th GN iteration as a function of its standard deviation σ _{ i } and current residual e_{ii, k} according to
where the underlying Huber distribution is defined as having a Gaussian core for ǀe_{ i, k }ǀ ≤ cσ_{ i }, and Laplacian tails (Constable, 1988). A value of c = 1.5 is used by the CI algorithm. All measurement types conform to the structure of Eq. (57) except the vector differences and summations. To rectify this, the Huber weighting is applied to the principle components of the covariance in Eq. (55), which follows readily from the eigendecompositions of C_{AA} and C_{BB}.
In Appendix A it is shown that E_{u}u = 0. It follows from Eq. (35) that , which means that exists in a 2-dimensional subspace that is spanned by the columns of a 3 × 2 matrix that are orthogonal to . The eigendecomposition of C_{AA} is then given by
where is the unit vector in the direction of is a 3 × 2 matrix whose columns span the range of , and is the eigendecomposition of where the 2×2 matrix U_{a} is orthogonal and the 2×2 matrix Λ_{a} is diagonal with positive eigenvalue entries. Of course a similar development leading to Eq. (58) applies to C_{bb} . Thus, using Eq. (55) and Eq. (58) the residuals are rotated into the principle axes of the covariance matrix in Eq. (55) where and supply the σ _{ i } needed for determining the Huber weighting in Eq. (57).
5.4 Regularization
What remains is to define the quadratic norms in Eq. (8) that are used to regularize the system. For the core SV and ionosphere the norms are similar to those used in the CM4 model (Sabaka et al., 2004) and earlier Swarm simulation studies (Sabaka and Olsen, 2006). A combination of the mean-squared magnitude of over the sphere at the core-mantle boundary (CMB) and at Earth’s surface were used to constrain the core SV, while the nightside ionospheric E-region currents were minimized by including a norm that measures the mean-squared magnitude of the E-region equivalent current, J_{eq}, flowing at 110 km over the night-time sector defined as 1100–0500 hrs local time. In addition, these currents are further smoothed by minimizing the mean-squared magnitude of the surface divergence of the diurnally varying portion of J_{eq} at mid-latitudes at all local times.
For this study two additional norms were employed. The first is motivated by the presence of a gap in the coverage of the satellites resulting in a polar cap of a few degrees in half-angle. Because zonal SH terms are most affected by these gaps, a norm which minimizes the square of the magnetic potential of the lithospheric field for degrees n ≥ 60 at both the north and south geographic poles was developed. The final norm minimizes the sum of square deviations of the Euler angles in each time bin with the average value over the entire mission domain as determined from the current nominal values. This is done separately for each of the three angles.
In summary, N_{ q } = 6 quadratic norms are applied in Eq. (8), four of which are similar to those used in previous studies, and two of which are experimental. It is expected that similar norms will be used for the actual mission analysis, but development is continuing on the CI algorithm and could quite possibly lead to better regularization techniques. The advanced CI algorithm has now been developed and will next be applied to the V2 simulation in Section 6.
6. The V2 Simulation
The V2 closed-loop simulation is one of several levels of synthetic mission data required by ESA to validate the algorithms of the Level-2 data facility. This simulation uses synthetic data (“Test Data Set-1” or TDS-1) described in Olsen et al. (2013) for testing the various chains of SCARF. For testing the CI chain, a data subset was used representative of geomagnetic quiet times during a 4.5 year period from July 1998 through December 2002. These quiet periods are defined as times when the geomagnetic activity index K_{ p } ≤ 2° and the D_{st}-index, measuring the strength of the magnetospheric ring-current, does not change by more than 2 nT/hr. The data sampling period was set to 30 secs. The test data set contains contributions from the core field and secular variation (SV), lithospheric field, and primary and secondary ionospheric and magnetospheric fields. Note that toroidal fields where omitted from the synthetic data. Not only was Swarm satellite constellation data synthesized, but also a complementary OHM data set. In addition, random noise has been added to the satellite data. The requirements for the accuracy of the estimated models with respect to the reference models are listed in Table 3. The “target” and “threshold” requirements refer to desired and mandatory levels of accuracy, respectively, based upon modeling experience.
The core field is defined for SH degrees n = 1−20 and consists of snap-shots derived from order 6 spline models (Lesur et al, 2010) that run from 2003.0 to 2008.0 inclusive, but are shifted byδyears (i.e. to 1998.0 to 2003.0) in order to be compatible with the data period used for TDS-1. However, for SH degrees n = 14−20 the core field static terms have been replaced by those of the lithospheric field. The lithospheric field consists of SH degrees n = 14− 250, where n = 14–15 taken from model POMME-6.1 (Maus et al, 2010), degrees n = 16–90 are taken from model MF7 (Maus, 2010a), and degrees n = 91-250 are taken from model NGDC-720 (version 3p1) (Maus, 2010b) scaled by factor 1.1. The primary ionospheric field is based on that of CM4 (Sabaka et al, 2004), and the secondary field SH coefficient vector ι(ω) is computed from the primary vector ϵ (ω) at each frequency ω via the relationship ι(ω) = Q(ω)ϵ(ω) as discussed in Section 5.1. The coupling matrices Q(ω) come from a 3D mantle conductivity model (Kuvshinov, 2011). The magnetospheric primary field is similar to that of the E2E+ simulation (Tøffner-Clausen et al, 2010) and is based on an hour-by-hour spherical harmonic analysis of worldwide distributed observatory data. The secondary magnetospheric field is computed from the same set of coupling matrices as the ionospheric field.
For synthetic satellite data, the VFM systems have been rotated with respect to their CRFs via R_{crf←vfm} from Eq. (49) by the amounts shown in Table 4. Note, however, that TDS-1 does not include attitude error in the rotation R_{crf←j2000}. Their synthetic instrument noise is based upon CHAMP experience and Swarm specifications and is correlated in time, but uncorrelated among vector components. More details are given in Olsen et al. (2013). The standard deviations of the noise are (0.07,0.1,0.07) nT for (B_{ r }, B_{ θ }, B_{ø},), in agreement with Swarm performance requirements. For synthetic OHMs, isotropic noise has been applied such that the standard deviations are (7, 7, 7) nT and (15, 15, 15) nT for locations equator-ward and pole-ward of ±50° geomagnetic latitude, respectively, for(B_{ r }, B_{ θ }, B_{ø})For the CI analysis, the attitude error was treated as isotropic and uncorrelated such that in Eq. (35), where ω_{ a } = 10 arcsecs, even though no attitude error technically exists in the TDS-1 data. The standard deviation of the isotropic instrument noise was set to ω_{ F } =3 nT, which is much larger than present in the TDS-1 data. Although no noise has been added to the TDS-1 OHM synthetic data, the noise treatment in the CI follows what is actually expected from real data, that is, where ω_{OHM} = 7 and 15 nT for station locations equator-ward and pole-ward of ±50° geomagnetic latitude, respectively.
7. Results and Discussion
The results of applying the advanced CI algorithm to the TDS-1 data set of the V2 simulation will now be briefly discussed. While these were found to be quite favorable, it should be stressed that this is a simulation based upon synthetic data containing contributions from forward models similar to those used in the CI, such as the ionospheric primary and secondary fields. Therefore, it is possible that performance could be degraded when analyzing data from the actual mission. The three metrics employed by Sabaka and Olsen (2006) will be used here, which are defined in terms of the real and imaginary parts of complex S H coefficients of a field, denoted generically as ,and respectively.
7.1 Metrics
The first metric is the Lowes-Mauersberger spectrum, R_{ n }(r), of Lowes (1966)
where a and r are the reference and evaluation radii, respectively. The R_{ n }(r) are the mean-square magnitude of an internal field over a sphere of radius r. The second metric is the degree correlation, p_{ n }, of two fields given by
where and are from the first field and and are from the second field. It has a range of − 1 ≤ ρ_{ n } ≤ 1 and is invariant to scale factors on the degree n part of the fields. The third metric is the sensitivity matrix, S(n, m), given by
where and are from the recovered field and and are from the true field. Thus, S(n, m) represents the percentage degree-normalized error in a recovered coefficient of degree n and order m.
There is one additional metric that will be used to evaluate the V2 results. This is the “squared-magnitude coherence”, or coh^{2}, whose range is 0 ≤ coh^{2} ≤ 1 and measures the similarity between input and output signals of a system. For constant parameter linear systems, coh^{2} = 1, but this can decrease due to a number of issues, particularly the presence of noise in the system. This has been used by Olsen (1998) to analyze C-responses describing electrical conductivity of the mantle beneath Europe, and details may be found therein.
7.2 MCO core field
The R_{ n } spectra of the MCO reference field (black) and difference between reference and recovered fields (blue) at epoch 2000.0 at Earth’s surface are shown in the top panel of Fig. 3. The blue line falls several orders of magnitude below the black line and indicates excellent agreement between the two models near the center of the model domain. The same lines are shown for SV in the bottom panel where the blue line is below the black until degree n > 19. The accumulated error in SV is found to be less than 0.2 nT/yr for all times within model domain for degree n = 2–20, and thus, easily meets both the threshold and target values specified in Table 3.
To aid in the comparison, the first time derivatives of the MCO Gauss coefficients for degrees n = 1–4 are shown in Fig. 4 for the reference (black) and recovered (red) fields. The fields agree very well with most of the variations in the lines exhibited over small ranges. In fact, a comparison of the actual Gauss coefficients shows them to be almost indiscernible over the range of coefficient values. The advanced CI algorithm is evidently recovering the MCO core field and SV to within specifications.
7.3 MLI lithospheric field
In addition to assessing the CI algorithm performance with respect to V2 accuracy requirements, it was of interest to see the direct benefits of taking explicit advantage of the magnetic gradient information for determination of the small-scale lithospheric field. Therefore, two types of models were derived from exactly the same magnetic field observations. The first model, denoted as “field only”, was constructed by considered the Swarm data from three single satellites, whereas the second model, denoted as “field plus gradient”, was constructed by explicit use of the constellation aspect of Swarm by using magnetic gradient information with SIVW.
The R_{ n } spectra of the MLI reference field (black) and the difference between reference and recovered fields for the “field only” case (red) and the “field plus gradient” case (blue) at Earth’s surface are shown in the left panel of Fig. 5. The plots indicate a roughly two-fold reduction in power in the error per degree above about n = 45 when difference data are exploited. The right panel indicates a far superior recovery of coefficients in the case of “field plus gradient” (blue) over the “field only” case (red) with regards to phase of the models, i.e., degree correlation ρ_{ n }. The former case is now recovering a field that is positively correlated with the true field at the 0.93 level for n = 150, the degree limit of the synthetic signal. The accumulated error at ground for degrees n = 16–150 is only about 12 nT, which again is well below both the target and threshold target values for V2 accuracy.
A more complete view of the performance of the traditional field value method versus the gradient method can be seen by examining the sensitivity matrix S(n, m) of the latter in Fig. 6. The “field plus gradient” recovery is excellent for mid-valued m, but is nearly identical to the traditional approach (though not shown) for the m ≤ 20 regime, which recall, conforms to the structure of lithospheric bias treatment used in SIVW. There is also some improvement in the purely sectorial terms (left and right edges).
Finally, a physical sense of the quality of the MLI lithospheric field recovery can be seen in difference maps of B_{ r } at ground between the reference and “field plus gradient” models in Fig. 7. Most differences are just a few nT, except in regions around the geographic poles where they are several tens of nT. This is no doubt due to polar gaps in the Swarm orbital coverage.
While these results suggest an optimistic outlook that the Swarm constellation is capable of accurately recovering small-scale lithospheric structure, the application to real data will be more challenging, especially at high latitudes. However, if V2 performance is any indication of real performance, then Swarm will go far in closing the gap in intermediate lithospheric wavelength coverage that exists now between satellites and aeromagnetic surveys.
7.4 MIO ionospheric field
The accuracy target requirement for the MIO primary ionospheric field is 10% average relative error in ǀBǀ on ground. Table 5 actually subdivides these numbers across local time sectors (midnight, morning, noon, and evening) and across seasons (February to April, May to July, August to October, and November to January) for magnetic latitudes equator-ward of ±50° and all latitudes. It can be seen that every subdivision is performing almost twice as well as the 10% requirement, with overall errors of 4.16% and 4.40% for the low latitude and all latitudes regions, respectively.
The primary MIO ionospheric field source is modeled as an equivalent sheet current at 110 km altitude in the CI algorithm (Sabaka et al., 2002, 2004) and its corresponding current function has been generated from the ionospheric coefficients recovered from the V2 test and shown for different universal times (UT) at vernal equinox in Fig. 8. This is also in very good agreement with the current function of the MIO reference field. The advanced CI algorithm appears to be performing satisfactorily for this field source as well.
7.5 MMA magnetospheric/induced fields
Recall that the test data are selected for magnetically quiet times such that the CI products for core, lithosphere, and ionosphere (primary and secondary) all reflect this. However, the determination of continuous time series of the spherical harmonic expansion coefficients of magneto-spheric and corresponding induced sources requires data taken during all geomagnetic conditions, as required for Level-2 product MMA_SHA_2. Therefore, this is achieved by applying a second processing step, the MMA_SHA_2 analysis step, after CI in the following way: Predictions are made from the output CI core, lithospheric, and primary and secondary ionospheric models derived during quiet times and subtracted from each 1 min Swarm satellite measurement and OHMs from all available ground observatories. The resulting residuals (observations minus model values) are expected to contain the magnetospheric (primary and secondary) field plus errors due to improper removal of all other sources. From those residuals of each day estimates are made of the spherical harmonic expansion coefficients describing the external (magnetospheric) sources for N_{max} = 3 and M_{max} = 1, and coefficients describing the induced field for N_{max} = N_{max} = 5. This is done in bins of 1.5 hrs for the axial dipole coefficients and (which means that 16 values for each of those coefficient pairs are determined per day) and in bins of 6 hrs for the remaining 42 coefficients (resulting in 4 × 42 = 168 values per day). In total, 200 coefficients are estimated for each day using IRLS with Huber weights.
Figure 9 shows squared-magnitude coherence (coh^{2}) between the input and the estimated time series for external (red) and induced (blue) coefficients corresponding to N_{max} = 3 and M_{max} = 1. There is generally excellent coherency for the external coefficients (coh^{2} well above 0.95) and good coherence (coh^{2} > 0.8) for the induced coefficients, in particular for periods shorter than one month or so. When using the coefficients for determination of mantle conductivity, periods up to one month correspond to resolving conductivity down to 1200 km or so. Püthe and Ku-vshinov (2013a, b) describe the estimation of mantle conductivity from the Level-2 product MMA_SHA_2 determined in this way. Although this is a comparison between the reference magnetospheric and induced fields and the output of the MMA_SHA_2 analysis step, it still indicates that good separation must exist between the magntospheric and induced fields and the others in the CI estimation. As for the accuracy requirements, the external field recovery exceeds the target values for all periods while the internal field recovery generally exceeds the target for periods shorter than one month and are mostly above the threshold requirement for other periods.
8. Conclusions
This paper has presented an advanced CI algorithm that includes many improvements over the basic algorithm described in Sabaka and Olsen (2006), and most important among these are the introduction of the SIVW mechanism for mitigating systematic, non-zero mean bias in the observations allowing for optimal combinations of data to achieve improved models. In addition, this paper has pointed out a way to improve on the handling of attitude error in vector magnetometer data put forth in the HB theory. This will hopefully allow for even better error characterization, and thus, model quality.
The performance of the advanced CI algorithm was evaluated using a synthetic test data set (TDS-1) from a full mission simulation. In general, it was found to perform well above both threshold and target accuracy requirements for core, lithospheric, ionospheric, and magnetospheric and induced. Only the recovery of induced fields at some periods longer than one month were of suspect quality. This may be due to the point-wise orthogonality condition with respect to the core field that has been imposed on this field, which of course would affect the longer periods since SV is on the order of these longer periods. Although there were no toroidal field contributions in the TDS-1 data, these fields were nonetheless co-estimated and found to be negligible, thus indicating a proper treatment in the model. All of this suggests, at least from the standpoint of V2, that the advanced CI algorithm will be quite competent in delivering high quality Level-2 products.
Although the advanced algorithm is a great improvement over the basic algorithm, certain issues should still be dealt with to further enhance performance. Recall that when using the covariance matrix describing the error in the vector differences and summations in Eq. (55) the cross-covariance matrices C_{−+} have been ignored. This was done to simplify the algorithm during development, but should now be instated. The increase in deviations between reference and recovered B_{ r } from the lithosphere around the geographic poles in Fig. 7 indicates that the polar gap problem should be further addressed. In addition, several issues surrounding SIVW should be explored, such as the optimal SH order m that delineates between the nominal and nuisance lithospheric fields, which could easily lead to better models. In addition, SIVW could be used to account for dayside bias when modeling the lithosphere, which would reflect the current best methods for crustal field modeling (Thomson and Lesur, 2007; Maus et al., 2007, 2008; Lesur et al., 2008, 2013; Olsen et al., 2011). Finally, SIVW could be applied to high degree SV modeling by mitigating the bias due to the poor distribution of ground observatories. It is planned to implement and test several of these ideas.
References
Bertsekas, D. P., Nonlinear Progamming, Athena Scientific, 1999.
Constable, C. G., Parameter estimation in non-Gaussian noise, Geophys. J.,94, 131–142, 1988.
Friis-Christensen, E., H. Lühr, and G. Hulot, Swarm: A constellation to Study the Earth’s Magnetic Field, Earth Planets Space, 58, 351–358, 2006.
Golub, G. H. and C. F van Loan, Matrix Computations, The John Hopkins University Press, 1989.
Holme, R., Modelling of attitude error in vector magnetic data: application toØrsted data, Earth Planets Space, 52, 1187–1197, 2000.
Holme, R. and J. Bloxham, Alleviation of the Backus effect in geomagnetic field modelling, Geophys. Res. Lett., 22, 1641–1644, 1995.
Holme, R. and J. Bloxham, The treatment of attitude errors in satellite geomagnetic data, Phys. Earth Planet. Inter., 98, 221–233, 1996.
Kuvshinov, A. V., Deep electromagnetic studies from land, sea, and space: Progress status in the past 10 years, Surv. Geophys., 3, 169–209, doi:10.1007/s10712-011-9118-2, 2011.
Lesur, V., I. Wardinski, M. Rother, and M. Mandea, GRIMM: the GFZ Reference Internal Magnetic Model based on vector satellite and observatory data, Geophys. J. Int., 173, 382–294, 2008.
Lesur, V., I. Wardinski, M. Hamoudi, and M. Rother, The second generation of the GFZ reference internal magnetic model: GRIMM-2, Earth Planets Space, 62, 765–773, doi:10.5047/eps.2010.07.007, 2010.
Lesur, V., M. Rother, F Vervelidou, M. Hamoudi, and E. Thébault, Postprocessing scheme for modeling the lithospheric magnetic field, Solid Earth, 4, 105–118, doi:10.5194/se-4-105-2013, 2013.
Lowes, F. J., Mean-square values on sphere of spherical harmonic vector fields, J. Geophys. Res., 71, 2179, 1966.
Maus, S., Magnetic Field Model MF7, http://www.geomag.us/models/MF7.html, 2010a.
Maus, S., NGDC-720 lithospheric magnetic model, http://www.geomag.us/models/ngdc720.html, 2010b.
Maus, S., H. Lühr, M. Rother, K. Hemant, G. Balasis, P. Ritter, and C. Stolle, Fifth generation lithospheric magnetic field model from CHAMP satellite measurements, Geochem. Geophys. Geosyst., 8(5), Q05013, doi:10.1029/2006GC001521, 2007.
Maus, S., F. Yin, H. Lühr, C. Manoj, M. Rother, J. Rauberg, I. Michaelis, C. Stolle, and R. D. Müller, Resolution of direction of oceanic magnetic lineations by the sixth-generation lithospheric magnetic field model from CHAMP satellite magnetic measurements, Geochem. Geophys. Geosyst., 9(7), Q07021, doi:10.1029/2008GC001949, 2008.
Maus, S., C. Manoj, J. Rauberg, I. Michaelis, and H. Lühr, NOAA/NGDC candidate models for the 11th generation International Geomagnetic Reference Field and the concurrent release of the 6th generation Pomme magnetic model, Earth Planets Space, 62, 729–735, 2010.
Olsen, N., Estimation of C -Responses (3 h to 720 h) and the electrical conductivity of the mantle beneath Europe, Geophys. J. Int., 133, 298–308, 1998.
Olsen, N., A model of the geomagnetic field and its secular variation for epoch 2000 estimated from Ørsted data, Geophys. J. Int., 149(2), 454–462, 2002.
Olsen, N., R. Holme, G. Hulot, T. Sabaka, T. Neubert, L. Tøffner-Clausen, F Primdahl, J. Jørgensen, J.-M. Léger, D. Barraclough, J. Bloxham, J. Cain, C. Constable, V. Golovkov, A. Jackson, P. Kotzé, B. Langlais, S. Macmillan, M. Mandea, J. Merayo, L. Newitt, M. Purucker, T. Risbo, M. Stampe, A. Thomson, and C. Voorhies, Ørsted initial field model, Geophys. Res. Lett., 27, 3607–3610, 2000.
Olsen, N., M. Mandea, T. J. Sabaka, and L. Tøffner-Clausen, The CHAOS-3 geomagnetic field model and candidates for the 11th generation of IGRF, Earth Planets Space, 62,719–727, 2010.
Olsen, N., H. Lühr, T. J. Sabaka, I. Michaelis, J. Rauberg, and L. Toffner-Clausen, The CHAOS-4 geomagnetic field model, Geochem. Geophys. Geosyst., 2011 (in preparation).
Olsen, N., E. Friis-Christensen, R. Floberghagen, P. Alken, C. D Beggan, A. Chulliat, E. Doornbos, J. T. da Encarnação, B. Hamilton, G. Hulot, J. van den IJssel, A. Kuvshinov, V. Lesur, H. Luhr, S. Macmillan, S. Maus, M. Noja, P. E. H. Olsen, J. Park, G. Plank, C. Püthe, J. Rauberg, P. Ritter, M. Rother, T. J. Sabaka, R. Schachtschneider, O. Sirol, C. Stolle, E. Thebault, A. W. P. Thomson, L. Toffner-Clausen, J. Velímský, P. Vigneron, and P. N. Visser, The Swarm Satellite Constellation Application and Research Facility (SCARF) and Swarm data products, Earth Planets Space, 65, this issue, 1189–1200, 2013.
Püthe, C. and A. Kuvshinov, Determination of the 1-D distribution of electrical conductivity in Earth’s mantle from Swarm satellite data, Earth Planets Space, 65, this issue, 1233–1237, 2013a.
Puthe, C. and A. Kuvshinov, Determination of the 3-D distribution of electrical conductivity in Earth’s mantle from Swarm satellite data: Frequency domain approach based on inversion of induced coefficients, Earth Planets Space, 65, this issue, 1247–1256, 2013b.
Sabaka, T. J. and N. Olsen, Enhancing comprehensive inversions using the Swarm constellation, Earth Planets Space, 58, 371–395, 2006.
Sabaka, T. J., N. Olsen, and R. A. Langel, A comprehensive model of the quiet-time near-Earth magnetic field: Phase 3, Geophys. J. Int., 151,32–68, 2002.
Sabaka, T. J., N. Olsen, and M. E. Purucker, Extending comprehensive models of the Earth’s magnetic field with Ørsted and CHAMP data, Geophys. J. Int., 159, 521–547, doi:10.1111/j.1365-246X.2004.02421.x, 2004.
Seber, G. A. F and C. J. Wild, Nonlinear Regression, Wiley-Interscience, 2003.
Thomson, A. W. P. and V. Lesur, An improved geomagnetic data selection algorithm for global geomagnetic field modelling, Geophys. J. Int., 169(3), 951–963, 2007.
Tøffner-Clausen, L., T. J. Sabaka, and N. Olsen, End-To-End Mission Simulation Study (E2E+), Proceedings of the Second InternationalSwarmScience Meeting, ESA, Noordwijk/NL, 2010.
Toutenburg, H., Prior Information in Linear Models, John Wiley & Sons, New York, 1982.
Walker, M. R. and A. Jackson, Robust modelling of the Earth’s magnetic field, Geophys. J. Int., 143,799–808, 2000.
Wertz, J. R. and W. J. Larson (eds.), Space Mission Analysis and Design, Kluwer Academic Publishers, 1999.
Wessel, P. and W. H. F Smith, Free software helps map and display data, Eos Trans. AGU, 72, 441, 1991.
Acknowledgments
We thank Richard Holme and Vincent Lesur for fruitful reviews. The NASA Center for Climate Simulation at Goddard Space Flight Center provided computer support. Some figures were produced with GMT (Wessel and Smith, 1991).
Author information
Appendices
Appendix A.
The purpose of this Appendix is to present a derivation of Eq. (35), that is, the attitude error covariance under general, finite rotations, and from that, the various forms used in the HB theory. First, several useful properties of the cross-product matrix will be given. The cross-product of two vectors u and v can be expressed as a matrix-vector multiplication as follows
where u, v, and the cross-product matrix E_{ u } are given by
If R is an arbitrary rotation matrix, such that
where R_{ j } is a column vector containing the elements of the j-th row of R, then the following properties emerge
and also
A.1 Covariance of a vector under general, finite rotations due to random, zero-mean angular perturbations
Consider the case of a compound rotation matrix R representing successive rotations about three normalized axes , û, and such that
where a general, elemental rotation matrix describing a positive rotation of angleΨabout the axisêis given by (Wertz and Larson, 1999)
Using the properties of the cross-product matrix, the following additional property of interest may be derived
The goal is now to derive the covariance of a vector B_{2} due to random, zero-mean perturbations about non-zero angles of the rotation matrix R defined in Eq. (A.8) such that
The linear-tangent approximation is used to relate the differential of B_{2} with those of the angles χ, δ, andλsuch that
The covariance of B_{2} is then assumed to be
where . Taking the derivatives of B_{2} with respect to χ, δ, andλand using the various properties previously derived yields
where and are the vectors , û, and rotated into reference frame “2”, respectively. It follows from Eq. (A.13) that the covariance of B_{2} is given by
where
A.2 Covariance of a vector within the HB theory framework
The treatment of attitude errors in satellite geomagnetic data put forth in Holme and Bloxham (1996) derives covariance expressions resulting from small random, zero-mean angular perturbations about a set of orthonormal axes , û, and , which is equivalent to setting η =δ= λ = 0 in Eq. (A.8). This leads to
such that
They discuss three special cases, depending on the values of the three perturbation variances, that will now be shown as special cases of Eq. (A.18). Note that the isotropic components in their formulae will be excluded from these derivations. In all cases it is assumed that C_{a} is a diagonal matrix. The subscript “2” will also be omitted as it should be understood in which reference system the covariance is expressed.
A.2.1 Three equal variances
Here it is assumed that σ^{2} = σ ^{2}_{χ} = σ ^{2}_{δ} = σ ^{2}_{λ} , and if B = ǀBǀ, then this leads to
A.2.2 Two equal variances
Here it is assumed that σ ^{2}_{χ} ≠ σ ^{2}_{δ} = σ ^{2}_{λ} , which leads to
A.2.3 No equal variances
Here it is assumed that σ ^{2}_{χ} ≠ σ ^{2}_{δ} , σ ^{2}_{χ} ≠ σ ^{2}_{λ} and σ ^{2}_{δ} ≠ σ ^{2}_{λ} , which leads to
Appendix B.
The purpose of this Appendix is to present a derivation of Eqs. (52) and (53) and establish the conditions under which they are valid. The magnetic field vector in the ECEF frame at position (r, θ, ϕ), denoted B(r,θ,ϕ ), is related to the field vector in the local spherical NEC frame, denoted B(r, θ, ϕ), through the rotation
where
and (r, θ, ϕ) are the spherical coordinates of radius, co-latitude, and longitude, respectively. This rotation can be shifted to a new position (r, θ, ϕ′) such that ϕ′ − ϕ = Δϕ through a rotation about the ECEF z axis
where
Now, Swarm A and B provide two magnetic field vector measurements at positions (r_{A}, θ_{A},ϕ_{A}) and (r_{B}, θ_{B},ϕ0_{b}), respectively, that differ only in longitude such that ϕ_{b}−ϕ_{A} = Δϕ. If all quantities that are evaluated at (r_{A}, θ,ϕ_{a}) and (r_{B}, θ_{B}, ϕ_{b}) are indicated by a subscript “A” and “B”, respectively, then using Eq. (B.3), the difference in their ECEF vector measurements can be expressed as
This can also be expressed as
Note that to first-order in Δϕ
and
which leads to
where and are the unit vectors in the direction of the ECEF z axis in the ECEF and NEC frames, respectively. Using Eqs. (B.5)–(B.9) leads to the following expression that is first-order accurate in Δϕ
where is the rotation from the local spherical NEC frame at the mid-point position of the satellites to the ECEF frame. A similar expression may be derived for the sum of the Swarm low satellite ECEF vector measurements that is first-order accurate in Δϕ, which gives the pair
If the magnetic field vector in the local spherical NEC frame is represented as the negative gradient of a potential function such that and , then Eq. (B.11) may be rewritten as
Now, consider the magnetic potential due to sources internal to the satellite sampling shell at position (r,θ, ϕ)
where a is a reference radius, is an associated Legendre function of spherical harmonic degree n and order m, and is a complex coefficient of degree n and order m such that , where the overbar indicates complex conjugation. If the positions of the Swarm satellite low pair differ only in longitude by Δϕ, then the difference in potentials at (r, θ, ϕ) is given by
where is the set of modified coefficients . The gain factor, , is defined here to be the magnitude of the ratio of the modified to original coefficients given by
One can also consider the sum in potentials at (r, θ, ϕ) given by . Following a similar derivation, this leads to the following gain factors
Thus, Eq. (B.12) may now be finally rewritten as
If u_{SUM} and u_{DIF} are the unit vectors in the directions of and ∇ (Δ V_{A}), respectively, then the Swarm low pair vector differences in Eq. (B.17) become exclusive functions of ΔV_{A} in their and u_{SUM} components since the second term has no contribution in these directions. Similarly, the Swarm low pair vector sums in Eq. (B.17) become exclusive functions of ΣV_{A} in their and u_{DIF} components since the second term has no contribution in these directions.
Equation (B.17) holds when the Swarm satellite low pair positions are aligned in an approximately east-west orientation, which is true for low-mid latitudes. At higher latitudes the two orbital planes intersect requiring one satellite to lag behind the other in order to avoid collision. Here the differences are more in the along-track direction which can contain a significant north-south component.
Rights and permissions
About this article
Cite this article
Sabaka, T.J., Tøffner-Clausen, L. & Olsen, N. Use of the Comprehensive Inversion method for Swarm satellite data analysis. Earth Planet Sp 65, 2 (2013). https://doi.org/10.5047/eps.2013.09.007
Received:
Revised:
Accepted:
Published:
Key words
- Swarm
- Earth’s magnetic field
- comprehensive modeling
- core
- lithosphere
- ionosphere
- magnetosphere
- electromagnetic induction