The effect of initial and prior models on phase tensor inversion of distorted magnetotelluric data

Magnetotelluric (MT) data are often distorted by near-surface small-scale lateral heterogeneities. Inverting distorted MT data may produce artifacts or false anomalies, leading to unreliable interpretation. This problem can be avoided by inverting the phase tensor (PT), which is known to be free of galvanic distortion. However, PT inversion is known to strongly depend on the initial or prior model, because the PT itself does not contain absolute resistivity information. To obtain a reliable inversion result from a set of PT data, a proper initial or prior model is crucial. In this study, a one-dimensional mean resistivity profile estimated from the average sum-of-squared-elements impedance was chosen as an initial model, because it was proven to be less sensitive to galvanic distortion. Examples with synthetic data showed that PT inversion using such an initial and prior model is a viable approach for inverting galvanically distorted MT data. In addition, the present paper considers a situation, where the distortion is not purely galvanic. A simple synthetic study indicated that the PT is affected by inductive distortion, and thus, such inversion results should be interpreted with caution.


Introduction
Magnetotellurics (MT) is a passive electromagnetic method for imaging the Earth's subsurface electrical structure. It has been proven to be an efficient tool in geophysical studies with a wide range of applications, from engineering applications (e.g., Amatyakul et al. Rung-Arunwan et al. Earth, Planets and Space (2022) 74:51 2015; Gresse et al. 2021) to deep Earth studies (e.g., Baba et al. 2010;Boonchaisuk et al. 2013). Hardware, data processing methods, and inversion schemes have been developed for MT.
Galvanic distortion (Utada and Munekane 2000), a frequency-independent alteration in regional (undistorted) data, is a problem in MT data inversion that may result in unreliable data interpretation. It is a kind of geologic noise (Bahr 1991) caused by charge accumulation at the interfaces of near-surface small-scale heterogeneities (hereafter called distorters). Käufl et al. (2018) recently demonstrated that variations in surface topography also cause distortions. In the present study, we did not explicitly treat topographic variation but assumed that topographic effects are included in the distortion due to the near-surface heterogeneous distribution of resistivity. Significant distortion is caused by these heterogeneities, which lie in a layer that is shallower and thinner than the inductive scale length of interest and are smaller than the site spacing (spatial aliasing). Hence, distorters are deemed to be structures unresolvable by a set of MT measurements, and thus, inverting distorted data without appropriate treatment may lead to false results with artifacts.
In general, the galvanic distortion problem is mathematically under-determined (McNeice and Jones 2001). The problem of MT inversion with distorted data cannot be solved unless some assumptions or constraints based on external information are included (e.g., DeGroot-Hedlin 1995;Ogawa and Uchida 1996;Sasaki and Meju 2006). For layered-earth structures, galvanic distortion can be modeled as a simple scalar applied to the rotation-invariant impedance of the observation (e.g., Berdichevsky et al. 1989) and solved using a statistical approach. There are other approaches, where regional (undistorted) impedance can be determined by decomposing the observed impedance tensor (e.g., Groom and Bailey 1989;McNeice and Jones 2001;Bibby et al. 2005) if the regional structure can be assumed to be two-dimensional (2D).
Thus, MT studies dealing with galvanic distortion problems were initially limited to one-dimensional (1D) or 2D interpretation due not only to theoretical but also to data acquisition and computational considerations; MT observations in a dense 2D array that cover a wide area and improvements in computer performance that allow three-dimensional (3D) inversion are required. Several studies have recently demonstrated the possibility of obtaining a reliable 3D interpretation from distorted data. There are three main strategies: (1) removal of galvanic distortion from the data (Utada and Munekane 2000;Tang et al. 2018; (2) simultaneous inversion of the distorted data and the distortion operator (Avdeeva et al. 2015;Moorkamp et al. 2020);and (3) inversion of the phase tensor (PT) (Caldwell et al. 2004), which is the galvanic distortion-free data (e.g., Koyama 2009;Patro et al. 2013;Tietze et al. 2015).
In the present study, we focused on inverting the PT, which is intrinsically free of galvanic distortion. Although PT inversion is accepted and favored in several studies, it has been shown to be strongly affected by the choice of the initial or prior (also called reference) model, because the PT itself does not contain amplitude (electrical resistivity) information. A method for estimating a proper initial/prior model is thus needed for PT inversion.
There are two ways of estimating initial and/or prior models in MT inversion: one is purely based on assumptions and the other is based on other information. One example of the former is to try inversion with several models of a uniform half-space (UHS) of different resistivity values and to select the best one among them. Tietze et al. (2015) examined in detail how the choice of initial UHS model affects the result of PT inversion. In this approach, one must define the "best" inversion result among a number of trials. Most of past works select a model yielding the smallest misfit to data as the best, but other definitions are possible. This brings further complexity to the inverse problem and the selection of the best result is necessarily subjective. Thus, the present study attempted the latter approach, that is, estimating a good initial/prior model based on other information. Estimation of the initial/prior model based on geological and geophysical studies can be regarded as one example of the latter approach. Because such an approach tends to be highly subjective, we propose here a more objective approach that is based on observed data.
Mathematically, the optimal 1D profile that minimizes the lateral contrast of an arbitrary 3D resistivity structure is expected to yield a better conditioned system of equations in 3D EM forward calculation (Avdeev 2005), if it is used for the reference model. However, this optimal profile is not used for an initial/prior model, because it cannot be obtained until the 3D structure is accurately estimated. Thus, we propose here to use the regional mean 1D profile, which can be directly calculated from a set of observed impedances. Among several possible ways to estimate the regional mean 1D profile, we choose what is derived from the sum of squared elements (SSQ; Szarka and Menvielle 1997) of the observed impedance averaged over the observation array, because it is shown to be a good approximation of the optimal 1D profile (Rung-Arunwan et al. 2017). Thus, the main purpose of this paper is to examine how the choice of an initial or prior model will affect the PT inversion of distorted MT data. By testing, we further examined the performance of PT inversion with a proper initial/prior model for a case, where distortion is not purely galvanic.

MT data inversion
At each MT observation site, we collect the orthogonal components of the natural electric and magnetic field fluctuations. From the observation data, the MT impedance tensor Z , which is the transfer function between orthogonal horizontal components of the electric field E and magnetic field B , is estimated as a function of frequency.
where ω denotes the angular frequency and r s is the position vector of the sth station.
The 2 × 2 complex-valued and frequency-dependent tensor Z in Eq. (1) is called the observed impedance. In the absence of distortion, the observed impedance reflects the regional structure of interest. In contrast, if the observation data contain galvanic distortion, the observed or distorted impedance tensor Z can be mathematically written as a product of the distortion tensor C and the undistorted impedance tensor Z U : where the distortion operator C is a 2 × 2 real-valued and frequency-independent tensor. Note that, for simplicity, the dependence on the angular frequency ω and the position vector r s in Eq. (2) is implicit hereafter.
MT inversion is a mathematical procedure for finding the optimal distribution of model parameters (electrical resistivity) that explains the observed MT data by solving the minimization problem. The MT inversion scheme used in the present study is based on WSINV3DMT, a 3D MT inversion code (Siripunvaraporn et al. 2005;Siripunvaraporn and Egbert 2009), which is a data space Occam's inversion algorithm that minimizes the unconstrained objective function: where d and m are the data misfit and the model constraint, respectively. The Lagrange multiplier is a parameter for tuning the trade-off between the data misfit and the model constraint. In this code, the data parameter d is an N-dimensional vector consisting of the observed MT responses to be inverted. Each data point is inversely weighted with the data uncertainty (observation error) e . Note that the dimension N is the product of the number of observation sites, the number of frequencies, and the number of impedance elements used in the inversion. In this study, we examine various definitions of (1) E(r s ; ω) = Z(r s ; ω)B(r s ; ω), the data vector, as described below. F [m] is the forward operator that calculates the forward response from the model parameter m corresponding to each site location and frequency. The model parameter m describes the whole model space divided into M cubic cells with constant resistivity ρ i , where i = 1, ..., M , making m in Eq.
(3) an M-dimensional vector. Each element of m , m i , is defined as the resistivity measured on a logarithmic scale, m i = log 10 ρ i . The model evolves during the inversion process, which consists of a series of iterations, from the initial model. The model constraint is applied to the model at each iteration relative to the prior model, m p (see Siripunvaraporn and Egbert 2000;Siripunvaraporn et al. 2005). In theory, the prior model can be different from the initial model. The influence of the prior model was discussed by Tietze et al. (2015). In the present study, the initial and prior models are assumed to be the same except in the last section, where we examine their effects independently.
A detailed review of inversion algorithms can be found in the report of Siripunvaraporn (2012) and references therein. To find the optimal model parameter m s , we need to differentiate the objective function with respect to the model parameter. This gives us the partial derivative of the response with respect to the model parameter, ∂ m F [m] , called the model sensitivity. The exact derivations for the MT responses considered in this study are presented in the following two subsections.

Inversion of elements of the MT impedance tensor
In general, most existing inversion schemes invert the MT impedance estimated from observed electromagnetic data, which can be expressed as where the subscripts j and k denote the orthogonal directions x or y. Each element of the observed impedance tensor Z jk is a complex number that can be written either as the sum of real R jk and imaginary I jk parts or as a polar form using the magnitude |Z jk | and the phase φ jk , where the magnitude |Z jk | = R 2 jk + I 2 jk , phase φ jk = arctan(I jk /R jk ) , and i denotes the imaginary unit.
In the present study, we invert the impedance tensor using logarithmic scaling, because this is consistent with the conventional representation of the frequency dependence of the apparent resistivity and phase. We consider Z jk = |Z jk |e iφ jk , the inversion of the elements of the impedance tensor, and thus the forward solution F consists of the elements of the calculated impedance tensor measured on a logarithmic scale. From Eq. 6, the logarithmically scaled Z C jk is where the superscript C denotes the data derived from the forward calculation. The sensitivity of Z C jk with respect to the model parameter m , which is the logarithmically scaled resistivity, can be formally given in complex form as and the sensitivity of log Z jk is where the sensitivities of log |Z jk | and φ jk can be written as and Details of the derivation are given in Appendix A.
In addition, one may choose to invert either all elements of the MT impedance tensor or only the off-diagonal elements, Z xy and Z yx (e.g., Siripunvaraporn et al. 2005;Tietze and Ritter 2013;Kiyan et al. 2013). Inverting all elements of the MT impedance tensor has been recommended, because it allows for the best recovery of the underlying structure (Campanya et al. 2016). However, this result was obtained in the case without distortion. Indeed, a reasonable inversion of impedance elements naturally converges to an accurate solution if the effect of distortion (expressed by C ) is negligible or weak, specifically, Z ≈ Z U . The main concern in the present paper is how to obtain an accurate solution of the MT inversion when the observed impedances are significantly distorted.

Phase tensor inversion in arctangent scaling
The phase tensor of any MT impedance can be read as where R jk and I jk are the real and imaginary parts of the impedance tensor. See Caldwell et al. (2004) for more details of the PT definition. Here, we propose inverting the PT with arctangent scaling, as expressed below, to be consistent with a conventional plot of the impedance phase as Mathematically, the arctangent of any values bounded to The arctangent scaling is thought to help avoid the singularity of ψ jk and its derivatives near the numerical bounds. In general, the impedance phase is defined as arg(Z jk ) = arctan(I jk /R jk ) in MT practice, which is simply because the impedance is relevant to logarithmic scaling (Eq. 6). Here, we applied the same functional form to each of the PT elements. Hereafter, we denote the PT with arctangent scaling as the PTATAN. The sensitivity of the PTATAN with respect to the model parameters m is simply written as where the sensitivity of the PT, ∂ m ψ jk , is available from Patro et al. (2013) or Tietze et al. (2015).

Data error calculation
The impedance is statistically estimated from the observed time series of the electromagnetic fields (e.g., Chave and Thomson 2003) with observation error. For a fair comparison of data misfit in which the observation error is used as a weight (see Eq. 3), we calculate the synthetic observation error for each data type from the synthetic error of the impedance by error propagation. Here, we assume a constant percent error, from which the synthetic error is calculated using the Frobenius norm of the impedance ( ||Z|| f = |Z xx | 2 + |Z xy | 2 + |Z yx | 2 + |Z yy | 2 , Szarka and Menvielle 1997) as (13) ψ xx = 1 det(R) (R yy I xx − R xy I yx ), ψ xy = 1 det(R) (R yy I xy − R xy I yy ), where SPE is the synthetic percent error. The error on a logarithmic scale is calculated following the error propagation rule (Ch. 3 of Taylor 1997): The error for each element of the PT, δψ jk , is calculated following the appropriate propagation rule given by Patro et al. (2013). The error for the PTATAN can be simply derived as

Synthetic model
For the synthetic study, we used a model with 3D anomalies embedded in a layered earth, similar to that presented in the study of Campanya et al. (2016), as illustrated in Fig. 1. A comprehensive study was conducted by Campanya et al. (2016) using this synthetic model, but the effect of galvanic (18) δτ jk = 1 1 + ψ 2 jk δψ jk . An array of 40 non-uniformly distributed MT stations with a mean site spacing of 1.6 km (each station represents an area of 1.6 km × 1.6 km), covering an area of 8.0 km × 12.8 km in the x and y directions, was used. The location of the sth station (x s , y s ) is given by (Rung-Arunwan et al. 2017) where x c,s and y c,s are the coordinates of the center of each mesh area that the sth site represents, x r,s and y r,s are uniform random numbers in the range (− 0.5,+ 0.5), and s x and s y are the typical site spacing values in the x and y directions, respectively, both of which are taken to be 1.6 km in the present study (Fig. 1).
The synthetic MT responses for this setting were calculated using WSINV3DMT (Siripunvaraporn et al. 2005;Siripunvaraporn and Egbert 2009). Examples of the MT impedance and the PTATAN for the station syn-0403 over conductive anomaly L1 are shown in Figs. 2 and 3 , respectively.

Galvanic distortion of the MT impedance tensor
In the present study, we use the parameterization proposed by Groom and Bailey (1989) to simulate the galvanic distortion, in which the distortion operator C is defined in a manner analogous to deformation theory as where the scalar g is the site gain and the operators T S and A are, respectively, the twist, shear, and splitting operators, expressed as 1 e e 1 , in which t , e , and s are the twist, shear, and splitting parameters, respectively, which are real-valued. Each operator is normalized so that the norm is unity. As a consequence, the definitions of scaling (change in magnitude) and geometric distortion (change in impedance dimensionality) are clearly separated. An alternative to the GB model is the perturbed identity matrix (PIM) model (e.g., Tietze et al. 2015). The difference between the PIM and GB models of galvanic distortion is discussed in the study of Rung-Arunwan et al. (2017). The (g, t, e, s) cohorts are randomly generated following the normal distribution. Here, we test cases with three standard deviation (SD) values of the normal distribution, namely, 0.1, 0.3, and 0.5 (Fig. 4), to represent cases of weak, moderate, and strong distortion, respectively, according to Rung-Arunwan et al. (2017). Note  Fig. 6. A data set with stronger distortion shows larger and more scattered average LDIs. The error bar of the average LDI is the standard deviation of the LDI as a function of frequency at each station.

Results
In this section, we compare the inversion results obtained using the PTATAN and full and off-diagonal elements of MT impedances (called ZLOGA and ZLOGO, respectively) calculated from the synthetic model ( Fig. 1) with and without galvanic distortion. The 3D MT inversion software WSINV3DMT (Siripunvaraporn et al. 2005; Siripunvaraporn and Egbert 2009) was modified to invert PTATAN, ZLOGA, and ZLOGO data types using the formulation described in "MT data inversion" section.
In this study, an SPE of 5% was applied to the synthetic noise-free data and the errors were calculated by following the derivation in "Data error calculation" section.
We ran 10 iterations of synthetic inversion for each data type (PTATAN, ZLOGA, and ZLOGO). We show the inverted models derived from the iteration that provided the minimum data root-mean-square misfit (dRMS), which is defined as To select the optimal initial model, we compare the inversion results obtained using various initial models, namely, UHS models with resistivity values of 10, 100, and 1000 m , denoted as UHS10, UHS100, and UHS1000, respectively (Fig. 7), and the regional mean 1D resistivity profile estimated from the average ssq impedance, denoted as SSQ1D (Fig. 7). Note that the prior model is assumed to be the same as the initial model in each case.
SSQ1D is estimated by following Rung-Arunwan et al. (2016) and Rung-Arunwan et al. (2017). We first calculate the average ssq impedance from the given MT array: where S is the total number of MT stations in the array, which is 40 in the present case. The average ssq impedance is then inverted to estimate a 1D profile using Occam's inversion (Constable et al. 1987), in which the first-order roughness is penalized. The result from the undistorted data converged within a dRMS value of unity (Fig. 7), which means that model responses agree with the observed ones within the observation error. As shown in Fig. 7, the SSQ1D can be regarded as a smoothed version of the optimal mean 1D profile.
The SSQ1D models from the distorted data set were also calculated. They were essentially the same as those obtained from the undistorted data, which is consistent with the study of Rung-Arunwan et al. (2017).
To quantify the extent to which the inversion recovers the synthetic 3D model, here we define the model recovery factor using the normalized cross-correlation (NCC) proposed by Lewis (1995). The NCC is used to measure image similarity in template matching (Penney et al. 1998, for comparison, see). Here, we modify the original 2D NCC definition to suit the present problem. We define the model recovery factor as where m syn,i and m are the mean values of the synthetic and inverted model parameters, respectively.
When distortion is not included, inverting the impedance data, either ZLOGA or ZLOGO, using the UHS100 model as the initial model results in reasonable convergence (dRMS < 1) with the inverted models closer to the synthetic model ( Fig. 8) than those obtained using other UHS initial models (Campanya et al. 2016). Inversion using SSQ1D as the initial model exhibits similarly good or slightly better performance.
Next, we consider the effect of distortion. The synthetic MT responses were distorted with three sets of (g, t, e, s) cohorts, which were derived from the normal random numbers with SD values of 0.1, 0.3, and 0.5. They were inverted in the same manner as that for the undistorted data.
We found that the inversion results of distorted data show a number of artifacts depending on the degree of distortion (Figs. 9 and 10).
As clearly indicated by the model recovery factors (Fig. 11), inverting ZLOGO may produce less spurious artifacts and a more accurate model than inverting ZLOGA, because the effect of phase mixing (Jones 2011) is stronger for on-diagonal elements.
As shown in Fig. 11  3.096 (8.513 and 12.73) using SSQ1D as the initial model, respectively, when the SD of (g, t, e, s) is 0.1 (0.3 and 0.5).
A reasonable solution (dRMS < 1) is obtained from the inversion of ZLOGO if distortion is weak (SD = 0.1). However, it becomes more difficult for the inversion to converge and resolve main features of the given structure when distortion is more intense. The selection of a good initial model (UHS100 or SSQ1D in this case) is essential for good inversion results. If we employ a less accurate initial model (UHS10 or UHS1000), the results are unreliable, producing a large dRMS and a small model recovery factor.
Next, we inverted PTATAN data, which are free from galvanic distortion. As expected, we obtain inversion results comparable to those obtained by inverting undistorted impedances if a good initial model (UHS100 or SSQ1D) is employed. However, as has previously been pointed out (Patro et al. 2013;Tietze et al. 2015), inverting PTATAN data strongly depends on the choice of initial model.
In contrast to inverting the impedance element data, inverting the PTATAN data barely recovers the given structure if the initial model is too conductive (UHS10) or, especially, too resistive (UHS1000), as shown in Fig. 8.
Similar to the case of inverting the impedance data, inverting the PTATAN data using UHS100 or SSQ1D as the initial model easily converges to a solution that is close to the synthetic model. Quantitatively speaking, using UHS100 and SSQ1D as initial models gives comparable model recovery factors (Fig. 12) for all data types, although the data misfit from using SSQ1D tends to be slightly smaller. The only difference is that SSQ1D can be directly estimated from the data, while the mean value of resistivity must be assumed for UHS models. The UHS100 has no vertical variation in the resistivity, but the value is close to the vertically averaged resistivity of the synthetic model (see "Selection of initial model" section for more details). This suggests that the lack of resistivity information of the PT can be compensated for by giving an appropriate initial/prior model (see also Tietze et al. 2015). In addition, the investigation of using SSQ1D in the case, where the regional structure is not clearly 1D, as shown by Samrock et al. (2018), would be of interest. One may need to try both SSQ1D and a UHS model in such cases.
Generally speaking, the inversion of either the PT or impedance element data is affected by the choice of initial/prior model regardless of the presence of galvanic distortion.
The results of the present synthetic inversion indicate that SSQ1D is a practical choice of initial model that can be estimated from only observation data by assuming that the distortion is a random phenomenon (Berdichevsky et al. 1980). Our numerical results show that it is difficult to recover the given structure by inverting distorted impedance element data, and inverting distortion-free responses, such as the PT, with a proper initial/prior model is more promising.

Selection of initial model
From the results of synthetic inversion presented in the previous section, the 100 m half-space (UHS100) model and the SSQ1D model are regarded as appropriate for the initial/prior model to provide acceptable results.
We can simply calculate the vertical average resistivity on a base 10 logarithmic scale of the SSQ1D model as where M z is the number of model parameters in the vertical direction and σ iz and d iz are, respectively, the (25) resistivity and thickness of the izth layer. Using Eq. (25), the vertically averaged resistivity of the SSQ1D and the optimal 1D profiles ( Fig. 7) can be estimated as 2.16 and 2.25 on a base 10 logarithmic scale, respectively. The resistivity value of UHS100 model differs only by about 8.0% and 12.5% , respectively, from these values. This is the reason why the UHS100 model outperforms the UHS10 and UHS1000 models. This result suggests that using a UHS model with an arbitrary resistivity as the initial model is viable if the chosen resistivity value is close to the vertically averaged resistivity of the optimal 1D profile. However, knowing the optimal mean 1D resistivity profile beforehand is impossible in practice, and therefore, the initial guess of a 100 m UHS may not always be applicable. When the resistivity structure is unknown, one must try various initial resistivity values, depending on the area of study. In contrast, the SSQ1D profile is obtained unambiguously from observational data, and hence a trial-and-error process with various UHS models is not necessary to determine the best inversion model.

Model and data direction vectors
So far, the inversion performance has been examined in terms of the dRMS. For example, the convergence of inverting PTATAN data using SSQ1D as the initial model is confirmed by the dRMS having its minimum value in the third iteration (Fig. 14c). In the synthetic study, the model recovery factor is also regarded as a measure of inversion performance. Both the dRMS data misfit and the model recovery factor measure the distance between multidimensional vectors: the former between the vectors of observed and calculated responses and the latter between the vectors of estimated and synthetic model parameters. Here, we propose parameters that describe the direction angles of the model and data vectors.
First, we define the model correction vector angle (MCVA) α k at iteration k as (26) α k = arccos δm k · δm k,syn |δm k ||δm k,syn | , where the model correction vectors are given by where m k is the vector of model parameters (resistivity on a base 10 logarithmic scale in this case) at the kth iteration and m syn is the vector of model parameters derived from the given synthetic model. As illustrated in Fig. 13, α k is the angle between the model correction vector at the kth iteration ( δm k ) and the model vector from a model of the kth iteration to the synthetic model (δm k,syn ). The 2D illustration in Fig. 13a suggests that the parameter after correction will be closer to the synthetic parameter than the parameter before correction, if the MCVA is smaller than 90°. As the inversion evolves and approaches conversion, the dRMS is  expected to reach a minimum. If the initial guess is very good, there is little room for improving the parameters and thus the dRMS might only slightly decrease from the beginning. In such a case, the inversion takes only a few iterations to converge (e.g., in the case of inverting the PTATAN with SSQ1D as the initial model in Fig. 14c). After the minimum dRMS is reached, the model vector fluctuates around the synthetic model vector due to the presence of observation error. Consequently, the MCVA fluctuates around 90° (see Fig. 14b and c, for example), which can be used as an indication of convergence.
The MCVA can thus indicate inversion performance. However, the MCVA depends on the true 3D resistivity structure, which is available only in a synthetic study. We, therefore, define the data correction vector angle (DCVA) β k at iteration k in a form similar to that of the MCVA: where and where d k is the vector of the calculated data at the kth iteration and d obs is the vector of the observed data, which in this case is synthetic data. The physical meaning of the DCVA is similar to that of the MCVA. In the following paragraphs, the performance of inversion results described in the previous section is discussed in terms of dRMS and the MCVAs and DCVAs.
In PTATAN inversion (Fig. 14), the dRMS gradually decreases and reaches a minimum if an appropriate initial model (UHS100 or SSQ1D) is chosen. The MCVAs and DCVAs start from small values and approach 90°. When (29) β k = arccos δd k · δd k,obs |δd k ||δd k,obs | , an inappropriate initial model (UHS10 or UHS1000) is chosen, the dRMS fluctuates from the beginning, making it difficult to recognize the minimum. This behavior may suggest that the inversion becomes stagnant at a certain local minimum. Figure 15 shows a 2D representation of the model and data vectors derived from inverting PTATAN data using SSQ1D as the initial model. Note that these vectors are not the correction vectors illustrated in Fig. 13. Here, the norms and angles of the model/data vectors are measured relative to the synthetic model/data vectors. From the initial model/data vectors, the inversion optimizes the objective function so that the model/data vectors move toward the synthetic ones (blue arrows). After the inversion reaches convergence (indicated by red arrows), the solutions slightly diverge.
As shown in the previous section, if distortion is negligible, inverting PTATAN and impedance data gives comparable results (Fig. 8); that is, the behaviors of the MCVAs and DCVAs of ZLOGA and ZLOGO inversion (Figs. 16 and 17) are similar to those of PTATAN inversion. When distortion is significant, however, the dRMS for ZLOGA and ZLOGO inversion decreases rapidly in the first few iterations but then stagnates at a high level (Figs. 18 and 19), particularly in cases of moderate (SD = 0.3) and strong (SD = 0.5) distortion.
Overall, reasonable convergence occurs if the MCVAs start from small values (much smaller than 90°). In other words, this condition guarantees that the model (and data) correction vector δm k (and δd k ) points in the direction that reduces the data misfit (approaching the solution).
If the initial MCVAs are large (but still less than 90°), the initial models may be very close to the solution. Only slight improvement is required by the inversion (e.g., Fig. 14c). Conversely, rapid and stable convergence is not expected if the initial MCVAs are obtuse.

Phase tensor inversion under inductive distortion
The present study with synthetic data showed that PTA-TAN inversion with a proper initial/prior model is a reliable approach if the distortion is purely galvanic. In this section, we demonstrate the limitation of PTATAN inversion using a model with near-surface small-scale heterogeneities (Fig. 20) Fig. 21. The average LDI suggests that station syn-0505 is strongly distorted. This may be due to the station being located near a distorter with a strong resistivity contrast (Fig. 20). At this station, the PT elements in arctangent scaling deviate from those of the undistorted case for a period of up to approximately 100 s (Fig. 22). The corresponding LDI as a function of period from this station is shown in Fig. 23. The frequency-dependent feature and the large phase angles of the LDI derived from the distorted data suggest the presence of induction effects from the distorters. The distortion due to the near-surface heterogeneity cannot always be expressed by GB parameterization, which expresses the effect by a real-valued distortion tensor, C . It is applicable only if the distortion is galvanic. Galvanic distortion is realized when the induction scale length of the electromagnetic field variation is sufficiently larger than the actual scale (1000 m in this case) of the distorters (Utada and Munekane 2000). In other words, the distortion may be inductive if this condition is violated. In such a case, the distortion must be expressed by a complex-valued tensor, and, therefore, the condition that makes the PT distortion-free no longer holds. In other words, the PT with distorted impedance is no longer the same as the PT with undistorted impedance. The significant inductive distortion in the present synthetic data set also affects the accuracy of the estimation of the mean 1D profile from SSQ1D (Fig. 24). As a result, the SSQ1D model calculated from the present synthetic data set is slightly different from that without distortion (Fig. 7) at a depth range of approximately 30-700 m. This difference can be ascribed to the presence of inductive distortion. In the following discussion, we invert the data set with inductive distortion in a manner similar to that described in "Results" section.
The inversion results obtained from PTATAN, ZLOGA, and ZLOGO are shown in Fig. 25, and the inversion performance is summarized in Table 1. Compared with the case of using the GB parameterization model, inverting the PTATAN from the model with distorters gives more near-surface artifacts (Fig. 26). This is due to the inductive effect generated from the distorters, which may also alter the induction effects from deep structures by mutual coupling. Regarding the inversion of impedance data, the result of inverting ZLOGO is qualitatively better than that for ZLOGA. Overall, inverting the PTATAN data provides inversion results that are both qualitatively  et al. Earth, Planets and Space (2022) 74:51 and quantitatively better than those obtained by inverting the impedance data (ZLOGA and ZLOGO), and the process takes only a few iterations to converge. The dRMS, MCVA and DCVA from inverting the distorter model are shown in Fig. 27.The behavior of MCVAs and DCVAs from ZLOGA and ZLOGO inversion are similar to the case of parametric distortion (Figs. 18, 19). Those from PTATAN inversion are similar to MCVA and DCVA shown in Fig. 14c, which shows reliable convergence.The results of inverting PT with a linear scale are given in Appendix C for comparison.
In the absence of distortion, inverting all elements of the MT impedance tensor has the greatest capacity to recover all features of the underlying structure (Campanya et al. 2016). If galvanic distortion exists, however, inverting only the off-diagonal elements is preferable as long as the distortion is relatively weak. The results obtained in this paper suggest that inverting the PTA-TAN with an appropriate initial model is quite reliable when distortion is purely galvanic. However, this condition cannot always be expected in practice. In general, near-surface distorters (of specific scales and contrasts) generate both inductive and galvanic effects. We demonstrated that the inductive effect weakens with increasing Synthetic data (blue circles) and inversion responses (blue solid curves) corresponding to the SSQ1D model from the case with distorters. They are shown in comparison with corresponding responses for the case without distorters period but may still cause considerable inductive distortion. Unlike the galvanic effect, the inductive effect is frequency-dependent and exists in the magnetic field as well (e.g., Chave and Smith 1994;Nolasco et al. 1998). Inductive distortion in MT data is more problematic than galvanic distortion and, therefore, has rarely been dealt with, particularly for on-land observations. Careful interpretation of PT inversion results is recommended, because the inversion of a data set with inductive distortion may cause some artifacts and alteration of the regional model. Checking the LDI before performing inversion is important in practice, as it diagnoses the presence of galvanic or inductive distortion at each observation site, at least to some extent.

Importance of the prior model for phase tensor inversion
In general, the initial and prior models are identical in the inversion ( m p in Eq. 3), unless some a priori information, such as data from geological or previous studies, is known. In this paper, we showed that the SSQ1D model could be a proper estimation of the initial/prior model for the PT-based inversion. In some inversion codes, such as WSINV3DMT, the initial and prior models can be treated as independent. Therefore, in this section, we examine how the initial or prior model affects the inversion performance separately, using a PTATAN data set from a synthetic model (Fig. 1).
First, SSQ1D was chosen as the initial model, and the prior models UHS10, UHS100, and UHS1000 were tested to simulate the situation (denoted, respectively, as SSQ1D/UHS10, SSQ1D/UHS100, and SSQ1D/ UHS1000), where the prior model is less proper than the initial model. As a result, the inversion did not converge in the case of SSQ1D/UHS10. In the cases of SSQ1D/ UHS100 and SSQ1D/UHS1000, the inversion did converge and recover the main features of the input model (Fig. 28) comparable to the case using SSQ1D as the initial and prior model (Fig. 25a) qualitatively. However, these tests yielded larger dRMS misfits and smaller model recovery factors than those from the SSQ1D/SSQ1D case (Table 2) quantitatively.
Second, we kept the prior model fixed to the SSQ1D model, while UHS10, UHS100, and UHS1000 were chosen as the initial model to examine the case of using the proper prior model. The results showed that the inversion converged reasonably well qualitatively (Fig. 29) and quantitatively ( Table 2). From these two tests, we may conclude that the choice of a good prior model is more crucial. However, this possibly depends on the inversion  (Fig. 20). All used SSQ1D as the initial model, but the prior models, a UHS100 and b UHS1000, were examined. The result of SSQ1D/UHS10 is not shown, as its inversion did not converge  Fig. 1) with near-surface distorters (Fig. 20). All used SSQ1D as a prior model, but the initial models, a UHS10, b UHS100, and c UHS1000, were examined  For the case of SSQ1D/UHS10, the inversion did not converge even if the initial DCVA was approximately 67° (< 90°), while the initial MCVA in this case exceeded 90°. Thus, the MCVA is more diagnostic, but it cannot be used in real data inversion, as discussed in "Model and data direction vectors" section. In practice we can say that the assumed prior model is not appropriate, and the inversion may hardly converge, when the initial DCVA is relatively large. However, all these results were obtained only for WSINV3DMT code with a data-space Occam's inversion algorithm. The behavior and diagnostic power of the MCVA and DCVA on other inversion algorithms, e.g., conjugate gradient based or Gauss-Newton approaches is an interesting topic that should be thoroughly investigated in the future.

Conclusions
Galvanic distortion is geological noise and a kind of spatial aliasing in MT observation caused by near-surface small-scale heterogeneities and topographic variations. If distortion effects in MT data are not treated properly, the inversion of such data may produce artifacts or false anomalies. One approach for dealing with distorted MT data is to invert the PT, which is free of galvanic distortion.
However, estimation of appropriate initial and prior models is crucial for reliable PT inversion, because the PT does not contain any information on the absolute value of Earth's resistivity.
In the present study, we demonstrated that the use of the regional mean 1D profile estimated from the average ssq impedance (SSQ1D) is a promising solution to this problem. We conducted several tests with synthetic data to quantitatively show that PT inversion using SSQ1D as the initial/prior model gives reliable results provided that the distortion is galvanic. Most importantly, we confirmed that the SSQ1D model and the PT are obtained from only the observation data, and thus no other a priori information is required.
However, PT inversion results should still be interpreted with caution when a real MT data set, which may include inductive distortion, is used. Using a synthetic model with near-surface distorters, we showed that the PT responses may be altered, particularly at high frequencies, due to inductive distortion and that the inversion of such a data set may produce artifacts. Nevertheless, the inversion performance is still better than the cases of inverting MT impedance elements. We conclude that PT inversion with SSQ1D as the initial/prior model is a viable choice for dealing with distorted MT data when distortion is purely galvanic or weakly inductive. More work is needed to solve the problem of severely inductive distortion. from which we can write the sensitivity of log |Z C jk | as follows: From the definition of the MT impedance phase in terms of its real and imaginary parts: the sensitivity of the impedance phase can be written as

Appendix B. Near-surface heterogeneities by the oblong distorter model
In general, near-surface distorters can have any shape and size. Here, oblong-shaped heterogeneity was chosen, which allowed us to control the typical scale and shape of the distorters in a simple fashion. Given that the model space has lengths (in meters) of L x and L y in the horizontal directions x and y, respectively, and is divided into N x N y cubic cells, as shown in Fig. 1, the distorters are embedded in the near-surface layer with a thickness of d and a uniform background resistivity of ρ 0 . The total number of distorters was K = L x L y /M x M y , where M x and M y are the typical dimensions of the oblongs in the orthogonal directions. The typical size M of the distorters was defined as M = M x M y . The distorters were distributed uniformly over the space. The shape of each distorter was assumed to be anisotropic. The controlling parameters were the azimuth θ k and the aspect ratio (or flatness) µ k . Both were assumed to be uniformly distributed, over [0, 360) and (0, 1], respectively. The resistivity of each distorter ρ k was generated from a Gaussian random number following the normal distribution with the given SD [see Eq. (32)].
In this paper, the typical size of oblongs M was assumed to be 1000 m (smaller than the average site spacing of 1600 m). The area of each oblong was approximately 1000 × 1000 m 2 . The ratios between the width and length of each oblong were uniformly distributed over [0.2, 1.8]. The parameters used to generate the oblong distorter model (shown in Fig. 20) are summarized in Table 3.

Appendix C. Example of inverting the PT with a linear scale
We show the results of inverting the PT with a linear scale (Fig. 30) from the synthetic model with and without nearsurface distorters. The inversion metrics, data misfit, and model recovery factors are given in Table 4 in comparison with PTATAN. Overall, the performance of linear PT inversion is comparable to that of PTATAN inversion. This is probably because phase values in the present synthetic tests are in the moderate range (< 60°) to avoid the singularity problem.