- Letter
- Open Access
- Published:

# Statistical analysis of extreme auroral electrojet indices

*Earth, Planets and Space*
**volume 67**, Article number: 153 (2015)

## Abstract

Extreme auroral electrojet activities can damage electrical power grids due to large induced currents in the Earth, degrade radio communications and navigation systems due to the ionospheric disturbances and cause polar-orbiting satellite anomalies due to the enhanced auroral electron precipitation. Statistical estimation of extreme auroral electrojet activities is an important factor in space weather research. For this estimation, we utilize extreme value theory (EVT), which focuses on the statistical behavior in the tail of a distribution. As a measure of auroral electrojet activities, auroral electrojet indices AL, AU, and AE, are used, which describe the maximum current strength of the westward and eastward auroral electrojets and the sum of the two oppositely directed in the auroral latitude ionosphere, respectively. We provide statistical evidence for finite upper limits to AL and AU and estimate the annual expected number and probable intensity of their extreme events. We detect two different types of extreme AE events; therefore, application of the appropriate EVT analysis to AE is difficult.

## Findings

### Introduction

Extreme auroral electrojet activities can induce rapid variations in geomagnetic fields and ionospheric disturbances. These rapid variations in the geomagnetic fields induce large geomagnetically induced currents (GIC), which can damage high-voltage power transformers of power grids and increase steel corrosion of pipeline networks (e.g., Lanzerotti 2001). The ionospheric disturbances due to Joule heating associated with auroral electrojets can degrade radio communications and interfere with precise navigation (e.g., Ding et al. 2008). The characterization of extreme GIC events is central to quantifying the technological impacts and societal consequences of extreme space weather. Recently, Thomson et al. (2011), Viljanen et al. (2013), and Pulkkinen et al. (2012) used the European magnetic observatory network to study extreme geomagnetic activities and to assess their hazard in Europa. In the present study, we focus on auroral latitude geomagnetic activity caused by auroral electrojets. Because they cause intense GIC, the effect of auroral latitude geomagnetic activity on GIC has been intensively studied by many researchers (e.g., Beggan 2015). This auroral activity also causes polar-orbiting satellite anomalies due to the enhanced auroral electron precipitation. Auroral electrojets are horizontal electric currents that flow in the ionosphere of the auroral zone; their indices were introduced by Davis and Sugiura (1966) as a measure of global electrojet activity. These indices are derived from geomagnetic variations in the horizontal component observed at 12 observatories in a geomagnetic latitude range of 61°–70° in the Northern Hemisphere. The AU and AL indices are defined by the largest and the smallest values of the data, respectively. The symbols AU and AL represent values forming the upper and lower envelopes of the superposed plots of all data from these stations as functions of universal time (UT). The AU and AL indices are believed to describe the maximum strength of eastward and westward electrojet currents in the auroral latitude ionosphere, respectively. The difference, AU minus AL, defines the auroral electrojet (AE) index (AE = AU − AL). The AE index describes the sum of the maximum current strength of the two oppositely directed currents at two different points in local time and is commonly used as an index of global aurora activities. When a substorm is initiated, auroral electrojets are developed. Westward (eastward) electrojets lead to decreases (increases) in AL (AU), resulting in an increase in AE at the substorm onset. A commonly used method for identifying substorm onsets based on the AE index is to detect prompt decreases (increases) in AL (AE). Moreover, these extreme auroral electrojet activities represent extreme substorm activities. These substorms cause the stored energy in the magnetosphere to be released from the magnetotail to be injected into the auroral latitude ionosphere. The mechanisms of this energy injection have not been fully resolved. The study of extreme auroral geomagnetic activities allows for GIC hazard assessment in the auroral latitude and provides physical insight into the energy release and injection mechanisms in place during extreme substorms.

To estimate the occurrence of extreme auroral electrojet activities, we utilize extreme value theory (EVT) for AL, AU, and AE. EVT, a statistical theory that focuses on the behavior of the upper tail of a distribution function, has been previously applied to geomagnetic data for space weather studies. Tsubouchi and Omura (2007) evaluated long-term occurrence probabilities of extremely intense geomagnetic storm events by using the Dst index, which is a global geomagnetic index used in measuring geomagnetic storm magnitude. Thomson et al. (2011) studied extreme variations in magnetic fields by using digital 1-min data from 28 European observatories in a geomagnetic latitude range of 40.3°–73.9°. They evaluated the return level of the horizontal and declination of magnetic field variations at each observatory that might be observed once every 100 and 200 years. Other applications of EVT have been used in the context of space weather (energetic electrons) for satellite designs. O’Brien et al. (2007) estimated the extremely high fluxes of MeV electrons in the outer zone of radiation belts. In addition, Nakamura and Yoneda (2014) estimated the worst electron temperatures in the geostationary orbit.

In the present study, we begin with an analysis of AL and AU indices by outlining a statistical model of EVT. Next, we analyze the AE index, and we discuss the physical meaning of the statistical results.

### Data

The AL, AU, and AE datasets used in this study are available from the World Data Center for Geomagnetism (WDCG), Kyoto University, Japan (http://swdcwww.kugi.kyoto-u.ac.jp/index.html). The datasets consist of 1-min values of the auroral indices in 1996–2012, giving a total of 8,942,400 data points for each index. These datasets have been systematically calculated from digitally recorded data. Older datasets from 1975 to 1995 are also available from WDCG. However, because much of the older data is digitized from analog magnetogram paper records, it is not clear whether their accuracy is sufficient for an extreme region.

### AL and AU indices

For the auroral electrojet index dataset, we utilized the peak-over-threshold (POT) method of EVT (e.g., Coles 2001; Reiss and Thomas 2001) in the same manner as that used by Tsubouchi and Omura (2007). For convenience, a negative AL value (−AL) is used hereafter. Figure 1 shows the probability density function (PDF) of −AL and AU. The probabilities sum to 1. Very large −AL and AU events are located in the upper tail of the distribution function and are regarded as extreme events that rarely occur. To precisely describe the statistical behavior in such an extreme range, EVT can exclude a bias due to the bulk of the distribution. The main aim of EVT is to create a parametric model of the distribution function focused on the extreme data, which is determined by the use of the POT method.

There are subtleties in applying EVT to geophysical data such that clusters of extreme values may occur during a single substorm or groups of substorms; that is, each sample is not statistically independent. We first assume the data as independent random variables for simplicity and later discuss the influence of clustering. We assume that the dataset *X* = {*x*
_{
i
}} consists of independently and identically distributed random variables that are governed by the distribution function *F*(*x*). The POT method requires a sufficiently large threshold *μ*, but one that is smaller than the right end point sup [*x*: *F*(*x*) < 1]. *F*
^{[μ]}(*x*) is a conditional (cumulative) distribution function for the *x* > *μ* data and can be written as

A theorem in EVT shows that *F*
^{[μ]}(*x*) asymptotically approaches the generalized Pareto distribution (GPD),

where *μ*, *γ*, and *σ* represent location, shape, and scale parameters, respectively. We fit the auroral electrojet index dataset to this formula to determine these parameters. Determining the appropriate threshold *μ* is critically important in checking the validity of the POT method. Although *μ* should be sufficiently large to assure the extreme properties, larger *μ* can lead to insufficient data for conducting meaningful analysis. There is no general, systematic way of finding the best *μ* estimation. Ideally, *μ* should be the lowest value above which the exceedance distribution obeys the same GPD function. One of the methods commonly used is to examine a mean excess function *M*(*u*) = *E*[*X* − *u* | *X* > *μ*], which represents the mean value of the exceedance data subset over the threshold *u*. The GPD characteristic indicates that *M*(*u*) = (*σ* + *γu*)/(1 − *γ*) is a linear function of *u*. Then, the parameter *μ* can be identified by the minimum *u* that conforms to this linearity. Figure 2 shows the mean excess function *M*(*u*) of −AL as a function of the threshold *u*. The graph shows a linearly decreasing tendency beyond *u* of 3000. The decreasing tendency indicates a negative value for the parameter *γ*. The data subset of −AL >3000 nT can be expected to obey a distribution to that represented by the GPD formula. The number of exceedance data points at −AL >3000 nT is 84 and confirms the rare occurrence (~0.001 %) of such events. Figure 3 shows the mean excess function *M*(*u*) of AU and a linearly decreasing tendency (*γ* < 0) beyond *u* of 1500. The number of exceedance data points at AU >1500 nT is 34 and confirms the rare occurrence (~0.0004 %) of such events.

The other GPD parameters (*γ*, *σ*) are estimated by conventional maximum likelihood methods for the given threshold *μ*. The appropriate estimation is taken to maximize the likelihood function for *x*
_{
i
} > *μ*:

Numerical techniques are required to solve this. We use the R package “ismev” and “extRemes”; “R” is a free software environment for statistical computing and graphics (http://www.r-project.org/).

Table 1 summarizes the estimated GPD parameters −AL and AU. The distribution functions can be obtained by substituting the estimated parameters (*μ*, *γ*, *σ*) into the GPD formula. One important characteristic of the GPD formula is that the parameter *γ* determines the behavior in the extreme limit (e.g., O’Brien et al. 2007; Tsubouchi and Omura 2007). If *γ* < 0, (2) gives a finite upper limit, *μ* + *σ*/|*γ*|. On the contrary, if *γ* ≥ 0, the distribution is unbounded. The results indicate that *γ* < 0 for both −AL and AU, which suggests −AL and AU have upper limits that become approximately 4202 and 2063 nT, respectively.

We checked the validity of the estimated GPD parameters (*γ*, *σ*) over a range of thresholds *μ*; Figs. 4 and 5 show those when a different threshold *μ* is used for −AL and AU, respectively. The parameter *γ* is approximately constant at *μ* ≥ 3000 for −AL and at *μ* ≥ 1450 for AU. Because the parameter *σ* for −AL is approximately constant at 3000 ≤ *μ* ≤ 3300, −AL is distributed according to the GPD formula. However, the parameter *σ* becomes smaller at *μ* ≥ 3400. In such a case, the number of exceedance data points is 40, which is smaller than half the number of *μ* = 3000; thus, the statistical accuracy is decreasing. The parameter *σ* for AU shows a decreasing tendency at *μ* ≥ 1450, which means that AU is not identically distributed according to the GPD formula. Figure 6 shows the estimated upper limits of −AL and AU for different threshold *μ* values. Although the parameter *σ* shows a decreasing tendency, it has little influence on the statistical results; thus, the estimated upper limits become almost the same values at *μ* ≥ 3000 for −AL and at *μ* ≥ 1450 for AU.

The occurrence probability is calculated from the GPD formula (2) (Coles 2001) by evaluating the following probability:

The exceedance probability Pr [*X* > *x*] can be approximated by *k*/*n*, where *k* represents the number of exceedances (*X* > *x*), and *n* represents the total number of data points. The GPD formula indicates the conditional probability of *X* < *x* in the subset of *x* ≥ *μ* data, Pr [*X* < *x*|*x* > *μ*]. Therefore, Pr [*X* > *x*|*X* > *μ*] is equivalent to 1 − Pr[*X* < *x*| *x* > *μ*] = 1 − *W*
_{
μ;σ,γ
}(*x*). By these formulae, the annual expected number *Y*, that is, the number of events in which *x* > *x*
_{0} is expected per year, is written as

where *m* represents the number of the data points per year. Figure 7 shows the annual expected number of the extreme events over the given −AL and AU values. For example, the annual expected numbers of the events with −AL >4000 nT and AU >2000 nT are 0.2. and 0.3, respectively.

We calculate the probable intensity *S*
_{
T
}, which presents the return level of a given period, as

where *δ*
_{
n
} is the exceedance probability Pr [*X* > *μ*], and *N*
_{
T
} is the total number of data points in the given period. Figure 8 shows the return levels of −AL and AU at the upper and lower 95 % confidence limits. Those of −AL for once in 10, 20, and 100 years were determined as 4064 (4005 − 4208), 4105 (4047 − 4289), and 4157 (4047 − 4289) nT, respectively. The return levels of AU for once in 10, 20, and 100 years were determined as 2046, 2056, and 2062 nT, respectively, in which widths of the 95 % confidence intervals were less than 1 nT. These values are close to their upper limit values. Therefore, for industrial purposes concerned with hazard assessment in the auroral latitude, it is better to consider their upper limits.

### AE index

We show the PDF and the mean excess function *M*(*u*) of AE in Figs. 9 and 10, respectively. The mean excess function appeared to be a linearly decreasing tendency (*γ* < 0) at 2700 < *u* < 3500, a linearly increasing tendency (*γ* > 0) at 3500 < *u* < 3800, and a sharp decreasing tendency at *u* ≥ 3800. The number of the excess data points at *u* ≥ 3800 is only three and is inadequate for providing adequate statistical accuracy. The mean excess functions decrease to zero and usually show a negative tendency in the range of the last few of the largest data points. Therefore, we selected the threshold *μ* = 3500. The number of exceedance data points at AE >3500 nT is 26. The estimated parameters (*μ*, *γ*, *σ*) are summarized in Table 2. The results reveal *γ* > 0, which suggests that the AE distribution is unbounded and that AE has no upper limit. These results are inconsistent with the statistical results such that both −AL and AU have upper limits and that AE should not exceed a maximum upper limit of approximately 6265 nT, which is the sum of the AU and −AL upper limits. We discuss the physical meaning of the results in a subsequent section.

### Influence of clustering

Table 3 shows the event lists of −AL >3000 nT and AU >1500 nT. We detected numerous clusters of extreme values and analyzed their influence. We used run de-clustering to define the clusters, which assumes that the exceedance data points above a given threshold belong to the same cluster if they are separated by fewer than a given number (run length) of consecutive values below the threshold. Only the maximums within each cluster are retained. For the −AL, we set the threshold at 2000 nT and the run length at 30 min to detect 92 clusters, which is originally 881 data points over the threshold. For AU, we set the threshold at 1000 nT and the run length at 30 min to detect 25 clusters, which is originally 226 data points over the threshold. We utilized the POT model for these de-clustered datasets. We plotted the mean excess function *M*(*u*) in Fig. 11 and calculated the GPD parameters. For the de-clustered −AL data, in the case of *μ* = 3300, the number of the exceedances, *γ*, and *σ* were determined to be 5, −1.563 ± 0.001, and 1314 ± 0, respectively. Those for the de-clustered AU data in the case of *μ* = 1500 were determined to be 4, −1.618 ± 0.001, and 830 ± 0, respectively. The very small number of the clusters, which are distributed on a line over the GPD threshold *μ*, significantly decreases its statistical accuracy. However, the negative values for the parameter *γ* still indicate that the AU and −AL have upper limits of approximately 4141 and 2063 nT, respectively. Figure 12 shows the return levels of the de-clustered −AL and AU data. Those of −AL for once in 10, 20, and 100 years were determined to be 3985, 4088, and 4137 nT, respectively. The return levels of AU for once in 10, 20, and 100 years were determined to be 1932, 2023, and 2060 nT, respectively. The 95 % confidence intervals were not plotted owing to a shortage of data points. The clustering is shown to significantly decrease statistical accuracy, although it does not change the essential results.

### Discussion

We utilized the POT model of EVT for the AL, AU, and AE indices and presented statistical evidence for finite upper limits to AL and AU. The results suggest that the westward and eastward electrojet currents in the auroral latitude ionosphere have finite upper limits. These results appear to be contradictory to those reported by Thomson et al. (2011). In their research, the return levels of the horizontal magnetic field variations in the auroral latitude gradually increased with the return periods and appeared to have no upper limits. We attribute this discrepancy to differences in determining the GPD threshold *μ*. Although they set the threshold *μ* at 99.97 % for each variable at most observatories, we set the value at ~99.999 % for −AL. Because the auroral indices were derived from 12 stations and provided a substantially larger amount of information than that obtained by using the data of a single station, the number of data points in the extreme range was insufficient in the previous study.

Thomson et al. (2011) determined that the magnetic field return levels in the mid-latitude range of 53°–62° increase more than those in other latitude ranges. They suggest that the extreme auroral electrojet currents move southward from the normal auroral latitude to the mid-latitude as the active auroral oval extends and moves southward. The southward movement of the auroral electrojet currents is a possible explanation for the upper limits of −AL and AU because the induced field variations decrease with distance even if the auroral electrojet currents strengthen over mid-latitudes. However, the limited amount of further extreme data prevents us from concluding that the magnetic field variation has no upper limits. Therefore, we offer an additional possible hypothesis: the auroral electrojet currents themselves have upper limits. In text books (e.g., Baumjohann and Treumann 1997), substorms cause part of the stored energy in the magnetosphere to be released from the magnetotail to be injected into the auroral latitude ionosphere. The energy is dissipated by Joule heating due to auroral electrojet current flows in the ionosphere and the additional heating of the upper atmosphere. Although these mechanisms have not been fully resolved, it appears reasonable that the aural electrojet currents have upper limits because the stored energy in the earth’s finite-sized magnetosphere should have a physical upper limit. These analyses will provide physical insight into the energy release and injection mechanisms in place during extreme substorms.

Although the AL and AU analyses indicate that AE should have an upper limit, the AE analysis suggests that the AE distribution is unbounded and that AE has no upper limit. A scatter plot of the relationship between −AL and AU is shown in Fig. 13. Two types of extreme large AE data were detected. Type 1 data are located in the lower and rightmost area of the scatter plot, where −AL is significantly larger than AU. Table 3 shows that extreme −AL and AU events are not generated during the same substorms. Most of the extreme AE data points belong to type 1. The results of the POT analysis using only type 1 data indicate the existence of an upper limit because their extreme AE values are roughly equal to their extreme −AL values. Conversely, type 2 data are distributed in the central area of the scatter plot and have large but not extreme −AL and AU. Only few of the extreme AE data belong to type 2. However, the existence of these few data indicates that AE has no upper limit. The amount of type 2 data is insufficient for appropriate analysis and make a fatal error in the analysis of major type 1 data. This may explain why no upper limit was obtained for AE. For the largest AE in the type 2 data (2003/10/29:06:16 UT, AE = 4056 nT, −AL = 2703 nT, AU = 1353 nT), the AL and AU observations were conducted at approximately midnight and noon, respectively. The two stations are widely separated in local time. Kamide and Rostoker (2004) reported that the nature of the eastward electrojet as described by AU is quite different from that of the westward electrojet as described by AL. AE practically gives the sum of the maximum current densities of the two oppositely directed currents at two different points that are widely separated in local time. The weak dependency of the eastward and westward electrojets may be considered as a possible cause of the two types of the extreme AE data.

Finally, we discuss the older dataset of the auroral indices. We determined that the mean excess function of −AL including the older dataset before 1995, particularly that before 1992, does not show a simple negative tendency linear slope shape in the extreme range; rather, the slope shape resembled that of AE. However, there is no method available to check the validity of the extreme dataset owing to difficulties in the digitization of the analog magnetogram data and the calibration of the data. Our findings of the auroral electrojet analysis are given as preliminary and not as final results. That is, new datasets should be added, and the influence of the clusters of extreme values requires more precise evaluation. Further research should include such improvements.

## References

Baumjohann W, Treumann RA (1997) Basic space plasma physics. Imperial College Press, London

Beggan CD (2015) Sensitivity of geomagnetically induced currents to varying auroral electrojet and conductivity models. Earth, Planets and Space 67:24. doi:10.1186/s40623-014-0168-9

Coles S (2001) An introduction to statistical modeling of extreme values. Springer, London

Davis TN, Sugiura M (1966) Auroral electrojet activity index AE and its universal time variations. J Geophys Res 71:785–801. doi:10.1029/JZ071i003p00785

Ding F, Wan W, Liu L, Afraimovich EL, Voeykov SV, Perevalova NP (2008) A statistical study of large-scale traveling ionospheric disturbances observed by GPS TEC during major magnetic storms over the years 2003–2005. J Geophys Res 113:A00A01. doi:10.1029/2008JA013037

Kamide Y, Rostoker G (2004) What is the physical meaning of the AE index? Eos 85:188–192. doi:10.1029/2004EO190010

Lanzerotti LJ (2001) Space weather effects on technologies. In: Song P, Singer HJ, Siscoe GL (eds) Space weather. American Geophysical Union, Washington. doi:10.1029/GM125p0011

Nakamura M, Yoneda A (2014) A statistical analysis of the worst GEO plasma environment —extreme value analysis of the high electron temperatures—. In: Proceeding of the 13th Spacecraft Charging Technology Conference (2014 SCTC), Pasadena, CA, 23–27 June 2014

O’Brien TP, Fennell JF, Roeder JL, Reeves GD (2007) Extreme electron fluxes in the outer zone. Space Weather 5:S01001. doi:10.1029/2006SW000240

Pulkkinen A, Bernabeu E, Eichner J, Beggan C, Thomson AWP (2012) Generation of 100-year geomagnetically induced current scenarios. Space Weather 10:S04003. doi:10.1029/2011SW000750

Reiss RD, Thomas M (2001) Statistical analysis of extreme values. Birkhauser, Boston

Thomson AWP, Dawson EB, Reay SJ (2011) Quantifying extreme behavior in geomagnetic activity. Space Weather 9:S10001. doi:10.1029/2011SW000696

Tsubouchi K, Omura Y (2007) Long-term occurrence probabilities of intense geomagnetic storm events. Space Weather 5:S12003. doi:10.1029/2007SW000329

Viljanen A, Pirjola R, Prácser E, Ahmadzai S, Singh V (2013) Geomagnetically induced currents in Europe: characteristics based on a local power grid model. Space Weather 11:575–584. doi:10.1002/swe.20098

## Acknowledgements

The authors thank Dr. Masahito Nosé for providing information on the extreme auroral electrojet indices events. The authors are also grateful to the referees for their constructive comments. The authors acknowledge World Data Center for Geomagnetism (Kyoto) for the use of the auroral electrojet index database. This work was partly supported by the NICT Science Cloud at National Institute of Information and Communications Technology.

## Author information

## Additional information

### Competing interests

The authors declare that they have no competing interests.

### Authors’ contributions

MN, AY, and MO performed the statistical data analysis and participated in the interpretation and discussion of the results. KT contributed to the interpretation of the results. MN wrote the manuscript. All authors read and approved the final manuscript.

## Rights and permissions

**Open Access** This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

## About this article

#### Received

#### Accepted

#### Published

#### DOI

### Keywords

- Auroral electrojet activity
- AE AU AL index
- Extreme value theory
- Space weather