Extreme geomagnetic activities: a statistical study

Statistical distributions are investigated for magnetic storms, sudden commencements (SCs), and substorms to identify the possible amplitude of the one in 100-year and 1000-year events from a limited data set of less than 100 years. The lists of magnetic storms and SCs are provided from Kakioka Magnetic Observatory, while the lists of substorms are obtained from SuperMAG. It is found that majorities of events essentially follow the log-normal distribution, as expected from the random output from a complex system. However, it is uncertain that large-amplitude events follow the same log-normal distributions, and rather follow the power-law distributions. Based on the statistical distributions, the probable amplitudes of the 100-year (1000-year) events can be estimated for magnetic storms, SCs, and substorms as approximately 750 nT (1100 nT), 230 nT (450 nT), and 5000 nT (6200 nT), respectively. The possible origin to cause the statistical distributions is also discussed, consulting the other space weather phenomena such as solar flares, coronal mass ejections, and solar energetic particles.


Introduction
It is important to understand the characteristics and possible amplitudes of extreme events of substorms, sudden commencements (SCs), and magnetic storms to mitigate the space weather hazard, especially from geomagnetically induced currents (Kataoka and Ngwira 2016;Pulkkinen et al. 2017). However, it is still hard to predict the amplitude of unprecedented extreme events by physicsbased simulations, and the statistical analysis is necessary to estimate the quantitative amplitude of possible extreme events.
One of the extreme geomagnetic activity events was observed associated with an episodic solar flare on 1 September 1859 (Carrington 1859), which has been considered as a measure of extreme events. A low-latitude magnetometer measured ~ 1600 nT spike during the magnetic storm on 2 September 1859 (Tsurutani et al. 2003). Siscoe et al. (2006) estimated the 1-h averaged value as − Dst = 850 nT, which is well below the theoretical upper limit of − Dst = 2500 nT (Vasyliunas 2011). From the statistical comparison among several space weather phenomena of magnetic storms, solar flares, and coronal mass ejections, Riley (2012) estimated a probability of 12% to have another Carrington event in coming 10 years. Kataoka (2013) applied the same analysis to the 90-year list of magnetic storms and estimated the probability of another Carrington storm in 10 years as 4 ~ 6%.
It is possible that we will have extreme magnetic storms even larger than the Carrington storm in future. For example, from the detailed analysis of an auroral painting from Kyoto, Japan, Kataoka and Iwahashi (2017) estimated that the amplitude of a historical magnetic storm occurred on 17 September 1770 can be comparable to or even larger than the Carrington storm. As the latest example, the amplitude of a magnetic storm in May 1921 was estimated to be comparable to the Carrington event (Hapgood 2019; Love et al. Kataoka Earth, Planets and Space (2020) 72:124 2019), suggesting that the Carrington-class events may be more frequent than previously expected.
Another useful concept to define the extreme events is so-called "one in 100-year event". Tsubouchi and Omura (2007) applied an extreme value theory to estimate the amplitude of the 100-year event as − Dst = 550 ~ 750 nT. More recently, Love et al. (2015) estimated the 100-year event as − Dst = 880 nT, i.e., the Carrington event corresponds to the 100-year event. The best efforts have been repeatedly conducted to estimate 100-year event and even 1000-year event by carefully extrapolating the tail distributions, although such results are highly uncertain especially from the limited data set of the Dst index for only a half century (Love 2020).
In this study, acknowledging an advantage of Japan's long-term monitoring effort of geomagnetic activities at Kakioka Magnetic Observatory (KAK), their complied event lists of magnetic storms and SCs are used for the statistics to estimate the possible amplitudes of the 100-year and 1000-year events. Using the KAK lists, the possible power-law distributions of the amplitude of magnetic storms as well as SCs were discussed by Minamoto et al. (2015).
The similar statistical analysis can also be applied for substorms as well. It has been discussed for a long time that the amplitude of substorms basically follows lognormal distributions (e.g., Liou et al. 2018). From the statistical analysis, Nakamura et al. (2015) estimated the possible maximum amplitude of substorms as AE = ~ 4100 nT.
The purpose of this paper is to estimate the possible amplitude of extreme events of magnetic storms, SCs, and substorms from those statistical distributions. The event lists used in this study are explained in detail in "Data set" section. The method of analysis how to fit the log-normal and power-law distributions to the data set is described in "Method of analysis" section. Obtained results are provided in "Results" section. The possible origins to cause the log-normal and power-law distributions are discussed in "Discussion" section, by consulting the statistical distributions of the other space weather phenomena such as solar flares, coronal mass ejections (CMEs), and solar proton events (SEPs). Finally, concluding remarks are described in "Conclusions" section.

Data set
The event lists of magnetic storms and SCs are available at the KAK website (https ://www.kakio ka-jma.go.jp/ obsda ta/metad ata/ja/produ cts/list/event /kak). The lists are manually accumulated everyday by the professional operators at KAK. The occurrence properties of the identified magnetic storms and SCs are displayed in Figs. 1, 2. Although the lists essentially include the local variation, the long-term 96-year data with the unchanged identification criteria is very unique to investigate the 100-year and 1000-year extreme events. Further, in the view-point of space weather countermeasure against the extreme The standard data for measuring the amplitude of substorms are the AE index. However, the AE index becomes unreliable when large substorms occurred at lower magnetic latitude than that of the AE stations located at high latitude of 65-70 deg. Recently, the SME index was developed from globally distributed magnetometers ranging from 40 deg to 80 deg magnetic latitude (Gjerloev 2012) to provide a better replacement to evaluate the amplitude of such a large substorm events. The substorm list was also provided from SuperMAG website (http:// super mag.jhuap l.edu/). The data set used in this study is 34-year data from January 1986 when the number of SME stations was large enough (> 30 stations) to better identify extreme events. A total of ~ 6×10 4 substorm events were identified in the list for the 34-year time interval. The substorm amplitude of each event was calculated as the 15-min mean value of the SME index starting 10 min after the substorm onset (Newell and Gjerloev 2011). The basic occurrence property and the amplitude distribution of the SME index were documented by Newell and Gjerloev (2011).
In this paper, some more space weather event lists are consulted to discuss the possible origins of the statistical distributions. The event list of solar flares is available at NOAA's website (https ://www.ngdc.noaa.gov/stp/solar /solar flare s.html). The event list of CMEs is available at NASA's website (https ://cdaw.gsfc.nasa.gov/CME_list/). The list of SEP events is available at NOAA and NASA (https ://umbra .nasco m.nasa.gov/SEP/).

Method of analysis
In general, interactions among many elements or their non-linearity bring out a new characteristic from the complex network. In the complex system, large-amplitude events tend to follow a power-law distribution (e.g., Riley 2012; Gopalswamy 2018). The power-law distribution of the event amplitude x is defined as where α denotes the spectral index and A is a constant. A useful way to investigate the large-amplitude rare events is a cumulative distribution function (CDF) which is defined as We use the maximum likelihood estimation (Riley 2012;Kataoka 2013) to obtain the slope as where the x min is the minimum value to be used for the fitting.
(1) Rare events or large-amplitude events always arise from the majority. The majority usually follows a log-normal distribution in a complex system because it is characterized as a random walk of multiplications rather than that of additions. The log-normal distribution is defined as where μ is the geometric mean value and σ is the standard deviation. The CDF of the log-normal distribution can be written as where N T is the total number of events and the error function is A standard method of minimum variance fitting (scipy. optimize.curve_fit) is used in this study to find the bestfit CDF.
In order to estimate the possible amplitudes of the 100-year or 1000-year events from a limited data set of less than 100 years, both the log-normal CDF (Eq. 2) and power-law CDF (Eq. 5) are fitted to the data set. In this study, the time-stationarity of the distributions is then assumed to extend the limited time interval of the data set to 100 years or 1000 years. The log-normal distribution gives relatively conservative estimates, while the power-law distribution generally gives upper-limit estimates (Riley and Love 2017). Figure 3 shows that the majority of magnetic storms roughly follows the log-normal distribution, and the large-amplitude population of − dH > 200 nT may also follow the power-law. Note that the log-normal fit for above 200 nT level can be meaningful to give an estimate of the extreme amplitude because those large storms were caused by only CMEs, while relatively weak magnetic storms are driven by both CMEs and corotating interaction regions (Richardson et al. 2006;Kataoka and Miyoshi 2006). The excess from the log-normal distribution at relatively weak level can therefore be interpreted to be the mixed population of magnetic storms. The log-normal fit for > 200 nT storms gives the 100-year and 1000-year events as − dH = 750 nT and 1100 nT, respectively, while the power-law fit gives the 100-year and 1000-year events as − dH = 1100 nT and 2200 nT, respectively. The largest amplitude in the list is − dH = 661 nT that occurred on 24 March 1940. Note also that the record largest event of − dH > 700 nT on 4 July 1941 at KAK was not used in this study because of its ambiguity. The largest events of − dH > 400 nT are listed in Table 1. The 13 March 1989 storm is the largest Dst event since 1957 with the peak of final Dst index  Figure 4 shows that SC events follow a log-normal distribution within the amplitude range from 5 nT to 70 nT, while the large-amplitude SC events deviate from the log-normal distribution and follow a power-law distribution. The log-normal fit gives the 100-year and 1000year events as dH = 160 nT and 240 nT, respectively. The power-law fit gives the 100-year and 1000-year events as dH = 230 nT and 450 nT, respectively. The largest amplitude is dH = 220 nT that occurred on 13 November 1960. Note also that the record largest event of dH = 310 nT on 24 March 1940 was not in the KAK list because it does not include the second SC according to the rule of KAK (Araki 2014). The largest events of dH > 100 nT are listed in Table 2. Figure 5 shows that the substorm events essentially follows the log-normal distribution. The log-normal fit gives the 100-year and 1000-year events as 5000 nT and 6000 nT, respectively. The power-law fit gives the 100-year and 1000-year events as 6200 nT and 8500 nT, respectively. The record largest amplitude is SME = 3929 nT that occurred on 30 October 2003 during the so-called  Halloween event. The largest events of SME > 3000 nT are listed in Table 3.

Solar flares, CMEs, and SEP
Consulting other space weather phenomena, the same statistical analysis is applied to solar flares, CMEs, and SEP events. The results shown in Figs. 6, 7, 8 basically follow the analysis of Gopalswamy (2018), and the only difference is the types of distributions fitted to the cumulative distribution function. Gopalswamy (2018) used the Weibull distribution (exponential curve) and power-law distribution, while this study uses the log-normal and power-law distributions to give somewhat larger estimates as follows. The log-normal fit shown in Fig. 6 gives the 100-year and 1000-year event sizes as X70 and X200, respectively. These values are larger than the estimates of Gopalswamy (2018) in which the 100-year and 1000-year event sizes are ~ X40 and ~ X100, respectively. The log-normal fit shown in Fig. 7 gives the 100-year and 1000-year speed as 4500 km/s and 6000 km/s, while Gopalswamy (2018) gives the 100-year and 1000-year event as 3800 km/s and 4700 km/s, respectively. The log-normal fit shown in Fig. 8 gives the 100-year and 1000-year events as ~ 2.5 × 10 5 pfu and ~ 1.5 × 10 6 pfu, while the extrapolated curve of Gopalswamy (2018) gives the 100-year and 1000-year events as ~ 2×10 5 pfu and ~ 1×10 6 pfu, respectively.

Discussion
In summary, conservative amplitudes can be estimated from the log-normal distribution, while the possible excess (likely upper limit to the behavior of the tail) can be estimated from the power-law distribution. The possible amplitudes of 100-year and 1000-year events based   Table 4. The possible origins of the statistical distributions are briefly discussed as follows. The power-law fitting can be meaningful for rare events or large-amplitude events, and there are possible reasons to cause the excess from lognormal distribution at the tail. For SCs, the main cause of the excess from log-normal distribution are spikes (Araki 1997(Araki , 2014, which can be interpreted as the amplification of the preliminary impulse phase due to the velocity-jump effect over the density-jump effect, based on the parameter survey of a global magnetohydrodynamic (MHD) simulation (Kubota et al. 2015). Although it has been well known that the amplitudes of SC are proportional to the change of dynamic pressure, densityjump effect dominates the change of dynamic pressure for weak SCs, while velocity-jump effect dominates for large SCs. The rapidly changing large-amplitude spike  appears when the shock downstream speed becomes high (Kubota et al. 2015). In addition, we must admit that there are a few more missing extreme events from the statistics. For example, an extreme SC event on 24 March 1940 is not included in the KAK list, because it was the second SC event (Araki 2014). This particular example  reminds us of the complex interaction among multiple CMEs and ambient solar wind to enhance the geo-effectivity (e.g., Kataoka et al. 2015;Shiota and Kataoka 2016), which may also additionally contribute to the further excess from the standard log-normal distribution. For magnetic storms, unusual spikes may also cause the excess, although the physics and time-scale are totally different from SCs of course. There is an example that a huge spike of > 1600 nT appeared in the Carrington storm, in which additional ionospheric current or field-aligned current may locally contribute (Akasofu and Kamide 2005). Even if the spikes are not originated from the ring current, it does matter to prepare against the possible hazards. Those spikes may also contribute to make an excess from the log-normal extrapolation at the tail.

Table2 Largest SC events of dH > 100 nT at KAK since May 1924
It was clearly demonstrated that substorms of the Earth's magnetosphere essentially follow the log-normal distribution (Fig. 5). On the other end of the solar-terrestrial system, solar flares and CMEs may resemble the plasma explosions against substorms. Here, we discuss whether there are similarities among their statistical distributions.
It is interesting to note here that the majority of solar flares ranging from C-class to X10-class follows a powerlaw distribution rather than a log-normal distribution, although C1 class and the tail beyond X10 flares show the deviations (Fig. 6). The fewer samples of C1 class flares may be the missing counts when the background level is comparable to C1 flares during highly active conditions. Note also that we have only a few samples for > X10 flares in the last 40 years. The power-law distribution over a wide range may be interpreted, considering a difference between the Earth's magnetosphere and sunspots' magnetosphere, i.e., the active region. Active regions have a large variation of the spatial scales ranging multi orders of magnitude, and the fractal reconnection patterns naturally arise in the scale-free MHD system, in contrast to the only-one magnetospheric system of the Earth. In other words, if significant limitations in the scale-free system exist, log-normal distribution may clearly appear. Figures 7, 8 show that CMEs and SEP events essentially follow the log-normal distributions. These distributions are different from that of solar flares, which can be originated from the fact that only selective active regions can launch CMEs against the strong solar gravitation, and only selective CMEs can launch SEPs. For SEP events, however, an additional factor of a complex interplanetary propagation of energetic particles may broaden the distribution from the standard log-normal distribution.
In a simplified view, geomagnetic activities are the product of the interaction between the solar wind and magnetosphere. The solar wind parameters follow log-normal distributions (Burlaga and Lazarus 2000;Burlaga 2001), while the Earth's magnetic moment does not essentially change in short time. This is one of the reasons that majority of the geomagnetic activities follows the log-normal distribution rather than the power-law distribution. In other words, if the magnetic moment changes rapidly and follows the log-normal distribution like sunspots, the occurrence of substorms may essentially follow a power-law distribution. This idea can be tested by a global MHD simulation of the Earth's magnetosphere by changing the magnetic moment.
Recently, is became possible to continuously run the global MHD simulation of the magnetosphere for more than several months, using the observed solar wind data as the input to reproduce a number of substorms. For example, the occurrence properties of the simulated substorms were statistically compared against the observed one for a whole month in January 2015 (Haiducek et al. 2017(Haiducek et al. , 2020. Future works should also include the similar direction with different simulation codes to examine the difference of the statistical distributions.

Conclusions
It was found that the amplitudes of magnetic storms, SCs, and substorms essentially follow the log-normal distributions, with the large-amplitude events showing a possible excess from the log-normal distributions, which follow the power-law distributions. This is interpreted as a natural consequence as a random output from a complex system. Based on both the log-normal and power-law distributions, the amplitudes of the 100-year (1000-year) events can be approximately estimated for magnetic storms, SCs, substorms as 750 nT (1100 nT), 230 nT (450 nT), and 5000 nT (6200 nT), respectively.