Statistical analysis of ionospheric total electron content (TEC): long-term estimation of extreme TEC in Japan

Ionospheric total electron content (TEC) is one of the key parameters for users of radio-based systems, such as the Global Navigation Satellite System, high-frequency communication systems, and space-based remote sensing systems, since total ionospheric delay is proportional to TEC through the propagation path. It is important to know extreme TEC values in readiness for hazardous ionospheric conditions. The purpose of this study is to estimate extreme TEC values with occurrences of once per year, 10 years, and hundred years in Japan. In order to estimate the extreme values of TEC, a cumulative distribution function of daily TEC is derived using 22 years of TEC data from 1997 to 2018. The extreme values corresponding to once per year and 10 years are 90 and 110 TECU, respectively, in Tokyo, Japan. On the other hand, the 22-year data set is not sufficient to estimate the once-per-100-year value. Thus, we use the 62-year data set of manually scaled ionosonde data for the critical frequency of the F-layer (foF2) at Kokubunji in Tokyo. First, we study the relationship between TEC and foF2 for 22 years and investigate the slab thickness. Then the result is applied to the statistical distribution of foF2 data for 62 years. In this study, two methods are applied to estimate the extreme TEC value. In the first method, the distribution of slab thickness is artificially inflated to estimate extreme TEC values. In the second method, extreme slab thicknesses are applied to estimate extreme TEC values. The result shows that the once-per-100-year TEC is about 150–190 TECU at Tokyo. The value is also estimated to be 180–230 TECU in Kagoshima and 120–150 TECU in Hokkaido, in the southern and northern parts of Japan, respectively.


Introduction
The ionospheric condition is one of the most important space weather features for users of radio-based systems, such as navigation systems based on the Global Navigation Satellite System (GNSS), high-frequency (HF) communication systems, and space-based remote sensing systems. Radio waves propagating in the ionosphere experience a delay in group velocity and advance in phase velocity due to the electrons in the ionosphere.
The ionospheric delay is proportional to the ionospheric total electron content (TEC) along the propagation path. The easiest way to correct the ionospheric delay is to utilize broadcast ionospheric delay models based on simple empirical TEC models such as the Klobuchar (1987) and NeQuick (Hochegger et al. 2000, Radicella andLeitinger 2001) models. The TEC value is determined by many factors, such as solar activity, the season, local time, and geomagnetic activity. There is also latitudinal dependence in TEC variations. TEC variations caused by solar activity, the season, and local time may be estimated using these simple models but those caused by geomagnetic storms and other phenomena cannot be fully removed from these models. Therefore, users of radio-based systems may be affected by positive and/or negative ionospheric storms. During negative ionospheric storms, TEC is ≥ 0 TECU even if the negative storm is extremely severe. On the other hand, extreme TEC values during positive storms are not unknown and should be studied.
For the design and operation of systems that may be impacted by space weather phenomena, it is important to know the possible extent of the impact and how often such events are likely to occur. Thus, it is important to study extreme values related to various space weather phenomena. For users of trans-ionosphere radio-based systems, the extreme TEC value is a key value.
Extreme values of some space weather parameters have been studied. For example, that of the Dst index was investigated using extreme value modeling (Tsubouchi and Omura 2007). Those of the solar flare X-ray flux, speed of coronal mass ejection, Dst index, and proton energy in proton events were studied by Riley (2012) using complementary cumulative distribution functions. More recently, that of short-wave fadeout by a solar flare was examined on the basis of long-term ionosonde observation data (Tao et al. 2020).
However, extreme TEC values of once per long period of time have not yet been quantitatively estimated. Several countries have prepared documents with space weather benchmarks. The US White House published "Space Weather Phase 1 Benchmarks" in June 2018 (US White House 2018). Although it lists three factors that cause ionospheric disturbances, such as geomagnetic storms, quantitative benchmarks were not provided because the ionospheric effects of geomagnetic storms on the ionosphere largely differ from event to event and even their mechanism is not completely understood.
Another reason why extreme TEC values have not been fully studied is that only 20 years has passed since the start of fully fledged TEC observations. TEC observations started with measurements of the Faraday rotation or Doppler effect many decades ago (Bauer and Daniels 1959;Evans 1977). Since these observations were conducted by a few transmitters and receivers, it is difficult to study TEC behavior statistically. With the spread of GNSS and its ground-based receivers, the number of TEC observations dramatically increased. Thanks to the GNSS-TEC observation systems, we have learned a lot about TEC behavior during the last 20 years (for example Foster 2007;Nishioka et al. 2009;Maruyama et al. 2013). The purpose of this study is to estimate extreme values of TEC with their occurrence rates. We investigate the occurrence rates of extreme values of TEC in Japan in the short, mid-, and long term, which are once per year, 10 years, and 100 years, respectively.
To evaluate TEC corresponding to an occurrence rate of once per 100 years, 20 years of data is obviously insufficient. Furthermore, solar activity in the last 20 years has on average been moderate, although several intense geomagnetic storms occurred during solar cycle 24. Compared with GNSS-TEC observation, ionosonde observation has a much longer history. This technique was developed in the late 1920s and began to be implemented in the 1940s in order to monitor shortwave propagation (Gladden 1959). In Japan, ionosonde observation began in 1931. After going through various changes, routine ionosonde observation was started by the predecessor of National Institute of Information and Communications Technology (NICT) in 1951 using an automatic system. Ionospheric parameters derived from the long-term ionosonde observation are archived by World Data Center for the Ionosphere at NICT (http:// wdc.nict.go.jp/IONO/wdc/). Long-term ionosonde data have been used for various studies such as a study of the long-term trends of the ionosphere (Xu et al. 2004) and for the development of empirical models (Bilitza 2018;Yue et al. 2006;Maruyama 2011). As the TEC and the maximum density of the F region derived from ionosonde observation (NmF2) are known to be correlated, NmF2 can be a proxy of TEC. In this study, about 60 years of data of ionospheric parameters derived from the long-term ionosonde observation are used. Although the data period is still shorter than 100 years, we investigate statistical characteristics of extreme TEC values in order to estimate the ionospheric once-per-100-year condition.
The TEC value over Japan depends on the latitude, normally with a larger value in southern Japan. Japan is mainly located in the lower mid-latitude region with a latitudinal range of about 20°. The southern part of Japan is located at the poleward slope of the equatorial ionospheric anomaly (EIA) crest. On the other hand, the northern part is hardly affected by EIA variation and may rather be affected by phenomena originating from the polar region (Cherniak et al. 2015). Therefore, extreme TEC values should also differ among the center, southern, and northern parts of Japan.
Details of the data set used in this study and the analysis method are described in "Data set" and "Methods", respectively. In "Results", the result obtained using about 20 years of TEC data collected in Tokyo, which is almost in the center of Japan, is shown as the first step. Then long-term ionosonde data are analyzed. On the basis of the result, extreme TEC values with probabilities of once per year, 10 years, and 100 years are estimated for Tokyo. In the last part of the section, the extreme TEC values in southern and northern Japan are also estimated. In "Discussion", the results are discussed in comparison with those of case studies of geomagnetic storms in previous papers. The last section provides the summary of this study.

Data set
In this study, we use TEC data derived from the nationwide GNSS network over Japan, which is called the GNSS Earth Observation Network System (GEONET) and operated by the Geospatial Information Authority of Japan, and ionosonde observation data collected over Tokyo.
GNSS-TEC data derived from GEONET have been archived by NICT since 1997. Using the network data, the slant TEC along the line of sight between the receiver and the satellite was derived from pseudo-range and carrier-phase measurements by dual-frequency GPS receivers (Saito et al. 1998). The instrumental bias of the TEC associated with the inter-frequency bias of the satellite and receiver was obtained by a technique proposed by Otsuka et al. (2002), in which the daily bias values are derived by assuming that hourly averaged TEC values are uniform within the field of view of a given GNSS receiver. The slant TEC is converted to the vertical TEC after removing the instrumental bias. The TEC data from small satellite elevation angles, which is smaller than 35° is neglected to reduce cycle slips and errors due to conversion from slant to vertical TEC. The median value of the vertical TEC whose ionospheric pierce point is located within 100 km from a given location over 1 h is derived as an hourly TEC. The largest hourly TEC in a given day is noted as the daily TEC in this paper. The daily TECs of 22 years from 1997 to 2018 are used in this study and studied in Sect. 4.1.
Ionospheric conditions have been monitored for about 70 years by NICT using ionosondes in Kokubunji,Tokyo (36.7°N,139.5°E,26.8°N in Mag.Lat) and other stations. Ionospheric parameters have been manually scaled from ionograms. In order to ensure uniform quality of data, the scalers have discussed and established scaling rules, although automatic scaling tools have been developed in recent years. Thanks to the substantial efforts of the scalers, ionospheric parameters from the 1950s to the present are now available. In this study, the manually scaled critical frequency of the F-layer (foF2), which corresponds to the peak density of the F-layer, is used. In order to study foF2 with the daily TEC, we refer to the maximum foF2 in a given day as the daily foF2. In Sect. 4.2, a 22-year data set of daily foF2 values from 1997 to 2018 is used. In the same section, a 62-year data set of daily foF2 values from 1957 to 2018 is also used.

Methods
In order to find extreme values of TEC corresponding to an occurrence frequency of once every certain number of years, the cumulative distribution function (CDF) of daily TEC occurrence is investigated (Riley 2012;Kataoka 2020). The CDF of the daily TEC occurrence is a distribution function of daily TEC values that are greater than or equal to a critical TEC. One of the advantages of investigating the CDF instead of a simple occurrence probability is that it is easy to find TEC values with an occurrence frequency of once per long period (Riley 2012). In other words, the CDF of the daily TEC occurrence provides an occurrence probability of a daily TEC that is greater than or equal to a certain value, while a normal distribution provides the occurrence probability of a daily TEC between two values.
Although a data set of TEC values over 22 years may be sufficient to investigate TEC values with occurrence frequency of once per year and 10 years, it would not be sufficient to investigate the TEC value with an occurrence frequency of once per 100 years.
To compensate the insufficient number of TEC data, we utilized a 62-year data set of foF2 values in order to calculate NmF2 and study a property of the relationship between TEC and NmF2. The relationship between TEC and foF2 is given by the following equation: where S is the slab thickness. In this study, characteristics of slab thickness are studied using the 22-year data set of TEC and foF2 values. By utilizing the characteristics of the slab thickness and the 62 years of foF2 data, we deduce CDFs of TEC values over 62 years, from which we estimate the TEC value corresponding to occurrence frequency of once per 100 years.
Even if the 62-year data are utilized to estimate the TEC values with occurrence frequency of once per hundred years, the amount of the data is still not enough. The occurrence rate of a single event in 62-year data set is 1/((365.25 × 62)) = 0.0044%. This occurrence rate is larger than that of once-in-100-year event, 1/ ((365.25 × 100)) = 0.003%. In order to compensate the insufficient number data set, the distribution was extrapolated in two ways in order to deduce CDFs of TEC values over 62 years in this study. In the former method, which we call Method I, the following four steps are taken to derive the CDF using the 62-year data set of NmF2. For the first step, probability function of slab thickness, P s , is presumed with the 22-year slab thickness data set. The presumed P s is (1) TEC = S × NmF2, used to calculate a probability function of TEC for a given i-th day, P i T , with NmF2 observed on the day, NmF2 i . In the third step, P i T is converted to CDF i , which is a CDF of TEC for i-th day. Finally, CDF i is derived with all NmF2 values in 62 years and integrated to deduce CDF of TEC values over 62 years.
Here, in the step one, we assume that slab thickness follows a normal distribution, e.g., S ∼ N (µ S , σ 2 S ) where µ s and σ S are mean and standard deviation of slab thickness based on 22 years. The probability function of P s for slab thickness of s [km] is described as follows: One of the problems in estimating extreme TEC value of once-in-100-years is that the number of TEC data, or slab thickness data is insufficient compare to a 100 years. Therefore, the normal distribution, N (µ S , σ 2 S ) , cannot reproduce extreme slab thickness. In order to compensate the lack of extreme values with N (µ S , σ 2 S ) , we introduce an inflated sigma, which is described as σ s , to model the slab thickness. Inflation factor, σ s σ s , is determined by comparing TEC values of once-in-10-years deduced with various inflation factors with that based on 22-year TEC data set.
As the step 2, a probability function of TEC for i-th day, P i T is calculated on the assumption that NmF2 and slab thickness are independent parameters. The TEC i follows a normal distribution with mean and standard deviation of NmF2 i × µ S and NmF2 i × σ S , respectively. That is, is expressed as the following equation: Since TEC i follows normal distribution, CDF of TEC i , CDF i , is given using error function, erf, In the final step, CDF i is calculated for each day in the 62 years and added to obtain CDF, that is, where N is the total number of the day in 62 years.
In the latter method, which we call Method II, CDF TEC of extreme case was deduced by multiplying the extreme slab thickness, which could occur once in 10 and 100 years, by the 62-year data set of daily foF2. By assuming that the slab thickness has a normal distribution with a mean µ and a standard deviation σ , the value corresponding to occurrence of once per 10 and 100 years, or 0.03% and 0.003%, are µ + 3σ and µ + 4.2σ , respectively. CDF TEC for the 62 years can be deduced by multiplying the CDF of NmF2 for the 62 years by the extreme values of slab thickness. Since the slab thickness is known to have seasonal dependence, a single value of the slab thickness is not appropriate for estimating TEC from foF2. In order to estimate P ST in Method I, data set of slab thickness is divided into four seasons, that is, from February to April, from May to July, from August to October, from November to January. Four seasonal P ST are used to estimated CDF TEC in Eqs. (3), (4) and (5). Three-month data are used to derive P ST in Method I to obtain sufficient number of data for the inflation. On the other hand, monthly data is used to calculate the mean μ and the standard deviation σ in Method II. Figure 1 shows the CDF of the daily TEC occurrence at Tokyo. The occurrence rate is shown on the left axis. The occurrence rate on the left-hand axis of the ordinate is days per 100 years, which is obtained by dividing the occurrence days by the total number of days in 22 years and then multiplying those in 100 years. Therefore, an occurrence rate of one day means an occurrence rate of once per 100 years. The occurrence rate is converted to the occurrence percentage and shown on the right-hand axis of the ordinate. An occurrence probability of 0.3%, which corresponds to a frequency of once per year, is shown as a solid horizontal line. It is found that the daily TEC can reach about 90 TECU with a frequency of once per year. The occurrence probabilities of once per 10 years and once per 100 years correspond to 0.03% and 0.003% and are shown with dotted and dashed horizontal lines, respectively. It is found that a daily TEC of more than 100 TECU occurs with a frequency of once per 10 years. The TEC values with frequencies of once per year and once per 10 years are summarized in Table 1.

Statistical analysis of TEC over 22 years
On the other hand, the daily once-per-100-year TEC value cannot be appropriately estimated from Fig. 1 because the distribution is based on only 22 years of data.
The colors in the histograms in Fig. 1 represent the classifications based on solar and geomagnetic activity: red, pink, blue, and light blue represent days of high solar activity and high geomagnetic activity (HSHG), high solar activity and low geomagnetic activity (HSLG), low solar activity and high geomagnetic activity (LSHG), and low solar activity and low geomagnetic activity (LSLG), respectively. Solar and geomagnetic activities are, respectively, defined on the basis of the solar sunspot number (SSN) and disturbance stormtime (DST) index, which are provided as sunspot data from the World Data Center SILSO, Royal Observatory of Belgium, Brussels (http://sidc.be/silso /datafi les) and WDC for Geomagnetism, Kyoto (http://wdc.kugi.kyoto -u.ac.jp/dstdi r/index .html), respectively. HS (LS) days are defined as days for which the average daily SSN for the previous 27 days is ≥ (<) 50. HG (LG) days are defined as days for which the average daily DST of the current day and the previous day is ≤ (>) − 50 nT. It can be seen that a TEC of 60 TECU or larger is most likely to be observed when either the solar activity or the geomagnetic activity is high, while those exceeding 100 TECU are observed only when the solar activity is high.

Statistical analysis of foF2 over 22 and 62 years
Here, CDFs of the daily foF2 occurrence are studied in order to estimate once-per-100-year values. First, a CDF of the daily foF2 occurrence over the same period as in Fig. 1, from 1997 to 2018, were examined in comparison with that of the 22 years of TEC data in Fig. 1. Figure 2 shows a CDF of the daily foF2 occurrence, that is, the distribution of the daily foF2 that is greater than or equal to some critical foF2. As in Fig. 1, the occurrence rate per 100 years is shown on the left-hand axis of the ordinate and the occurrence rate in percentage is shown on the right axis. The occurrence frequencies of once per year, 10 years, and 100 years of 0.3%, 0.03%, and 0.003% are shown as solid, dotted, and dashed horizontal lines, respectively. The colors in Fig. 2 represent solar and geomagnetic activities similarly to in Fig. 1; red, pink, blue, and light blue represent days of HSHG, HSLG, LSHG, and LSLG, respectively. The largest foF2 was about 17.5 MHz. It is found that foF2 was higher than 15 MHz for only HSHG and HSLG days, which is similar to the result in Fig. 1.
The same analysis is carried out for the 62-year foF2 data set from 1957 to 2018. The result is shown in Fig. 3 in the same format as Fig. 2. The maximum observed foF2 is about 18.7 MHz, which is slightly larger than that obtained from the 22-year data set in Fig. 2. The maximum foF2 18.7 MHz was observed during geomagnetic storm in November 1960 when DST index reached − 333 nT (Cliver and Svalgaard 2004). Moreover, the occurrence rate of daily foF2 values larger than 16.8 MHz in Fig. 3, which corresponds to the rightmost bar in the histogram, is about twice of that in Fig. 2.

Estimation of extreme TEC from slab thickness using Method I
As the characteristics of the CDFs of the daily foF2 occurrence are different for the 22-and 62-year data sets, the once-per-100-year TEC value cannot be estimated by extrapolating the CDF of the daily TEC occurrence obtained from the 22-year data set. In this sub-section, we estimate the once-per-100-year TEC value by using the 62-year foF2 data set with Method I.
The value of foF2 is proportional to the square root of the maximum ionospheric density, NmF2. NmF2 is given by the following equation. Figure 4 shows the correlation between daily TEC and NmF2 derived from the daily foF2. All data collected over 22 years are shown in this scatter plot. It can be seen that TEC and NmF2 have a strong correlation. The red line is the least-squares linear approximation of all data. As shown in Eq. (1), the slope, which is about 250 km, is (6) NmF2 m −3 = 1.24 × 10 10 × foF2 2 [MHz].

Fig. 1
Cumulative distribution function (CDF) of daily TEC occurrence at Tokyo from 1997 to 2018. The occurrence rate, which is the number of days per 100 years, and the occurrence percentage are shown on the left and right axes, respectively. Red, pink, blue, and light blue represent days of HSHG, HSLG, LSHG, and LSLG, respectively. The solid, dotted, and dashed horizontal lines represent occurrence rates of 0.3%, 0.03%, and 0.003%, which correspond to occurrence frequencies of once per year, 10 years, and 100 years, respectively equivalent to the thickness of the ionosphere that gives a TEC value with a density of NmF2. This parameter, which is called the ionospheric slab thickness, is used to deduce TEC from NmF2 because of the strong correlation between daily TEC and daily foF2.
In order to derive CDFs of TEC values over 62 years with Method I, distribution of slab thickness is examined. Figure 5 shows distribution of slab thickness from 1997 to 2018. Mean and standard deviation of the distribution is 215 km and 52 km, respectively. The red curve represents the normal distribution with the mean and the standard deviation. The distribution in 3 months from May to July is shown in Fig. 6. The mean and standard deviation of the distribution is 273 km and 45 km. The mean is larger than that in Fig. 5, which is one of the seasonal effects. The red curve represents a normal distribution with the mean and the standard deviation. The curve roughly fits the slab thicknesses but does not cover large values such as more than 400 km. Mean values and standard deviations of other seasons, that is, from February to April, from August to October, and November to January, are listed in Table 2. Normal distributions with the mean and the standard deviations for each season listed in Table 2 are applied for P ST in Method I. The result of CDF TEC is shown with black histograms in Fig. 7. TEC of once per 10 and 100 years, that is, TEC of 0.03% and 0.003% was 35 TECU and 45 TECU, respectively, which are smaller than those can be read in Fig. 1. This is because of the assumption of the normal distribution, which cannot cover the large slab thickness. In order to cover them, normal distributions are inflated using an inflation factor. The blue curves in Fig. 6 show the inflated normal distribution with inflation factors. The dashed and solid lines are derived with inflation factors of 2.0 and 3.8, respectively. The inflated normal distribution with an inflation factor of 3.8 overbounds the large slab thickness around 480 km while that of 2.0 is too small to cover the large slab thicknesses. In order to optimize the inflation factor,

Table 1 Estimated TEC of once per one, 10, and 100 years in Tokyo, Kagoshima, and Hokkaido
The unit is in TECU  once-per-10-year TEC value is calculated using various inflation factors to obtain CDF TEC . Figure 8 shows the once per 10-year TEC value as a function of the inflation factor. It increases as the inflation factor increases and exceeds 110 TECU, which is the once-per-10-year TEC value based on the 22-year TEC data set, when the inflation factor changes from 3.7 to 3.8. In this paper, therefore, the inflation factor of 3.8 is adopted based on the 22-year TEC data set. Using the inflated normal distribution with the inflation factor of 3.8, CDF of TEC are derived as blue histogram in Fig. 7. TEC of once per 10 and 100 years, that is, TEC of 0.03% and 0.003% was 110 TECU and 150 TECU, respectively.

Estimation of extreme TEC from slab thickness using Method II
In this sub-section, we estimate the once-per-100year TEC value by using the 62-year foF2 data set with Method II. Here, we calculated the mean and the standard deviation of slab thickness for each month. Figure 9 shows the slab thickness against the day of the year for 22 years from 1997 to 2018. Data are sparser from June to August compared with other months, because foF2 values of 10 cannot be obtained owing to masking by the sporadic E-layer, which often appears in these months. The red polyline is the monthly mean of the slab thickness. The monthly mean slab thickness is about 180 km in winter and 280 km in summer. Blue and red vertical lines indicate the ranges of ± 3σ and ± 4.2σ. These ranges are equivalent to probabilities of once per 10 and 100 years, respectively, when the estimated slab thickness is assumed to have a normal distribution, that is, occurrence probability of the values larger than average + 3σ and + 4.2σ are 0.13% and 0.001% Here, we estimate the daily TEC from the daily NmF2 data, assuming the slab thickness has only seasonal   dependence. Figure 10 shows the CDFs of the estimated daily TEC occurrence obtained using the monthly mean slab thickness and observed NmF2 from 1957 to 2018. The black histograms are distributions of the daily TEC estimated with the monthly mean slab thickness, which is shown with a red polyline in Fig. 9. The number of days per 100 years and the occurrence rate are shown on the left-and right-hand axes of the ordinate, respectively. The black solid, dotted, and dashed horizontal lines correspond to 0.3% (once a year), 0.03% (once every 10 years), and 0.003% (once every 100 years), respectively. The blue histograms in Fig. 10 are the distribution of TEC estimated with the average + 3σ slab thickness (upper value of the blue vertical line in Fig. 9), which corresponds to a slab thickness with a frequency of once per 10 years. According to this histogram, the TEC with a frequency of once per 10 years is 130 TECU or more. Furthermore, the red histograms in Fig. 10 are derived from the average + 4.2σ slab thickness (upper limit of the red vertical line in Fig. 3). This result indicates that TEC values of more than 190 TECU can be observed with a frequency of once per 100 years. These TEC values are summarized in Table 1. Figures 1, 2, 3, 4, 5, 6, 7, 8 and 9 are results based on data obtained in Tokyo. Here we estimate extreme TEC values for southern and northern Japan because TEC behavior is expected to be different at different magnetic latitudes. Figure 11 shows the correlations of daily TEC between Tokyo and Kagoshima (31. 2°N,130.6°E,21.7°N in Mag. Lat) and between Tokyo and Hokkaido (45.2°N, 141.8°E 36.4°N in Mag. Lat) for 22 years from 1997 to 2018. Basically, the TEC in Tokyo is smaller than that in Kagoshima and larger than that in Hokkaido. The red line represents the linear approximation of these data and reveals that the TECs in Kagoshima and Hokkaido are, on average, 1.2 and 0.8 times that in Tokyo, respectively. From these results, the TEC values with probabilities of once per year, 10 years, and 100 years are estimated as 110, [90][91][92][93][94][95][96][97][98][99][100][101][102][103][104][105], respectively, in Kagoshima (Hokkaido) as round to the nearest multiple of five. The numbers are summarized in the second and third rows in Table 1.

Discussion
It is important to estimate the occurrence rates of extreme values of TEC in Japan in the short, mid-, and long term, which are once per year, 10 years, and 100   years, respectively, in readiness for hazardous ionospheric conditions. "Space Weather Phase 1 Benchmarks", which was published by the USA White House in June 2018, lists three factors that cause ionospheric disturbances: solar flares, proton events, and geomagnetic storms. However, quantitative benchmarks are difficult to derive because the effects of geomagnetic storms largely differ from event to event. Furthermore, the mechanism of ionospheric storms is not yet completely understood. Although the results in this paper are limited to the region around Japan, they are a starting point for evaluating benchmarks in other regions.
One of the challenges is to estimate extreme TEC value such as once per a 100 year with a limited data set. In this study, we have 22-year TEC data set and 62-year foF2 data set. Method I assumes the probability distribution of slab thickness as a normal distribution. First, raw σ is used to model the slab thickness with the 22-year data set. The resulting CDF which is shown with black histograms is Fig. 7 underestimates the observed CDFs in Fig. 1. The TEC values of once-per-year, for example, were about 90 TECU in Fig. 1 while that of black histograms in Fig. 7 was < 30 TECU. One of the reasons that the values underestimate extreme TEC values is that comes that the normal distribution cannot reproduce large value of slab thickness such as over 400 km. In order to cover the large slab thickness, the slab thickness distribution was approximated by inflated normal distributions. The inflation factor is a key parameter which affects the extreme TEC values. The solid and dashed lines in Fig. 8 show TEC values which would occur once per 10 and 100 years, respectively, as a function of the inflation factor. If the inflation factor is chosen as 5, the onceper-10-year TEC value is more than 150 TECU, which is comparable to the once-per-100-year TEC value for the inflation factor of 3.8. Inflation factor largely affects the extreme TEC value in Method I while this study adopts the inflation factor of 3.8 based on 22-year TEC data set.
In Fig. 6, the inflated normal distribution with an inflation factor of 3.8 overbounds the large slab thickness around 480 km while that of 2.0 does not. A discussion should be done for the assumption of normal distribution for the slab thickness. As shown in Figs. 5 and 6, the distribution of slab thickness has long tail, the tail cannot be reproduced by normal distributions even if the σ is inflated. Alternative approach would be to model the distribution in a different way. The distribution in Figs. 5 and 6 could be fitted by a sum of two normal functions which centers the core part and the tail parts, so-called double Gaussian, instead of multiplying an inflation factor to the standard deviation, which is left for future studies.
Comparing Method I and Method II, Method II is more conservative than Method I because Method II takes out the extreme slab thickness multiplies it with TEC values. Method I has an advantage in order to grasp the overall distribution while extreme large values are not reproduced, which may depends on how to determine the inflation factor. Method II has an advantage in estimating extreme values while overall distribution is not very accurate. Fig. 9 Slab thickness against day of year: the red polyline is the monthly mean value of slab thickness. Blue and red vertical bars represent ± 3σ and ± 4.2σ, respectively Fig. 10 CDFs of the daily TEC occurrence estimated with Method II. The occurrence rate, which is the number of days per 100 years, and the occurrence rate in percentage are shown on the left-and right-hand axes of the ordinate, respectively. The black histograms are derived from the average slab thickness shown in Fig. 9. The blue and red histograms are derived with slab thicknesses of average + 3σ and + 4.2σ, which are shown with blue and red vertical lines, respectively. The solid, dotted, and dashed horizontal lines represent occurrence rates of 0.3%, 0.03%, and 0.003%, which correspond to frequencies of once per year, 10 years, and 100 years, respectively In this study, we estimated extreme TEC values by assuming that the slab thickness has only seasonal dependence. The seasonal dependence of the slab thickness shown in Fig. 9 is consistent with the results of previous studies (Jin et al. 2007;Huang et al. 2016). Another factor determining the slab thickness is the dynamics and/or composition change caused by geomagnetic disturbances. According to Stankov and Warnant (2009), the slab thickness is systemically enhanced during geomagnetic disturbances for both positive and negative ionospheric storms. Extreme values of TEC estimated by blue or red histograms in Fig. 10 would be recorded during geomagnetic storm conditions. Extreme positive storms are thought to be caused by a geomagnetic disturbance that induces prompt penetration of the electric field (Tsurutani et al. 2004). The largest reported TEC is about 330 TECU to our knowledge, which was recorded by a GPS receiver onboard the CHAMP satellite at an altitude of about 400 km during the October 2003 Halloween storm (Mannucci et al. 2005). Magnetic latitude where the 330 TECU was observed was about 25°S. Although the observation was in the south hemisphere, the magnetic latitude is similar to that of Tokyo (26.8°N). The TEC value of 330 TECU reported in Mannucci et al. (2005) is much higher than our result of 190 TECU, which is conservatively estimated in Method II.
Before discussing possible reasons for the discrepancy between our result and that reported in Mannucci (2005), we have to discuss estimation accuracy of the instrumental bias to derive absolute value of TEC. In estimating instrumental bias, we assume that the hourly average of vertical TEC is uniform within an area covered by a receiver; this area approximately corresponds to a surrounding of 1000 km (Otsuka et al. 2002). It is reported that the technique can derive absolute values of TEC with the accuracy of ∼3 TECU in the daytime and ∼1 TECU in the nighttime, respectively, during quiet and moderated disturbed day. It is also reported the characteristics of temporal and spatial distribution of absolute TEC are consistent with the previous studies during a geomagnetic storm day. Nonetheless, during the geomagnetic disturbed condition, TEC tends to have spatial gradient and large-scale traveling ionospheric disturbances (LSTIDs) could appear. The horizontal scale of LSTIDs is more than 2000 km, which is larger than the assumption of TEC uniformity. Therefore, there is a possibility that the assumption of the TEC uniformity tends to be invalid during severe geomagnetic storm days. Zhang et al. (2009) investigated influences of geomagnetic storms on the estimation of GPS instrumental biases. The bias errors are in order of a few TECU while the errors are different among geomagnetic storms and its duration. Since the order of the errors in estimating instrumental bias is < 10 TEC, we speculate that the error would not reverse the difference between our result (190 TECU) and that in Manucci et al. (330 TECU) while further quantitative investigation would be necessary in order to clarify the estimation errors.
Here we discuss possible reasons for the difference between these values. One possibility is differences in observation opportunities. The characteristics of ionospheric storms are not always similar among geomagnetic storms, with their magnitude varying greatly from event to event. Mannucci et al. (2008) analyzed four intense geomagnetic storms in 2003 including the event for which the extreme value of 330 TECU was observed by the CHAMP satellite. A dramatic increase in TEC was observed in only one event. The observed TEC on the other three storm days was around 100 TECU or less. If the event-to-event difference is too large, 70 years of data might not be enough to estimate TEC values for onceper-100-year or once-per-1000-year events.
Another possibility accounting for the difference between the extreme value of 330 TECU in Mannucci et al. (2003) and our result is the longitude dependence of the ionospheric influence on geomagnetic storms. Immel and Mannucci (2013) analyzed global TEC maps during geomagnetic storms over 7 years. Their analysis confirmed that on average the American sector exhibits larger TEC enhancements regardless of the onset UT. Greer et al. (2017) used the Global Ionosphere-Thermosphere Model to carry out an experiment on a geomagnetic storm by modifying the storm arrival UT. The result indicated that the strongest enhancements of TEC during storms are found in the American and Pacific longitude sectors. They suggested that the longitudinal dependences were due to Earth's asymmetrical geomagnetic topology in the American and Pacific sectors. The difference between our results and that of Mannucci et al. (2003) may originate from the difference between the Japanese and American/Pacific sectors. In order to clarify whether the longitudinal dependence results in the large difference between the results of this study and that of Mannucci et al. (2008), long-term observational data in addition to data over oceans are necessary.
This study focuses on positive ionospheric storms, which may significantly affect GNSS users. On the other hand, the effect of negative storms on space weather users may also be significant, particularly for HF communicators, who may experience blackouts during negative ionospheric storms. In addition, parameters other than TEC, such as maximum usable frequency (MUF) and scintillation indices, should be studied for extreme cases.

Summary
In this study, extreme values of TEC with frequencies of once per year, 10 years, and 100 years were investigated. The results are summarized as follows: The CDF of daily TEC values was studied for a 22-year data set observed in Tokyo in order to estimate TECs with frequencies of once per year and 10 years. The obtained once-per-year and once-per-10-year TECs were 90 and 110 TECU, respectively.
• In order to estimate the once-per-100-year TEC value, 62 years of manually scaled ionosonde data were used to augment the insufficient observation period of TEC. The slab thickness was assumed to have only seasonal variation and was used to estimate TEC from 62 years of foF2 data. In this study, two methods were tested in order to compensate the insufficient number of data. • In Method I, the slab thickness distribution is modeled with artificially inflated normal distributions. The inflation factor was determined by calibrating the once-in-10-year TEC value deduced with various inflation factors with that based on 22-year TEC data set. The once-per-10-year TEC was result as 150 TECU. • In Method II, extreme slab thickness is applied to deduce the extreme TEC values. Slab thickness of the average + 3σ and + 4.2σ, which correspond to onceper-10-years and once-per-100-years, respectively, to deduce the extreme values of TEC. The result was 190 TECU. In Method II, once-per-10-year TEC is also derived and was 130 TECU. • Extreme TEC values were also studied for Kagoshima and Hokkaido in southern and northern Japan, respectively. In Kagoshima, those which occur once per one, 10, and a 100 years are 110 TECU, 130-155 TECU, and 180-230 TECU, respectively. In Hokkaido, they are 70 TECU, 90-105 TECU, and 120-150 TECU, respectively.