Findings on celestial pole offsets predictions in the second earth orientation parameters prediction comparison campaign (2nd EOP PCC)

In 2021, the International Earth Rotation and Reference Systems Service (IERS) established a working group tasked with conducting the Second Earth Orientation Parameters Prediction Comparison Campaign (2nd EOP PCC) to assess the current accuracy of EOP forecasts. From September 2021 to December 2022, EOP predictions submitted by par‑ ticipants from various institutes worldwide were systematically collected and evaluated. This article summarizes the campaign’s outcomes, concentrating on the forecasts of the dX, dY, and dψ, dε components of celestial pole offsets (CPO). After detailing the campaign participants and the methodologies employed, we conduct an in‑depth analysis of the collected forecasts. We examine the discrepancies between observed and predicted CPO values and analyze their statistical characteristics such as mean, standard deviation, and range. To evaluate CPO forecasts, we computed the mean absolute error (MAE) using the IERS EOP 14 C04 solution as the reference dataset. We then compared the results obtained with forecasts provided by the IERS. The main goal of this study was to show the influ‑ ence of different methods used on predictions accuracy. Depending on the evaluated prediction approach, the MAE values computed for day 10 of forecast were between 0.03 and 0.16 mas for dX, between 0.03 and 0.12 mas for dY, between 0.07 and 0.91 mas for dψ, and between 0.04 and 0.41 mas for dε. For day 30 of prediction, the correspond‑ ing MAE values ranged between 0.03 and 0.12 for dX, and between 0.03 and 0.14 mas for dY. This research shows that machine learning algorithms are the most promising approach in CPO forecasting and provide the highest prediction accuracy (0.06 mas for dX and 0.08 mas for dY for day 10 of prediction).


Graphical abstract 1 Introduction
The irregularities in the Earth's rotation are observed as variations in the rotation rate, polar motion, and alterations in the direction of the rotation axis in space, known as precession and nutation.The Earth's precession and nutation are largely generated by the lunisolar tidal torque.Diurnal retrograde variations in the atmospheric and oceanic angular momenta in an Earth-fixed reference system, combined with the free core nutation effect, induce additional nutation motions (Dehant et al. 2015).The precession-nutation effect pertains to the movement of the celestial intermediate pole (CIP) within the celestial reference frame (McCarthy and Petit 2004).This motion occurs with a frequency range from − 0.5 cycles per sidereal day (cpsd) to + 0.5 cpsd, as detailed by Capitaine et al. (2005).
In contrast, polar motion encompasses the CIP's motion within the celestial frame across all other frequency ranges or its motion within the terrestrial frame for all frequencies, excluding those falling between − 1.5 cpsd and − 0.5 cpsd.This distinction incorporates retrograde, nearly diurnal ocean tidal terms into nutation, as observed from the terrestrial reference frame.In addition, polar motion encompasses nutation terms with frequencies below − 0.5 cpsd or above + 0.5 cpsd, as perceived within the celestial reference frame (Gross 2015).
Earth orientation parameters (EOP) include corrections to the conventional precession-nutation model, i.e., celestial pole offsets (CPO), polar motion, differences between universal time and coordinated universal time (UT1-UTC), and Length-of-Day (derivative of UT1-UTC).They are necessary for transformation between International Celestial and Terrestrial Reference Frames (ICRF and ITRF, respectively).However, the complexity and time-consuming nature of the required data processing invariably results in report delays.Currently, the official and most accurate EOP solution obtained from the combination of observations from different space geodesy techniques is provided by the International Earth Rotation and Reference Systems Service (IERS) with the delay of up to 6 weeks.Less accurate and more quickly processed data are available with a delay of one to several days.Consequently, accurately predicting EOP based on past observed data in conjunction with geophysical phenomena is of great scientific and practical significance.Short-term predictions of EOP are routinely used for many real-time advanced geodetic and astronomical tasks, such as navigation and positioning on Earth and in space.
The CPO signifies the disparity between the observed position of the celestial pole and its position predicted by a precession-nutation model.The IERS consistently monitors and reports the ongoing differences between the observed and modeled celestial pole positions.The newest CPO definition, introduced in 2000 by the International Astronomical Union (IAU), assumes CPO as the corrections dX and dY applied to the coordinates of the CIP within the ICRF (Resolution B1.6, McCarthy and Capitaine 2003).The IAU 2000 recommendations introduced a new parametrization of the CPO based on the non-rotating origin of the Earth's orientation matrix (McCarthy and Capitaine 2003).The IERS regularly publishes the CPO based on the IAU 2000A precession-nutation model.The conventional offsets expressed in terms of longitude (dψ) and obliquity (dε), associated with the former IAU 1980 theory of nutation and the IAU 1976 precession model (Kaplan 2005), can still be accessed from the IERS website.
Accurate determination of CPO through very-longbaseline interferometry (VLBI) measurements has been possible since 1984.Today, VLBI is widely recognized as the most accurate technique for observing CPO (Kiani Shahvandi et al. 2024).In addition, combined solutions are calculated by integrating VLBI with other space-geodetic techniques.While some models solely include CPO determined from geodetic measurements, others also offer predictions.Among the many utilized CPO models accessible to the public are the United States Naval Observatory (USNO) combined CPO series produced by the IERS Rapid Service/Prediction Center (Dick and Thaller 2015;Wooden et al. 2010), the International VLBI Service for Geodesy and Astrometry (IVS) combined CPO series produced by the IVS Combination Center (Böckmann et al. 2010), and the IERS EOP 14 C04 combined CPO series developed by the IERS Earth Orientation Product Center at the Paris Observatory (Bizouard and Gambis 2009).Comparative analyses of these different CPO series have been conducted by Malkin (2010aMalkin ( , b, 2013Malkin ( , 2014Malkin ( , 2017)), demonstrating substantial differences among them, reaching several tens of μas.
At present, EOP predictions are regularly provided by the IERS Rapid Service/Prediction Centre (Luzum et al. 2001) and many other research groups working on EOP predictions (Kiani Shahvandi et al. 2023;Belda et al. 2018;Modiri et al. 2024).However, the predictions provided by these institutes differ in terms of input data, forecasting method, and prediction horizon, leading to different levels of accuracy for each prediction.
Since the beginning of this century, major progress has been made in processing geodetic observations for estimating EOP (Bizouard et al. 2019;Karbon et al. 2017;Nilsson et al. 2014).The First Earth Orientation Parameters Prediction Comparison Campaign (1st EOP PCC), which was conducted in 2006-2008, aimed to assess and compare the accuracy of different prediction methods (Kalarus et al. 2010).These methods included the leastsquares (LS) extrapolation and autoregression (AR) (Wu et al. 2019;Xu et al. 2015), spectral analysis combined with LS (Zotov et al. 2018;Guo et al. 2013), artificial neural networks (ANN) (Schuh et al. 2002), wavelet decomposition and auto-covariance method (Kosek et al. 2006), and Kalman filtering (Xu et al. 2012;Gross et al. 1998).The main conclusion from this campaign was that no single prediction technique could be considered optimal for all EOP components and all prediction intervals.It was also proved that the prediction accuracy benefits from the use of atmospheric and oceanic angular momentum (AAM and OAM, respectively) data and forecasts.
At present, there is increased understanding of the influence of the Earth's surficial fluid layers (i.e., atmosphere, oceans, and hydrosphere) on the rotational changes of the solid Earth (Schindelegger et al. 2016;Nastula et al. 2019).As additional data in the EOP forecasting process, teams often use not only AAM and OAM data and predictions but also hydrological angular momentum (HAM) and sea-level angular momentum (SLAM).Moreover, the number of research groups actively developing advanced methods for EOP  This paper summarizes the results of evaluation of predictions of CPO components (dX, dY and dψ, dε) collected during the 2nd EOP PCC.The analyses are based on comparison between observed CPO taken from the IERS 14 C04 solution and predicted values.We study in detail statistics of prediction residuals as well as the mean absolute error (MAE) of predictions.
The remainder of the paper is structured as follows.Section 2 presents an overview of CPO predictions and their preliminary assessment, specifically, statistics of prediction methods, input data and submitted files (Sect.2.1) and the analysis of the prediction residuals (Sect.2.2).Detailed evaluation of the accuracy of CPO forecasts and the benefits of transformation of dψ, dε to dX, dY parameters is presented in Sect.3. Finally, Sect. 4 presents the ranking of all CPO predictions, summarizing the campaign results and identifying the most reliable forecasting techniques for dX, dY predictions.

Prediction methods, input data and statistics of submitted files
An overview of the prediction techniques, input data, and prediction horizons exploited by campaign participants is presented in Table 1.A full description of each approach is provided in Table 6.Each campaign participant could apply more than one prediction technique, and no recommendations for predictions were given, allowing participants freedom in the choice of the prediction method, forecast horizon, and input data.The prediction methods used by the campaign participants were LS extrapolation, AR (both methods are used alone or in combination), Kalman filter, empirical free core nutation (FCN), and machine learning (ML).Participants sent their predictions for time periods of 11, 31, 90, 179 and 364, and 365 days.Each registered prediction approach was assigned an individual ID by the EOP PCC Office.All IDs predicted dX, dY parameters and only IDs 100 and 101 additionally provided forecasts of dψ, dε (Table 1).It should be noted that the dψ, dε predictions provided by ID 101, according to the participant's declaration, were not directly forecasted but transformed from their dX and dY predictions.During the campaign period, all participants sent 559 predictions of dX and dY and 185 predictions of dψ and dε using nine different forecasting approaches.In addition, we used CPO predictions provided by the Rapid Service/Prediction Centre of IERS as a comparative dataset.These predictions received the ID 200.The IERS forecasts are sourced from regularly updated files finals.daily,based on the previous IAU1980 convention for precession-nutation, and finals.2000A.daily,based on the current IAU2000A convention for precession-nutation (https:// www.iers.org/ IERS/ EN/ DataP roduc ts/ Earth Orien tatio nData/ eop.html -accessed on May 1, 2023).These forecasts are collected by the EOP PCC Office every Wednesday, following the same procedure used for submissions from other participants.
The entire set of submitted forecasts was tested to find erroneous predictions, which cannot be used in further processing.A two-step approach was applied to eliminate outlier predictions separately for 10-and 30-day prediction horizon.In the initial stage of data selection, known as the "σ criterion", we independently calculated the standard deviation S j of the differences between refer- ence (IERS 14 C04) and prediction ( x obs − x pred ) for each submitted prediction independently.Subsequently, we computed the overall standard deviation of differences for all submissions ( S total ).Individual predictions with S j > S total were excluded from further analysis.This process targets highly inaccurate predictions that deviate significantly from observational data and other submissions, possibly due to many factors, such as producing highly inaccurate predictions related to incorrect units, errors in algorithms, or incorrect use of input data.
In the second step of data selection, to exclude individual predictions from a specific participant that significantly deviate from the rest of the predictions provided by the same participant, we applied a criterion based on the β parameter, computed separately for every single prediction as described in Kalarus et al. (2010): (1) where I denotes the length of prediction (I = 10, or 30), MDAE is a median absolute prediction error defined for the i th day in the future, and the prediction residu- als ε i,j = x obs i − x pred i,j are used to calculate the differences between observed EOP data and their i th point for the j th prediction.If β j < 0 , the predictions were rejected and not included in further processing.The α parameter was determined empirically, and in this study, its value was chosen as α = 3.
Table 2 shows number of rejected and total submitted predictions of dX, dY and dψ, dε for 10-and 30-day prediction horizon together with percentage of rejection.For 10-day horizon, the set of submitted files was reduced by 4.9% because of highly inaccurate forecasts, while for 30-day predictions, the percentage of rejection was 3.5%.In general, the highest percentage of outlier forecasts was detected for ID 127, while the lowest-for ID 101.More Figure 1 shows the number of accepted predictions for each submission day after applying sigma and beta criteria.It can be seen from the plot that in the case of dψ, dε, the number of accepted files is rather stable during the whole campaign duration (2-4 submissions), while for dX, dY the number of uploaded files has increased after April 2022.This is probably due to the addition of new methods by one of the participants (IDs 154 and 155).

Analysis of prediction residuals for dX and dY
This part of our study presents basic statistics of prediction residuals (ε i,j ) between observed and predicted values of the parameters dX and dY.
Figures 2 and 3 show time variability of prediction residuals for day 1 (ε 1,j ) and day 8 (ε 8,j ) computed for each ID over the entire campaign duration.The differences between the reference and predicted dX, dY series for day 1 of prediction range from 0.13 to 0.61 mas (Fig. 2), while these differences for day 8 of prediction are between 0.24 and 0.58 mas (Fig. 3).
Figure 4 shows the minimum, mean, maximum and range of prediction residuals for dX and dY, computed for day 1, day 8, day 15, and day 22 of prediction for each ID.Since the predictions from IDs 127 and 134 are 11 days long, their statistics were computed only for day 1 and day 8.The maximum range of prediction residuals for all considered days is obtained for ID 100 for both dX and dY (Fig. 4).For dX, the ε i,j values of ID 100 are 0.44, 0.49, 0.52, and 0.52 mas for day 1, day 8, day 15, and day 22, respectively.The corresponding ε i,j values for dY are equal to 0.61, 0.58, 0.57, and 0.54 mas, respectively.ID 200 has the smallest range of differences for day 1 of prediction, both for dX and dY (equal to 0.13 mas for dX and 0.16 mas for dY), but this value visibly increases for day 8, day 15, and day 22 and is comparable to those received for other IDs.
As a next step, the distribution of the prediction residuals ( ε i,j ) of dX, dY parameters was studied by analysing their histograms (Fig. 5).The histograms display a symmetric, bell-shaped curve with a single peak around the mean showing that the data follows a normal distribution but with an additional tail.In the case of day 8, the distribution of individual values is more dispersed than for day 1, for which we observe more consistent values of prediction residuals.Figure 5 shows that for day 1, the most common values of differences between the reference and predicted series (indicated by peaks in the histograms) are (-0.07)-0.03mas for dX and (-0.05)-0.10mas for dY.For day 8, the most frequent values of differences are (-0.06)-0.00for dX and (-0.03)-0.05mas for dY.It is Fig. 3 Prediction residuals for a dX and b dY for day 8 of prediction also visible that the prediction residuals are mostly negative for dX, while for dY there is a greater balance between positive and negative values.Moreover, a larger deviation of residual values is observed for dY than for dX.
We now analyse relations between prediction residuals obtained for day 1 and corresponding residuals received for day 8, day 15 and day 22.To do so, for each ID separately, we computed correlation coefficients: between prediction residuals ε 1,j and ε 8,j , between ε 1,j and ε 15,j , and between ε 1,j and ε 22,j , which are presented in Table 3.
For ID 100 there is a strong positive correlation between prediction residuals for day 1 and corresponding residuals for other days (between 0.77 and 0.84 for both dX and dY).A weak relationship between residuals for day 1 of prediction and residuals for other days was found for ID 200 (correlation coefficients ranging between 0.15 and 0.26 for both dX and dY).This may be due to a different behavior of the prediction accuracy for ID 200 in the first days of the forecast than in the following days (see also Fig. 4 with statistics of prediction residuals for day 1, day 8, day 15, and day 22).Notably, for dX of ID 155, the correlations between residuals for day 1 and other prediction days are above 0.50, while for the dY component the corresponding correlations are lower and decrease with subsequent days of prediction (i.e., the correlation between residuals for day 1 and day 8 is higher than the correlation between those of day 1 and day 15).This may suggest that the residuals of predicted dY values do not change substantially with prediction day.Overall, we do not observe a change in correlation larger than 40%, except for ID 104, where there is an increase of 60% from day 15 to day 22 (dY), for ID 155, where is decrease of 50% (dY) and −42% from day 8 to day 15 (dY), and for ID 155, where is decrease of 42% (dX) from day 8 to day 15.This may indicate that the accuracy of prediction does not change as the prediction day increases.
In the following, we analyse correlations between participants' prediction residuals separately for day 1, day 8, day 15, and day 22 (Fig. 6).For day 1 of prediction for both dX and dY, a strong positive correlation (between 0.80 and 1.00) was found for the following pairs: IDs 155 and 102, IDs 134 and 102, IDs 127 and  6a, b).The highest correlation coefficients are detected for ML-based methods, either between prediction residuals from two between IDs 155 and 134) or between prediction residuals from ML and from other techniques (between IDs 155 and 102, between IDs 134 and 102, between IDs 127 and 104, between IDs 154 and 104).For day 1, predictions from ID 200 disseminated by IERS, are characterized by lower correlations (between − 0.20 and 0.40) than those of other IDs (except for correlation between IDs 101 and 134 and between IDs 101 and 102).For those pairs of IDs that had the highest correlations for day 1, the correlations are also high for day 8 (see Fig. 6c, d), day 15 (see Fig. 6e, f ) and day 22 (Fig. 6g, h).Table 3 Correlation coefficients: between prediction residuals ε 1,j and ε 8,j , between ε 1,j and ε 15,j , and between ε 1,j and ε 22,j , computed for each ID separately In brackets, the percentage change in correlation coefficients relative to the previous one is shown  There is no noticeable decrease in the correlation between different IDs for day 15 and day 22 of prediction compared with the values received for day 1 and day 8, and no negative correlations are noted.Despite the use of different prediction methods and different forecast horizons, there is a positive correlation between prediction residuals obtained for different IDs.
The correlation between residuals for ID 200 and residuals from the other participants increased in day 8, day 15 and day 22 compared to the correlations received for day 1, especially for the dY component.Taking into account all considered prediction days, residuals of ID 200 have the highest agreement with the residuals for IDs 101 and 154 and the lowest correspondence with the residuals for ID 102.

MAE and its time evolution
In this section, we assess the quality of CPO predictions from all IDs based on MAE computed according to the following equation (Kalarus et al. 2010): (2) where n p is the number of predictions related to the same ID and the same dX, dY or dψ, dε data.We consider MAE for the 10-day and 30-day prediction horizon (Figs. 7 and 8, respectively).Figures 7 and  8 additionally include MAE values for day 0, which represents the day of submission (the last observational data record).Day 0 is used to assess whether participants encountered any errors during the preparation of observational data, which could affect their forecast accuracy.Since final IERS 14 C04 solution is usually published with around 6-week delay, to perform prediction, participants usually use IERS 14 C04 supplemented with different rapid solutions that are not as accurate as the final IERS 14 C04 series due to limited access to all data and shorter processing time.Therefore, differences at day 0 between various participants may result from diverse rapid data used or different methods of processing of that data.Large errors at day 0 may indicate problems with correct data preparation or limitations in access to the latest observational data.Except for IDs 100 (Fig. 7a) and 101 (Fig. 7b), there were no issues at the data preparation stage, as MAE for day 0 is relatively low for most IDs.For day 0, the MAE for ID 200 is lower than for other participants; however, for this ID MAE increases rapidly after day 1 of prediction for both dX and dY, suggesting some modelling errors.For the dX component, most IDs show a similar course of the MAE change, with little increase in error between day 1 and day 10 (Fig. 7a).However, MAE for ID 100 is visibly higher than that of the other IDs (between 0.14 and 0.16 mas) for the whole prediction horizon.Notably, for ID 200, MAE increases almost linearly (MAE equal to 0.04 mas for day 1, and 0.13 mas for day 10).The MAE of dX forecast from ID 200 is higher than that of any other ID after day 2 of prediction.Of all IDs, ID 154 provides the lowest MAE value on day 10 (about 0.05 mas).
For the dY component, the MAE for the consecutive forecast days remains relatively stable for all IDs except 200 (Fig. 7b).For IDs 100 and 101, the MAE is greater than that for other IDs and reaches 0.09-0.12mas for the whole prediction horizon.The forecasts provided by IDs 134 and 127 are the most accurate for day 10 of the prediction (MAE values about 0.06 mas).Similar to the results for dX, MAE for the dY parameter provided by ID 200 is lower than corresponding values for other IDs only for day 1 and day 2 of the prediction.The almost linear increase in error causes the MAE of ID 200 to reach 0.09 mas on day 10 of prediction.
Figure 8 shows that, for a 30-day prediction horizon, the MAE values for dX and dY do not increase linearly; however, about every seventh day of prediction there are peaks of increased prediction errors.The nature of these peaks is not entirely known, but they appear practically for every ID, so it might be a matter of the C04 data.For ID 104, these peaks might indeed be somewhat different, but generally in dX, they are not as pronounced as in dY, and more varied depending on the ID.In dY, however, distinct peaks appear practically for all IDs.Similar to the 10-day prediction horizon, MAE for forecasts from ID 200 rises most rapidly for the first 10 days of prediction; however, for the subsequent days, the change in MAE as the prediction day increases is of a similar course as in the case of MAE for other participants.For the dX component, the lowest MAE on day 30 of the prediction is found for IDs 104 and 154 (0.05 mas).For the dY component, the lowest MAE is provided by IDs 117 (0.08 mas) and 102 (0.06 mas).
We also investigated whether participants improved their methods throughout the campaign by plotting the MAE for a 10-day prediction horizon for dX and dY (Figs.To quantify the change in MAE in each period relative to the previous one, the percentage change (PCh) of MAE for each of the above periods was calculated as follows: where MAE i is the value for the i th point of prediction computed for the n th period.If PCh > 0, the preceding period has a lower MAE (predictions are improved).If PCh < 0, the preceding period has a higher MAE (predictions are worsened).The PCh n values are shown in Fig. 11.
Figure 9 shows that for dX, the mean value of MAE computed for the whole campaign duration and mean value of MAE obtained for each period is comparable in the P1 (Fig. 9a) and P7 periods (Fig. 9g).Conversely, in the P8 period (Fig. 9h), the mean MAE for this 2-month (3) period is substantially higher than for the previous periods and the whole campaign duration.This is due to the high value of MAE detected for IDs 100 and 200.The accuracy of predictions from IDs 200 and 117 is higher than the mean MAE in the P1 (Fig. 9a) and P2 periods (Fig. 9b).Over the following months, the accuracy of both forecasts increased considerably.However, the MAE value for ID 200 increased substantially again for the last 4 months (P7-P8) of the campaign, while ID 117 maintained a high forecast accuracy.During the period of increased forecast errors for ID 200 (P1, P2, P8), there was also a clear linear increase in MAE for this prediction, especially between day 2 and day 6 of forecast.In other periods, the behavior of MAE for ID 200 is similar to that observed for the other IDs.From January 1, 2022 to June 30, 2022 (P3-P5), the MAE values for each ID are below 0.15 mas and remain stable for the whole prediction horizon.Starting from around the middle of the campaign duration, the average MAE for the period is changed only by single outlier IDs, for which the errors are visibly higher than for the others [IDs 100 and 117 for P5 (Fig. 9e), IDs 101 and 102 for P6 (Fig. 9f ), IDs 100 and 102 for P7 (Fig. 9g), IDs 100 and 200 for P8 (Fig. 9h)].
For the dY predictions (Fig. 10), the mean MAE computed for the whole campaign duration and for each Fig. 10 MAE for dY prediction for individual 2-month periods (a-h).The thick black line represents the mean value of MAE over the 2-month period ("Mean for period"), the thick magenta line ("Mean for all") represents mean MAE for the whole campaign duration period separately is comparable for the P4-P6 periods (Fig. 10d-f ).The mean MAE for the P3 (Fig. 10c) and P8 periods (Fig. 10h) is higher than the MAE for the whole campaign period, which relates to the high MAE of ID 101.For P3 (Fig. 10c), which covers the period between January 1, 2022 and February 28, 2022, all MAE values are very high (above 0.05 mas starting from day 1).In the P7 period (Fig. 10g), the highest MAE values are for IDs 100 and 101.In the P6 period (Fig. 10f ), ID 101 has the highest MAE values, whereas IDs 101 and 104 have the highest MAE values in the P5 period (Fig. 10e).The highest MAE value was observed for ID 101 in the P3 and P8 periods (Fig. 10c, h, respectively).
The values of percentage change of MAE in analysed periods are shown in Fig. 11.It can be seen that the accuracy of predictions of dX component varies between the 2-month periods for all IDs, but we do not observe a constant decrease or increase in MAE, but rather alternating periods of improvement and deterioration in accuracy (Fig. 11a).The period P8 exhibits a clear increase in accuracy for almost all IDs as most values of PCh are positive.For ID 134, after some decrease of accuracy in P2 (November-December 2021), and P3 (January-February 2022) there is a prominent MAE improvement in the P5 (May-June 2022) and P8 (November-December 2022) periods.In the case of P7 (September-October 2022) period, we note a decrease of prediction accuracy for all IDs except for ID 101.
Figure 11b shows that the accuracy of dY predictions increased for most IDs in most periods.A decrease of MAE from one period to the next was observed in the following cases: in period P4 (November-December 2021) for all IDs; in period P5 (May-June 2022) for ID 100 (73%), and ID 200 (40%); for period P7 (September-October 2022) for all IDs excluding IDs 100 and 200; for period P8 (November-December 2022) for all IDs excluding IDs 101 and 127.In period P5, for most IDs the deterioration in accuracy is noticeable.In general, for dY predictions of most IDs, after declines in prediction accuracy in the first half of the campaign, the accuracy improves in the majority of cases in the last months of the 2nd EOP PCC duration.In contrast, for dX forecasts, periods of increased and decreased prediction accuracy alternated with each other.

Transformation between dX, dY and dψ, dε
Many of the existing algorithms that are applied to positional astronomy are reliant on conventional transformations (Hohenkerk 2017).These transformations involve expressing the sequence of rotations between the terrestrial and celestial systems using familiar angular quantities based on the equinox and sidereal time.Even though the IAU 2000A precession-nutation model and the new definition of UT1 can be implemented without adopting the (X, Y) coordinate scheme for pole coordinates used by the IERS, the new models still describe the pole's position using conventional angles (Kaplan 2005).The X and Y components must be derived from these angular quantities.Consequently, even users implementing the new IAU models may need to convert dX and dY values to their equivalent dψ and dε values.
This section discusses the influence of conventional transformation between dψ, dε and dX, dY components of CPO on the MAE values for the 10-day forecast horizon.To do this, we compare MAE for original dX, dY predictions with MAE of dX, dY predictions obtained by transformation from dψ, dε forecasts.This analysis is conducted only for IDs that provided forecasts for both dX, dY and dψ, dε components of CPO (IDs 100,101,and 200).
To perform a transformation of CPO from IAU 1980 (dψ, dε) to IAU 2000 (dX, dY) model, we used the package of subroutines, uai2000.package,available at the Earth Orientation Center of Paris Observatory (https:// hpiers.obspm.fr).The programs, originally written by Christian Bizouard from Systèmes de Référence Temps Espace (SYRTE), are based upon the International Astronomical Union's SOFA (Standards of Fundamental Astronomy) matrix transformations.SOFA service (http:// www.iauso fa.org/) provides astronomical software packages that contain sets of algorithms and procedures for implementing standard models used in fundamental astronomy (Wallace 1998).
First, we analyse the accuracy of original predictions of dψ, dε by showing their MAE over the 10-day prediction horizon (Fig. 12).One can see in the figure that in the case of dψ, MAE values for day 0 for IDs 100 and 101 are much higher than in the case of ID 200 and reach 0.46 and 0.66 mas, respectively (Fig. 12a).In contrast, the MAE for day 0 of ID 200 equals 0.07 mas and increases almost linearly for the next 10 days.The MAE for day 0 for the dε parameter is equal to 0.04 mas for ID 200 and 0.10 mas for ID 101 and does not change noticeably for the next 10 days of prediction.Conversely, the MAE for day 0 for ID 100 reaches 0.23 mas, it increases until day 7, reaching a maximum value of 0.41 mas, and then begins to decrease again until 0.32 mas at day 10.We now come to the comparison of the accuracy of original dX, dY forecasts (shown in Fig. 7) with the accuracy of forecasts received by transformation from dψ, dε.MAE for the transformed dX, dY are plotted in Fig. 13a, b, while the MAE differences between original dX, dY predictions and transformed dX, dY predictions are shown in Fig. 13c, d.Note that, as declared by the participant submitting predictions under ID 101, their predictions of dψ, dε are not direct forecasts of these components, but they are transformed values of the dX, dY predictions developed by that participant.Therefore, in this case we deal with a double transformation.
The transformation results show higher MAE values in all cases for transformed than for directly predicted dX and dY values.The smallest difference in MAE between original and transformed predictions of dX, dY is obtained for the case of ID 200.This might suggest that for ID 200 both dX, dY and dψ, dε are predicted with similar level of accuracy.The differences between MAE computed for original and transformed dX, dY predictions are highest for ID 101 (in the case of dX transformed from dψ) and ID 100 (in the case of dY transformed from dε).This means that after parameter prediction transformations, the MAE increases compared with the untransformed data.This also suggests issues with the prediction of dψ, dε by IDs 100 and 101, which contribute to the increased MAE of dX, dY predictions after transformation.As a result, replacing predicted dψ, dε with their transformation to dX, dY is not recommended.This analysis illustrates the influence of differences in accuracy between dX, dY and dψ, dε predictions on the results of the parameter transformation,  rather than the impact of the transformation itself on the accuracy of the transformed predictions.

Summary and conclusions
In this study, we analyzed the accuracy of CPO predictions collected during the 2nd EOP PCC, using the IERS 14 C04 solution as a reference.The campaign's primary objective was to evaluate the current potential of EOP prediction.This involved exploring emerging methodologies such as ML, which have seen rapid advances in recent years.The 2nd EOP PCC was an excellent and innovative opportunity for scientists from a range of countries and institutes to collaborate and compete in enhancing EOP predictions.With the participation of 23 institutions worldwide, the operational phase of the campaign spanned 70 weeks and yielded an unprecedented collection of EOP forecasts.The 2nd EOP PCC served as a valuable endeavor to assess different prediction techniques within a standardized framework and under consistent rules and conditions.During the campaign, CPO were predicted by 6 groups with 9 different approaches, and more than 500 predictions were submitted to the EOP PCC Office.It was found that the ML and Kalman filter approaches achieved the highest accuracy for CPO prediction.Depending on the evaluated prediction approach, the MAE values computed for day 10 of forecast are between 0.03 and 0.16 mas for dX, between 0.03 and 0.12 for dY, between 0.07 and 0.91 mas for dψ, and between 0.04 and 0.41 mas for dε.For day 30 of prediction, the corresponding MAE values range between 0.03 and 0.12 mas for dX, between 0.03 and 0.14 mas for dY.
To summarize the achievements of the 2nd EOP PCC in CPO prediction, we devised a ranking of IDs based on the following criteria: 1) Percentage of rejected submissions: this criterion evaluates the credibility of predictions by measuring the proportion of unreliable or inconsistent submissions; 2) Range of differences between the reference values and the prediction: this criterion examines the repeatability of predictions by assessing the range of prediction residuals.Forecasts with high stability over time should exhibit a small range of prediction residuals; 3) MAE values for day 1, day 6, day 7 and day 10: this criterion evaluates the quality of predictions at the beginning, middle, and end of a 10-day prediction horizon.Predictions for 30 days into the future were not considered to include all IDs in the ranking; 4) Median PCh: this criterion assesses the stability of the method's accuracy.
Under the classification, each ID has been assigned points (from 0 to 10) corresponding to its position, with the understanding that a lower number of points indicates a higher position in the ranking.The rankings for dX and dY are shown in Tables 4 and 5, respectively.Overall, predictions made by ML algorithms (IDs 127,134,154,and 155) are at the top of the ranking, indicating the credibility of this approach in CPO forecasting.For prediction of the dX and dY parameters, the lowest rankings are represented by prediction techniques based on the LS + AR (except for ID 101, which took fourth place for dX).
One of the main conclusions of this study is that the CPO predictions provided by the IERS are not sufficiently reliable, especially for the first days of prediction, due to an almost linear increase of the MAE for up to 10 days into the future.Overall, the results of the 2nd EOP PCC are promising as most of CPO predictions processed by campaign participants achieve accuracy similar or better than the accuracy received for forecasts provided by the IERS, especially after around third day of prediction.Moreover, in contrast to the forecasts disseminated by the IERS, predictions developed by 2nd EOP PCC participants do not show a significant increase in prediction errors with increasing prediction day.Therefore, MLbased forecasts can be successfully used in operational applications where accurate predictions for the first days of the forecast horizon are most important.

Appendix
See Table 6.

Table 6 (continued)
Names and affiliations of participants: Mostafa Kiani Shahvandi, Matthias Schartner, Junyang Gou, Benedikt Soja ETH Zurich, Institute of Geodesy and Photogrammetry, Zurich, Switzerland Predicted parameters: dX, dY Description of method: The architecture used is based on the first-order neural ordinary differential equations (Neural ODEs).In this architecture the hidden state in the hidden layer should follow a differential equation.To apply this concept to the EOP, it is assumed that EOP follow first-order differential equations the exact form of which should be determined by fitting neural networks to the observations.The general approach of Neural ODE differential learning (Kiani Shahvandi et al. 2022a) is modified (i.e., in a way that does not require using the EOP rates) and used as the primary architecture.A variation of this architecture is the so-called simple recursive method (Kiani Shahvandi et al. 2022b), in which an attempt is made to incorporate the uncertainties in the observational data in the training for a more reliable estimation of parameters of the neural networks (Kiani Shahvandi and Soja 2022).As a result, the loss function here is the mean squared error.The architecture does not require any preprocessing of the input features.The forecasting horizon includes both 10 and 30 days.The input sequence length is 10 days.Each architecture is trained for each prediction epoch to take advantage of the most recent available EOP and EAM data

Fig. 1
Fig. 1 Number of accepted predictions for each submission day after applying σ and β criteria

Fig. 4
Fig. 4 Minimum, mean, maximum, and range of prediction residuals for a dX and b dY, computed for day 1, day 8, day 15, and day 22 of prediction for each ID.Note that the predictions of IDs 127 and 134 are 12 days long so data for day 15 and day 22 are omitted

Fig. 5
Fig. 5 Distributions of the prediction residuals for a dX and b dY for day 1 and day 8 of predictions for all IDs with their respective best-fitted normal distributions

Fig. 7
Fig. 7 MAE for a dX and b dY predictions for up to 10 days into the future for each ID

Fig. 9
Fig.9MAE for dX predictions for individual 2-month periods (a-h).The thick black line represents the mean value of MAE over the 2-month period ("Mean for period"), the thick magenta line ("Mean for all") represents mean MAE for the whole campaign duration

Fig. 11
Fig. 11 Percentage change (PCh) of MAE for a dX and b dY predictions in individual analysis periods (P2-P8) in relation to the previous periods (P1-P7).The periods where data are not available are marked as grey, red colors indicate a MAE reduction, green colors indicate a MAE increase

Fig. 13
Fig. 13 Impact of transformation from dψ, dε to dX, dY on MAE: a MAE for dX obtained by transformation of d ψ,), b MAE for dY obtained by transformation of d ε , c differences of MAE between original submitted dX predictions and the transformed dX from d ψ predictions, and d differences of MAE between original submitted dY predictions and the transformed dY from d ε predictions

Table 1
List of predicted parameters, length of prediction, prediction techniques, and input data for each IDA more detailed description of prediction techniques is given in Table6*Dobslaw and Hill, 2018

Table 2
Number of N/M of rejected (N) and total submitted (M) predictions of dX, dY and dψ, dε for 10-and 30-day forecast horizon

Table 4
Ranking of IDs according to the adopted criteria and the number of points awarded to each ID in individual categories for dX

Table 5
Ranking of IDs according to the adopted criteria and the number of points awarded to each ID in individual categories for dY