### The global production rate of cosmogenic radionuclides

The global production rate of cosmogenic radionuclides depends on the cosmic-ray flux coming to the Earth which is modulated by solar activity (i.e., solar shielding of GCRs in the heliosphere) and the shielding effects of the geomagnetic field (Masarik and Beer 1999). Concentrations of stable and radioactive nuclides in meteorites and terrestrial archives suggest that, on time scales of about 0.5 million years, the GCR flux outside the heliosphere has remained constant within ± 10% over the past ∼ 10 million years (Wieler et al. 2013). Therefore, the assumption of a constant local interstellar GCR spectrum outside the heliosphere is usually made for any reconstruction of solar and geomagnetic field activity using cosmogenic radionuclides. The global production rate of cosmogenic radionuclides (*Q* [atoms cm^{−2} s^{−1}]) can then be modeled as a function of solar modulation potential (*ϕ* [MV]) and GDM denoted as *M* [10^{22} A m^{2}]:

$$Q\left(t\right)=f\left(\phi \left(t\right),M\left(t\right)\right),$$

(1)

where *t* represents time. In this study, we employed the tabulated function established in Kovaltsov et al. (2012) for the ^{14}C production rate (Fig. 3). Since the tabulated function was published with discrete data, we approximated it with a polynomial function (details in section 1 of the Additional file 1). This allowed us to work with continuous values of *ϕ* and GDM.

The function established by Kovaltsov et al. (2012) was based on the LIS model from Usoskin et al. (2005). According to Herbst et al. (2017), *ϕ* from one LIS can be converted to *ϕ* in another LIS by means of linear regression functions. We then used the following regression function to convert from *ϕ*_{US05} to the more recent LIS model by Herbst et al. (2017), i.e., *ϕ*_{HE16}:

$${\phi }_{\mathrm{HE}16}=1.025\times {\phi }_{\mathrm{US}05}+24.18\mathrm{ MV}.$$

(2)

Theoretical ^{14}C production rates and the normalized ^{14}C data need to be connected via a normalizing constant. This constant can be estimated through a comparison of the ^{14}C production rate data with theoretically expected ^{14}C production rates inferred from independent records of solar modulation potential and GDM. As mentioned above, the GDM from 1840 CE provided by COV-OBS. × 2 is suitable for our purpose. On the other hand, *ϕ*_{HE16} provides a record of solar activity inferred from instrumental data. However, it only overlaps with the radiocarbon data for a short period from 1939 to 1950 CE. Therefore, we rely on the solar modulation record inferred from the GSNs as an alternative.

Figure 4 shows the global ^{14}C production rate estimated by combining the mean of *ϕ*_{GSN} and the mean GDM predicted by COV-OBS. × 2 from 1840 to 1950 CE. The significant 11-year solar variation can be observed in the estimated mean production rate, while the variation has a lower amplitude in the processed ^{14}C data (Fig. 4). The main reason is the smoothening which occurred during the construction of the average IntCal20 ^{14}C record (Heaton et al. 2020; Reimer et al. 2020) and the fact that the 11-year cycle variability is strongly dampened and close to the detection limit in tree-ring based reconstructions of the atmospheric ^{14}C concentration (Brehm et al. 2021). The smoothening of the ^{14}C data will have consequences for our modeling approach later on. Assessing this smoothening with a moving average filter showed that a 9-year moving average version of the theoretical ^{14}C production rate (Fig. 4) approximates best the variations in the ^{14}C data (*r* = 0.76). A normalizing constant was then computed by comparing the averages of the estimated ^{14}C production rate filtered with a 9-year moving average filter, and the averages of the ^{14}C production data for the period of overlap. The model uses this normalizing constant to connect the input ^{14}C data to the global production rates generated with Eq. 1.

### A Bayesian approach for sampling the past solar modulation potential and geomagnetic dipole moment

Consider a model with a set of parameters (*θ*) and observed data (*y*); a set of parameters that fit with the observed data can be found using Bayes’ theorem (Gelman et al. 2004):

$$p\left(\theta |y\right)\propto p\left(y|\theta \right)\times p\left(\theta \right).$$

(3)

Within the Bayes’ framework, *p(θ|y)* is the unnormalized posterior distribution of the parameters after considering the observed data. *p(y|θ)* is the distribution of the data conditional on *θ*. It is also called the likelihood function if it is treated as a function of *θ*. *p(θ)* is the prior distribution of the model parameters before any observation. Although the posterior distribution *p(θ|y)* cannot always be solved analytically, Eq. 3 allows for an approximation by generating samples from the posterior distribution via different sampling methods. This method of statistical inference or Bayesian inference allows us to incorporate our additional knowledge into the prior distribution of the parameters. Moreover, the parameters’ distribution could be updated continuously as more observations become available. This is particularly useful in cases where the sample size is not fixed and/or when the users want to incorporate additional information.

We employed the Hamiltonian Monte Carlo (HMC) method which utilizes the Hamiltonian dynamic simulation to efficiently generate samples from the posterior distribution (Neal 2011). We developed and executed our model via Stan (Carpenter et al. 2017), a probabilistic programming language for statistical modeling and high-performance statistical computation. In addition, we used the No-U-Turn Sampler (NUTS), an extension of HMC developed by (Hoffman and Gelman 2014) to sample from the posterior distribution of solar variations and GDM. NUTS provides an auto-tuning of difficult and highly sensitive parameters of the HMC sampler.

If the solar modulation potential (*ϕ*) and the GDM (*M*) are considered as parameters in formula 3, the Bayesian approach can be used to find *ϕ* and *M* that fit with the observed global production rate (*Q*):

$$p\left(\phi ,M|Q\right)\propto p\left(Q|\phi ,M\right)\times p\left(\phi ,M\right),$$

(4)

and if *ϕ* and *M* are independent, the formula can be re-written as:

$$p\left(\phi ,M|Q\right)\propto p\left(Q|\phi ,M\right)\times p\left(\phi \right)\times p\left(M\right),$$

(5)

where the likelihood function *p(Q|ϕ,M)* can be established based on the global production rate function of Kovaltsov et al. (2012) (Fig. 3). This Bayesian method allows us to input our information on the characteristics of the Sun and the geomagnetic field intensity via the prior distribution *p(ϕ)* and *p(M)*. Combining this information with the observed production rate data offers the possibility to reduce the uncertainty and improve the reconstructions.

### Setting up the prior distributions

#### Prior distribution of solar variability

The prior distribution of solar activity was set up using the framework of a Gaussian process (GP) described by Rasmussen and Williams (2006):

$$\phi \left(t\right)\sim GP\left({m}_{\phi }\left(t\right), {k}_{\phi }\left(t,{t}^{\mathrm{^{\prime}}}\right)\right),$$

(6)

where *ϕ(t)* represents the solar modulation potential at time *t*. *t* and *t'* are separated by a time difference *r*. *m*_{ϕ}*(t)* is the mean function and *k*_{ϕ}*(t,t')* is the covariance function of *ϕ(t)*. The mean function and the covariance function are given by:

$${m}_{\phi }\left(t\right)=E\left[\phi \left(t\right)\right],$$

(7)

$${k}_{\phi }\left(t,{t}^{\mathrm{^{\prime}}}\right)= E\left[\left(\phi \left(t\right)-{m}_{\phi }\left(t\right)\right)\times \left(\phi \left({t}^{\mathrm{^{\prime}}}\right)-{m}_{\phi }\left({t}^{\mathrm{^{\prime}}}\right)\right)\right],$$

(8)

where *E* represents the expected values. The GP is a generalization of the multivariate Gaussian probability distribution (Rasmussen and Williams 2006), where each point *t* is described by a mean and a joint Gaussian distribution with the surrounding points. Consequently, every point/event in time is influenced by (correlated to) the data before and afterwards. Characteristics of the correlation are determined by the covariance function. The covariance matrix, generated using a specific covariance function (discussed later), is the collection of vectors defining the correlation of every point in time with the surrounding points. In other words, the covariance matrix includes the information on the characteristic timescales and variances of the physical processes that we aim to reconstruct.

The most common covariance function within the machine learning field is the squared exponential (SE) (Rasmussen and Williams 2006). The covariance (*k*_{SE}) and spectral density (*S*_{SE}) of the SE covariance function have the forms:

$${k}_{\mathrm{SE}}\left(r\right)={\sigma }^{2}\times \mathrm{exp}\left(-\frac{{r}^{2}}{2{l}^{2}}\right),$$

(9)

$${S}_{\mathrm{SE}}\left(s\right)={\sigma }^{2}\times {\left(2\pi {l}^{2}\right)}^\frac{D}{2}\times \mathrm{exp}\left(-2{\pi }^{2}{l}^{2}{s}^{2}\right),$$

(10)

where *D* is the dimensionality, *σ*^{2} is the signal variance and *l* is the length-scale that determines how quickly the correlation diminishes with time. *s* in Eq. 10 represents frequencies. Additional file 1: Fig. S3 shows an example of the covariance *k*_{SE}* (r)* as a function of the input distance *r* in time.

In this study, a new covariance function for the solar variations was created by adding two SE covariance functions with different characteristic timescales:

$$k\left(r\right)={\sigma }_{\mathrm{short}}^{2}\mathrm{exp}\left(-\frac{{r}^{2}}{2{l}_{\mathrm{short}}^{2}}\right)+{\sigma }_{\mathrm{long}}^{2}\mathrm{exp}\left(-\frac{{r}^{2}}{2{l}_{\mathrm{long}}^{2}}\right).$$

(11)

One SE covariance function simulates the observed short-term variations (e.g., 11-year cycle), while the other one simulates the observed centennial variations (e.g., 88–100 years) of the Sun. This combined covariance function was established based on the variations of the solar modulation parameters (i.e., *ϕ*_{GSN}) which is the longest observational record of solar activity. The record shows solar activity for the last ~ 400 years which was dominated by the 11-year cycle and the centennial variations (Fig. 1). The short length of *ϕ*_{GSN} is a weakness of the model since the record is not long enough to capture possible quasi-periodicities of ~ 400 years to up to ~ 2400 years in solar variations (e.g., Bond et al. 2001; Snowball and Muscheler 2007; Usoskin et al. 2016; Dergachev and Vasiliev 2019). However, we want to avoid including long-term cycles inferred from the radionuclide records in our prior information which would lead to circular reasoning as the model is applied to radionuclides. Moreover, solar variations inferred from radionuclide records on longer time scales are rather ambiguous since it is challenging to completely eliminate influences from geomagnetic field variations, transport and deposition processes for ^{10}Be and carbon cycle effects on ^{14}C (Vonmoos et al. 2006; Snowball and Muscheler 2007). In addition, we lack a longer direct observational record of solar activity that could help us to obtain a better constrained prior for solar variability on timescales longer than the GSN record. Nevertheless, the GSN record still shows indications of the 200-year cycle such as the Maunder minimum, a period with an almost complete lack of sunspots, which can be considered as an expression of the 200-year cycle (Fig. 1). The millennial-scale cycles can possibly be characterized as bundling of larger-amplitude centennial-scale variations followed by periods of weaker centennial-scale variability, and, in such cases, the proposed covariance function can reproduce the millennial-scale variations to some extent.

The histograms in Fig. 5 show that *ϕ*_{GSN} has a positive distribution (i.e., the adiabatic energy loss of GCR cannot be negative) that is skewed toward higher values. Therefore, the symmetrical Gaussian distribution, which also allows for negative values, implied by Eq. 6 is not an appropriate approximation of *ϕ*. Moreover, the polynomial approximation (Additional file 1: Fig. S2) of the ^{14}C production rate function starts to become unrealistic and produces negative values at *ϕ* smaller than -362 MV which would be problematic to the modeling process. For *ϕ* within the range of − 362 to 0 MV, which also implies unphysically negative shielding, the approximation produces unrealistically large values of ^{14}C production rate which would then mostly be rejected by the Bayesian sampler. However, the sampling process would be inefficient if the model has to frequently reject negative values of *ϕ*. Thus, we considered modeling *ϕ* using a log-normal distribution which is also a skewed positive distribution:

$$\mathrm{log}\left(\phi \right)\sim \mathrm{ N}\left({\mu }_{\mathrm{log}\left(\phi \right)}, {\sigma }_{\mathrm{log}\left(\phi \right)}^{2}\right),$$

(12)

where *μ*_{log(ϕ)} represents the mean and *σ*^{2}_{log(ϕ)} represents the variance of the log transformation of *ϕ*. A disadvantage of using a log-normal distribution is that the model will occasionally generate extremely high values of *ϕ* due to the nonsymmetrical characteristic (Fig. 5a). However, the proposed *ϕ* will be rejected by the model when it is unrealistically high and cannot be explained by the radionuclide production rate. Another problem of the log-normal distribution is a bias toward lower values of *ϕ* as demonstrated by the histograms. Consequently, values above the mean of *ϕ*_{GSN} will be generated with a lower probability. A solution for this is to add a constant (*c* [MV]) to *ϕ*_{GSN} before fitting with a log-normal distribution:

$$\mathrm{log}\left(\phi +c\right)\sim N\left({\mu }_{\mathrm{log}\left(\phi +c\right)}, {\sigma }_{\mathrm{log}\left(\phi +c\right)}^{2}\right).$$

(13)

After sampling from this distribution, we exponentiate and subtract *c* to obtain *ϕ*. This approach allows the model to generate a more flexible distribution agreeing well with the *ϕ*_{GSN} distribution (Fig. 5). This minimizes the bias toward lower values of *ϕ* at the cost of allowing for negative values of *ϕ* (i.e., as low as minus *c*) with a low prior probability. We assessed the fitted distribution for the case of *c* equal to 100, 200 and 300 MV (Additional file 1: Fig. S4) and decided to choose 200 MV as this value shows a good balance between the pros and cons. The fitted distribution to log*(ϕ* + *c)* with *c* equal to 200 MV is shown in Fig. 5. Parameters of the fitted log-normal distribution such as mean and variance were assessed using the method of maximum likelihood estimation. In summary, the prior distribution allows for negative values of solar modulation as low as − 200 MV but with a low probability in exchange for a better model performance with less bias toward lower values of *ϕ*. Negative solar modulation values, while unphysical, could be generated associated with the biases in radionuclide data (e.g., climate impact), extreme spikes/enhancement in the radionuclide records (e.g., SPEs), or simply the data uncertainties. Data uncertainties are included in the model and we assume that climate biases are minor for the ^{14}C production rate. However, SPEs are not included in our model and, therefore, we removed the known SPE production peaks (i.e., the 774/775 and the 993/994 peaks) from the ^{14}C data. In addition, Fig. 5 shows that the prior distribution (green line) still slightly underestimates the probability of solar modulation from around 500 to 1000 MV. This could still lead to a slight bias toward lower values of solar modulation. However, the posterior distribution of solar activity will ultimately be evaluated and selected based on the radionuclide data. Therefore, a small inclination toward low solar modulation values of the prior distribution will not bias the results.

We then replaced Eq. 6 with:

$$log\left({\phi }_{c}\left(t\right)\right)\sim GP\left({m}_{\mathrm{log}\left({\phi }_{c}\right)}\left(t\right), {k}_{\mathrm{log}\left({\phi }_{c}\right)}\left(t,{t}^{\mathrm{^{\prime}}}\right)\right),$$

(14)

where *ϕ*_{c}*(t)* = *ϕ(t)* + *c*. The mean function and the covariance function are adjusted accordingly:

$${m}_{\mathrm{log}({\phi }_{c})}\left(t\right)=E\left[\mathrm{log}({\phi }_{c}(t))\right],$$

(15)

$${k}_{\mathrm{log}({\phi }_{c})}\left(t,{t}^{\mathrm{^{\prime}}}\right)=E\left[\left(\mathrm{log}({\phi }_{c}\left(t\right))-{m}_{\mathrm{log}({\phi }_{c})}\left(t\right)\right)\times \left(\mathrm{log}({\phi }_{c}\left({t}^{\mathrm{^{\prime}}}\right))-{m}_{\mathrm{log}({\phi }_{c})}\left({t}^{\mathrm{^{\prime}}}\right)\right)\right].$$

(16)

The mean of our proposed prior distribution (*m*_{log(ϕc)}) for the modeling period was equal to the mean of the fitted log-normal distribution.

The short-term and long-term variations of the Sun were defined in Eq. 11 by the characteristic length-scale (*l*_{short}, *l*_{long}) and the signal variance (*σ*^{2}_{short}*, σ*^{2}_{long}). These parameters were determined using *ϕ*_{GSN}, particularly the power spectrum of log*(ϕ*_{GSN}*)* (Fig. 6a). In other words, we investigate how log*(ϕ*_{GSN}*)* behaves in the frequency range and adjust our prior to resemble it. The short-term variations are reflected as a bump and changes in the slope of the power spectrum around the 6- to 16-year period. The centennial variation could be observed for periods longer than 55 years, but the changes in the slope were not as strong as the short-term variation. First, we determined *l*_{short} and *σ*^{2}_{short} by tuning our covariance function for the period from 10 to 12 years (highlighted in Fig. 6a). This ensures that our prior captures the short-term variations which are most prominent for period lengths from 10 to 12 years. We then tuned *l*_{long} and *σ*^{2}_{long} for the period from 50 to 136 years where the transition in power occurs (highlighted in Fig. 6a). Details of the parameterization process are outlined in section 2 of the Additional file 1.

The spectral density of the combined covariance function with tuned parameters is shown in Fig. 6a. The power decreases stepwise with a constant period between the steps as a result of the combination of our two SE covariance functions. As expected, the combination of the two tuned covariance functions does not fully simulate the bump-like structure in power generated by the short-term variation of the Sun around a 10- to 12-year period. The power was instead raised to a higher level before and after the bump. A fixed periodic signal could be introduced to simulate the bump-like property as a peak in the power spectrum. However, this would also limit the short-term variations to a narrowly defined cycle as, for example, an 11-year cycle. We here chose to apply a more relaxed prior since the short-term variations of the Sun vary around the 11-year timescale rather than being a cycle with constant frequency (Friis-Christensen and Lassen 1991; Solanki et al. 2002; Petrovay 2010; Brehm et al. 2021). Moreover, the short-term variations could have changed further back in time. Overall, despite this drawback, we still chose the SE covariance function because the flexibility in the range of the short-term variations allows for a variable frequency for the “quasi” 11-year cycle. Therefore, the solar cycle can be determined by the data instead of being imposed by the prior.

Figure 6b shows a comparison of the power spectra of *ϕ*_{GSN} and solar modulation potential realizations that were randomly generated with our tuned covariance function for the same time period. As expected, the covariance function generates short-term variations that are not fixed to, but rather vary around, the 11-year cycle. The transition in the power spectrum of *ϕ*_{GSN} from 50 to 136 years is well simulated. The covariance function generates curves with higher power than *ϕ*_{GSN} for periodicities with 16–40 years cycle lengths. This is a drawback of the SE covariance function that cannot be avoided with this rather simple approach, but on the other hand it is necessary to generate high enough power to capture the short-term variations. Variations on timescales shorter than 4 years can be observed in the power spectrum of *ϕ*_{GSN}. We interpret these variations as not relevant for our radionuclide data interpretation and decided to exclude them in our parameterization. Therefore, it is not a problem when the power spectrum of our generated solar modulation potential diminishes rapidly for periodicities shorter than 4 years.

We generated 1000 realizations of *ϕ* from 1 to 1938 CE using Eq. 14 to simulate our prior distribution of solar activity. We found that 1000 realizations were enough to capture the main aspect of our covariance matrix, mainly around the diagonal. The covariance matrix and hence the realizations were connected to the present solar activity of *ϕ*_{HE16}. Every realization was binned (i.e., taking an average of every 10 years) prior to 1600 CE to have the same time resolution as the processed ^{14}C data. Post 1600 CE the realizations were smoothened with a 9-year moving average filter to match the treatment of the ^{14}C data, as discussed in Sect. 3.1. A new adapted covariance matrix for solar variations over the period 1 to 1938 CE was computed from these 1000 processed realizations to account for the lower resolution and smoothening. The final prior distribution of solar activity was estimated via 1000 realizations generated using the adapted covariance matrix (Fig. 7a). Again, we tested and found that 1000 realizations were enough to represent the adapted covariance matrix. The new solar realizations have comparable variations to the 9-year moving average version of *ϕ*_{GSN} during the same period (Fig. 6c). The approach described here allows us to directly adjust/adapt our covariance matrix without having to adjust/re-parameterize the covariance function. Thus, it will be helpful for modeling radionuclide data with different temporal resolution or smoothening when applied to long-term radionuclide records.

#### Prior distribution of geomagnetic field intensity

We set up the prior distribution for GDM following Bouligand et al. (2016). We simplified the approach using just the axial component to approximate the GDM. This was justified because the axial dipole is the component that dominates the geomagnetic shielding of galactic cosmic rays (Masarik and Beer 1999). The covariance function for changes in the axial component is given by:

$$k\left(r\right)=\frac{{\sigma }^{2}}{2\xi }*\left(\left(\chi +\xi \right){e}^{-\left(\chi -\xi \right)\left|r\right|}-\left(\chi -\xi \right){e}^{-\left(\chi +\xi \right)\left|r\right|}\right),$$

(17)

with *ξ*^{2} = *χ*^{2}* − ω*^{2}. *χ* and *ω* are parameters representing frequencies. *σ*^{2} and *r*, again, symbolize signal variance and difference in time. The power spectral density of the covariance function is as follows:

$$P\left(f\right)=\frac{4\chi {\omega }^{2}{\sigma }^{2}}{{({\omega }^{2}-{(2\pi f)}^{2})}^{2}+{(4\pi \chi f)}^{2}}.$$

(18)

The power spectrum (*P*) has an arc shape with the power decreasing with increasing frequency. It is often divided into several frequency ranges which can then be approximated locally with various spectral indices (*p*) (Bouligand et al. 2016; Hellio and Gillet 2018):

$$P\left(f\right)\propto {f}^{-p}.$$

(19)

At very low frequency where *p* ≅ 0, the spectrum is almost flat indicating that it has the largest power at long-term periods such as periods longer than 50,000 years as shown in Additional file 1: Fig. S5. The power decreases faster at shorter periods (i.e., larger spectral indices). The decrease in power spectrum is simulated by changes in the slope which are determined by the cut-off frequencies *T*_{s} and *T*_{f}. The cut-off frequencies indicate the time periods where transitions into steeper slopes occur and their formulas are given by Bouligand et al. (2016):

$${T}_{s}=\frac{2\pi (\chi +\xi )}{{\omega }^{2}},$$

(20)

$${T}_{f}=\frac{2\pi (\chi -\xi )}{{\omega }^{2}}.$$

(21)

*χ* and *ω* are then chosen to achieve the desired cut-off frequencies.

A recent geomagnetic field model COV-LAKE spanning the last 3000 years has *T*_{s} ~ 100,000 years and *T*_{f} ~ 60 years (Hellio and Gillet, 2018). The model is based on measurements of the magnetic field directions and intensity stored in archeological artifacts, igneous rocks and sediment records. The power spectral densities for the axial component of COV-LAKE are shown in Additional file 1: Fig. S5 and the GDM provided by COV-LAKE is shown in Fig. 10. The COV-OBS. × 2 model (Additional file 1: Fig. S5) has the same *T*_{s} but higher *T*_{f} (~ 235 years). This earlier transition (at longer time period) in the slope resulted in lower power for variations on timescales shorter than 200 years.

We tested generating a prior distribution of GDM using *T*_{s} and *T*_{f} from COV-LAKE and COV-OBS. × 2 models (results are shown in Additional file 1: Fig. S6). However, comparisons to the reconstructed GDM based on COV-LAKE suggest the prior distribution is rather conservative. We also compared our prior distribution with pfm9k.1b another major reconstruction for the last 2000 years (Nilsson et al. 2014; Muscheler et al. 2016). pfm9k.1b provides a low temporal GDM reconstruction (300–400 years) over the Holocene based on almost the same underlying dataset as COV-LAKE (COV-LAKE was extended to include additionally new sediment and archeological intensity data). The GDM reconstructed by pfm9k.1b over the last 2000 is shown in Fig. 10. The GDM indicated by pfm9k.1b was mostly above and outside the prior distribution (Additional file 1: Fig. S6). Too conservative prior distributions could result in a bias toward values of GDM lower than the range seen in the previous GDM models. Therefore, we adjusted the parameters to widen our prior distribution (Additional file 1: Fig. S6c) and allow for more variability in the prior for GDM. We used a lower *T*_{s} of 50,000 years, a higher *T*_{f} of 433 years and also a larger signal variance. The variations for timescales between 100 and 10,000 years of our prior are larger compared to the prior used for COV-OBS. × 2 (Additional file 1: Fig. S5). For variations shorter than 100 years, our prior agrees with the prior used for COV-OBS. × 2. The parameters of COV-OBS. × 2, COV-LAKE and from this study for the axial component are shown in Additional file 1: Table S1. The variations of the axial dipole are insignificant at timescales shorter than 20 years. Therefore, the prior distribution of the GDM can be used directly without any adjustment (e.g., smoothening) of the covariance matrix. In addition, the prior distribution of GDM was connected to present GDM from 1939 to 2020 CE predicted by COV-OBS. × 2. We generated 1000 realizations of GDM simultaneously with *ϕ* to estimate the prior distribution (Fig. 7).

Results of palaeomagnetic field models could be used to further constrain the prior distribution. This would reduce the reconstruction uncertainty for periods where past GDM variations are well constrained, but would introduce uncertainties associated with the chosen palaeomagnetic field model. Disagreement in GDM reconstructed by different palaeomagnetic field models would lead to discrepancies in solar activity reconstructions. In addition, this also defeats our purpose of being independent of GDM models and providing a radionuclide-based reconstruction for GDM. Therefore, we did not further constrain our prior distribution based on results of palaeomagnetic field models.

### Evaluation of the proposed solar activity and GDM

The samples of *ϕ* and *M* drawn from the prior distribution were evaluated via the likelihood function:

$${Q}_{t} \sim N\left(f\left({\phi }_{t},{M}_{t}\right),{\sigma }_{Q,t}^{2}\right),$$

(22)

where *Q*_{t} is the data vector of the observed global production rates and *σ*^{2}_{Q,t} is the vector representing uncertainty (i.e., variance) of the observed data. *ϕ*_{t} and *M*_{t} are the vectors of the solar modulation potential and GDM, proposed by the Bayesian model as the solution for *Q*_{t}. The link between *Q*_{t} and *ϕ*_{t} and *M*_{t} [i.e., *f(ϕ*_{t}*,M*_{t})] was established based on the global production rate function in Kovaltsov et al. (2012) (see Fig. 3 and section 1 in the Additional file 1). In summary, the Bayesian model combines our knowledge about the parameters (i.e., incorporated in the prior distributions) with the additional constraints provided by the observations (here, the global production rate) to yield a new (posterior) distribution of the parameters.

The prior distributions allow for an overlap of geomagnetic field and solar variability, which is the main challenge our method addresses. However, this leads to the fact that the reconstruction for geomagnetic field and solar variability on timescales significantly longer than 200 years is to some extent ambiguous since the prior distribution of solar activity was established based on variations observed in the 400-year-long GSN record. Nevertheless, the prior distribution is based on flexible SE covariance functions so that the choice of timescales does not prevent the model from finding longer periodicities, if the data requires them. The power spectrum of the tuned covariance function in Fig. 6 is essentially flat for longer timescales, but presumably the actual power spectrum decreases at very long periods. The model will compare and decide if the variations in the radionuclide record at timescales longer than 200 years can or cannot be explained by geomagnetic field variations (characterized in the prior distribution) and, if not, the model will likely consider those as solar variations. Overall, the model has larger uncertainties for variations on timescales longer than 200 years, but the prior information allows for some separation of solar and geomagnetic field influences also on these timescales. This is an advantage of the model over a simple band-pass frequency filter in disentangling solar and geomagnetic field variations on timescales longer than 200 years.

We also evaluate the correlation coefficient (*r*_{ϕt,Mt}) between the proposed curves of *ϕ* and GDM:

$${r}_{{\phi }_{t},{M}_{t}} \sim N\left({\mu }_{r},{\sigma }_{r}^{2}\right).$$

(23)

We set the mean (*μ*_{r}) and standard deviation (*σ*_{r}) of the correlation coefficient equal to 0 and 0.01, respectively. This ensures the independence of the reconstructed solar activity and geomagnetic field strength which was the initial assumption of the Bayesian approach (Eq. 6). It is worth mentioning that we tested different values of *σ*_{r} and chose 0.01 as the value providing the best independence constraint. *σ*_{r} larger than 0.01 would result in *ϕ* and GDM correlating more strongly than we expect (we do not expect a link between solar activity and GDM variations), while *σ*_{r} lower than 0.01 would be too severe a constraint. The model will then reject the majority of the proposed samples albeit some low correlation can occur just by coincidence. It is also worth mentioning that some chance correlation between *ϕ* and GDM is more likely if we investigate a short period of time (e.g., shorter than 1000 years), especially if both processes exhibit a long-term trend over the investigated period.