April 25, 2024
Less extreme and earlier outbursts of ice-dammed lakes since 1900 – Nature

Less extreme and earlier outbursts of ice-dammed lakes since 1900 – Nature

Developing a database of ice-dam failures

We define an ice-dammed lake as any water body at the margin of and impounded by glacier ice. We exclude all other glacier-fed lakes, such as lakes dammed by moraines or landslides, internal water pockets, supraglacial ponds, lakes below glaciers and lakes formed by geothermal activity. Ice dams often trap runoff from catchments and cause loss and damage when the impounded water is suddenly released and reaches human settlements59. We focus on six major glaciated mountain regions that support 99% of the contemporary ice volume outside polar regions47. Our evidence for the 1,569 ice dam failures reported between 1900 and 2021 comes from a total of 446 different sources of information. We distinguish between primary sources, where we have direct observations or access to raw data, and secondary sources that describe, interpret or summarize previous findings on historical GLOFs. Primary sources are available for 64% of all reported cases and include original scientific publications (36%), annals and reports published by local and regional authorities (18%), written communications in emails with eyewitnesses and experts (9%), and newspapers and internet resources such as video footage and posts on social media platforms (altogether 1%). Secondary sources (36% of all cases) include second-hand information derived from reviews and summaries of previous publications (19%) and entries in databases with no direct access to the primary source (17%). Half of all sources report only a single event, whereas a few studies include information on dozens of GLOFs. For example, ref. 60 identified more than 100 outbursts from lakes impounded by Kennicott Glacier during the twentieth century by studying stream gauge data and historical aerial photographs. Still, decades may elapse between the date of a GLOF and the reporting, especially where researchers reprocessed and reanalysed historical data (Extended Data Fig. 1). In half of all cases, however, GLOFs were reported within a decade after they occurred. A few studies contributed significantly to improving the regional record of GLOFs in the first decades of the twentieth century, when the overall reporting rate was low3,7,11,60,61,62. In a few cases, reporting may have changed due to resettlement in this period. For example, repeated flooding of an ice-dammed lake below the Folgefonni Ice Cap (Norway) forced farmers to abandon their homes in the mid-1960s16. It is only since the 1970s that research activity on GLOFs has become more consistent24: 63% GLOFs in our databases were first documented in that period (Extended Data Fig. 1).

In collecting historic GLOFs, we only considered cases with either a documented date (at least the year of occurrence) or a time interval of failure, centroidal coordinates of the lake and at least one reference. For each GLOF, we documented the mountain range, the country from which the flood originated, the name of the parent glacier (local name and the ID in the Randolph Glacier Inventory RGI V6.063) and the name of the burst glacier lake. We extracted the surface elevation of each lake centroid from the Advanced Land Observing Satellite (ALOS) V3.2 digital elevation model (DEM)64. We collected the reported flood volume (n = 510, 33% of all cases), peak discharge (n = 506, 33%) and timing (n = 940, 61%). Finally, we documented any impacts, damage, and losses for events in our database.

Lake mapping

We could verify the origin, timing and area of 53% of all reported lake outbursts that have occurred since the Landsat 5 mission began to continuously obtain satellite images in the late-1980s65. Later generations of satellites, such as RapidEye and the Planet Cubesat constellation (launched in 2009 and 2015, respectively), have higher spatial resolution and thus capture GLOF features in greater detail. Ice-dammed lakes decrease in size, or even empty entirely, during GLOFs, exposing either partly or fully the lake floor, which is often covered with stranded icebergs (Fig. 5c). Ice, clouds, and shadows compromise the automatic detection of water and lakes66. Hence, we manually digitized the extents of the lakes from satellite images in QGIS V3.16 software. To this end, we scanned the archive of satellite images to retrieve the last available image before and the first available image after the lake drained. Mapping lakes from optical images was limited in Iceland because of frequent cloud cover. For each lake with repeat outbursts, we used a simple linear regression model of the mapped lake area before the outbursts versus time to obtain the change in lake area (Fig. 5). We then estimated the mean percent change in lake area between the first and last reported outburst in the time series. For manually mapped lake areas, uncertainties are generally assumed to be between 4% and 10% of the mapped lake area48,55 or between 0.5 and 1 pixel of the sensor resolution18,42,57, that is, 3–5 m for Planet and RapidEye, and 30 m for Landsat images.

In Fig. 4, we compare elevations of lakes that have had historic outbursts with elevations of ice-dammed lakes that currently (2018–2020) exist according to regional inventories18,48,55,56,57,58. The lakes in these databases were either manually digitized or refined after automatic mapping. We chose a minimum mapping unit of 0.01 km2 to ensure comparability between inventories. No contemporary lake inventory is available for Iceland. Thus, we manually mapped ice-dammed Icelandic lakes from high-resolution Google Earth imagery obtained between May and September 2021. We added the elevations obtained from the ALOS V3.2 DEM64 to both the previous lake inventories and our manually mapped lakes.

Bayesian hierarchical modelling

To assess temporal changes in GLOF magnitude, elevation and timing, we fitted Bayesian hierarchical models that distinguish between regional (that is, our six study regions) or local (that is, lakes with repeat outbursts) groups. Bayesian models form a compromise between the likelihood of observing the data under the assumption of specific model parameters and prior knowledge about these parameters. We encode our uncertainty in this prior knowledge before obtaining a posterior distribution for all parameters conditioned on the data. The benefit of hierarchical models is that they account for group-level structure in the data within a single model. We can thus estimate regional or local effects with respect to a population-level model learned from all data regardless of their location. Hierarchical models offer a compromise between separate models that overemphasize regional and local effects and that are independently learned for each group, and pooled models that are learned from all data without distinction67. Hierarchical models improve parameter estimates for individual groups, especially if sample sizes are low or vary across the groups. This trait is particularly desirable for our study as the number of reported GLOFs differs widely among both regions and lakes. The model learns the population-level parameters from the data, and these parameters are shared among groups. In this way, the groups inform each other, achieving an effective sample size that greatly exceeds that of single groups. The joint posterior density in a hierarchical model is proportional to the product of the likelihood and the joint prior density (pi ({theta }_{1},ldots ,{theta }_{J},tau {|D})propto )(pi ({D|}{theta }_{1},ldots ,{theta }_{J},tau )pi ({theta }_{1},ldots ,{theta }_{J},tau )=pi (tau ){prod }_{{j=1}}^{J}pi ({D}_{j}|{theta }_{j})pi ({theta }_{j}|tau )), where D are the observed data and Dj is the subset of D that contains all data in group j. The group-level parameters ({theta }_{j}=({theta }_{1j},ldots ,{theta }_{{Kj}})) define the model on the group-level with K individual parameters, which vary for (j=1,ldots ,J) groups. These parameters are drawn from distributions (pi left({theta }_{j}|tau right)) and specified by population-level (hyper-) parameters (tau =left({tau }_{1},ldots ,{tau }_{L}right)), which consist of L individual parameters and are also learned from the data. The hierarchical model estimates the posterior distribution for both population-level and group-level parameters conditional on the reported GLOFs.

Assessing regional and local trends of V
0, Q
p and Z

Continental and global studies of river floods point to climate-driven changes in both median and extreme discharges in past decades68,69,70,71. Similarly, repeated inventories reveal recent shifts in the regional size distribution of ice-dammed lakes, but little about trends in lake outbursts18,48,49. By using quantile regression, we examined whether and by how much the quantiles in the conditional distributions of reported peak discharge, Qp, and flood volume, V0, have changed in our six study regions. Extreme (>90th percentile) flood discharges are important for planning flood protection measures and informing flood risk zoning69. Ordinary least squares regression estimates the conditional mean, but quantile regression makes no assumptions about the distribution of the response variable and is robust against outliers. In our model specification, we used an asymmetric Laplace likelihood (AL) with fixed quantiles. The AL has density

$${f}_{{rm{AL}}}(,y{rm{| }}mu ,kappa ,p)=frac{p(1-p)}{kappa }exp left(-{rho }_{p}left(frac{y-mu }{kappa }right)right),$$

(1)

where y is the response variable, μ is a location parameter, κ is a positive scale parameter, 0 < p < 1 is a percentile, and ({rho }_{p}(x)=x(p-I(x < 0))) with indicator function I(∙). The exponent takes the values (-frac{,|{y}-mu |}{kappa }p) for y ≥ μ and (-frac{|y-mu ,|}{kappa }(1-p)) for y < μ. We modelled the conditional distributions of Qp and V0 with decimal year t for two fixed values of p as

$${y}_{ji}sim {rm{A}}{rm{L}}({mu }_{ji},kappa ,p),{rm{f}}{rm{o}}{rm{r}},j=1,ldots ,,J,{rm{a}}{rm{n}}{rm{d}},i=1,ldots ,{n}_{j}$$

(2)

$${{rm{mu }}}_{ji}={alpha }_{j}+{beta }_{j}{t}_{ji},{rm{for}},j=1,ldots ,,J,{rm{and}},i=1,ldots ,{n}_{j}$$

(3)

$$[begin{array}{c}{alpha }_{j}\ {beta }_{j}end{array}]sim {rm{M}}{rm{V}}{rm{N}}{rm{o}}{rm{r}}{rm{m}}{rm{a}}{rm{l}},[(begin{array}{c}alpha \ beta end{array}),S],,{rm{f}}{rm{o}}{rm{r}},j=1,ldots ,J$$

(4)

$$S=left(begin{array}{cc}{sigma }_{alpha } & 0\ 0 & {sigma }_{beta }end{array}right)Rleft(begin{array}{cc}{sigma }_{alpha } & 0\ 0 & {sigma }_{beta }end{array}right)$$

(5)

$$R=left(begin{array}{cc}1 & varsigma \ varsigma & 1end{array}right)$$

(6)

where yji are reported values of Qp or V0 with yji referring to the ith observation in group j, nj is the number of observations in group j, and J is the number of groups. The parameters αj and βj are the group-level intercepts and slopes, respectively, and α and β are the corresponding population-level parameters. The covariance matrix S is composed of group-level standard deviations σα and σβ, and R, the correlation matrix with correlation ς. In line with decadal analyses of median and extreme flood discharges68,69,70, we considered fixed values of p = 0.5 (median) and p = 0.9 (90th percentile), and estimated all other parameters from the data.

The Bayesian framework demands prior distributions for each parameter that enters at another model level. We chose the following priors

$$kappa sim N(0,2.5)$$

(7)

$$alpha sim N(0,2.5)$$

(8)

$$beta sim N(0,2.5)$$

(9)

$${sigma }_{alpha }sim N(0,2.5)$$

(10)

$${sigma }_{beta }sim N(0,2.5)$$

(11)

$$Rsim {rm{L}}{rm{K}}{rm{J}}{rm{C}}{rm{h}}{rm{o}}{rm{l}}{rm{e}}{rm{s}}{rm{k}}{rm{y}}(1).$$

(12)

These priors refer to standardized, log10-transformed response variables Qp and V0, and a standardized predictor (decimal year of observation) with zero mean and unit standard deviation. Our choice of a zero-mean Gaussian with standard deviation of 2.5 admits both negative and positive trends for β, expressing the contrasting trends reported for the size of ice-dam failures in recent decades1,6,19,22,72. The Lewandowski–Kurowicka–Joe (LKJ) Cholesky correlation distribution prior for R makes all correlation matrices equally likely.

For a fixed quantile, the joint posterior distribution can be thus written as

$$begin{array}{c}pi ({alpha }_{1},…,{alpha }_{J},{beta }_{1},…,{beta }_{J},kappa ,alpha ,beta ,{sigma }_{alpha },{sigma }_{beta },R|y,t,p)propto pi (kappa ,alpha ,beta ,{sigma }_{alpha },{sigma }_{beta },R)\ ,({prod }_{j=1}^{J}{prod }_{i=1}^{{n}_{j}}{f}_{{rm{A}}{rm{L}}}(,{y}_{ji}|{alpha }_{j}+{beta }_{j}{t}_{ji},p,kappa )pi ({alpha }_{j},{beta }_{j}|alpha ,beta ,{sigma }_{alpha },{sigma }_{beta },R)),end{array}$$

(13)

where (pi left({alpha }_{j},{beta }_{j}| alpha ,beta ,{sigma }_{alpha },{sigma }_{beta },Rright)) is the prior distribution for the group-level parameters as defined in equations (4)–(6) and (pi left(kappa ,alpha ,beta ,{sigma }_{alpha },{sigma }_{beta },Rright)) is the joint prior distribution for the independent population-level parameters as defined in equations (7)–(12). We numerically approximate the posterior distribution using a Hamiltonian sampling algorithm implemented in Stan73 that is called via the software package brms74 within the statistical programming language R75.

We used the same multi-level structure of the quantile regression model to determine how flood discharges have changed for individual lakes. To this end, we conditioned the model on J = 22 lakes (Qp) and J = 26 lakes (V0) that produced at least five outbursts each between 1900 and 2021. For these models, we focused only on the trends in median flood discharges, given that lakes with few reported GLOFs have few samples that exceed high thresholds.

Finally, to test whether more recent GLOFs originated from progressively higher elevations, we selected for each lake the year of the first reported GLOF, given that many ice-dammed lakes burst out repeatedly. As for our hierarchical models on GLOF magnitudes, we used quantile regression to determine the trend in the median elevation Z of glacier-dammed lakes against the decimal year of GLOF occurrence t, conditioned on the J = 6 regions.

We ran all regional and local quantile regression models on Qp, V0 and Z with three parallel chains that drew 6,000 samples with 2,000 warm-up runs each. We found no numerical divergences after warm-ups, which were supported by the Gelman–Rubin potential scale reduction factor (hat{R}=1.0) in all models, indicating that the Markov chains have converged. We report the posterior distributions of the pooled model parameters in Supplementary Tables 13, and the group-level regression slopes in Extended Data Figs. 2, 3 and 5. These regression slopes are conventionally deemed ‘credibly’ different from zero if a select interval of the posterior probability mass, in our case the 95% HDI, excludes zero. This choice of the HDI is arbitrary, and we also report how much of the probability mass of the posterior distribution is on either side of zero (Extended Data Figs. 25).

Assessing trends in doy

Measuring temporal changes in GLOF timing must consider that the response variable doy can recycle through the observation period, given that calendar day 0 is equal to day 365. The von Mises (VM) distribution is a close approximation to a wrapped normal distribution for circular data and allows estimating the angular response d (doy) from the linear predictor year t (ref. 64). The VM distribution has density

$${f}_{{rm{VM}}}(,y| {vartheta },phi )=frac{{rm{exp }}(phi ,cos (,y-{vartheta }))}{2pi {I}_{0}(phi )},$$

(14)

where y are the observed data, ϑ is a location parameter, φ is a positive precision parameter and I0 is the modified Bessel function of the first kind of order 0.

Following the parameterization in brms, we rescaled the response d to ({y}_{i}in (-pi ,pi )), standardized t to a zero mean and unit standard deviation, and estimated the trend in GLOF timing as

$${y}_{ji}sim {rm{V}}{rm{M}}({vartheta }_{ji},{phi }_{j}),,{rm{f}}{rm{o}}{rm{r}},j=1,,ldots ,,J,{rm{a}}{rm{n}}{rm{d}},i=1,,ldots ,,{n}_{j}$$

(15)

$${{vartheta }}_{ji}={zeta }_{j}+{eta }_{j}{t}_{ji},,{rm{for}},j=1,,ldots ,,J,{rm{and}},i=1,,ldots ,,{n}_{j}$$

(16)

$${zeta }_{j}sim N(0,2.5)$$

(17)

$${eta }_{j}sim N(0,2.5)$$

(18)

$${phi }_{j}sim {rm{G}}{rm{a}}{rm{m}}{rm{m}}{rm{a}}(2,0.01),$$

(19)

where yji are reported values of doy, with yji referring to the ith observation in group j, ϑji is the corresponding location parameter, and ζj and ηj are the estimated intercepts and slopes, respectively. We used the default Gamma prior in brms to estimate the precision parameter φj. This model was fit to each group separately without partial pooling of parameters across groups, because the distinct seasonal differences in GLOF timing in the two hemispheres (Extended Data Fig. 6) caused numerical divergences. Using a VM likelihood function, the joint posterior distribution can be written as

$$begin{array}{c}pi ({zeta }_{j},{eta }_{j},{phi }_{j}|{y}_{j},{t}_{j})propto ({prod }_{i=1}^{{n}_{j}}{f}_{{rm{V}}{rm{M}}}({y}_{ji}|{zeta }_{j}+{eta }_{j}{t}_{ji},{phi }_{j})),pi ({zeta }_{j},{eta }_{j},{phi }_{j}),\ ,,,{rm{f}}{rm{o}}{rm{r}},j=1,,ldots ,,Jend{array}$$

(20)

where yj and tj are the observations and corresponding years of group j and (pi ({zeta }_{j},{eta }_{j},{phi }_{j})) is the joint prior of the regression parameters. We maintained the set-up of the Hamiltonian sampler in brms, using three parallel chains each with 6,000 samples after 2,000 warm-up runs. We found that (hat{R}=1.0) in all runs, indicating convergent Markov chains. We report the posterior distributions of all model parameters in Supplementary Table 4.

Estimating local trends between glacier elevation change and V
0 and Q
p

The length and thickness of the glacier dam are important diagnostics for predicting flood magnitudes and timing76. Changes in dam geometry could therefore alter the pattern of lake outbursts, in both frequency and magnitude. In QGIS, we manually trimmed the area of all glaciers in the RGI V6.0 that impound a lake to the area below the dam. We define a dam as the area between the glacier terminus and the elevation contour on the glacier directly upstream of the lake. We then extracted the cumulative elevation change of all glacier dams between 2000 and 2019, following the procedure by ref. 8. In summary, this method first generates DEMs from all available stereo-pairs of the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) satellite instrument intersecting glaciers during the period of 2000 to 2019. We stacked the ASTER DEMs in temporal order and further added high-resolution data from the ArcticDEM77 to the time series. Data from ArcticDEM cover glacier dams in Alaska, Iceland and Scandinavia (Supplementary Table 6). Each pixel in the DEM time series is then interpolated using a Gaussian process regression model to estimate the elevation at any date within this period. We extracted the elevation time series for the areas of the glacier dams, resulting in an area-weighted average of 45 independent DEMs between 2000 and 2019 for each dam. Individual glacier dams are covered by at least 95% of measured pixels, drawing on at least 13 independent DEMs per pixel. We calculated the associated elevation differences between successive time steps, interpolated spatial data gaps hypsometrically78 and finally aggregated them to the cumulative mean elevation changes with respect to 1 January 2000. On average, our glacier dams thin at a rate of 3 m yr−1 during the period 2000–2019, with an average uncertainty of ±0.23 m yr−1 (95% confidence level). The elevation change uncertainties were estimated using a framework that accounts for heteroscedasticity (that is, variability in precision), which more reliably describes errors depending on terrain slope and the quality of stereo-correlation. The framework also accounts for spatial correlation of errors using multiple correlation ranges that better describe ASTER instrument noise79. These uncertainties were extensively validated using independent high-precision data from NASA’s Ice, Cloud and land Elevation Satellite (ICESat) and IceBridge missions, as well as LiDAR acquisitions8. Supplementary Table 6 reports statistics on spatial coverage, the temporal density of DEM data, thinning rates between 2000 and 2019, and associated uncertainties for each glacier dam.

To assess the impact of glacier thinning on GLOF magnitudes, we selected only J = 13 glacier lakes that burst out repeatedly (n > 5) between 2000 and 2019. For each dated outburst within a GLOF cycle, we obtained the closest temporal estimate of the cumulative elevation change of the glacier dam. We then fitted a hierarchical quantile regression model, that is, the same as in equations (1)–(13), to estimate the median of reported V0 and Qp at a given glacier dam from the associated cumulative elevation change h. As we did when calculating temporal trends of V0, Qp and Z, we chose an AL likelihood to estimate the conditional median of V0 and Qp, with the same (hyper-)prior distributions as in equations (7)–(12). We ran three parallel chains with 6,000 samples and 2,000 warm-up runs and found no numerical divergences. We report the posterior distributions of all model parameters in Supplementary Table 5.

Source link