May 11, 2024
Global hotspots of traded phylogenetic and functional diversity – Nature

Global hotspots of traded phylogenetic and functional diversity – Nature

Trade data

To determine whether a species was traded we used the global terrestrial vertebrate dataset compiled by Scheffers et al.2, filtering for birds and mammals. This contains extensive data from CITES and the International Union for Conservation of Nature (IUCN) red list. The IUCN list was generated via text-string search and manual reading to confirm trade. All species listed on CITES Appendix II as being ‘look-alike species’ and that are not traded themselves were considered as not traded in this study; eight species recently listed as extinct or extinct in the wild were removed. This resulted in a database of 4,265  avian and 1,189 mammalian species known to be traded from a total of 10,267 avian and 5,419 mammalian species (see Supplementary Table 9 for numbers of species used in each analysis). From this dataset we also extracted information on whether a species was traded as a product (that is, dead when used) or as a pet (that is, alive when used). A species can be traded as both a product and a pet.

Spatial analyses

We divided the world into 111 × 111 km2 grid cells using a cylindrical equal-area projection and removing coastal cells consisting of less than 30% land. Species range maps were obtained from the IUCN Red List48 and superimposed onto this grid, with their presence/absence within each cell being recorded. A species was recorded as being present if its distribution overlapped any part of the cell. Each taxon was recorded as either traded or not (4,265 traded avian and 1,189 traded mammalian species). To compare geographical patterns in trade across biologically meaningful regions, we assigned each grid cell to one of the 11 biogeographical realms classified by Holt et al.49.

Hotspots of traded PD and EDGE richness

In calculation of PD we used the most comprehensive global, time-calibrated species-level phylogenetic trees available. For birds this was derived from Jetz et al.50, overlaid on a Hackett family-level backbone containing 9,993 species; for mammals, the phylogenetic tree provided by Upham et al.51 containing 5,325 species. Nomenclature of species was standardized according to the corresponding phylogenies, resulting in phylogenetic analyses undertaken using 9,792 avian species and 5,325 mammalian. In total, 432 avian and 94 mammalian species were lost from the dataset used for phylogenetic analyses because they were not present in the phylogenetic trees.

To determine the hotspots of traded PD, Faith’s PD (from now on, PD, which is the sum of all branch lengths on a phylogenetic tree for a given set of species4), was calculated for each grid cell using the R package picante52. For birds and mammals, separately, this was calculated for all species within a grid cell followed by traded species only. To assess whether there are regional differences across different types of trade, PD was finally calculated for species traded as pets and those traded as products. To account for phylogenetic uncertainty, PD was calculated using 500 trees (for birds, 250 based on the Ericson backbone and 250 based on the Hackett backbone), with the median value for each grid cell being selected. Hotspot thresholds were set at the top 25 and 5% of grid cells, as per Scheffers et al2, to identify where trade is highest and to provide a measure of spread between these hotspots.

Because PD is correlated with species richness, to control for this and identify regions where a greater proportion of PD is traded than would be expected, we calculated ses.PD using the R package picante52. ses.PD compares the observed PD (PDobs) with that of null communities having the same species richness (equation (1)) to assess whether the observed PD is overdispersed (greater) or underdispersed (lower) in comparison with what would be expected under the null expectation of PD (PDexp) for a given community, and is calculated as

$$({rm{Observed}},{rm{PD}}-{rm{mean}},{rm{expected}},{rm{PD}})/{rm{SD}}({rm{expected}},{rm{PD}})$$

(1)

The expected PD values for each grid cell were determined by calculating the mean PD of 999 random null communities. Null communities were generated by randomization of species at the tips of the phylogeny but restricted to the regional pool of species present in the biogeographical realm of a given grid cell, giving geographically plausible null communities with species richness maintained. Due to the computational load of calculating ses.PD, the PD of null communities was calculated using the median value from 200 phylogenetic trees as opposed to 500 for other metrics (for birds, 100 based on the Ericson backbone and 100 based on the Hackett backbone). Hotspot thresholds were again set at the top 25 and 5% of grid cells.

For all avian and mammalian species, evolutionary distinctiveness was calculated using the ‘fair proportion’ method in the R package picante52. This method divides the value of each branch length of a phylogenetic tree by the number of species at its tip53. Evolutionary distinctiveness measures the isolation of a species on a phylogenetic tree, usually expressed in units of time (per million years)53. As with PD, to ensure that our results were robust to phylogenetic uncertainty, evolutionary distinctiveness values were calculated using 500 trees (for birds, 250 based on the Ericson backbone and 250 based on the Hackett backbone), with the median value for each species being selected. Using these evolutionary distinctiveness metrics, EDGE scores for all species in the phylogeny (excluding species listed as data deficient on the IUCN Red list) were calculated by weighting a species’ evolutionary distinctiveness by its IUCN threat status (equation (2)). We used the method proposed by Isaac et al.53, whereby global endangerment is a species IUCN Red list Category, weighted as follows: least concern, 0; near threatened, 1; vulnerable, 2; endangered, 3; critically endangered, 4. Data-deficient species were excluded (see Supplementary Table 9 for number of species used):

$${rm{E}}{rm{D}}{rm{G}}{rm{E}}=log (1+{rm{E}}{rm{D}})+{rm{G}}{rm{E}}times log (2)$$

(2)

Following calculation of EDGE scores for all species, the values of each metric for traded species present within each grid cell were summed to measure the cumulative levels of EDGE traded within each cell. As with PD, this process was additionally undertaken for all species within each cell to allow for comparisons between the two.

The top 25% of EDGE traded species was then extracted (see Supplementary Table 9 for species numbers) and their range maps overlaid to calculate their species richness in each grid cell. This was done separately for birds and mammals, allowing identification of the regions with the highest richness of traded EDGE species. To determine whether these regions differ from overall species trends, this process was then repeated using the top 25% of all EDGE species (see Supplementary Table 7 for species numbers). We also generated richness maps for those traded as pets and those as products. Hotspot thresholds were again set at the top 25 and 5% of grid cells with the highest species richness levels. We repeated this process using species evolutionary distinctiveness, and present these results in Extended Data Fig. 7.

Regional differences in proportion traded

Along with identification of hotspots for PD, evolutionary distinctiveness and EDGE, we also tested whether biogeographic realms differed in the proportion of each metric that was traded. To assess this, the proportions of PD, cumulative EDGE and cumulative evolutionary distinctiveness traded per grid cell were first calculated and then we fitted beta-regression models using the R package betareg54. Because our data did not fulfil the assumption that all values must fall between 0 and 1 (in some grid cells 0 and 100% of the community was traded), we transformed the proportions using the method proposed by Smithson and Verkuilen55 whereby n is sample size and y is the proportion of the respective measure subject to trade (equation (3)). The models were fit with biogeographic realm as the sole fixed effect. Model fit was assessed via diagnostic plots following Cribari-Neto and Zeilis54. Likelihood ratio tests were used to assess whether the effect of biogeographic realm was significant, followed by post hoc Tukey tests to evaluate differences between specific realms:

Hotspots of traded FD

Functional trait data for birds and mammals were extracted from Wilman et al.56 and assigned at species level. Four functional traits were used to calculate FD: (1) body mass (log transformed); (2) dietary composition; (3) foraging strata; and (4) activity period (details given in Supplementary Table 10). These traits cover a large proportion of Eltonian niche space56, providing information on multiple aspects of resource use and ecosystem interactions, such as the quantity and type of resources consumed by each species and where, within ecosystems, these interactions take place. They are thus representative of important functional dimensions of both birds and mammals.

Although all bird species within our phylogeny had trait data, 351 mammalian species present in the Upham phylogeny were missing trait data. For these species, missing traits were phylogenetically imputed using maximum-likelihood ancestral state reconstruction with the R package Rphylopars57 assuming Brownian motion. This imputation method increases the accuracy of the estimation of missing traits in comparison with other imputation approaches that do not consider phylogenetic information.

To ensure that this imputation process was appropriate we undertook the following steps. First we checked whether the traits showed a phylogenetic signal. For the two continuous traits (body mass and proportion of diet) we calculated Pagels lambda using the R package phytools58 and tested whether this was significantly different from the scenario in which the trait had evolved randomly. For the two discrete traits (foraging strata and activity period) we calculated the D-statistic using the R package caper59. The phylogenetic signal of all traits significantly differed than if the trait had evolved randomly (values presented in Supplementary Table 10).

Following this, we fitted three phylogenetic linear models (pGLMs) using the R package Rphylopars57. Each pGLM was fitted using using a different evolutionary model: Brownian motion, Pagel’s lambda, Ornstein–Beck and Kappa. When compared using the Akaike information criterion, the pGLM using a Brownian motion evolutionary model was found to fit our data best and this was the one selected to conduct our imputations. The categorical foraging stratum trait was dummy coded, setting ground, scansorial or aerial foraging to 1 for species that forage on each respective stratum. If imputed values for a species were 0 across all three foraging strategies, that species was set as arboreally foraging.

Given that simulations have shown that phylogenetic imputation is not always suitable even when phylogenetic signal is strong60, we then performed leave-one-out cross-validation on the pGLM to assess the accuracy of predictions. The results from this are presented in Supplementary Table 11. We evaluated the accuracy of our predictions using the mean absolute error and prediction coefficient as defined in ref. 61. Following this, all imputed traits were checked to ensure that they contained values consistent with a given trait type (for example, imputed dietary traits when rounded to the nearest 5 should sum to 100 to represent the proportion of a species diet). Through this, we identified the western sucker-footed bat (Myzopoda schliemanni) as having errors in its imputed trait values given that its predicted dietary traits summed to less than 10% of the overall proportion. Given this, we removed this species from our FD analyses. Finally we manually checked the imputed values of a random subset of 100 species to ensure that they were plausible given the information available on the species in the scientific literature. Following these processes, traits of 350 mammalian species were imputed and used in our functional analyses. This ultimately led to 15 avian and 81 mammalian species in our dataset being dropped from the FD analyses due to a lack of trait data and not being present in our phylogenies (Supplementary Table 9).

The metric used for FD was functional richness (from now on, FD), as described by Villéger et al.62, and has been used in similar global-scale analyses21. This index relies on placing a species within a multidimensional niche space in which the axes represent a combination of traits. FD quantifies the volume of this niche space occupied by the convex hull of a given set of species62. Higher FD values thus indicate a community having a wider range of trait values. FD relies on the assumption that species richness is greater than the number of traits, and thus cells with fewer than four species were removed. Because a combination of categorical and continuous traits was used, a pairwise species dissimilarity matrix of Gower distances was first calculated using the R package gawdis63, weighting traits so that each trait value contributed equally to the dissimilarity matrix. Principal coordinate analyses were then run, using three principal coordinate analyses axes, to gain the transformed coordinates, which were then used to calculate FD in the package mFD64. Hotspot thresholds were once again set at the top 25 and 5% of grid cells.

Regional differences in proportion traded

Regional differences in the proportion of FD were also measured as above.

ses.FD

Given that FD is also correlated with species richness62, to account for this we also calculated ses.FD of each grid cell following the same process as with ses.PD above.

Precautionary re-analysis

Our primary analysis uses all species identified as being in trade to identify epicentres of traded PD, EDGE and FD. This approach reduces potential omission-driven errors because it captures all locations in which species may be traded. However, it has the potential to introduce commission-driven errors in the identification of epicentres by inclusion of species that are not actually traded across their entire range. This may be a particular problem for species with very large ranges that are traded across only a smaller portion of that range. We thus repeated all geographic analyses (PD, EDGE and FD), using only realm-endemic species, to substantially reduce the risk of commission-driven errors. The results of this are presented in full in Supplementary Information. However, we caveat that this re-analysis could introduce omission-driven discrepancies, in which species are traded across much of their range that spans realms or in which widespread species are traded in an area appropriately identified as an epicentre of trade (and not traded in areas that were not identified as epicentres) in our primary analysis, resulting in the loss of that epicentre.

Prevalence of dietary or foraging traits in trade

Bayesian phylogenetic multilevel models were used to investigate whether any dietary or foraging-activity traits are associated with a species presence in trade. All species present in the global phylogenies were included in the models (9,835 avian and 5,325 mammalian species). We fit models using a Bernoulli distribution (logit link function) using the package brms65,66. Species presence in trade was the response variable, with each of the functional traits used in our functional diversity analyses (Supplementary Table 8) as explanatory variables (fixed effects). Dietary and functional traits represented as proportions were set as binary if they represented over 25% of a species’ diet/foraging stratum (Supplementary Table 4). Although body mass has already been identified as a key predictor of trade2, it was included as a fixed effect to account for correlations between size and other functional traits, and to assess whether the effect of body mass is still present when other functional traits are considered. log(Body mass) was standardized to have a mean of 0 and standard deviation of 1, to allow comparison with other traits in the model. Given that trade shows a phylogenetic signal2, the likelihood of a species being traded is non-independent and hence we computed a phylogenetic covariance matrix where the diagonal elements are equal to 1 using the R package ape67. The phylogenetic dependence of species was thus included as a random effect using this matrix.

Models were run with 4,000 iterations and 2,000 warm-up iterations in four Markov chains. All priors are zero-centred and diffuse to regularize parameter estimates and still explore plausible parameter space. A normal (0, 0.5) prior puts a priori weight on there being a 0.5 probability (inverse logit of 0) of the reference category being traded with a standard deviation of 0.5 on the logit scale. This incorporates almost the full range of values between 0 and 100% of being traded, without putting unnecessary weight on extremely high or low values: for example, an intercept prior centred on 3 on the logit scale would reflect the a priori expectation that over 95.2% of the reference category is traded. To ensure that chains were mixing and reached stable convergence, both models were visually assessed. Rhat (potential scale reduction values) values are below 1.02 for all model parameters, indicating convergence of both between- and within-chain estimates. Finally, to assess model adequacy, posterior predictive checks were undertaken using the predictive distribution in the R package Bayesplot68. These first compared our response variable with the simulated predictions from the model to ensure that the model had faithfully captured response distribution. We further checked the mean of the simulated data distribution with our response data to ensure that it was accurately recovered from the posterior. Finally we checked that no underlying patterns or discrepancies were present in the predictive error of predictive distribution.

For measures of uncertainty, the posterior distributions of each trait were summarized using medians and the 90% credible interval (highest density intervals). To assess the effect of functional traits, MPE estimates were computed for coefficients using the R package bayestestR69. MPE estimates—which range from 0.5 to 1.0—indicate the certainty of the direction of an effect and are generated from posterior distributions69. This index is highly correlated with the commonly used frequentist one- and two-sided P values and can therefore be useful for interpretation70. The MPE of parameters, alongside summaries of posterior distributions, was thus used to interpret the effect of having particular functional traits on a species’ likelihood of being traded. We assessed an MPE as being substantial where the probability of an effect going in a certain direction was over 97.50%. All analyses were undertaken using R v.4.2.1 (ref. 71).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Source link