Local species richness may be declining after all

Recently two papers seemed to turn what we thought we knew about changes in biodiversity on their head. These papers by Vellend et al. and Dornelas et al. collated data from multiple sources and suggested that species richness at local scales is not currently declining. This was counter-intuitive because we all know that species are going extinct at unprecedented rates. However, it is possible that the introduction of non-native species and recovery of previously cultivated areas may offset extinctions leading to relatively little net change in local species richness.

This week a paper has been published that calls these findings into question. The paper by Andy Gonzalez and colleagues published in the journal Ecology, suggests that there are three major flaws with the analyses. These flaws mean that the answer to the question ‘Is local-scale species richness declining?’ currently remains unanswered and is unanswerable.

The papers of Vellend et al. and Dornelas et al. were meta-analyses of previously published papers. One issue with meta-analysis is that it is very prone to bias. Like any study if the samples (in this case ecological studies) are not representative of the population (in this case locations around the globe) then any results will be flawed. To test the representativeness of the datasets used by Vellend and Dornelas Gonzalez et al. examined how well they represented biodiversity and threats to biodiversity. This analysis (see below) showed that the papers were not representative of biodiversity or the threats faced by biodiversity (though curiously, the analysis of Dornelas et al. showed an overrepresentation of areas highly impacted by human impacts).

Figure 1 – Spatial bias of the Vellend et al. (2013) and Dornelas et al. (2014) data syntheses. For more information see the paper by Gonzalez et al. (2016).

The paper also suggests that using short time series can underestimate losses. By analysing the effect of study duration and changes in species richness (see below) Gonzalez et al. claim that increases in study duration were correlated with a decline in species richness. This supports previous theory which suggests that there is often a time lag between disturbance events and species extinctions – termed ‘extinction debt.’ However, I’d be intrigued to see the results of removing the studies with the longest duration from this analysis since the authors admit that the analysis is sensitive to their inclusion. I’ve seen recent similar work that suggests the same kind of relationship might be seen for studies monitoring individual animal populations.

Figure 2 – The effect of study duration on apparent changes in species richness.

Thirdly, Gonzalez et al. assert that including studies in which ecosystems were recovering from disturbance (e.g. regrowth on former agricultural fields) without taking into account historical losses that occurred during or after the disturbance biases estimates of change. The paper by Vellend et al. in particular combined studies of the immediate response of biodiversity to disturbances such as fire and grazing along with studies of recovery from the very same disturbances. Gonzalez et al. show that once studies of systems that were recovering are removed from Vellend et al’s analysis there is a negative trend in species richness changes.

The biases prevalent in the Vellend and Dornelas papers lead to Gonzalez et al. to suggest that the papers cannot conclude what the net changes in local species richness are at a global scale. However, they note that the results of Dornelas and Vellend are in sharp contrast to other syntheses of biodiversity changes which used reference undisturbed such as those by Newbold et al. and Murphy and Romanuk which reported average losses of species richness of 14 and 18% respectively.

In their conclusion Gonzalez et al. suggest that though meta-analysis is a powerful tool, it needs to be used with great care. Or to put it another way, with great power comes great responsibility. As someone who regularly uses meta-analysis to form generalisations about how nature works I completely agree with this statement. Traditionally scientists have used funnel plots (graphs with study sample size on the y-axis and effect size on the x-axis) to identify biases in their analyses. I’ve always been skeptical of this approach, especially in ecology where there is always a large amount of variation between sites. In the future syntheses would do well to follow the advice of Gonzalez et al. and really interrogate the data they are using to find any taxonomic, geographic, climatic or any other biases that might limit their ability to generalise. I know it’s something I’ll be taking more seriously in the future.

Gonzalez et al. also point out that most ecological research is carried out in Europe and North America. If we want to monitor biodivesity we need to increase efforts in biodiverse tropical regions, as well as boreal forests, tundra and deserts. We need to identify where these gaps need filling most and then relevant organisations need to prioritise efforts to carry out monitoring. I am positive that this can be achieved, but it will cost a lot money, needs to be highlighted as a priority and will ned a lot of political good will. Even with this effort some of the gaps in biodiverse regions, such as the Democratic Republic of Congo, will be extremely difficult to fill due to ongoing armed conflict

My take-home message from this paper is that we need to be more careful about how we do synthesis. However, I also think that species richness isn’t the only metric that we should focus on when talking about biodiversity change. Studies have shown that measures of the traits of species present in a community are generally more useful for predicting changes in ecosystem function than just using species richness. Species richness is the iconic measure of biodiversity, but it probably isn’t the best. Ecologists should view species richness in the same way as doctors view a thermometer – it’s a useful tool but you still need to be able to monitor blood pressure, take biopsies and listen to a patient’s lungs before you diagnose them*.



*Thanks to Falko Bushke whose analogy I stole from a comment he made on my blog post here.


Guest post: Responses of functional groups to forest recovery suggests irreplaceability of old-growth forests

Today we have a guest post from Becks Spake who is doing her PhD at Southampton University (as an aside, Becks in currently on the lookout for a post-doc, so feel free to get in touch with her). Last week a paper she and I (and others) have been working on together on recovery of temperate and boreal forests was published in Conservation Biology. I think it’s a really neat bit of work that emphasises the importance of old-growth forests for biodiversity. I’ll leave Becks to tell you the rest.

Forest restoration measures are being increasingly implemented worldwide to enhance biodiversity and ecosystem service provision in degraded landscapes. These measures range from active restoration, such as planting, to passive restoration, where natural recovery is promoted following the removal of some environmental stressor such as grazing. Forest restoration is also used as a biodiversity offsetting mechanism to mitigate the loss of habitat incurred by development; such ‘restoration offsets’ generate new habitat at an offset site to compensate for the loss of habitat to development at the impact site.

Concerns about the value of replacing old-growth forest with plantations and young regenerating forest has motivated research on biodiversity recovery as forest stands age. Several reviews have quantified the recovery times required for biodiversity, including measures of species diversity and composition, to reach equivalence to some reference state, typically undisturbed old-growth forest. Such reviews have been produced for relatively charismatic taxonomic groups recovering in secondary tropical forests (e.g. Dunn 2004; Chazdon et al. 2009; Martin et al. 2013*). These syntheses have made important contributions to restoration science, showing that different taxonomic groups exhibit different patterns and rates of recovery with stand age and that these must be acknowledged by forest management strategies.

In our study, we attempted to assess the recovery of species richness in restored forests outside of the tropics. We synthesised data from empirical studies measuring species richness differences between old-growth and secondary forest in temperate, boreal and Mediterranean regions. We focussed on studies that investigated species-rich functional groupings of fungi, beetles and lichens (Fig.1), due to their relative unsexyness, and consequent under-representation in existing reviews, their importance to ecosystem function, and sensitivity to stand-level processes.


Fig.1 – Species from the functional groupings investigated. The functional groups were: saproxylic and non-saproxylic beetles, epiphytic lichens and deadwood, litter, and ectomycorrhizal fungi. Photos: Simon Currie.

Key findings

1. Functional group-specific responses to forest recovery

We found that functional groups responded differently to forest recovery (Fig. 2). Ectomycorrhizal fungi averaged 90 years for recovery to old-growth values, and epiphytic lichens took 180 years to reach 90% of old-growth values. Non-saproxylic beetle richness, in contrast, decreased as stand age of broadleaved forests increased.


Fig. 2 – Figure 1. Influence of stand age on percent change in species richness for 7 functional groups in planted and secondary forest relative to old-growth forest stands (horizontal dashed line, no difference between undisturbed old-growth forest and treatment [planted and secondary] forest stands; gray, 95% prediction intervals based on uncertainty in fixed effects only.

2. Pseudo-replication is widespread amongst empirical research investigating forest management impacts on biodiversity

Our systematic review yielded just 33 publications (90 individual studies) in which old-growth was compared with planted or secondary forests in a statistically robust way. For some functional groups, this led to small sample sizes and low precision in lnR values (Fig. 2). This low number of studies was due to the fact that a high proportion of studies were pseudo-replicated, with a lack of independence across replicates. Many forms of pseudo-replication exist (Fig. 3), the most common being simple segregation sensu Hurlbert (1984), in which multiple samples from a single contiguous treatment unit are analysed as if they were independent replicates that were interspersed with control replicates (Fig. 3 B-1). Differences between control and treatment replicates that are simply segregated cannot be unambiguously distinguished from other sources of spatial variation.


Figure 3. Schematic representation of various acceptable modes (A) of interspersing replicates (boxes) of two treatments (shaded, unshaded) and various ways (B) in which the principle of interspersion can be violated. From Hurlbert (1984).

Despite the widespread recognition of the problems associated with pseudo-replication, it still features prominently in peer-reviewed studies. The situation is the same in the tropics; Ramage et al. (2013) reviewed recent studies of the effects of logging on biodiversity in tropical forests (n = 77) and found that 68% of the studies were definitively pseudoreplicated, and only 7% were definitively free of pseudoreplication. Whilst data from pseudoreplicated studies can inform management in very local contexts, conclusions from these studies must not be generalised and data must not be included in meta-analyses.
Conservation implications

The primary goal of biodiversity offsetting is to achieve no net loss of biodiversity. Our results show that through restoration offsetting, this goal is unachievable within a reasonable time frame. The slow recovery of species richness for some functional groups essential to ecosystem functioning makes old-growth forest an effectively irreplaceable biodiversity resource that should never be compromised by development.

Our findings support the value of protecting old-growth forest through reserve creation, but also recognise the potential for planted and secondary forests to support biodiversity, if given enough time to recover. Our findings therefore also support the setting aside of overmature planted forest for biodiversity conservation, and the implementation of schemes that extend rotation-length of planted forests within production forest landscapes.

* For my blog post about this paper see here.

Logging intensity drives species richness loss

An area bigger than the entire Indian landmass is now used for timber production in the tropics. This logging is largely selective and leads to degradation with loss of specialist species and ecosystem services like carbon storage. However, many have also argued that these forests should be considered a priority for protection since they are at danger of conversion to other land-uses such as agriculture. In addition the impact of logging on tropical forest biodiversity appears to be less negative than other human impacts.

However, simply saying that logging is a less damaging option when compared to other way in which humans exploit forests misses a lot of what is going on. Logging operations differ massively from place to place in terms of the volume of trees cut for timber, the area affected by logging, the distance between logged and unlogged areas, I could go on… All of these differences have the potential to influence the the effect of logging on biodiversity.

Continue reading

How can we value the studies used in meta-analysis?

Not by doing this to primary researchers. So let’s change things. Photo credit to Killer Cars on flickr.

I signed a letter this week asking ISI, Google scholar and Scopus to recognise the articles used in meta-analyses as if they were regular citations. I and many other people who use data that we haven’t collected feel that those that did the primary research are not being fully recognised and given enough credit. Research is ranked by citations and it is perverse to award someone a gold star for getting a citation that may support a single statement, but not for supplying data forms the basis of an entire study.

I agree with what I signed but, even if successful, it will take a while to implement.

My question is: What should I do about the problem that will make a difference now?

As far as I can see I have three options, none of the them perfect:

  1. Continue as before, ignoring this issue
  2. Cite papers I used in the main text so credit is given to primary researchers
  3. Offer co-authorship to those authors that provided me with data

Really I don’t know which is best.

The first would be the easiest to do and I’m sure many researchers will continue to do this – their lives are already complicated enough. I’m not really happy doing that though – it undermines valuable work by people in the field, without whom I wouldn’t have a job.

The second, for me, will never really work. I have a meta-analysis that I recently carried out that has >80 papers as data sources. I couldn’t cite these papers unless I wanted to have a reference list of >100 papers. This kind of thing doesn’t make publishers happy.

The third seems to me like a good compromise, but is the most difficult to do. For example, I am currently working on something using the data of others that has potentially controversial conclusions. What do I do in this case? Do I offer people co-authorship, even though they may well disagree with me about my analysis and conclusions?

It’s been running through my head for a while now and I’d like to get a few opinions from others about this. If you think any of these options is particularly appealing, tell me. Or do you have other ways to fix this problem in the short term? What would you do?

Whatever your thoughts, give me some feedback so I can work out the best path to take and please sign the open letter.

Using metafoR and ggplot together:part 2

In the last post I introduced you to the basics of meta-analysis using metafor together with ggplot2. If you haven’t seen that post, I suggest you go back and read it since this post will be hard to follow otherwise.

In this post I will concentrate on trying to explain differences between different sites/studies.

Going back to our previous example, we saw that logging tended to have a negative effect on species richness.

Luckily my completely made up data had information on the intensity of logging (volume of wood removed per hectare) and the method used (conventional or reduced impact).

We can make a model looking at the effects of logging intensity and method on species richness. First we can start with our most complex model


This spits out the following:

Mixed-Effects Model (k = 30; tau^2 estimator: ML)

  logLik  deviance       AIC       BIC
-15.8430  121.7556   41.6860   48.6920

tau^2 (estimated amount of residual heterogeneity):     0.1576 (SE = 0.0436)
tau (square root of estimated tau^2 value):             0.3970
I^2 (residual heterogeneity / unaccounted variability): 96.52%
H^2 (unaccounted variability / sampling variability):   28.74

Test for Residual Heterogeneity:
QE(df = 26) = 1151.1656, p-val &amp;lt; .0001

Test of Moderators (coefficient(s) 2,3,4):
QM(df = 3) = 38.2110, p-val &amp;lt; .0001

Model Results:

                     estimate      se     zval    pval    ci.lb    ci.ub
intrcpt                0.1430  0.3566   0.4011  0.6883  -0.5558   0.8419
Intensity             -0.0281  0.0099  -2.8549  0.0043  -0.0474  -0.0088  **
MethodRIL              0.1715  0.4210   0.4074  0.6837  -0.6536   0.9966
Intensity:MethodRIL    0.0106  0.0138   0.7638  0.4450  -0.0166   0.0377

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Looking at our residual plots, they seem to conform to our assumptions of constant variance.


Our models suggests that the more wood is extracted from a forest the greater the negative effect on species richness, but that seems to be little evidence about the effect of method or the the interaction between intensity and method.

So, lets get rid of the interaction and see what happens:



Mixed-Effects Model (k = 30; tau^2 estimator: ML)

  logLik  deviance       AIC       BIC
-16.1323  122.3341   40.2646   45.8694

tau^2 (estimated amount of residual heterogeneity):     0.1606 (SE = 0.0443)
tau (square root of estimated tau^2 value):             0.4008
I^2 (residual heterogeneity / unaccounted variability): 96.75%
H^2 (unaccounted variability / sampling variability):   30.80

Test for Residual Heterogeneity:
QE(df = 27) = 1151.4310, p-val &amp;lt; .0001

Test of Moderators (coefficient(s) 2,3):
QM(df = 2) = 36.9762, p-val &amp;lt; .0001

Model Results:

           estimate      se     zval    pval    ci.lb    ci.ub
intrcpt     -0.0416  0.2648  -0.1570  0.8753  -0.5606   0.4774
Intensity   -0.0228  0.0070  -3.2617  0.0011  -0.0364  -0.0091  **
MethodRIL    0.4630  0.1803   2.5681  0.0102   0.1096   0.8163   *

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Now both intensity and method are statistically significant, but method less so. Let’s take it out:



Mixed-Effects Model (k = 30; tau^2 estimator: ML)

  logLik  deviance       AIC       BIC
-19.1526  128.3747   44.3052   48.5088

tau^2 (estimated amount of residual heterogeneity):     0.1962 (SE = 0.0536)
tau (square root of estimated tau^2 value):             0.4429
I^2 (residual heterogeneity / unaccounted variability): 97.45%
H^2 (unaccounted variability / sampling variability):   39.27

Test for Residual Heterogeneity:
QE(df = 28) = 1258.6907, p-val &amp;lt; .0001

Test of Moderators (coefficient(s) 2):
QM(df = 1) = 25.2163, p-val &amp;lt; .0001

Model Results:

           estimate      se     zval    pval    ci.lb    ci.ub
intrcpt      0.4678  0.1927   2.4276  0.0152   0.0901   0.8455    *
Intensity   -0.0324  0.0065  -5.0216  &amp;lt;.0001  -0.0451  -0.0198  ***

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Unsurprisingly, intensity is now more significant. However, we have taken out method which may have some value in explaining differences. The tool I use most for model selection is AIC. This basically aims to get the best fitting model with the least explanatory variables. This is why I did the last step of removing method. It could be that the effect of method is significant but adds relatively little explanatory power to our model.

Let’s test this by calculating their AICs.



[1] 41.68604
> AIC(Model2)
[1] 40.26459
> AIC(Model3)
[1] 44.30516

We choose the model with the lowest AIC, in this case Model2.

To calculate the fit of our model, we can compare it to our intercept only model (the original meta-analysis from part 1), but we have to set method=”ML” since REML is a poor estimator of deviance.


Using the two Tau squared statistics we can calculate the proportion of deviance not explained by the intercept only model using the sum:
1-\frac{tau^{2}_{Model2}} {tau^{2}_{ROM.ma2}}

in.plot3<-in.plot2+ylab("log response ratio")+xlab("Logging intensity (m-3 ha-1)")

Very pretty.

Circle size is relative to weight for each study, since we weight the most precise studies most heavily. Blue points and line refer to reduced impact logging sites, while red refers to conventionally logged sites.

We can also transform it all to proportions for more intuitive interpretations:

in.plot3<-in.plot2+ylab("Proportional change\nin species richness")+xlab("Logging intensity (m-3 ha-1)")

The graphs show that, for my made up data, reduced impact logging has a less negative effect than conventional logging on species richness when carried out at similar intensities but that increasing intensity leads to reductions in species richness for both conventional and reduced impact methods.

I think that wraps it up for now. I hope you’ve found this useful. As ever any comments or questions are welcome.
I will be back to writing non-stats posts again soon!

Using metafoR and ggplot together:part 1

I really like meta-analysis, as some of you might already  know.

It has it’s critics but used well it can help us achieve one of the major goals of applied ecology: generalisation. This way we can say how we think things generally work and draw conclusions bigger than  individual studies.

However, for some reason people have got it into their head  that meta-analysis is hard. It’s not. The hardest thing is the same as always, coming up with a good question/problem.

However, for a first timer it can seem daunting. Luckily there are a few good R packages for doing meta-analysis, my favourite of which is metafor.

The aim of these posts is to take some of the mystery out of how to do this and was prompted by a comment from Jarret Byrnes on a previous blog post.

In the first post I will deal with effect sizes, basic meta-analysis, how to draw a few plots and some analyses to explore bias. In the second I will go through meta-regression/sub-group analysis and a few more plots. It shouldn’t be too hard to understand.

For this I will use a dataset based on some work I am doing at the moment looking at the effects of logging on forest species richness. However, I don’t really feel like sharing unpublished data on the web so I’ve made some up based on my hypotheses prior to doing the work.

First of all we need to install the right packages and open them



Next we need the data, download it from here and get r to load it up:


This file contains data on species richness of unlogged (G1) and logged (G2) forests, it’s not perfect as all of the data is made up – it should be useful for an exercise though.

First we need to calculate the effect size. This is essentially a measure of the difference between a treatment and control group, in our case logged (treatment) and unlogged (control) forests. We can do this in different ways, the most commonly used in ecology are the standardised mean difference (SMD) and log mean ratios (lnRR).

The SMD is calculated like this:

\frac{\bar{X_{1}}-\bar{X_{2}}} {SD_{pooled}}

where \bar{X_{1}}and \bar{X_{2}} are the different group means, SD_{pooled} is the pooled standard deviation ( I won’t explain this here but there are good explanations elsewhere). Essentially what this does is to give an effect size based on the difference between groups with more imprecise studies having a smaller effect size.

The lnRR is calculated as:


which gives the log of the proportional difference between the groups. I quite like this as it is easier to interpret than the SMD as the units can be back-transformed to percentages.

In metafor you calculate SMD by doing this:


and lnRR like this:


From now on in this post I will use the lnRR effect size.

Next we carry out a meta-analysis using the variability and effects sizes of each set of comparisons to calculate an overall mean effect size, using the idea that we want to give most weight to the most precise studies.

In metafor you do that by doing this:


This is the code for a mixed effects meta-analysis, which basically assumes there are real differences between study effect sizes and then calculates a mean effect size. This seems sensible in ecology where everything is variable, even in field sites which quite close to each other.

The code it spits out can be a bit daunting. it looks like this:

Random-Effects Model (k = 30; tau^2 estimator: REML)

  logLik  deviance       AIC       BIC
-27.8498   55.6997   59.6997   62.4343

tau^2 (estimated amount of total heterogeneity): 0.3859 (SE = 0.1044)
tau (square root of estimated tau^2 value):      0.6212
I^2 (total heterogeneity / total variability):   98.79%
H^2 (total variability / sampling variability):  82.36

Test for Heterogeneity:
Q(df = 29) = 2283.8442, p-val < .0001

Model Results:

estimate       se     zval     pval    ci.lb    ci.ub
 -0.4071   0.1151  -3.5363   0.0004  -0.6327  -0.1815      ***

Essentially what it’s saying is that logged forests tend to have lower species richness than unlogged forests (the model results bit), and that there is quite a lot of heterogeneity between studies (the test for heterogeneity bit).

We can calculate the percentage difference between groups by doing taking the estimate part of the model results along with the se and doing this:


to get things in percentages. So, now we can say that logged forests have 33% less species than unlogged forests  +/- a confidence interval of 25%.

We can summarise this result using a forrest plot in metafor, but this is a bit ugly.



What did I tell you? Pretty ugly right?

Much better to use ggplot2 to do this properly. But this is a bit of a pain in the arse:

forrest_data$Study2<-factor(forrest_data$Study, levels=rev(levels(forrest_data$Study)) )
plot2<-plot1+coord_flip()+geom_hline(aes(x=0), lty=2,size=1)+scale_size_manual(values=c(0.5,1))
plot3<-plot2+xlab("Study")+ylab("log response ratio")+scale_colour_manual(values=c("grey","black"))


There. Much better.

With a little tweak we can even have it displayed as percentage change, though your confidence intervals will not be symmetrical any more.

plot2<-plot1+coord_flip()+geom_hline(aes(x=0), lty=2,size=1)+scale_size_manual(values=c(0.5,1))
plot3<-plot2+xlab("Study")+ylab("Percentage change in richness\n as a result of logging")+scale_colour_manual(values=c("grey","black"))
plot3+theme(legend.position="none")+ scale_y_continuous(breaks=c(-50,0,50,100,150,200))


There that’s a bit more intuitively understandable, isn’t it?

Right, I will sort out the next post on meta-regression and sub-group analysis soon.

In the meantime feel free to make any comments you like.

How bad is logging for tropical biodiversity?

How bad is logging like this for biodiversity?
(Image by flickr user Wakx)

Logging of tropical forests effects an area 10 times greater than the area converted to agriculture each year. Around 400 million hectares of tropical forest have been set aside for permanent logging – an area twice the size of Russia. Or one hundred and ninety two and a half times the size of Wales – if that’s your thing*.

Shocking, right?

But just how bad is this logging?

For starters it obviously not as bad as agricultural conversion. When land is cleared for farming all trees are removed. However,  logging is generally selective – only trees that are valuable for timber are removed, though many others can be damaged in the process. These differences between logging and agricultural conversion change the structure  of ecosystems in different ways and thus effect the species that are present in them differently.

Forest converted for agriculture  is largely dominated by generalist species. Logged forest on the other hand retains some of the conservation value of undisturbed forests. However, answering just how bad logging actually is for tropical forest biodiversity is tricky.

In the biggest study of its kind Gibson et al (2011) found that logging was the least harmful of the human impacts they investigated on tropical forest biodiversity. However, this meta-analysis brought together lots of different measures of biodiversity including, population sizes, species richness, demographics and community structure and used them to come up with a single metric. Whilst this serves to give an overall understanding of ‘forest health’ following different human disturbances, it tells us little about the general changes in particular features of biodiversity.

Effect of different disturbances on tropical forest biodiversity. Boxes represent median +/- 95% confidence intervals. Taken from Gibson et al 2011
Effect of different disturbances on tropical forest biodiversity. Boxes represent median +/- 95% confidence intervals. Taken from Gibson et al 2011

The simplest measure of biodiversity is species richness. On the whole logged forests seem to have pretty similar richness to neighbouring undisturbed forests for most taxonomic groups.  Richness is not a very useful metric though. It tells us nothing about what the species are that you find in logged forests. On one hand they could all be generalist species which are not endangered. On the other they could all be endangered species. By looking at species richness alone we have no idea about these details.

This is key to working out the conservation value of this forest since conservationists usually want to protect the rarest species to stop them from going extinct. So, how good is logged forest for these species? And do the communities resemble those of unlogged forest?

The truth is we’re not sure. Some work has suggested there is little difference in the communities and numbers of endangered species, while others suggest differently. Whatever the reality a new piece of work has found that >60% of studies on the effect of logging on community composition are flawed. The paper in Conservation Biology looked at the design of studies of logging done between 2000 and 2012 and found nearly all of them had designs that meant they couldn’t differentiate the effects of logging from the potential differences in the forests even before logging. This apparently was all down to (the dreaded) pseudoreplication.

To have a properly replicated design you need the logged and unlogged sites to be scattered throughout the landscape. However, most study sites were sampled so that all the logged sites fell in one area and all the unlogged sites in another area. This means that simply because samples are close to each other they are more likely to be similar to their respective group. In tropical forests this is a problem because species composition can change over relatively short distances.

An idealised sampling design of a study looking at community composition change in logged forests
Sampling designs of a hypothetical pseudoreplicated study and an idealised well replicated study investigating community change in logged forests

In addition few studies sampled more than one area of unlogged forest to test similarity between unlogged forest communities. The authors of the article suggest a possible way to get around this problem for some studies is to determine the relationship between plot similarity and distances between them. However, this option is second best. Properly replicated studies would give us a better idea of the effect of logging on tropical forest species.

Given how large an area has been logged, and will be logged in the near future we need to work out what’s going on with these forests. Many logging companies are open to reducing biodiversity loss so they can qualify for certification such as FSC, allowing timber to be sold at a premium price. We need partnerships with these companies, like has been done with the SAFE project and oil palm companies in Malaysia. Only by doing this will we be able to produce experimentally robust designs that allow us to draw proper conclusions about the future of tropical forest species in logged forests.

*If any US citizens want this calculating as relative to Rhode island, I did it. It’s 1273.8 Rhode islands.

Inexpert opinion

This post was inspired by an amazing workshop given by Mark Burgman at the recent Student Conference on Conservation Science in Cambridge. I have done my best to get across what I learnt from it here, but it is not the final word on this issue.

Some experts. Yesterday.

It turns out experts aren’t necessarily all that good at estimation. They are often wrong and overconfident in their ability to get stuff right. This matters. A lot.

It matters because experts, particularly scientists, are often asked to predict something based on their knowledge of a subject. These predictions can be used to inform policy or other responses. The consequences of bad predictions can be dramatic.

For example, seismologists in L’Aquila, Italy, were asked whether there was a risk of threat to human life from earthquakes in the area by the media. They famously told reporters there was ‘no danger.’ They were wrong.

Not all cases are so dramatic, but apparently experts make these mistakes all the time. This has profound implications for conservation.

Expert opinion is used all the time in ecology and conservation where empirical data is hard or impossible to collect. For example the well known IUCN redlist draws on large pools of expert knowledge in determining range and population sizes for species. If these are very inaccurate then we have a problem.

Fortunately, there may be a solution.

This solution was first noticed in 1906 at a country fair. At this fair people were taking part in a contest to guess the weight of a prize ox. Of the 800 or so  people that took part nobody got the correct weight. However, the average guess was closer than most people in the crowd and most of the cattle experts. As a group these non-experts out performed the experts.

Apparently this is now a phenomena that is widely recognised.

Building on this a technique has been developed called the delphi method. It aims to improve peoples estimates by getting them to make an estimate, discuss it with other people in their assigned group and then make another estimate. You then take the mean estimate of the group.

Mark Burgman and colleagues have come up with a modified version of the technique. This involves people estimating something, giving the highest reasonable value for their estimate, their lowest reasonable value and a measure of their confidence (50-100%) that their limits contain the true value. Then you discuss them in your group and change you estimates and use these to derive a group mean. This can be done many times, and it seems estimates are better with more iterations.

I think this is a great idea. But you can take the idea even further. You can do this with a series of questions some of which you know the answer for. Using respondents answers to these questions you can calibrate how expert your experts actually are. Then you can weight people’s estimates based on the confidence you have in them, like in the example below.

Estimates of the time-to-failure of an earth dam, once the core starts to leak. Taken from Aspinall 2010.
Estimates of the time-to-failure of an earth dam, once the core starts to leak. Taken from Aspinall 2010.

This is an idea pretty similar to meta-analysis. We give more weight to the estimates we are more confident about.

These approaches have been around for a while and appear to have been used very rarely in ecology and conservation. Given how often expert opinion is used in conservation it is important we think hard about how reliable it actually is. It will never be perfect, but it can be better. This work is a step in the right direction.

What traits drive response of birds to tropical land-use change?

Could fruit eating species such as the black-mandibled Toucan be disproportionately affected by land-use change?
Could fruit eating species like the black-mandibled Toucan be disproportionately affected by land-use change? (Photo credit to Ettore Bacocchi on flickr)

Everyone pretty much knows about the crisis of biodiversity loss facing the tropics.

In case you missed it tropical forests are being rapidly cleared, which human population increases and along with consumption. All this has lead to large losses of biodiversity in the tropics.

So far, so boring.

However, up until recently we didn’t have much of an idea how the characteristics of species in the tropics influenced their response to land-use change.

‘Why would we want to know that?’ – I hear you ask. Well if you’ve seen my blog before you will know that traits are  a good way of linking biodiversity change to changes in ecosystem function and services. This is the first step to working out the consequences of the massive changes in biodiversity we have seen over the last century. Simply put – we need to know this stuff.

Given what I think, it was great to find out at the recent BES 2012 annual meeting in Birmingham about a paper looking at how bird species with different traits respond to land-use change in the tropics.

Tim Newbold, a postdoc at the World Conservation Monitoring Centre in Cambridge, and colleagues compiled an impressive dataset of  >4500 records of >1300 bird species from 23 studies of  land-use change in the tropics. They then used data on habitat preferences, migratory status, diet, generation length and body size to determine how differences in these traits related to birds’ response to land-use change.

They found that long-lived, non-migratory, primarily frugiverous or insectivorous forest specialists were likely to be less abundant and less likely to occur in intensively used habitats.

Probabilities of presence and abundance relative to primary forest based on dietary preferences of tropical bird species
Probabilities of presence and abundance relative to primary forest based on dietary preferences of tropical bird species

Of these characteristics diet preference is perhaps the most easy to link to changes in ecosystem function and services.

The loss of insect eating species may impact the control of pest species with potentially negative consequences for tropical agriculture. However, this assumption depends heavily on pest species abundances not reducing in line with bird declines. It is also entirely possible that if pest species also reduce in abundance forest loss will lead to little change in crop damage.

The reduction in fruit eating bird species may have consequences for forest regeneration and maintainance of plant diversity. Many secondary forests that are isolated from primary forest have been shown to lack large seeded tree species. Any reduction in the abundance of fruit eating birds suggests another barrier preventing the recovery of plant species communities in secondary forests.

I really liked this paper. It shows the value of large datasets for making generalisations and the results are potentially important for investigating change in ecosystem function and services in tropical forest ecosystems. The good news is it looks like there is a lot more of this type of work on the way with the PREDICTS project aiming to do take a similar approach to many questions related to land-use change. I’m excited to see what they come up with next, provided they don’t scoop me in the process…

R code for Meta-analysis summary plots

Meta-analysis summary plot

I think meta-analysis is great. I am aware that some out there are a bit less complementary about it, but I like it. Sure, it can be used stupidly and give spurious results, but then so can ANOVAs or GLMs if you don’t know what you’re doing.

I also love R and have recently been converted to using ggplot2, one of the fantastic packages put together by Hadley Wickham. For me ggplot2 is the most aesthetically pleasing way to plot your data in R. It also has the ability to draw error bars easily, which in base graphics can be a bit of a faff. It’s endlessly customisable and can produce things of beautiful simplicity. I was quite proud of the rarefaction curves I produced the other day for example:

Some rarefaction curves we came up with the other day
Some rarefaction curves we came up with the other day

Anyway, I recently did a meta-analysis and found the best way to plot the results was in ggplot2. It wasn’t really easy to get into the form commonly used for summary plots so I thought I’d stick up some sample code here in the hope that it might help some people stuck with the same problems. The data I used here is all made up but the code for the graphs should prove useful.

#this is a file to create the standard meta-analysis 
#summary plot using ggplot2

#first load in/create some data, this should be in the form of 
#summarised means and upper and lower confidence intervals

#these are the mean values
#and this is a bit of code to come up with randomly sized CIs
#this is the upper 95% confidence interval
#this is the lower 95% confidence interval
# and this is a buch of different taxonomic groups for my imaginary meta-analysis
treatment<-c("Birds","Mammals","Amphibians", "Insects", "Plants") 
#stick all the data together

#now install and run ggplot from the library

#the easiest way for me to make a ggplot figure is to build things up
#a bit at a time. That way if you only need to change one bit you can do it
#without hunting through a heap of code

#this defines the elements to go in the plot, both the x and y and upper and lower CIs
#this defines the plot type
#this flips the co-ordinates so your x axis becomes your y and vice versa
#this puts in a dotted line at the point of group difference
d<-c+geom_hline(aes(x=0), lty=2,size=1)
#all of this gets rid of the grey grid and legends
e<-d+opts(panel.grid.minor=theme_blank(), panel.grid.major=theme_blank())+opts(axis.ticks = theme_blank(), axis.text.x = theme_blank())+ theme_bw()+opts(legend.position = "none")
#this sets x and y axis titles
f<-e+ xlab('Taxonomic group') +ylab ('Change following removal of herbivores')
#this sets axis label size
g<-f+opts(axis.text.x = theme_text(size = 16, colour = 'black')) +opts(axis.text.y = theme_text(size = 16, colour = 'black'))
#this sets axis title size and there is your finished summary plot!
g+opts(axis.title.x = theme_text(size = 20, colour = 'black'))+opts(axis.title.y = theme_text(size = 20, colour = 'black'))

Created by Pretty R at inside-R.org

At the end you should have something that looks a bit like this:

Meta-analysis summary plot

though your error bars will obviously be a different width.

Hope someone finds this useful. Drop me a comment if you did.


I have been asked below to supply the code for the rarefaction curves. The data for the curves was produced using a program other than R (shock horror – do they even exist?). I think my friend used estimateS. Anyway once you have done that you can get the data in R and do something like this to it:

#script for drawing rarefraction curves#

#read in data#
leah<-read.csv("C:/Users/Phil/Documents/My Dropbox/rare2.csv")

#load ggplot2 librar (or install it if you don't have it yet!)

#this sorts the orders of the seperate panels for each plot

#this tells ggplot what data to plot and the limits for the 'ribbon' around the curve

#this plots your line along with the ribbon and different 'facets' or panels and sets the y axis limits
b<-a+geom_line(shape=16,size=0.5)+geom_ribbon(colour=NA,fill="blue",alpha=0.2)+facet_wrap(~sampleNo2,scales="free_x")+ylim(0,80)+opts(panel.grid.major = theme_line(colour = "white"))
#this changes the colour of your plots to white and gets rid of gridlines
c<-b+ theme_bw()+opts(panel.grid.major = theme_line(colour =NA))
#this changes the angle of your x axis labels
d<-c+opts(axis.text.x=theme_text(angle=90, hjust=1))
#this changes the size of your x axis labels
e<-d+opts(axis.title.x = theme_text(size = 20, colour = 'black'))+opts(axis.title.y = theme_text(angle=90,size = 20, colour = 'black'))
#this changes you x axis title names
f<-e+ylab ('Mean richness')+xlab ('Sample size')
#this plots the final plot

#and this saves it
setwd("C:/Documents and Settings/PMART/My Documents/Dropbox/")

Created by Pretty R at inside-R.org

The data we used can be found here.