R squared for mixed models – the easy way

Earlier this year I wrote a post on calculating R squared values for mixed models.

It turned out a lot of people had been having the same problem that I had been having – basically we didn’t know how well our mixed models fit our data.

Thankfully a paper in Methods in Ecology and Evolution  by Nakagawa & Schielzeth helped to get close to solving this problem.

The only problem was that the code was a little difficult to implement.

Now Kamil Barton has written a function for this calculation in my current favourite R package – MuMIn. (Edit – as Erika Berenguer pointed out on twitter this doesn’t work for all versions of MuMin only for versions later than 1.19. So if you’ve got an older version use the update.packages() function.)

Basically the function works like this:

  • You give it your mixed model
  • It spits out marginal and conditional R squared values

The marginal R squared values are those associated with your fixed effects, the conditional ones are those of your fixed effects plus the random effects. Usually we will be interested in the marginal effects.

To show how this works we’ll use the example given in the paper using beetle body length.

First get the dataset we will use from here.

#first load in the packages you want

#Next the data

# Fit a model including fixed and all random effects
mF <- lmer(BodyL ~ Sex + Treatment + (1 | Population) + (1 | Container), data = BeetleBody)


This should spit out the following:

R2m R2c
0.3913021 0.7406447

Showing that your marginal R squared is 0.39 and your conditional R squared is 0.74 – not bad.

Of course you should do model simplification or model averaging in an attempt to get a parsimonious model before you do any of this, but I just wanted to flag this up.

Apparently this function is still in it’s ‘experimental stages’ so if you manage to break it let the people who put MuMIn together know.

If you have any comments, as ever put them below.

I’ll get back to writing a non-stats blog post soon.

Using metafoR and ggplot together:part 2

In the last post I introduced you to the basics of meta-analysis using metafor together with ggplot2. If you haven’t seen that post, I suggest you go back and read it since this post will be hard to follow otherwise.

In this post I will concentrate on trying to explain differences between different sites/studies.

Going back to our previous example, we saw that logging tended to have a negative effect on species richness.

Luckily my completely made up data had information on the intensity of logging (volume of wood removed per hectare) and the method used (conventional or reduced impact).

We can make a model looking at the effects of logging intensity and method on species richness. First we can start with our most complex model


This spits out the following:

Mixed-Effects Model (k = 30; tau^2 estimator: ML)

  logLik  deviance       AIC       BIC
-15.8430  121.7556   41.6860   48.6920

tau^2 (estimated amount of residual heterogeneity):     0.1576 (SE = 0.0436)
tau (square root of estimated tau^2 value):             0.3970
I^2 (residual heterogeneity / unaccounted variability): 96.52%
H^2 (unaccounted variability / sampling variability):   28.74

Test for Residual Heterogeneity:
QE(df = 26) = 1151.1656, p-val &amp;lt; .0001

Test of Moderators (coefficient(s) 2,3,4):
QM(df = 3) = 38.2110, p-val &amp;lt; .0001

Model Results:

                     estimate      se     zval    pval    ci.lb    ci.ub
intrcpt                0.1430  0.3566   0.4011  0.6883  -0.5558   0.8419
Intensity             -0.0281  0.0099  -2.8549  0.0043  -0.0474  -0.0088  **
MethodRIL              0.1715  0.4210   0.4074  0.6837  -0.6536   0.9966
Intensity:MethodRIL    0.0106  0.0138   0.7638  0.4450  -0.0166   0.0377

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Looking at our residual plots, they seem to conform to our assumptions of constant variance.


Our models suggests that the more wood is extracted from a forest the greater the negative effect on species richness, but that seems to be little evidence about the effect of method or the the interaction between intensity and method.

So, lets get rid of the interaction and see what happens:



Mixed-Effects Model (k = 30; tau^2 estimator: ML)

  logLik  deviance       AIC       BIC
-16.1323  122.3341   40.2646   45.8694

tau^2 (estimated amount of residual heterogeneity):     0.1606 (SE = 0.0443)
tau (square root of estimated tau^2 value):             0.4008
I^2 (residual heterogeneity / unaccounted variability): 96.75%
H^2 (unaccounted variability / sampling variability):   30.80

Test for Residual Heterogeneity:
QE(df = 27) = 1151.4310, p-val &amp;lt; .0001

Test of Moderators (coefficient(s) 2,3):
QM(df = 2) = 36.9762, p-val &amp;lt; .0001

Model Results:

           estimate      se     zval    pval    ci.lb    ci.ub
intrcpt     -0.0416  0.2648  -0.1570  0.8753  -0.5606   0.4774
Intensity   -0.0228  0.0070  -3.2617  0.0011  -0.0364  -0.0091  **
MethodRIL    0.4630  0.1803   2.5681  0.0102   0.1096   0.8163   *

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Now both intensity and method are statistically significant, but method less so. Let’s take it out:



Mixed-Effects Model (k = 30; tau^2 estimator: ML)

  logLik  deviance       AIC       BIC
-19.1526  128.3747   44.3052   48.5088

tau^2 (estimated amount of residual heterogeneity):     0.1962 (SE = 0.0536)
tau (square root of estimated tau^2 value):             0.4429
I^2 (residual heterogeneity / unaccounted variability): 97.45%
H^2 (unaccounted variability / sampling variability):   39.27

Test for Residual Heterogeneity:
QE(df = 28) = 1258.6907, p-val &amp;lt; .0001

Test of Moderators (coefficient(s) 2):
QM(df = 1) = 25.2163, p-val &amp;lt; .0001

Model Results:

           estimate      se     zval    pval    ci.lb    ci.ub
intrcpt      0.4678  0.1927   2.4276  0.0152   0.0901   0.8455    *
Intensity   -0.0324  0.0065  -5.0216  &amp;lt;.0001  -0.0451  -0.0198  ***

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Unsurprisingly, intensity is now more significant. However, we have taken out method which may have some value in explaining differences. The tool I use most for model selection is AIC. This basically aims to get the best fitting model with the least explanatory variables. This is why I did the last step of removing method. It could be that the effect of method is significant but adds relatively little explanatory power to our model.

Let’s test this by calculating their AICs.



[1] 41.68604
> AIC(Model2)
[1] 40.26459
> AIC(Model3)
[1] 44.30516

We choose the model with the lowest AIC, in this case Model2.

To calculate the fit of our model, we can compare it to our intercept only model (the original meta-analysis from part 1), but we have to set method=”ML” since REML is a poor estimator of deviance.


Using the two Tau squared statistics we can calculate the proportion of deviance not explained by the intercept only model using the sum:
1-\frac{tau^{2}_{Model2}} {tau^{2}_{ROM.ma2}}

in.plot3<-in.plot2+ylab("log response ratio")+xlab("Logging intensity (m-3 ha-1)")

Very pretty.

Circle size is relative to weight for each study, since we weight the most precise studies most heavily. Blue points and line refer to reduced impact logging sites, while red refers to conventionally logged sites.

We can also transform it all to proportions for more intuitive interpretations:

in.plot3<-in.plot2+ylab("Proportional change\nin species richness")+xlab("Logging intensity (m-3 ha-1)")

The graphs show that, for my made up data, reduced impact logging has a less negative effect than conventional logging on species richness when carried out at similar intensities but that increasing intensity leads to reductions in species richness for both conventional and reduced impact methods.

I think that wraps it up for now. I hope you’ve found this useful. As ever any comments or questions are welcome.
I will be back to writing non-stats posts again soon!

Using metafoR and ggplot together:part 1

I really like meta-analysis, as some of you might already  know.

It has it’s critics but used well it can help us achieve one of the major goals of applied ecology: generalisation. This way we can say how we think things generally work and draw conclusions bigger than  individual studies.

However, for some reason people have got it into their head  that meta-analysis is hard. It’s not. The hardest thing is the same as always, coming up with a good question/problem.

However, for a first timer it can seem daunting. Luckily there are a few good R packages for doing meta-analysis, my favourite of which is metafor.

The aim of these posts is to take some of the mystery out of how to do this and was prompted by a comment from Jarret Byrnes on a previous blog post.

In the first post I will deal with effect sizes, basic meta-analysis, how to draw a few plots and some analyses to explore bias. In the second I will go through meta-regression/sub-group analysis and a few more plots. It shouldn’t be too hard to understand.

For this I will use a dataset based on some work I am doing at the moment looking at the effects of logging on forest species richness. However, I don’t really feel like sharing unpublished data on the web so I’ve made some up based on my hypotheses prior to doing the work.

First of all we need to install the right packages and open them



Next we need the data, download it from here and get r to load it up:


This file contains data on species richness of unlogged (G1) and logged (G2) forests, it’s not perfect as all of the data is made up – it should be useful for an exercise though.

First we need to calculate the effect size. This is essentially a measure of the difference between a treatment and control group, in our case logged (treatment) and unlogged (control) forests. We can do this in different ways, the most commonly used in ecology are the standardised mean difference (SMD) and log mean ratios (lnRR).

The SMD is calculated like this:

\frac{\bar{X_{1}}-\bar{X_{2}}} {SD_{pooled}}

where \bar{X_{1}}and \bar{X_{2}} are the different group means, SD_{pooled} is the pooled standard deviation ( I won’t explain this here but there are good explanations elsewhere). Essentially what this does is to give an effect size based on the difference between groups with more imprecise studies having a smaller effect size.

The lnRR is calculated as:


which gives the log of the proportional difference between the groups. I quite like this as it is easier to interpret than the SMD as the units can be back-transformed to percentages.

In metafor you calculate SMD by doing this:


and lnRR like this:


From now on in this post I will use the lnRR effect size.

Next we carry out a meta-analysis using the variability and effects sizes of each set of comparisons to calculate an overall mean effect size, using the idea that we want to give most weight to the most precise studies.

In metafor you do that by doing this:


This is the code for a mixed effects meta-analysis, which basically assumes there are real differences between study effect sizes and then calculates a mean effect size. This seems sensible in ecology where everything is variable, even in field sites which quite close to each other.

The code it spits out can be a bit daunting. it looks like this:

Random-Effects Model (k = 30; tau^2 estimator: REML)

  logLik  deviance       AIC       BIC
-27.8498   55.6997   59.6997   62.4343

tau^2 (estimated amount of total heterogeneity): 0.3859 (SE = 0.1044)
tau (square root of estimated tau^2 value):      0.6212
I^2 (total heterogeneity / total variability):   98.79%
H^2 (total variability / sampling variability):  82.36

Test for Heterogeneity:
Q(df = 29) = 2283.8442, p-val < .0001

Model Results:

estimate       se     zval     pval    ci.lb    ci.ub
 -0.4071   0.1151  -3.5363   0.0004  -0.6327  -0.1815      ***

Essentially what it’s saying is that logged forests tend to have lower species richness than unlogged forests (the model results bit), and that there is quite a lot of heterogeneity between studies (the test for heterogeneity bit).

We can calculate the percentage difference between groups by doing taking the estimate part of the model results along with the se and doing this:


to get things in percentages. So, now we can say that logged forests have 33% less species than unlogged forests  +/- a confidence interval of 25%.

We can summarise this result using a forrest plot in metafor, but this is a bit ugly.



What did I tell you? Pretty ugly right?

Much better to use ggplot2 to do this properly. But this is a bit of a pain in the arse:

forrest_data$Study2<-factor(forrest_data$Study, levels=rev(levels(forrest_data$Study)) )
plot2<-plot1+coord_flip()+geom_hline(aes(x=0), lty=2,size=1)+scale_size_manual(values=c(0.5,1))
plot3<-plot2+xlab("Study")+ylab("log response ratio")+scale_colour_manual(values=c("grey","black"))


There. Much better.

With a little tweak we can even have it displayed as percentage change, though your confidence intervals will not be symmetrical any more.

plot2<-plot1+coord_flip()+geom_hline(aes(x=0), lty=2,size=1)+scale_size_manual(values=c(0.5,1))
plot3<-plot2+xlab("Study")+ylab("Percentage change in richness\n as a result of logging")+scale_colour_manual(values=c("grey","black"))
plot3+theme(legend.position="none")+ scale_y_continuous(breaks=c(-50,0,50,100,150,200))


There that’s a bit more intuitively understandable, isn’t it?

Right, I will sort out the next post on meta-regression and sub-group analysis soon.

In the meantime feel free to make any comments you like.

R squared for mixed models

Goodness of fit

As a disclaimer, this is probably the geekiest post I have written on here yet. It’s about statistics. You have been warned.

Don’t disappear just yet. It does get better.

For a while I have been dissatisfied with mixed models and how they have been used in ecology. On the one hand they are wonderful and they allow you to deal with nasty problems like pseudoreplication. On the other hand I have always been concerned about the lack of any measure of goodness of fit. Many papers now use mixed models without reporting something like an R2 statistic and so we have little idea how good (or bad) these models are at describing the patterns you are looking at. R2 statistics are important (as Jeremy Fox (sorry I misatributed credit here – thanks to Nick Golding on Twitter for the correction) Brian McGill has pointed out at the ever excellent dynamic ecology blog) and if our R2 is small we may well be barking up the wrong tree when we’re looking to explain how our system works. In his blog post Jeremy Brian points out research that suggests ecologists work tends to have an R2 of 0.02-0.05. We should be ashamed of this.

Anyway, papers with mixed models often report AIC values which as a means of model selection are really useful. However, they don’t inherently tell you much about the fit of your model.  This means that for a whole swathe of new papers we have little idea how good the fit of models really is.

This is where a new paper by Shinichi Nakagawa and Holger Schielzeth published in Methods in Ecology and Evolution comes in.

They have come up with R2 equivalents for mixed models. And I love them for it.

These new methods allow us to calculate the R2 values for fixed effects in our models and fixed effects and random effects combined. Being statistical types they gave them catchy names like R2GLMM(m) and R2GLMM(c). Even though the names are clunky, if you are an ecologist (or even non-ecologist) that uses mixed models you must read this paper. It includes good worked examples and importantly an R script which will allow you to calculate the statistic (find this in the supplementary bit which you can access for free, I have included it below as well but please go and read the paper).

I hope that this will lead to papers which use mixed models to calculate their goodness of fit more regularly so we can assess how good people’s models actually are. I for one will certainly be using it.

*Update – I have added a blog post here showing how to calculate R2 for mixed models more easily, here.

R code for Meta-analysis summary plots

Meta-analysis summary plot

I think meta-analysis is great. I am aware that some out there are a bit less complementary about it, but I like it. Sure, it can be used stupidly and give spurious results, but then so can ANOVAs or GLMs if you don’t know what you’re doing.

I also love R and have recently been converted to using ggplot2, one of the fantastic packages put together by Hadley Wickham. For me ggplot2 is the most aesthetically pleasing way to plot your data in R. It also has the ability to draw error bars easily, which in base graphics can be a bit of a faff. It’s endlessly customisable and can produce things of beautiful simplicity. I was quite proud of the rarefaction curves I produced the other day for example:

Some rarefaction curves we came up with the other day
Some rarefaction curves we came up with the other day

Anyway, I recently did a meta-analysis and found the best way to plot the results was in ggplot2. It wasn’t really easy to get into the form commonly used for summary plots so I thought I’d stick up some sample code here in the hope that it might help some people stuck with the same problems. The data I used here is all made up but the code for the graphs should prove useful.

#this is a file to create the standard meta-analysis 
#summary plot using ggplot2

#first load in/create some data, this should be in the form of 
#summarised means and upper and lower confidence intervals

#these are the mean values
#and this is a bit of code to come up with randomly sized CIs
#this is the upper 95% confidence interval
#this is the lower 95% confidence interval
# and this is a buch of different taxonomic groups for my imaginary meta-analysis
treatment<-c("Birds","Mammals","Amphibians", "Insects", "Plants") 
#stick all the data together

#now install and run ggplot from the library

#the easiest way for me to make a ggplot figure is to build things up
#a bit at a time. That way if you only need to change one bit you can do it
#without hunting through a heap of code

#this defines the elements to go in the plot, both the x and y and upper and lower CIs
#this defines the plot type
#this flips the co-ordinates so your x axis becomes your y and vice versa
#this puts in a dotted line at the point of group difference
d<-c+geom_hline(aes(x=0), lty=2,size=1)
#all of this gets rid of the grey grid and legends
e<-d+opts(panel.grid.minor=theme_blank(), panel.grid.major=theme_blank())+opts(axis.ticks = theme_blank(), axis.text.x = theme_blank())+ theme_bw()+opts(legend.position = "none")
#this sets x and y axis titles
f<-e+ xlab('Taxonomic group') +ylab ('Change following removal of herbivores')
#this sets axis label size
g<-f+opts(axis.text.x = theme_text(size = 16, colour = 'black')) +opts(axis.text.y = theme_text(size = 16, colour = 'black'))
#this sets axis title size and there is your finished summary plot!
g+opts(axis.title.x = theme_text(size = 20, colour = 'black'))+opts(axis.title.y = theme_text(size = 20, colour = 'black'))

Created by Pretty R at inside-R.org

At the end you should have something that looks a bit like this:

Meta-analysis summary plot

though your error bars will obviously be a different width.

Hope someone finds this useful. Drop me a comment if you did.


I have been asked below to supply the code for the rarefaction curves. The data for the curves was produced using a program other than R (shock horror – do they even exist?). I think my friend used estimateS. Anyway once you have done that you can get the data in R and do something like this to it:

#script for drawing rarefraction curves#

#read in data#
leah<-read.csv("C:/Users/Phil/Documents/My Dropbox/rare2.csv")

#load ggplot2 librar (or install it if you don't have it yet!)

#this sorts the orders of the seperate panels for each plot

#this tells ggplot what data to plot and the limits for the 'ribbon' around the curve

#this plots your line along with the ribbon and different 'facets' or panels and sets the y axis limits
b<-a+geom_line(shape=16,size=0.5)+geom_ribbon(colour=NA,fill="blue",alpha=0.2)+facet_wrap(~sampleNo2,scales="free_x")+ylim(0,80)+opts(panel.grid.major = theme_line(colour = "white"))
#this changes the colour of your plots to white and gets rid of gridlines
c<-b+ theme_bw()+opts(panel.grid.major = theme_line(colour =NA))
#this changes the angle of your x axis labels
d<-c+opts(axis.text.x=theme_text(angle=90, hjust=1))
#this changes the size of your x axis labels
e<-d+opts(axis.title.x = theme_text(size = 20, colour = 'black'))+opts(axis.title.y = theme_text(angle=90,size = 20, colour = 'black'))
#this changes you x axis title names
f<-e+ylab ('Mean richness')+xlab ('Sample size')
#this plots the final plot

#and this saves it
setwd("C:/Documents and Settings/PMART/My Documents/Dropbox/")

Created by Pretty R at inside-R.org

The data we used can be found here.