Dr. Mark Gardener
Dr. Christine Gardener

GO...
Gardeners Own Home
Education Home
Other Publications
About Us

PPLogo

Publications page

Community Ecology

Analytical Methods Using R and Excel

by: Mark Gardener

Available now from Pelagic Publishing.

On this page find the support files to accompany the text. There are Excel spreadhsheets containing example calculations. There are Excel spreadsheets and CSV files containing data examples. There is also an R data file, which contains data and custom R commands (see the notes on the R data file). There is a ZIP file containing all the files to make things easier to manage.

There are notes about the various files (datasets and spreadsheets) so that you can use it as a handy reference. The custom R commands are already indexed, with notes and examples. Some of the commands are also illustrated on my Writer's Bloc page (with full code and notes). See also the book outline and Table of Contents.

See the News page for details about updated files, new custom commands and so on.


Support files include Excel spreadsheets, data and R (commands and data)

 

Support Files

There are various sorts of file associated with the book:

  • Data examples – these are presented as Excel or CSV files, in any event they open in a spreadsheet.
  • Excel calculations – these are copies of those presented in the text and illustrate a computation of some kind, mostly they accompany the Have a Go exercises.
  • R data and commands – this file contains the data and custom R commands in one single R file (get it here).
  • All files – this ZIP archive contains all the Excel and CSV files as well as the RData file (get it here).

There are notes about the datasets and spreadsheet files, which I hope you'll find useful.

RData | ZIP archive of all files | Data notes | Spreadsheet notes


Download the RData file

Help with custom R commands used in the book

Use the load() command to get the file into R

Top

RData File

This contains the data and custom R commands as a single .RData file – get it here.

To get the contents into your copy of R you need to use one of the following methods:

Use the load() command

If you use Windows or Mac you can use the load() command like so:

load(file.choose())

A "chooser" window will open and you can navigate your way to the CERE.RData file and select it. The contents will be added to anything you already have in R.

If you use Linux then you will have to replace the file.choose() part with "CERE.RData". You will also need to append the full file path to the beginning of the name (inside the quotes).

If you want to clear everything from menory before you start then Windows and Mac versions have a menu item for the purpose; you can also use:

rm(list = ls())

This works on all versions of R and removes everything from the workspace – there is no warning.

Drag "n" Drop

In Windows and Mac versions of R you can drag and drop the CERE.RData file onto the R program icon. Double clicking the file sometimes works but also sometimes doesn't! Drag the CERE.RData file onto the desktop shortcut, taskbar or dock icon to open the contents in R. There are two options:

  • If R is not running it will open and your workspace will only contain the CERE.RData items.
  • If R is already running the contents of the CERE.RData file will be appended to the workspace. Any items that are named the same as an element of the CERE.RData file will be overwritten, there is no warning.

Managing the file

Once you've got the CERE.RData file opened you will have a workspace populated with data items and custom R commands. You can treat all the elements like regular R items. See the save() command for details of saving items in an RData file. The load() command appends RData elements to your workspace.

File Contents

Use ls() to view the contents of the CERE.RData file once it is in your R workspace. Type the name of an element to view it (remember that R is case sensitive). Data items will be viewed in rows and columns – some are quite large so your display may scroll quite a bit. Custom R commands will display the code. Feel free to re-distribute the commands but please leave my "credits" intact.

The custom R commands are mentioned in the text and covered in detail in the custom R commands suppport page, where you can find a listing similar in layout to the regular R help entries. There are notes on usage and examples, which mostly use data already in the CERE.RData file. Any updates and changes are listed in the News page.

The data files are all referred to and explained in the book text as well as here.

Additional packages

Some commands require additional command packages. These are mentioned in the text and in the custom R commands support page. To get a command package the simplest way is to use the install.packages() command. This works for all versions of R:

install.packages("pkgname")

Make sure you have an Internet connection and then simply replace the "pkgname" with the name of the package you want (you need the quotes).

Some packages are already part of R so do not need to be downloaded and installed, these include boot, cluster, and MASS.

Packages that you will need are: vegan and BiodiversityR.

To "prepare" a package and make the commands within it available for use, you will need the library() command:

library(pkgname)

Simply replace pkgname with the name of the library you want (you do not need quotes).


All files in one ZIP archive

Top

ZIP Archive of all files

This file is a standard ZIP file and should open easily in Windows, Mac or Linux. The archive contains all the Excel and CSV files as well as the CERE.RData file.

Once you have the ZIP file you can open it to extract the files:

  • In Windows 7 right–click the file and select Extract (you can then select the destination).
  • In Mac OSX the archive may automatically decompress, if not then double–click it (the files open in a new folder).
  • In Linux the file should open in your archive manager and you can view and extract the files.

The archive contains all the files you need to follow the examples in the book and to carry out the exercises. Most of the data are in CSV form, which will open in your spreadsheet and will also import into R. Some data are Excel (.xls) files and some are in both formats.

The CERE.RData file is for use with R – see here for instructions.

Some of the Excel (.xls) files are not data but are calculation worksheets. These are "constructed" by following the Have a Go exercises – I've provided you with completed versions for convenience and to check your work, try not to cheat!


Datasets are available in XLS, CSV and R formats

Top

Notes for Datasets

A variety of datasets are included in the support material. Data are available in various formats, Excel (.xls), Comma Separated Values (.csv) and in R format (part of the CERE.RData file). Some of the data were sourced from the Internet and published papers. Some were collected by undergraduates on field trips with the Field Studies Council. Some were kindly donated by students working towards an MSc in Biological Recording (formerly run by the University of Birmingham, now run by Manchester Metropolitan).

The following is a simple list of the datasets with notes on the formats the data are available in and what they represent.

Datasets cover a range of taxa

Top

Data Notes

Ants and Fire regime:

R data, CSV

These data show the abundance of various ant species at sites with different soil type and fire regimes: Hoffmann, B.D. (2003) Austral Ecol. 28, p.182-195. There are 2 soil types (brown and red) and 5 fire regimes (these differ in interval between burning and so on):

  • E2 = burnt every 3yr with grazing early (May)
  • E3 = burnt, spelled & burnt in 2 successive yr
  • L2 = burnt every 3yr with late grazing (Oct)
  • L3 = burnt, spelled , burnt in 2 successive yr
  • U = unburnt control

The CSV file includes 2 extra columns, row totals for the two sampling years. The data are in "community" format with species as rows and sites as columns.

Brazilian Forest trees:

R data, CSV, XLS

These data show the relative density of forest trees in a region of Brazil: adapted from Pereira, I.M. (2003) Biotropica 35, 154-165. There are 4 types of forest fragment:

  • LD = little disturbed
  • GF = grazed fragment
  • OR = old regrowth (30yrs of growth after agriculture)
  • NR = new regrowth (20yrs of growth after agricuture)

The CSV and XLS files also contain an additional column showing the plant habit (DT = dominant trunk, ST = several trunks). The data are in "community" format with rows as species and columns as sites.

Bryophytes and churchyards:

R data, CSV

These data show the abundance (scale 0–9) of various mosses at churchyard sites. The associated dataset shows various environmental variables at these same sites (thanks to: Mark Latham).

The data are in "community" format with rows as sites and columns as species (or environmental variable).

Butterflies and year:

R data, CSV

These data show the abundance of various butterfly species at a site in Scotland for several years (thanks to: Jessie MacKay).

The data are in biological recording format with columns for species name, year and abundance.

Dartmoor plants:

R data, CSV

These data show the presence and absence of various plant species at a valley bog on Dartmoor. There are 100 quadrats (thanks to: Field Studies Council).

The data are in "community" format with the species as the rows and the quadrats as the columns. The CSV version includes extra columns for common names and abbreviations.

Freshwater invertebrates:

R data, CSV

These data show the abundance of freshwater invertebrates at 18 sites (thanks to: Hing Kin Lee). The sites are split into 3 habitat types (River, Stream, Ditch) with 6 replicates in each. The associated dataset shows various environmental variables at these 18 sites.

The data are in "community" format with rows as samples and columns as species (abbreviated names).

Ground beetles:

R data, CSV

These data show abundances of ground beetles at three habitat types in the UK (thanks to: Robin Cure). There are 6 replicates for each habitat (Wood, Grass, Edge). The associated dataset shows vegetation height and the habitat type for each sample.

The data are in "community" format with rows as samples and columns as species (abbreviated names). However, there are several versions of the data and a couple are in biological recording format – these are designed to help you practice using Lookup and Pivot Tables.

Plants on Golf courses:

R data, CSV

These data show the abundance of various plant species at golf course habitats in the UK (thanks to: John Handley). The associated dataset shows the habitat type, as a name and abbreviation, for each of the 71 quadrats.

The data are in "community" format with rows as samples and columns as species (abbreviated names).

Hornbill fruit diet:

R data, CSV, XLS

These data show the species of fruit found in the diet of three hornbill species in India, adapted from: Datta, A. & Rawat, G.S. (2003) Biotropica 35, 208-218. The rows show the various fruit species (presence or absence) and the columns the three bird species:

  • GH = Great hornbill
  • WH = Wreathed hornbill
  • OPH = Oriental pied hornbill

The CSV and XLS versions have additional columns showing the plant family and the fruit colour.

Hydrosere plants:

R data, CSV

These data show the abundance of various plant species in a moorland hydrosere (thanks to: Field Studies Council). There are two versions, one shows abundance for 50 quadrats. These were collected at various distances from a fixed point (open water). There are 10 transect stations, each with 5 replicates. This gives rise to the second version of the dataset, where the replicates are amalgamated to 10 samples.

The datasets are in "community" format with species as rows and samples as columns. There is also an associated dataset that shows pH values at each of the 10 transect stations.

Mangrove fungi:

CSV

These data show the abundance of several species of polypore fungi in mangrove swamps in the Caribbean, adapted from: Gilbert, G.S. & Sousa, W.P. (2002) Biotropica 34, 396-404. There are three types of mangrove: Black, Red and White.

The data are in "community" format with species in rows and samples in columns.

Moorland plant presence/absence:

XLS

This file corresponds to a Have a Go exercise. The data are presence and absence of moorland plant species in 20 quadrats (thanks to: Field Studies Council). The exercise is to calculate the species co-occurrence and then the expected values under the null hypothesis that there are no associations between species.

Moss on trees:

R data, CSV

These data show the importance values (a measure of abundance) for various mosses on trees in a forest in North Carolina: Palmer, M.W. (1986) The Bryologist 89, 59–65.

The data are in "community" format with species in rows and sites as columns.

Neotropical butterflies:

R data

These data show the abundance of various butterfly species in two habitats, canopy and understorey: Jost, L. (2006) Oikos 113, 363–375 (adapted from DeVries, P. & Walla, T. (1996) Biol. Linn. Soc. 74, 1–15).

The data are in "community" format with samples as rows and species as columns.

Plant Quadrat Data:

R data, XLS, CSV

These data show the abundance of some upland plant species at two sites (thanks to: Field Studies Council). Each site has 5 replicates.

The basic data is in biological recording format. There are also versions in "community" data (presence absence and abundance) format.

Plant Species data (NVC):

R data, XLS, CSV

These data show the abundance of various plant species at ten sites (thanks to: Field Studies Council). There are several versions of the data. One version has no abundance data at all and is used in a Have a Go exercise for error checking.

Mostly the data are in biological recording format but one version (as R data) in in "community" format and shows presence absence for species as columns and sites as rows.

Ridge Psammosere succession:

R data, CSV

These data show the abundance of various plant species in a dune succession (thanks to: Field Studies Council). Data were collected along transects from the beach, working inland, with 10 transect stations. Point quadrats were used so the data are the number of "hits", a measure of frequency.

There are two versions of the data: the "full" version contains 80 quadrats and the "condensed" version amalgamates replicates at the transect stations to make 10 samples. These files are in "community" format with species as rows. The CSV files have additional columns for common names and abbreviations as well as the "community type":

  • P = Pioneer
  • H = Hardy generalist
  • MS = Maritime specialist
  • MC = Meadown community
  • SC = Shrub community

There is a separate R data file for the common names, abbreivations and community types. There is also a version of the data in biological recording format for the 80-quadrat samples. The associated dataset shows four environmental variables for the 10-quadrat data (i.e. at each transect station).

 

Spreadsheets generally correspond to in-text Have a Go exercises

Top

Notes for Spreadsheets

There are several spreadsheets included with the support materials. These generally reflect the Have a Go exercises, which you are encouraged to try out. Completed versions are supplied for you to check your working. The following is a simple list, all these files are included in the archive. They are listed in approximately the order they appear in the text.

Spreadsheets are part of CERE Support Files

Top

Filename Notes
Plant species lists with errors.xls This is contains two columns: site names and species names. The species column contains spelling mistakes (and additional spaces). This is intended to give you some practice at error checking and correction. You can use Pivot Tables and Filters to help manage and eliminate the errors.
Ground beetles.xls
Ground beetles and habitat.xls
These files are data but are used in an exercise to practice use of Lookup tables to add grouping variables. The second file is a completed version after the exercise. You can also use these data to practice Pivot Tables.
Diversity Simpson D.xls This is the completed version of the exercise in building a spreadsheet to calculate Simpson's diversity index. It leads on to...
Diversity Simpsons SEven.xls This adds to the previous exercise and includes calculations of Inverse Simpson's index (effective species) and Evenness. It leads on to...
Diversity Shannon.xls This adds to the previous exercise and builds calculations for Shannon entropy (using natural and base-2 logs). It leads on to...
Diversity HEven.xls This adds to the previous exercise and builds calculations for true diversity (effective species) and two measures of evenness, J-Evenness and E-evenness. It leads on to...
Diversity BergerParker.xls This adds to the previous exercise and builds calculations for the Berger Parker dominance index (and effective species). It leads on to...
Diversity Renyi.xls This adds to the previous exercise and builds calculations for Rényi entropy and Hill numbers equivalents. A simple graph is included to visualise the diversity profile. This leads on to...
Diversity calculator.xls

This adds to the previous exercise and completes the spreadsheet calculations for diversity. It is essentially the same as the previous spreadsheet but I added examples of Rény profiles for different communities and a graph.

Use this spreadsheet as a basic calculator of diversity and to visualise Rényi diversity profiles. The spreadsheet is not protected so you can insert additional rows. This also means you can alter the formulae so be careful!

Shannon stats test.xls
Simpson stats test.xls

These spreadsheets are presented in exercises in computing the significance of differences in diversity indices between two samples. There is a spreadsheet for Shannon entropy and one for Simpson's index. They are versions of the t-test. Each contains 3 worksheets, one for each of two samples and a final summary.

The final summary includes calculations of confidence intervals and a graphical representation of the two samples.

Moorland example.xls

This spreadsheet contains two worksheets. The first (Data only) contains data on the presence or absence of some moorland plant species in 20 quadrats.

The exercise is to calculate the species co-occurrence and then the expected values under the null hypothesis that there are no associations between species.

The second worksheet (Completed version) contains the calculations so that you can check your workings.

Polar ordination data.xls

This spreadsheet accompanies the exercise on Polar Ordination, a multivariate analysis. In the exercise you take raw data and use a Pivot Table to rearrange the data. Then you compute a dissimilarity matrix and axis scores before visualising the results in a graph.

The spreadsheet contains several worksheets so that you can check your working as you work through the exercise.

top
These cover the majority of the spreadsheet exercises. I may include some additional files, if so I will list them here and provide them as individual downloads.

Publications page

Back to top

My books on ecology and data analysis

Statistics for Ecologists is available now from Pelagic Publishing. Get a 20% discount using the S4E20 code!
Beginning R is available from the publisher Wrox or see the entry on Amazon.co.uk.

The Essential R Reference is available from the publisher Wiley now. See it also on Amazon.co.uk.
Community Ecology is available now from Pelagic Publishing.
Managing Data Using Excel is available now from Pelagic Publishing.
Get £5 discount using the MDUE20 code!

Managing Data Using Excel, Cover

 

See also...

Learn to use R for statistical analyses: Index page

Top
Gardeners Own Home
Education Home
Other Publications
About Us