Dr. Mark Gardener
Dr. Christine Gardener


Gardeners Own Home
Mark Homepage
My Research
Learn Statistics
Type of investigation | Number of samples | Matched pairs | Type of data | Distribution | Amount of data | Decision tree
Dr Mark Gardener
Associate Lecturer
Ecology and Environmental Science
The Open University

Choosing a stats test

Don't Panic - it's just a matter of following a simple flowchart

Which stats test should I use?

The decision about which stats test to use for your project or investigation can seem daunting as there are quite a few to choose from. However, by following a few simple steps you can pin down the correct one quite easily. It is important to put this decision into the planning stage of your investigation and not leave it until you have collected a pile of data!

It is easy to break down the decision into a series of steps - at each stage you answer a simple question about your investigation. To do this you need to understand a very small amount of terminology. If you are familiar with some of the terms then you can go straight to the decision tree right now. Otherwize you may wish to work through the following paragraphs to familiarize yourself with some of the terms.

See also: Training Courses | R and Excel Tips, Tricks & Hints | MonogRaphs | Writer's bloc

New: Statistics – a guide

Get a discount when you buy direct from Pelagic Publishing Enter the voucher code in the shopping basket:

Stats for Ecologists S4E20

Managing Data MDUE20

Have you seen?

A fewof my textbooks that might help with statistics and data analysis:

My Publications page

See my Amazon Author Page

Back to Top.

On Facebook:

Back to Top.

Type of investigation:

The first decision is to decide what sort of investigation you are dealing with. We can split into two main types - differences or similarities.

In this sort of investigation you are comparing one factor across two places (or categories), in other words you have two separate samples. For example, you may wish to look at the size of ivy leaves on the north facing side of a wall and compare with the size of the leaves on the south facing side. You may wish to see if there are more mayflies in slow moving pools than in fast parts of a stream. You may want to see if the height of nettles is different in shady areas of a hedgebank when compared to open sunlit areas. Each time you measure (or count) something the data you collect is called a sample.

In all cases you have measure or counted one thing and wish to compare this factor across two samples. It is possible to have more than two samples, e.g. you may have looked at how many bluebells there were in different woods, oak, ash and beech. In general it's best to stick with just two things. These sorts of investigation can be best summed up using a bar chart.

In this kind of investigation you are looking to compare things to see if there is a relationship between them - it's pretty much the opposite of the differences type of investigation. We can separate the similarities into two main themes: correlation and association.

In this type of investigation you are looking to compare two factors to see if there is a relationship. You may also wish to examine how one thing changes across a gradient of something. For example, you may wish to see if the abundance of a plant changes as you move from the bottom of a hill to the top. Here you have both a physical gradient (the hill) but also perhaps an environmental one (soil moisture - it may be damper towards the bottom). You may wish to examine how the dissolved oxygen content of a stream changes as you move away from a point source of pollution. You have two variables here, the oxygen concentration and the distance from source. As another example, you may want to look at how the abundance of mayflies varies with the velocity of the stream. In all these cases you can represent your findings using a scatterplot sort of graph. In most cases you will be looking to see how strongly correlated your two factors are.

In this type of investigation you are looking to see if there is an association between several "things" but they do not neatly break down into measurable variables. For example you are interested to see if marine snails are associated with a particular habitat on a rocky beach. You have two sets of "thing" to compare, the type of snail (e.g. periwinkles, topshells, dog whelks) and the habitat (e.g. in the water of rock pools, on bare rock, on seaweed). You have two sets of stuff but you cannot easily represent the results on a scatterplot! Neither "thing" is a variable as such. Both snail-type and habitat are categorical data and it's hard to represent neatly using a graph - you would probably use two bar charts, one where the habitat was the x-axis and one where the snail-type was the x-axis. It's common to display the results in a table (called a contingency table).

Back to Top.

Number of samples:

Usually you will have two samples - you will be comparing two things. For example you might be comparing the size of oak leaves from trees in a plantation to trees out in the open (by themselves), you may be comparing the number of millipedes in leaf litter from conifer woods to deciduous woods. Sometimes you may wish to look at more than two things. You may want to look at how the type of soil affects plant growth. You might have peat, coir, compost and regular soil - four things! It is perfectly possible to do this but there are pitfalls. Your final analysis will firstly tell you if there is a difference between the four (in this case) but not necessarily which ones. You will have to do further analyses to pick out where the differences are. More importantly, your sampling effort will be spread out over four samples - you will have fewer data in each sample than if you stuck to two things. It would be a lot better to collect 10 readings from two of the soil types than to get 5 from each of the four.

Back to Top.

Matched pairs:

Sometimes the readings/data you collect form what are known as matched pairs. This is where the reading from one sample is matched up with one specific reading from the second sample. An example might be where you looked at the abundance of moss on trees - you determined how much moss was on the north-facing side and the south-facing side of a number of trees. The readings from each tree naturally form matching pairs. Perhaps you looked at the amount of lichen on the to sides of gravestones in a church yard - the readings from the two sides are naturally closely connected. You may have looked at the size of ivy leaves on two sides of a wall (north and south facing). in this case the two samples are not matched pairs! There is no reason why any particular measurement on the north side should be tied to a particular measurement on the south. If you had several walls and each pair was at the same position along the wall then you'd be safe to say they were matched. With a single wall you are on dodgy ground (although it's a perfectly good example of a "differences" investigation). Sometimes you may look at a situation over time - that may be a matched pair too. For example, you might wish to look at butterfly visits to patches of flowers at different times of day (morning and afternoon). If you had several patches of flowers then your "morning" and "afternoon" readings could be considered as matched pairs (same place, different time).

In general it's best to treat all situations as not matched unless you can be absolutely certain that your readings really "match up".

Back to Top.

Type of data:

There are three main types of data (and you thought they were all just numbers!). They are: interval, ordinal and categorical.

These are what you think of as "real" numbers e.g. weight, height, how many. Most data will be like this. You can take the final list of data and put the numbers into size order (this is called rank order). You can also do "proper" mathematics on the data. The "interval" part you can think of as being that you can tell how far apart the individual data items are from one another. If you measured size in millimetres then the interval is a millimetre, if you measured weight in grammes then the interval is a gram. You can tell how many mm or g your items are apart from one another.

Ordinal data is slightly different from interval in that you cannot do proper maths on the numbers. It's common in ecological work to convert measurements of the abundance of plants into a different scale. For example, instead of working out exactly how much grass you have you estimate the coverage and convert to an abundance scale. The ACFOR scale is a common scale. A is for "abundant" and may equate to >50% coverage. C is for "common" and may equate to 25-49% coverage. F is for "frequent" and may relate to 5-24% coverage. O is for "occasional and may equate to <5% (but several individuals). R is for "rare" and may equate to <5% (but only 1-2 individuals). The exact relationship can be set at anything convenient but there are a number of ACFOR scales in common use for different organisms (e.g. seaweeds, barnacles, snails, flowering plants). It is easy to say that "A" (abundant) is bigger than "O" (occasional) but you cannot do proper maths on the numbers because the intervals are not exact.

The bottom line is that you can order the data into size order but you cannot do proper maths on the data because the interval varies between the categories.

Generally the data you collect will be interval but the way you set out the data may make it categorical. It's much easier to use an example to explain this. Imagine that you want to see if there is an association between bird species and habitat type. You might be interested in sparrows, great tits, starlings and any number of other species. The habitats could be woodland, gardens, grassland and arable fields. You could go out and count birds in those different habitats and end up with a big table of numbers. Surely the numbers are interval data because they are counts (so we know how far apart they are)? The problem comes with how your data ar arranged/collected. In order to be interval data you should be able to summarize your sample using an average (you can do this to ordinal data but you have to convert to a "rank" first). What you have is a simple table with birds down the side and habitats across the top. Each cell of your table has a number, so you may have the number of sparrows in gardens, the number of starlings in fields and so on. You cannot summarize the data using averages because each one only has one number.

What you have is categorical data - the birds are one category and the habitats are the other category. It's the arrangement that makes the data categorical. Another example of this sort of arrangement comes when you are looking for the presence or absence of something. You may take a small square (a quadrat) and put it on the ground. There will be a number of plant species in the square - you mark a tick if a species is present, leave a blank if it's not. You do this lots more times. Later on you can determine how many quadrats contained each species. You can also determine how many quadrats contained two particular species at the same time (and how many contained neither). Your categories are now the presence and the absence of species one and the presence and absence of the species two. Another sort of categorical experiment crops up in genetics. You might expect a certain pattern to occur in a hybrid of two different plants for example. So, if you know the ratio of plant types (e.g. flower colour) expected by the genetics you can carry out a goodness of fit test on the actual ratio of observed plant types in your experiment. In this case the categories would be the plant types (there is only one category here as opposed to the previous examples).

Back to Top.

Distribution of data:

Whenever you collect some data you end up with a whole bunch of numbers. The distribution refers to how many times each number crops up - it's shorthand for frequency distribution. Often you don't count up how many times each individual number occurs but rather you cerate small categories or bins - each bin contains a range of numbers (e.g. 0-3, 4-7, 8-11 and so on). The usual thing is to create at least 9-10 of these bins. If you do a bar chart of the frequencies (properly called a histogram) you can see how many times each number (rather range of numbers) crops up. If you see a pattern where the middle of the graph shows a distinct hump, with the sides tailing off equally, you have probably got what is called normal distribution.The features of this sort of distribution are well know mathematically and some statistics tests make use of this (another name for the distribution is parametric). Things that tend to be normally distributed are weight and height.

However, you may see that the hump where the greatest number of data items lie is not in the middle but rather neare to one end. This sample is showing a skewed distribution.The numbers are not distributed symmetrically around the middle point but lumped towards one end. This sort of distribution is also known as non-parametric.

Of course you will not know in advance if your data are going to be normally distributed or not. However, it is possible to have a good guess. This is where a pilot study is useful as it can help to determine if your data are parametric or not. Some sorts of data are not usually parametric. If you were comparing the numbers of freshwater shrimps in open water to shady sites your data are not likely to be parametric. Counting of animals rarely is, you tend to get sudden "lumps". If you are measuring the coverage of plant species and using the % of the ground occupied by each plant you are also unlikely to get normally distributed data. In this case the % can never be less than zero or much above 100% (you can get >100% with overlapping plants) - this means that the ends of your distribution are "fixed". Ordinal data that have been converted to a rank order are also not normally distributed. You could convert an ACFOR scale to a numerical scale (0-5) with A = 5, C = 4 and so on but the data would not be parametric.

In summary: you should check the frequency distribution of your data. Interval data may be normally distributed (i.e. symmetrical around a central hump) or may be skewed. Ordinal data are always non-parametric.

Back to Top.

Amount of data:

In general the more data you collect the better. However, in practice you have limited time to spend collecting data. Each statistical test has a set of requirements, some tests need to have >30 measurements for example and so if you had less then you would have to select a different analysis. It's a good idea to estimate how long it's going to take you to collect some data and then work out how much you can do in the time available. If you do a pilot study you will have a good idea if your target is achievable.

Back to Top.

Selecting the right statistical analysis - flowchart/decision tree

Start with the top line - you will have a choice. Make your choice and go to the next line down. As you move down the table you will make more choices until you have settled on the best statistical approach for your investigation/project.

Back to Top.
Start here ===>
Is your analysis concerned with Differences or Similarities?
<=== Start here
Do you have two samples to compare or more than two?
Are your data categorical? Are you looking for an association between categories or do you have two factors (can you draw a scatter graph)?
More than two samples
Two samples only


Categorical - Association
Two variables to correlate
Are your data normally distributed or non-parametric?
Do you have two sets of categories or only one?
Are both variables normally distributed or non-parametric?
You need ANOVA, analysis of variance
Use the Kruskal Wallis test
Use Chi Square analysis to look for association between the various categories
Use Goodness of fit test to look for a fit between the expected ratio between the categories and the observed ratio
Use Spearman Rank correlation to determine the strength of the relationship
Use regression analysis and Pearson's correlation coefficient to determine the strength of the relationship and also to be able to use one variable to predict the other

Are your samples in matched pairs?

Matched pairs
Not in matched pairs
Do you have >25 pairs of data?
Do you have >25 data items in each sample?
>25 pairs
<25 pairs
>25 data per sample
<25 data items per sample
Use the z test for matched samples
Are your data normally distributed or non-parametric?
You need the z test for unmatched samples
Are your data normally distributed or non-parametric?
Use the t test for matched pairs
Use Wilcoxon matched pairs analysis
Use the t test for unmatched samples
Use the Mann Whitney U test
Back to Top.


Gardeners Own Home
Mark Homepage
My Research