Bees and Visits to Garden Flowers
These data are from a student undergraduate project (Open University, Environmental Science) from 2011. The data show counts of various bee species to several different garden flowers. The data were collected from observations over several days, at the same time of day (mid-morning).
B.ter |
B.hor.pra |
B.lap |
A.mel |
B.pas |
|
Cir.vul |
10 |
8 |
18 |
12 |
8 |
Ech.vul |
1 |
3 |
9 |
13 |
27 |
Koe.pan |
37 |
19 |
1 |
16 |
6 |
Med.fal |
5 |
6 |
2 |
9 |
32 |
Rub.fru |
12 |
4 |
4 |
10 |
23 |
In the data the species names are abbreviations of their scientific binomial names. You should be able to find the full names quite easily.
Download
You can download the dataset as a CSV file using this link: bees-garden-flowers. Alternatively, you might copy the table to the clipboard and paste into a spreadsheet.
Usage
You can use these data to practice/illustrate various topics:
- Simple summary statistics.
- Graphical summary (e.g. bar chart).
- Test of association (e.g. chi-squared test).
Keywords
Invertebrate, pollinator, bee, association, chi squared, graphics, bar chart.
Examples
The following examples will give you a few ideas about how you might explore these data.
Summary statistics
The raw data show counts of bee visits to various plant species. Using averages is not really appropriate and the most “useful” thing you could do is to compute the row and column totals.
B.ter |
B.hor.pra |
B.lap |
A.mel |
B.pas |
Sum |
|
Cir.vul |
10 |
8 |
18 |
12 |
8 |
56 |
Ech.vul |
1 |
3 |
9 |
13 |
27 |
53 |
Koe.pan |
37 |
19 |
1 |
16 |
6 |
79 |
Med.fal |
5 |
6 |
2 |
9 |
32 |
54 |
Rub.fru |
12 |
4 |
4 |
10 |
23 |
53 |
Sum |
65 |
40 |
34 |
60 |
96 |
295 |
You could also calculate the proportions of visits, by row, by column, or overall.
Chi Squared test of association
These data are suitable for a test of association. The Pearson Chi Squared test is the analytical test of choice. The overall result could be summarized like so: X2 = 120.65, df = 16, p < 0.001. This shows that the visits are non-random and that certain bee species have preferences (both positive and negative). The Pearson residual values would show you the positive and negative associations.
B.ter |
B.hor.pra |
B.lap |
A.mel |
B.pas |
|
Cir.vul |
-0.67 |
0.15 |
4.54 |
0.18 |
-2.39 |
Ech.vul |
-3.12 |
-1.56 |
1.17 |
0.68 |
2.35 |
Koe.pan |
4.70 |
2.53 |
-2.69 |
-0.02 |
-3.89 |
Med.fal |
-2.00 |
-0.49 |
-1.69 |
-0.6 |
3.44 |
Rub.fru |
0.09 |
-1.19 |
-0.85 |
-0.24 |
1.39 |
Positive Pearson residuals indicate a positive association (bees visit more often), whilst negative values indicate negative association (bees visit less often). However, only values larger than 2.00 (plus or minus) are statistically significant.
You can see various interesting patterns in the results. For example, honeybees don’t seem to have any especial preferences, whereas all the bumblebees show at least some degree of choice.
Graphics
There are a number of ways to visualize count data. A bar chart (what Excel would call a column chart) is one sensible way. However, if there are large differences in count value, some bars end up very tall compared to others.
Figure 1. bees-garden-flowers-raw.png Bee visits to some garden flower species.
Rather than attempt to visualize the raw data, it may be better to look at the results. The Pearson residual values are a good indicator of the various associations; they are also more likely to be on a similar scale.
Figure 2. bees-garden-flowers-resid.png Pearson residuals for bee visits to garden flowers.
The residual plot for the bee data shows the positive and negative associations. Values greater than 2.00 (plus or minus) are statistically significant. The plot shown here shows the data grouped by bee species, but you might easily switch this around and show the data grouped by plant species.
References
Open University (2011), Environmental Science (S216). Undergraduate project.
Links
Data examples:
- Statistics for Ecologists: support files and example data.
- Statistics for Ecologists: exercises and notes.
- Community Ecology: support files and notes.
- Managing Data using Excel: support files and example data.
Custom R functions:
- Community Ecology: custom R functions.
General data science articles:
- DataAnalytics Knowledge Base. For general topics and articles about data science, including Learning R: the statistical programming language
- DataAnalytics Tips and Tricks. for articles covering a range of topics in data science, including Using R, Using Excel, quantitative data analysis, predictive data analysis and a lot more besides.
See our Publications Page for an overview of our book on Ecology, Environmental Science and R: the statistical programming language.