# Bees and Visits to Garden Flowers

These data are from a student undergraduate project (Open University, Environmental Science) from 2011. The data show counts of various bee species to several different garden flowers. The data were collected from observations over several days, at the same time of day (mid-morning).

 B.ter B.hor.pra B.lap A.mel B.pas Cir.vul 10 8 18 12 8 Ech.vul 1 3 9 13 27 Koe.pan 37 19 1 16 6 Med.fal 5 6 2 9 32 Rub.fru 12 4 4 10 23

In the data the species names are abbreviations of their scientific binomial names. You should be able to find the full names quite easily.

You can download the dataset as a CSV file using this link: bees-garden-flowers. Alternatively, you might copy the table to the clipboard and paste into a spreadsheet.

## Usage

You can use these data to practice/illustrate various topics:

• Simple summary statistics.
• Graphical summary (e.g. bar chart).
• Test of association (e.g. chi-squared test).

## Keywords

Invertebrate, pollinator, bee, association, chi squared, graphics, bar chart.

## Examples

The following examples will give you a few ideas about how you might explore these data.

### Summary statistics

The raw data show counts of bee visits to various plant species. Using averages is not really appropriate and the most “useful” thing you could do is to compute the row and column totals.

 B.ter B.hor.pra B.lap A.mel B.pas Sum Cir.vul 10 8 18 12 8 56 Ech.vul 1 3 9 13 27 53 Koe.pan 37 19 1 16 6 79 Med.fal 5 6 2 9 32 54 Rub.fru 12 4 4 10 23 53 Sum 65 40 34 60 96 295

You could also calculate the proportions of visits, by row, by column, or overall.

### Chi Squared test of association

These data are suitable for a test of association. The Pearson Chi Squared test is the analytical test of choice. The overall result could be summarized like so: X2 = 120.65, df = 16, p < 0.001. This shows that the visits are non-random and that certain bee species have preferences (both positive and negative). The Pearson residual values would show you the positive and negative associations.

 B.ter B.hor.pra B.lap A.mel B.pas Cir.vul -0.67 0.15 4.54 0.18 -2.39 Ech.vul -3.12 -1.56 1.17 0.68 2.35 Koe.pan 4.70 2.53 -2.69 -0.02 -3.89 Med.fal -2.00 -0.49 -1.69 -0.6 3.44 Rub.fru 0.09 -1.19 -0.85 -0.24 1.39

Positive Pearson residuals indicate a positive association (bees visit more often), whilst negative values indicate negative association (bees visit less often). However, only values larger than 2.00 (plus or minus) are statistically significant.

You can see various interesting patterns in the results. For example, honeybees don’t seem to have any especial preferences, whereas all the bumblebees show at least some degree of choice.

### Graphics

There are a number of ways to visualize count data. A bar chart (what Excel would call a column chart) is one sensible way. However, if there are large differences in count value, some bars end up very tall compared to others.

Figure 1. bees-garden-flowers-raw.png Bee visits to some garden flower species.

Rather than attempt to visualize the raw data, it may be better to look at the results. The Pearson residual values are a good indicator of the various associations; they are also more likely to be on a similar scale.

Figure 2. bees-garden-flowers-resid.png Pearson residuals for bee visits to garden flowers.

The residual plot for the bee data shows the positive and negative associations. Values greater than 2.00 (plus or minus) are statistically significant. The plot shown here shows the data grouped by bee species, but you might easily switch this around and show the data grouped by plant species.

## References

Open University (2011), Environmental Science (S216). Undergraduate project.