Boxplot with Parametric data
This function makes a box-whisker plot but uses parametric statistics rather than the usual non-parametric ones. In general the non-parametric statistics of the regular boxplot() function are what you require, but there are occasions when you may wish to present a box-whisker plot using mean, standard error and standard deviation instead.
Keywords
Graphics, parametric, boxplot, box-whisker.
Download
You can download the function code using this link (click to view the text, right-click to select download): boxplot-parametric.R. The file is an R source code file, which is readable by any text editor.
To get the function working for your copy of R you’ll need to use the source()
function. Put the boxplot-parametric.R
file in your working directory and type:
source("boxplot-parametric.R")
Alternatively, you can use:
source(file.choose())
This will open a file browser so you can find and select the file. If you are using RStudio you can use menu: Code > Source File... The source code generates a new function called box.para()
.
Description
This function creates a box-whisker plot but uses parametric data rather than the usual non-parametric. The data must be in a stacked layout with a response variable and a predictor (grouping) variable. The statistics used compare to the regular boxplot()
like so:
|
boxplot |
box.para |
Stripe |
Median |
Mean |
Box |
IQR |
Std. Err. |
Whisker |
Min-Max |
Std. Dev. |
The function box.para()
creates the statistics and then passes the data to bxp()
, which does the plotting. You can pass additional parameters to bxp()
, these are slightly different to boxplot()
.
Usage
box.para(formula, data, …)
Arguments
The arguments require you to enter a formula.
|
A formula of the form response ~ predictor. |
|
A data.frame which contains the variables referred to in the formula. |
|
Additional arguments to pass to bxp(). |
Note that the arguments to pass to bxp()
are not entirely the same as for boxplot()
, see Examples.
Value
The result is a boxplot drawn using the parametric statistics: mean, std. error and std. deviation. In addition, a numeric matrix showing the parametric statistics is produced.
See Also
See boxplot()
for regular non-parametric statistics, bxp()
for the underlying function used to draw box-whisker plots.
Code
Here is the code for the function in full:
# Boxplot with parametric data
# Mark Gardener Sep 2011 rev. Apr 2010
# https://www.gardenersown.co.uk
box.para <- function(formula, data, ...)
# formula = the formula to use e.g. y ~ x
# data = the data where the variables are (at least 2-columns, response, predictor)
# ... = additional parameters for bxp()
{ # start function code
## Check inputs
if(missing(data)) # check to see if data given
stop('Please specify data\n\n') # if not then stop
if(missing(formula)) # check that formula entered
stop('Please enter a formula in the form y ~ x\n\n') # if not then stop
## Make summary stats
bx.mean <- aggregate(formula, data = data, FUN = mean)[,2] # the mean
bx.sd <- aggregate(formula, data = data, FUN = sd)[,2] # Std. Dev.
bx.n <- aggregate(formula, data = data, FUN = length)[,2] # no. replicates
bx.names <- as.character(aggregate(formula, data = data, FUN = mean)[,1]) # make names
bx.se <- bx.sd/sqrt(bx.n) # Std. Err.
## Make result object for screen display
bx.result <- rbind(bx.mean, bx.sd, bx.se, bx.n) # a matrix of results
colnames(bx.result) = bx.names # column names
rownames(bx.result) = c("Mean", "Std. Dev.", "Std. Err.", "Replicates") # row names
## Make boxplot stats
bx.sd.up <- bx.mean + bx.sd # upper whisker
bx.sd.dn <- bx.mean - bx.sd # lower whisker
bx.se.up <- bx.mean + bx.se # upper box
bx.se.dn <- bx.mean - bx.se # lower box
# Make boxplot object
bx.stats <- rbind(bx.sd.up, bx.se.up, bx.mean, bx.se.dn, bx.sd.dn) # main results for plot
bx.bx <- list(bx.stats, bx.names) # make bxp list object
names(bx.bx) <- c("stats", "names") # set the names for the bxp list object
## Display results
bxp(bx.bx, ...) # draw boxplot and use additonal parameters if entered
print(bx.result) # show result summary on screen
} # end function code
## END
Examples
head(InsectSprays)
count spray
1 10 A
2 7 A
3 20 A
4 14 A
5 14 A
6 12 A
box.para(count ~ spray, data = InsectSprays, las = 1, boxfill = "violet")
A B C D E F
Mean 14.500000 15.333333 2.0833333 4.9166667 3.500000 16.666667
Std. Dev. 4.719399 4.271115 1.9752253 2.5030285 1.732051 6.213378
Std. Err. 1.362373 1.232965 0.5701984 0.7225621 0.500000 1.793648
Replicates 12.000000 12.000000 12.0000000 12.0000000 12.000000 12.000000
title(xlab = "Spray type", ylab = "Mean Insects killed")
Figure 1. boxplot-parametric.png The box.para()
function uses mean, std. err. and std. dev. to create a boxplot.
Note that the parameters for bxp()
are slightly different to boxplot()
.
Links
Data examples:
- Statistics for Ecologists: support files and example data.
- Statistics for Ecologists: exercises and notes.
- Community Ecology: support files and notes.
- Managing Data using Excel: support files and example data.
Custom R functions:
- Community Ecology: custom R functions.
General data science articles:
- DataAnalytics Knowledge Base. For general topics and articles about data science, including Learning R: the statistical programming language
- DataAnalytics Tips and Tricks. for articles covering a range of topics in data science, including Using R, Using Excel, quantitative data analysis, predictive data analysis and a lot more besides.
See our Publications Page for an overview of our book on Ecology, Environmental Science and R: the statistical programming language.