 ### Dr. Mark Gardener Dr. Christine Gardener

GO...
Gardeners Own Home
Education Home
Other Publications # Beginning R: The Statistical Programming Language

## Support and Outline Beginning R is available from the publisher Wrox or see the entry on Amazon.co.uk.

Get the example data used in the book

Instructor materials available via Wiley

The R website

My Amazon author profile

## Beginning R: The Statistical Programming Language

### by: Mark Gardener

Conquer the complexities of this open source statistical language

R is fast becoming the de facto standard for statistical computing and analysis in science, business, engineering, and related fields. This book examines this complex language using simple statistical examples, showing how R operates in a user–friendly context. Both students and workers in fields that require extensive statistical analysis will find this book helpful as they learn to use R for simple summary statistics, hypothesis testing, creating graphs, regression, and much more. It covers formula notation, complex statistics, manipulating data and extracting components, and rudimentary programming.

• R, the open source statistical language increasingly used to handle statistics and produces publication–quality graphs, is notoriously complex
• This book makes R easier to understand through the use of simple statistical examples, teaching the necessary elements in the context in which R is actually used
• Covers getting started with R and using it for simple summary statistics, hypothesis testing, and graphs
• Shows how to use R for formula notation, complex statistics, manipulating data, extracting components, and regression
• Provides beginning programming instruction for those who want to write their own scripts

Beginning R offers anyone who needs to perform statistical analysis the information necessary to use R with confidence.

The following outline covers each chapter of the book.

Download the Beginning.RData file for the example data used in the book

# 1 Introducing R: What It Is and How to Get It

## What you will learn in this chapter

• Discovering what R is
• Getting to the R program
• Installing it on your computer
• Starting to run the program
• Using the help system and finding help from other sources
• Obtaining additional libraries of commands

In this chapter you see how to get R and install it on your computer. You also learn how to access the built-in help system and find out about additional packages of useful analytical routines that you can add to R.

Download the Beginning.RData file for the example data used in the book

# 2 Starting Out: Becoming Familiar With R

## What you will learn in this chapter

• How to use R for simple math
• How to store results of calculations for future use
• How to create data objects from the keyboard, clipboard, or external data files
• How to see the objects that are ready for use
• How to look at the different types of data objects
• How to make different types of data objects
• How to save your work
• How to use previous commands in the history

This chapter builds some familiarity with working with R, beginning with some simple math and culminating in importing and making data objects that you can work with (and saving data to disk for later use).

Download the Beginning.RData file for the example data used in the book

# 3 Starting Out: Working With Objects

## What you will learn in this chapter

• How to manipulate data objects
• How to select and display parts of data objects
• How to sort and rearrange data objects
• How to construct data objects
• How to determine what form a data object is
• How to convert a data object from one form to another

This chapter deals with manipulating the data that you have created or imported. These are important tasks that underpin many of the later exercises. The skills you learn here will be put to use over and over again.

Download the Beginning.RData file for the example data used in the book

# 4 Data: Descriptive Statistics and Tabulation

## What you will learn in this chapter

• How to summarize data samples
• How to use cumulative statistics
• How to create summary tables
• How to cross-tabulate
• How to test for different object types

This chapter is all about summarizing data. Here you learn about basic summary methods, including cumulative statistics. You also learn how about cross-tabulation and how to create summary tables.

Download the Beginning.RData file for the example data used in the book

# 5 Data: Distribution

## What you will learn in this chapter

• How to create histograms and other graphics of sample distribution
• How to examine various distributions
• How to test for the normal distribution
• How to generate random numbers

In this chapter you look at visualizing data using graphical methods—for example, histograms—as well as mathematical ones. This chapter also includes some notes about random numbers and different types of distribution (for example, normal and Poisson).

Download the Beginning.RData file for the example data used in the book

# 6 Simple Hypothesis Testing

## What you will learn in this chapter

• How to carry out some basic hypothesis tests
• How to carry out the Student’s t-test
• How to conduct the U-test for non-parametric data
• How to carry out paired tests for parametric and non-parametric data
• How to produce correlation and covariance matrices
• How to carry out a range of correlations tests
• How to test for association using chi squared
• How to carry out goodness of fit tests

In this chapter you learn how to carry out some basic statistical methods such as the t-test, correlation, and tests of association. Learning how to do these is helpful for when you have to carry out more complex analyses and also illustrates a range of techniques for using R.

Download the Beginning.RData file for the example data used in the book

# 7 Introduction to Graphical Analysis

## What you will learn in this chapter

• How to create a range of graphs to summarize your data and results
• How to create box-whisker plots
• How to create scatter plots, including multiple correlation plots
• How to create line graphs
• How to create pie charts
• How to create bar charts
• How to move graphs from R to other programs and to save graphs as files on disk

In this chapter you learn how to produce a range of graphs including bar charts, scatter plots, and pie charts. This is a “first look” at making graphs, but you return to this subject in Chapter 11, where you learn how to turn your graphs from merely adequate to simply stunning.

Download the Beginning.RData file for the example data used in the book

# 8 Formula Notation and Complex Statistics

## What you will learn in this chapter

• How to use formula notation for simple hypothesis tests
• How to use formula notation in graphics
• How to carry out analysis of variance (ANOVA)
• How to conduct post-hoc tests
• How the formula syntax can be used to define complex analytical models
• How to carry out complex ANOVA
• How to draw summary graphs of ANOVA
• How to create interaction plots

As your analyses become more complex, you need a more complex way to tell R what you want to do. This chapter is concerned with an important element of R: how to define complex situations. The chapter has two main parts; the first part shows how the formula notation can be used with simple situations. The second part uses an important analytical method, analysis of variance, as an illustration. The rest of the chapter is devoted to ANOVA. This is an important chapter because the ability to define complex analytical situations is something you will inevitably require at some point.

Download the Beginning.RData file for the example data used in the book

# 9 Manipulating Data and Extracting Components

## What you will learn in this chapter

• How to create data frames and matrix objects ready for complex analyses
• How to create or set factor data
• How to add rows and columns to data objects
• How to use simple summary commands to extract column or row information
• How to extract summary statistics from complex data objects

This chapter builds on the previous one. Now that you have seen how to define more complex analytical situations, you learn how to make and rearrange your data so that it can be analyzed more easily. This also builds on knowledge gained in Chapter 3. In many cases, when you have carried out an analysis you will need to extract data for certain groups; this chapter also deals with that, giving you more tools that you will need to carry out complex analyses easily.

Download the Beginning.RData file for the example data used in the book

# 10 Regression (Linear Modeling)

## What you will learn in this chapter

• How to carry out linear regression (including multiple regression)
• How to carry out curvilinear regression using logarithmic and polynomials as examples
• How to build a regression model using both forward and backward stepwise processes
• How to plot regression models
• How to add lines of best-fit to regression plots
• How to determine confidence intervals for regression models
• How to plot confidence intervals
• How to draw diagnostic plots

This chapter is all about regression. It builds on earlier chapters and covers various aspects of this important analytical method. You learn how to carry out basic regression as well as complex model building and curvilinear regression. It is also important because it illustrates some useful aspects of R (for example, how to dissect results). The later parts of the chapter deal with graphical aspects of regression, such as how to add lines of best-fit and confidence intervals.

Download the Beginning.RData file for the example data used in the book

## What you will learn in this chapter

• How to add error bars to existing graphs
• How to add legends to plots
• How to add text to graphs, including superscripts and subscripts
• How to add mathematical symbols to text on graphs
• How to add lines to graphs, including mathematical expressions
• How to plot multiple series on a graph
• How to draw multiple graphs in one window
• How to export graphs to graphics files and other programs

This chapter builds on the earlier chapter on graphics (Chapter 7) and also from the previous chapter on regression. It shows you how to produce more customized graphs from your data. For example, you learn how to add text to plots and axes, and how to make superscript and subscript text and mathematical symbols. You learn how to add legends to plots and how to add error bars to bar charts or scatter plots. Finally, you learn how to export graphs to disk as high-quality graphics files, suitable for publication.

Download the Beginning.RData file for the example data used in the book

# 12 Writing Your Own Scripts - Beginning to Program

## What you will learn in this chapter

• How to store series of commands as snippets to be used with copy/paste
• How to make your own help file
• How to create simple customized functions
• How to edit, store, and recall customized functions
• How to create complex program code

In this chapter you learn how to start producing customized functions and simple scripts that can automate your workflow and make complex and repetitive tasks a lot easier.

Download the Beginning.RData file for the example data used in the book

## Beginning R: Example data file

The book includes many examples and these are included in the Beginning.RData file.

### Get the example data

You can download that file by clicking on the link. This one file contains all the example datasets and scripts you need for the whole book.

### Install the example data

Once you have the file on your computer you can load it into R by one of several methods:

• For Windows or Mac you can drag the Beginning.RData file icon onto the R program icon; this will open R if it is not already running and load the data. If R is already open, the data will be appended to anything you already have in R otherwise only the data in the file will be loaded.

If you have Windows or Macintosh you can load the file using menu commands or use a command typed into R:

• For Windows use File, Load Workspace, or type the following command in R:

• For Mac use Workspace, Load Workspace File, or type the following command in R (same as in Windows):

• If you have Linux then you can use the load() command but must specify the filename (in quotes) exactly, for example:

The Beginning.RData file must be in your default working directory and if it is not you must specify the location as part of the filename. Alternatively you can find the working directory in R by using the getwd() command:

getwd()

Then drag the Beginning.RData file into that directory and use the load() command:

### Using the example data

R uses named objects so everything gets a name. You can see what is included in the Beginning.RData file by using the ls() command:

ls()

This will show you everything currently in the memory of R. Remember that names are case sensitive so that Qty is not the same as qty. There are four main kinds of object in the Beginning.RData file:

• Data
• Results
• One-line functions
• Complex functions/scripts

You can look at an object simply by typing its name.

### Data

Many of the objects in the Beginning.RData file are data. For example the bv object shows some results for visits of bees to various colors of flower.

> bv

 ratio visit Red 10.0 100 Blue 5.0 33 White 15.0 12 Green 10.0 16 Yellow 5.0 22 Orange 2.5 7 Pink 6.0 23 Purple 12.0 17

These data are used to carry out a Goodness of fit test by comparing the observed visits to the theoretical ratio expected.

### Results

Some of the objects in the Beginning.RData file are results. For example the pw.kw object shows the results of a Kruskal-Wallis test.

> pw.kw

Kruskal-Wallis rank sum test

data: height by water
Kruskal-Wallis chi-squared = 15.205, df = 2, p-value = 0.0004992

The results of analyses are sometimes used for further analyses and to draw graphs.

### One-line functions

R is very flexible and one useful aspect is the ability to create simple functions. For example the pn object is a function that applies a polynomial formula to any numerical value.

> pn
function(x) (2.06*x)+(-0.04 * x^2)-2

In this case the polynomial formula was taken from a previous analysis and is used to draw a line of best-fit onto a graph.

### Complex functions/scripts

If you require a more complex task or want to automate your workflow, you can create a longer "script". The cum.fun object is an example of such a script.

> cum.fun
function(x, fun = median, ...) {
tmp = seq_along(x)
for(i in 1:length(tmp)) tmp[i] = fun(x[1:i], ...)
cat('\n', deparse(substitute(fun)),'of', deparse(substitute(x)),'\n')
print(tmp)

This script allows you to generate a cumulative statistic for a set of numbers. The default uses the median but you can specify any sensible function (the mean for example to create a running mean).

Instructor materials available via Wiley

## Instructor Support Materials

Instructors (teachers, lecturers, professors) can now access a range of support materials via the Instructor Companion Site on the Wiley Higher Education website (you have to register but it is free).

The materials include:

• An annotated syllabus split into 30 sections. Intended to be approximately 1 hour each section.
• A series of PowerPoint decks. Each deck is linked to a section of the annotated syllabus.
• Classroom exercises. These compliment the 30 sections of the syllabus and form a structured approach to teaching R.
• Questions and Answers. Each of the 12 chapters has 12 questions (the answers are separate). The questions are in 3 forms:
1. TRUE or FALSE?
2. Multiple choice
3. Fill in the missing word

If you are an instructor and are teaching R then these materials can help you structure your course and provide you with additional materials that you can press into service as you like.

Visit the R Project website