April 2013 update: this post is part 5 of 6 that were designed to help beginning R programmers get up and running with some simple data analyses. They were originally private for a specific course in Summer 2012, but they’re now public in case the tips might be useful for a broader audience. -Brian

This week’s focus is on estimating and plotting a linear regression.

-Brian

- Load the cars data from previous weeks and name the variables (see the week 3 tasks for details).
- We want to explore the relationship between engine displacement (which I assume is an indicator of an engine’s size) and gas mileage. Make a scatterplot of mileage (‘mpg’) against engine displacement (‘displacement’).
- Fit a linear regression of gas mileage on engine displacement. If you are familiar with linear regression, awesome. If not, think of it as the line that best fits the data points you just plotted. The command to do this is
*lm*, which is short for *linear model*. We want to save the output so we can access it later, so we assign it to a new variable (I call mine ‘fit’): *fit = lm(cars$mpg ~ cars$displacement)*.
- Display the results of the regression. The quick way to do this is with the
*summary(lm)* command.
- Add the regression line to your plot. If your plot is still open, the command
*abline(fit)* is built to do this automagically. Make the line red and thicker for easier viewing.
- The ‘fit’ object now has a bunch of stuff in it. Use the command
*names(fit)* to list this stuff. Any of these objects can be accessed with the $ operator. Make a plot of the regression residuals compared to displacement. *plot(cars$displacement, fit$residuals)*.
**Bonus**: Access the standard error of the intercept term. *Hint*: the ‘summary(lm)’ output can also be treated as an object, whose attributes can be accessed with the $ operator.

### Like this:

Like Loading...

*Related*