In weeks 1-3 we installed and opened R, imported data from a text file, and started editing and saving code in source files. This week’s focus is R’s basic plot functions. One of the most important parts of any statistical analysis is communicating the results to other people, and plots are often a very effective way to do this.
- Open R and load the car data from weeks 2 and 3. Name the variables in the car data appropriately (see the week 3 tasks for the variable names).
- Last week we used the table command to see how many cars there are in our data set for each number of cylinders. Now we’ll check out the same information visually, using the command hist to produce a histogram. In the command window, type hist(cardata$cylinders).
- Save this plot, by using a sequence of three commands: 1. png([filename].png); 2. hist([some variable]); 3. dev.off(). If you want a pdf file instead of a png image, you can substitute the command pdf in place of png. Similarly, we will work with various plot commands in addition to hist; just substitute your command in place of hist.
- Figure out how to make the plot pretty: change the default title, and axis labels. Hint: use the help(hist) command to see descriptions of the arguments to hist.
- Now make a histogram of the weight variable. Notice that hist chooses a default number of break points, because the weights don’t naturally fall into a nice number of bins like the number of cylinders. Re-do the plot with 20 bins.
- Suppose we want to investigate how mpg changes over the years. Make a scatterplot of mpg vs year with the plot(year, mpg) command.
- Bonus: Add a horizontal line to the plot that shows where the overall mean of mpg is. Use the command abline(h=mean(cardata$mpg)). Try to make this line thicker than the default, and try to make it red (default is black).