In week 1 we looked at installing and running R, while week 2 focused on importing data. This week we start exploring how to write re-usable R code.
- There are many advantages to doing statistical analysis with a programming language like R. Obviously, we use it to make the computer do what we want, but more importantly R enables us to write reusable code. This means a couple things: first, we want to be able to come back and run our programs long after we wrote them; second, we want to save time by not re-writing everything from scratch for each analysis; and lastly we want our analyses to be reproducible by other researchers.
- Open R, and open the text editor of your choice. If you’re using R on Windows or Mac you can use R’s built-in text editor. On Macs, go to File >> New Document. It should be similar in Windows. You can also use a separate text editor like Notepad (windows), gedit (linux), or TextEdit (mac).
- Enter the code for opening the car MPG data (from week 2 – you should have it saved as ‘cars.txt’) in the new document. cardata = read.table(‘cars.txt’)
- On the next line edit the names of the variables in the cardata data frame. names(cardata) = c(‘mpg’, ‘cylinders’, ‘displacement’, ‘horsepower’, ‘weight’, ‘acceleration’, ‘year’, ‘origin’, ‘name’)
- On the next line compute and display a table that shows the unique values in the cylinders column, and how many observations there are for each number of cylinders. print(table(cardata$cylinders)). Note the $ syntax to refer to a variable name in the cardata data frame.
- Save your program with a .R extension. For example, mine is called “mpg_analysis.R”.
- Run your program. Switch back to the R command line and type source([your program name]). For me, this is source(‘mpg_analysis.R’). You should see the table of cylinder values and counts come up.