Stat programming workshop – tutorial for a very simple R choropleth map

Eye candy is the best way to get your poster noticed at a conference and choropleth maps are some of the tastiest morsels. Here’s a quick tutorial on how to make one in R. It’s not a beginner topic, so bear with it. The approach below probably requires the least overheard in terms of new libraries or plotting paradigms, but it is also probably the least powerful method.
-Brian

  1. Install the maps library. The command install.packages("maps") usually does the trick, although I’m not too familiar with non-linux environments.
  2. Load the necessary libraries and data objects. The last line loads the state names and some data to play with.
    library(maps)
    data(state)
  3. Make a basic map, just to see how it looks.
    map("state", fill=FALSE, boundary=TRUE, col="red")
  4. Now come the hard parts. Presumably you want to color each state according to some numerical variable. I’m going to use land area, which is called state.area (loaded in step 2). The problem is that the map command actually draws 63 regions that comprise the continental US, while our data vector has one entry for each of the 50 states.
  5. To get the 63 regions, make an invisible map.
    mapnames <- map("state", plot=FALSE)$names
  6. Note that each region’s name is either a state or a state followed by a colon and then an island. We’ll use the colon to isolate the state names.
    region_list <- strsplit(mapnames, ":")
    mapnames2 <- sapply(region_list, "[", 1)
  7. Now we need to match our data to the state names as they appear in the map.
    m <- match(mapnames2, tolower(state.name))
    map.area <- state.area[m]

    Now each of the 63 regions is associated with the land area corresponding to its state (except poor Washington, DC, which is missing from the original state.area variable).
  8. The next step is to define the colors for our map. Of the basic R palettes, I think the heat.colors() look best. I use 8 colors; use a higher number if you want more subtle distinctions. I also reverse the color vector so that higher land area states show up redder.
    clr <- rev(heat.colors(8))
  9. Now we need to collapse our map areas into bins, one per color.
    area.buckets <- cut(map.area, breaks=8)
  10. And finally, for all the glory…
    map("state", fill=TRUE, col=clr[area.buckets])

Advertisements
This entry was posted in teaching. Bookmark the permalink.

3 Responses to Stat programming workshop – tutorial for a very simple R choropleth map

  1. Swetha says:

    Hi,

    Is there any way this sort of thing can be done with a world map? I tried using data(world.cities),
    but am finding it hard to access the data I want to show from another data frame.

    I was also wondering how to represent the data on the map. I am trying to use data that I
    have in a data frame, rather than data from the package and am struggling to get it to work.
    I am a little confused about what m is and what the line map.area <- state.area[m] is doing. I think
    the state.area is the data you want to represent, but I don't know what is happening exactly.
    This is where I got stuck with my example as I ended up with all NA values.

    Any help would be much appreciated.

    And thank you for the great post. R is such a powerful tool.

  2. Ari says:

    The choroplethr package in R is designed to simplify this process quite a bit: https://github.com/trulia/choroplethr

  3. Pingback: Broadgate Consultants » Blog Archive » Data Analysis – An example of using R

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s