R Lesson 22: US Mapmaking with R (pt. 1)

Hello everybody,

Michael here, and today’s post (which is my last post for 2020) will be a lesson on basic US mapmaking with R. Today we’ll only focus on US mapmaking with R, but don’t worry, I intend to do a global mapmaking with R post later on.

Before we get started mapmaking, install these two packages to R-ggplot2 and usmap. Once you get these two packages installed, write this code and see the output:

plot_usmap(regions = "states") + labs(title="US States") + theme(panel.background = element_rect(color="black", fill = "lightblue"))

As you can see, we have created a basic map of the US (with Alaska and Hawaii), complete with a nice blue ocean.

However, this isn’t the only way you can plot a basic map of the US. The regions parameter has four different options for plotting the US map-states (which I just plotted), state, counties, and county.

Let’s see what happens when we use the state option:

plot_usmap(regions = "state", include=c("TN")) + labs(title="Tennessee") + theme(panel.background = element_rect(color="black", fill = "lightblue"))

In this example, I used the state option for the regions parameter to create a plot of the state of Tennessee (but left everything else unaltered).

How did I manage to get a plot of a single state? The plot_usmap function has several optional parameters, one of which is include. To plot the state of Tennessee, I passed a vector to the include parameter that consisted of a single element-TN.

  • Whenever you want to plot a single state (or several states), don’t type in the state’s full name-rather, use the state’s two-letter postal code.
  • Another parameter is exclude, which allows you to exclude certain states from a multi-state map plot (I’ll discuss multi-state map plots later).

Awesome! Now let’s plot our map with the counties option:

plot_usmap(regions = "counties") + labs(title="US Counties") + theme(panel.background = element_rect(color="black", fill = "lightblue"))

This map looks just like the first map, except it shows all of the county lines in each state.

Last but not least, let’s plot our map with the county option:

plot_usmap(regions = "county", include=c("Davidson County")) + labs(title="Davidson County") + theme(panel.background = element_rect(color="black", fill = "lightblue"))

In this example, I attempted to create a plot of Davidson County, TN, but that didn’t work out. The plot didn’t work because, even though I told R to include Davidson County in the plot, R didn’t know which state Davidson County was in, as there are two counties named Davidson in the US-one in Tennessee and another in North Carolina.

This shows you that using the county name alone when using the county argument for the regions parameter won’t work, since there are often multiple counties in different states that share the same name-the most common county name in the US is Washington County, which is shared by 31 states.

So, how do I correctly create a county plot in R? First, I would need to retrieve the county’s FIPS code.

To give you some background, FIPS stands for Federal Information Processing Standards and FIPS codes are 2- or 5-digit codes that uniquely identify states or counties. State FIPS codes have 2-digits and county FIPS codes have 5 digits; the first two digits of a county FIPS code are the corresponding state’s FIPS code. Here’s an example of county FIPS codes using the two Davidson Counties I discussed earlier:

> fips(state="TN", county="Davidson")
[1] "47037"
> fips(state="NC", county="Davidson")
[1] "37057"

In this example, I printed out the county FIPS codes for the two Davidson Counties. The FIPS code for Davidson County, TN is 47307 because Tennessee’s FIPS code is 47. Similarly, the FIPS code for Davidson County, NC is 37057 because North Carolina’s FIPS code is 37.

Now that we know the FIPS code for Davidson County, TN, we can create a plot for the county. Here’s the code to do so:

plot_usmap(regions = "county", include=c(fips(state="TN", county="Davidson"))) + labs(title="Davidson County, TN") + theme(panel.background = element_rect(color="black", fill = "lightblue"))

When I create a map plot of an individual US county, I get the shape of the county.

  • A more efficient way to write the code for this plot would have been plot_usmap(regions = "county", include=c(fips) + labs(title="Davidson County, TN") + theme(panel.background = element_rect(color="black", fill = "lightblue")) where fips would be stored as a variable with the value fips <- fips(state="TN", county="Davidson") .

So how can we get a map of several US counties, or rather, a state map broken down by counties? Here’s the code to do so:

plot_usmap(regions = "counties", include=c("TN")) + labs(title="Tennessee's 95 counties") + theme(panel.background = element_rect(color="black", fill = "lightblue"))

To create a state map broken down by counties, set regions to counties and set the include parameter to include the state you want to plot (TN in this case). As you can see, I have created a map plot of the state of Tennessee that shows all 95 county boundaries in the state.

What if you wanted to plot several states at once? Well, the usmap packages has built-in region parameters that create a plot of certain US regions (as defined by the US Census Bureau), which consist of several states. The regions you can plot include:

  • .east_north_central-Illinois, Indiana, Michigan, Ohio, and Wisconsin
  • .east_south_central-Alabama, Kentucky, Mississippi, and Tennessee
  • .midwest_region-any state in the East North Central and the West North Central regions
  • .mid_atlantic-New Jersey, New York, and Pennsylvania
  • .mountain-Arizona, Colorado, Idaho, Montana, Nevada, New Mexico, Utah, and Wyoming
  • .new_england-Connecticut, Maine, Massachusetts, New Hampshire, Rhode Island, and Vermont
  • .northeast_region-any state in the New England or Mid-Atlantic regions
  • .north_central_region-any state in the East and West North Central regions
  • .pacific-Alaska, California, Hawaii, Oregon, and Washington
  • .south_atlantic-Delaware, Florida, Georgia, Maryland, North Carolina, South Carolina, Virginia, Washington DC, and West Virginia
  • .south_region-any state in the South Atlantic, East South Central, and West South Central regions
  • .west_north_central-Iowa, Kansas, Minnesota, Missouri, Nebraska, North Dakota, and South Dakota
  • .west_region-any state in the Mountain and Pacific regions
  • .west_south_central-Arkansas, Oklahoma, Louisiana, and Texas

Let’s plot out a simple region map for the .east_south_central region. Here’s the code to do so:

 plot_usmap(include = .east_south_central) +  labs(title="East South Central US", size=10) + theme(panel.background = element_rect(color="black", fill = "lightblue"))

Simple enough, right? All I did was set the include parameter to .east_south_central.

  • Remember to always include a dot in front of the region name so R reads the region name as one of the built-in regions in usmap; if you don’t include the dot, R would read the region name as a simple String, which will generate errors in your code.

Now let’s break up the region map by counties. Here’s the code to do so:

plot plot_usmap(regions = "counties", include = .east_south_central) +  labs(title="East South Central US", size=10) + theme(panel.background = element_rect(color="black", fill = "lightblue"))

To show all of the county lines in a specific region, simply set the regions parameter to counties. Also (and you probably noticed this already), if you don’t set a value for the regions parameter, regions defaults to states.

OK, so I’ve covered the basics of US map plotting with the usmap package. But did you know you can display state and county names on the map plot? Here’s the code to add state name labels to a map plot of the whole US:

 plot_usmap(regions = "states", labels=TRUE) + labs(title="US States") + theme(panel.background = element_rect(color="black", fill = "lightblue"))

The code I used to create this map plot is almost identical to the code I used to create the first map plot with one major exception-I included a labels parameter and set it to TRUE. If the labels parameter is set to true, the state label name will be displayed on each state (the label name being the state’s 2-letter postal code).

Now let’s display county names on a map, using the state map of Tennessee. Here’s the code to do so:

 plot_usmap(regions = "counties", labels=TRUE, include=c("TN")) + labs(title="Tennessee's 95 counties") + theme(panel.background = element_rect(color="black", fill = "lightblue"))

As you can see, by setting labels to TRUE, I was able to include all of Tennessee’s county names on the map (and most of them fit quite well, though there are a few overlaps).

Thanks for reading,

Michael

Also, since this is my last post of 2020, thank you all for reading my content this year. I know it’s been a crazy year, but hope you all learned something from my blog in the process. Have a happy, healthy, and safe holiday season, and I’ll see you all in 2021 with brand new programming content (including a part 2 to this lesson)!

Leave a Reply