Accessing the US Census API

Nov 7, 2017 00:00 · 3073 words · 15 minutes read R US Census

Introduction

Since 2010, the US Census American Community Survey (ACS) has replaced the long-form decennial census with the intent of releasing updated data on population statistics in a timely manner. The ACS is an ongoing survey where approximately 295,000 addresses are sampled per month and the results are released annually in a 1-year and 5-year series (RIP 3-year series). Although not as well-known as its older sibling, the (decennial) census, many entities in both the private and public sectors rely on the ACS when conducting important functions such as resource allocation, emergency planning and trendy visualizations.

So how do I get this data?

The US Census designates data centers across the country to assist potential users with understanding and accessing their data using tools such as the American FactFinder. Here is an example of the kinds of workshops offered in Hawaii by the Department of Business, Economic Development & Tourism. A more preferred method for obtaining statistical-ready ACS data is to access the US Census API using one of the following packages created by smart people like Ezra Glenn, Sebastian Daza and Hannah Recht in R - so hot right now.

# devtools::install_github("hrecht/censusapi")
library(censusapi)
my_api_key <- "input your API key here"
Sys.setenv(CENSUS_KEY=my_api_key)

library(data.table)
library(stringr)
library(rgdal)
library(proj4)
library(spdep)
library(maptools)
library(sf)

library(ggplot2)
library(viridis)
library(leaflet)
library(scales)

Data

GIS Data

The GIS data on tract areas was provided by the State of Hawaii Office of Planning.

tmp <- tempfile()
download.file("http://files.hawaii.gov/dbedt/op/gis/data/tracts10.shp.zip", tmp)
unzip(tmp)
shp <- st_read("tracts10.shp")
## Reading layer `tracts10' from data source `C:\Hugo\Sites\thomasyokota.com\content\post\tracts10.shp' using driver `ESRI Shapefile'
## Simple feature collection with 325 features and 4 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 371108.6 ymin: 2094231 xmax: 940276.1 ymax: 2458640
## epsg (SRID):    26904
## proj4string:    +proj=utm +zone=4 +datum=NAD83 +units=m +no_defs

ACS Data

Table B19113: MEDIAN FAMILY INCOME IN THE PAST 12 MONTHS (IN 2016 INFLATION-ADJUSTED DOLLARS).

df <- getCensus(name="acs5", vintage=2015,
                vars=c("GEOID", "NAME", paste('B19113_', sprintf('%03i', seq(1, 1)), 'E', sep='')),
                region="tract:*",
                regionin="state:15+county:*")
setDT(df)[, GEOID := str_replace(GEOID, "14000US", "")]

Leaflet

shp <- merge(shp, df, by='key', by.x="GEOID10", by.y="GEOID", all.x=FALSE)
shp <- st_transform(shp, "+proj=longlat +datum=WGS84")
shpf <- fortify(shp, region='GEOID10')

pal1 <- colorNumeric(
  palette = "RdYlBu",
  domain  = shp$B19113_001E,
  reverse = TRUE)

popup1 <- paste0("<span style='color: #7f0000'><strong>Median Family Income</strong></span>",
                 "<br><span style='color: salmon;'><strong>Tract Name: </strong></span>", 
                 shp$Name, 
                 "<br><span style='color: salmon;'><strong>Estimate: </strong></span>", 
                 dollar(shp$B19113_001E))

leaflet(shp) %>% 
  addProviderTiles(providers$Esri.NatGeoWorldMap, 
                   group="ESRI",
                   options=providerTileOptions(opacity=0.35)) %>%
  addProviderTiles(providers$Stamen.TonerLines,
                   group="Toner") %>%
  addProviderTiles(providers$Stamen.TonerLabels,
                   group="Toner") %>%
  addPolygons(fillColor=~pal1(B19113_001E),  
              fillOpacity=0.6,
              color="darkgrey",
              weight=1.5,
              popup=popup1,
              group="<span style='color: #7f0000; font-size: 11pt'><strong>Median Family Income</strong></span>") %>%
  addLayersControl(
    baseGroups=c("Toner", "ESRI"),
    overlayGroups=c(
    "<span style='color: #7f0000; font-size: 11pt'><strong>Median Family Income</strong></span>"
    ),
    options=layersControlOptions(collapsed=FALSE)) %>%
  # hideGroup(c(
  #   "<span style='color: #7f0000; font-size: 11pt'><strong>Median Family Income</strong></span>")
  #   ) %>%
  addLegend("bottomright", 
            pal=pal1, 
            values=~B19113_001E,
            title="Est. Median Family Income",
            labFormat=labelFormat(
              prefix='$', suffix='', between=', '
            ),
            opacity=1,
            group="<span style='color: #7f0000; font-size: 11pt'><strong>Median Family Income</strong></span>")