This tutorial will show you how to use the free NHGIS service to obtain demographic data that you can use in GIS analysis and mapping.
The National Historical Geographical Information System (NHGIS), a service provided by the Minnesota Population Center at the the University of Minnesota, is one of the best ways to access data from the Census Bureau. You likely want a shapefile that you can use for analysis in ArcGIS or QGIS. Data from NHGIS can also be adapted for web maps, interactive applications, and cartography.
Another advantage of NHGIS: It can be used to obtain data from as far back as the first U.S. Census in 1790. Not only does it provide the raw numbers, but you can also obtain shapefiles containing historical Census boundaries.
NHGIS can be intimidating to navigate. While the team behind it has done an excellent job balancing usability and comprehensiveness, demographic data is multifaceted and complex.
The goal of this tutorial will be to create a map of Milwaukee County (or some other county) showing the percentage of Black residents in each Census Tract using the most current data. The process for examining other locations, years, variables, and enumeration units will not be much different. Feel free to follow along by using the same data as our examples, or other data of your choosing. We will use ArcGIS Pro to work with the data we obtain.
Choose a box to begin:
The following tutorial is suitable for people with GIS experience. If you need additional help with a step, reference the beginner version of this tutorial.
ANDoperator if you are using more than one. In our example, the SQL statement was
"US_tract_2010.STATEFP10" = '55' AND "US_tract_2010.COUNTYFP10" = '079'. Run the query.
Other ways to retrieve NHGIS Data:
The following tutorial is suitable for beginners. If you need concise instructions, please look here.
Watch this quick (2:22) video providing an overview of the data extract process on NHGIS.
You should have already set up an NHGIS profile as part of the workshop setup, but if not, please do so now.
On the NHGIS homepage, click the Browse and Select Data link on the left menu. Alternatively, you may click on the Get Data button on the homepage or the Select Data link on the top navigation menu. You will see a page similar to this:
These filters will allow us to tell NHGIS what specific data we want to add to our cart.
Select Geographic Levels to specify your enumeration unit. In our example, we want to obtain data on a Census Tract level.
Click on the green plus sign next to Census Tract. This will add that level to the Selected Geographic Level Filters area at the top of the page. Note that you can select as many units as you want, but this may add a layer of confusion to your process. Let's stick to Census Tract.
Click Submit at the bottom of the window.
Multiple rows would now appear on the page. There are far too many to look through with such a limited filter; let's filter further.
Click on Years.
If you only want to look at U.S. Census data, the Decennial Years column will be your only concern; U.S. Census data is collected every ten years. Some options will be greyed out, but you can still select them. The greyed-out options are years for which data is not available given the other filters you've selected. For example, since we chosen Census Tracts as our geographic level, every decennial year prior to 1910 is grey. This is because Census Tracts were never used before 1910!
Select the year(s) you're interested in, and hit Submit. We'll use the most current decennial census data, which is from the 2010 Census.
The rows under Select Data will update once more, but there are still a lot of records.
Click on Topics to narrow results even further.
Find the topic(s) you wish to map in the list that pops up. Many researchers are concerned with Population data, but there are other categories that can be accessed using the tabs on the left. Again, options may be greyed out if data for those topics cannot be obtained given the other filters you have selected.
We're interested in data about the Black population in Milwaukee County, so we will select race as our topic.
The first plus sign next to Race is the Table Topic Filter; the second is the Breakdown Filter. You may click the green question mark next to each of these terms at the top of the table to learn more. For the purposes of this tutorial, we will use Table Topic Filter. Click on the first plus sign and Submit once more.
We could select Datasets to narrow down our search even further, but it is not necessary considering the specificity of the other three filters. Take a look at the dataset filters anyway and notice how many are now unavailable. That's because of our previous filters for geography, year, and topic.
Underneath the Apply Filters section is a table containing the results of your search. (You should be looking underneath the Source Tables tab, not Time Series Tables or GIS Boundary Files.) Chances are you will still have a significant amount of rows to search — and on multiple pages too!
Many researchers want to map race data. If we click on the Popularity column heading, we can see that one row is much more popular than the others.
We would want to select the most popular option. Confirm it by seeing what information the data table contains. Click on the table name, P3. Race, to open a window detailing the contents of this dataset.
We can now determine in the Universe field that this table measures the entire population of a geographic level. "Census Tract" is one of those available geographic levels. Finally, it looks like "Black or African American alone" is one of the variables available to us. In other words, this dataset contains the total population of people who identify as Black or African American in each Census Tract. That is exactly what we're looking for.
Close the Data Table Details window to return to the filter search results. Click on the green plus sign on the left to add this data to our Data Cart. You can select multiple datasets if you would like — this could be useful if, for instance, you are not entirely certain that you've selected the right dataset.
You may notice that we've been using the term "data table." That's because when you download data from NHGIS, it is delivered to you in the form of a table — the kind of thing you could open in Microsoft Excel or Google Sheets. It has no associated geographic coordinates, and it's definitely not a shapefile.
In the same area where you selected which dataset you want, click on the GIS FILES tab. If you've provided adequate Geographic Levels and Years filters, the list of results will be mercifully short. In fact, our query only returns two boundary files!
The boundary file that we are interested in is the 2010 TIGER/Line +. This boundary file contains all Census Tract boundaries used in the 2010 U.S. Census. We'll add it to our Data Cart by clicking on the green plus sign. The cart will update accordingly.
NHGIS will provide a data table and boundary shapefile, and it will be our job to join them. Our data will include every Census Tract in the U.S., but we are only concerned with Milwaukee County — so we will need to trim the data too.
But we're not going to worry about any of that until we actually have our data. Click Continue in the Data Cart, then click on it again. Keep all of the options as they are by default. You may want to add a brief description of your project, so you can find this dataset later to modify or re-download. Click on Submit to be brought to your Extracts History.
Note that the status of your new extract is queued. It takes a little time for NHGIS to process a simple dataset like ours; more complicated extracts will, of course, take longer. Take a break for a few minutes. By default, NHGIS will send you an e-mail once your data is ready for download. Refresh the page until your extract's status is completed.
You will need to download the table and gis as .ZIP files separately. Do so, and then extract them once they have finished. (If you're unable to extract your data from the .ZIP file, try 7-Zip.) Note that boundary shapefiles come in the form of a .ZIP file within another .ZIP file, and you'll need to extract both.
If some time has passed and you still haven't received an e-mail to download your data, you can download the data from the AGSL's sharepoint site:
Click here to download sample data. (0.5 GB, compressed)
The ability to easily perform joins in GIS software is one of the advantages of using NHGIS, and it's a big one.
First take a look at the table you downloaded. Along with the .CSV file containing the data itself, you received a codebook (.TXT file) in the same directory. Unfortunately, the columns in the data table you just downloaded are often not self-explanatory: for this project, there is no clearly labeled "% Black or African American" column. This codebook is a necessary part of understanding your data. Open it and take a look.
You can quickly glance at the Context Fields. What we're mostly interested in is the part of the document starting with "Breakdown."
Our variable of interest, "Black or African American alone," is associated with the column heading H7X003. Because we want to map the Black population as a percentage of the total population in a Census Tract, we should also know that H7X001 is the column for the total population in a tract.
Hint: Remember to check the Universe!
Take the time to note all of the columns that you will want to use.
Keep both the boundary file and data table you downloaded somewhere on your computer where you'll be able to find it later. Your downloads folder or desktop are good options.
It's time to open ArcGIS Pro. Note that what we'll be doing can be done in pretty much any GIS software, such as the free QGIS.
We'll work on a new, blank map template. Assign the project any name you want. Navigate to the Map tab. Click the Add Data button and select Data to add both the .CSV table and shapefile you downloaded from NHGIS to your document.
This next part will require some patience. Your boundary shapefile contains boundaries from the entire United States, so ArcGIS may take a little while to render it all. If your computer is having difficulty processing so much data, you can temporarily pause map drawing on the bottom left corner of the map view.
We need to associate the data in the .CSV data table with the boundaries in our shapefile. This will be done by joining the tables. In GIS, the term join has a very specific meaning. In our case, it involves finding a column that both our .CSV data table and our shapefile's attribute table have in common, and merging the two into one big table.
If you would like a refresher on joins, check out the ArcGIS documentation: About joining and relating tables.
Right click on one of the file names in the contents pane — in our example, US_tract_2010 and nhgis0017_ds172_2010_tract.csv — and open the (attribute) table. Thoroughly examining the table of any data you download is good practice. Do you see which field we're going to use to join the tables?
NHGIS very conveniently includes a GISJOIN column in all of its data. In other words, they've done a big part of the hard work for us! Right-click on the shapefile name (for us, US_tract_2010), hover over Joins and Relates, and click on Add Join.
For the Input Table field US_Tract_2010 should be selected. If it is not, you can use the drop-down menu to add it manually. Next set Input Join Field to GISJOIN using the drop-down menu. The Join Table drop-down should already have your data table selected, but if not, do that. Finally, the last field, Join Table Field asks which field you're going to base the join on — again, that would be GISJOIN. Your window should look something like this:
Open the attribute table for your boundary shapefile once more. Scroll all the way to the right. These columns should be the same ones you examined in the codebook earlier. Congratulations — all of your data is now in one convenient (and, frankly, huge) shapefile!
If you want to make a map of the entire United States, you're pretty much ready to go. If not, we still have some work to do. Regardless, take some time to appreciate the fact that you can now, with very little effort, make a detailed thematic map of the whole country. And you could do it with anything in the U.S. Census. Hopefully you feel that all of the effort you put into getting here was worth it!
There are a few ways to trim your data so it only includes your area of interest. The easiest method uses FIPS codes, which are already included in NHGIS data.
Check out the data table for your shapefile. You will find two columns: one begins with STATEFP, the other with COUNTYFP. They will be followed by a number; in our case, this is 10, because we are using data from 2010. Each state and county in the United States has a unique combination if FIPS codes. A simple way to find the FIPS code for an area is to use the official U.S. Census website: 2010 FIPS Codes for Counties and County Equivalent Entities. The FIPS code for Wisconsin is 55; for Milwaukee County, 079.
Click on the icon on the top left of the attribute table window, then select Select by Attributes. Alternatively, you can use Select by Attributes under the Map tab in the Selection group.
This is where you will tell ArcGIS which data you want. Leave the Selection type as the default New selection. Click on the New expression button with the green + sign next to it. In first drop-down menu of the Where clause Find STATEFP10 — it may be preceded by your shapefile name — and select it. The second drop-down menu should be set to is equal. In the third drop-down menu, we're looking for Wisconsin's code, 55. Find it toward the bottom of the list and select it. Your WHERE statement in the expression box should now look like the image below.
We are interested in all Census Tracts that have both the Wisconsin state FIPS code and the Milwaukee County FIPS code. This means our query will include the word
AND, which means all results of our query must include both FIPS codes. In SQL,
AND is an operator — something used to compare values in a query. For more information, see SQL AND & OR Operators at W3Schools.
Complete the second half of the query by clicking the Add clause button below the expression you just created. Leave the first drop-down with the value of And. Now use the same method above but with the COUNTYFP field instead, using the FIPS code 079. If you did it correctly, your final expression should look something like this:
Alternatively, you may use the SQL button to write an SQL query that will perform this task. The final statement should look something like this:
US_tract_2010.STATEFP10 = '55' And US_tract_2010.COUNTYFP10 = '079'
ArcGIS will read this query like instructions, in much the same way you are reading this tutorial. In plain English, what you have effectively told the software is: select all of the records in the US_tract_2010 shapefile where the state FIPS code is 55 and the county FIPS code is 079. If the state FIPS code for a Census Tract is 55, but the county FIPS code is not 079, that Tract will not be included.
Click Apply. It may difficult to see, but ArcGIS has selected the entirety of Milwaukee County. Zoom and pan your map to see for yourself.
As you may notice, however, the rest of the U.S. is still present. Indeed, selecting alone does not isolate your data. Right-click on your shapefile name in the Table of Contents on the left, hover over Selection, and click on Make Layer From Selected Features. Remember this option — it will save you a lot of time and energy throughout your GIS career!
You should have a new layer listed on your Table of Contents; it will be called similar to "US_tract_2010 selection." Go ahead and remove your original shapefile and data table (right click them and select Remove), since all of our data is now concentrated in this brand new layer. Zoom in a bit if necessary; you will notice that you have isolated your area of interest!
You could save this layer as it's own shapefile, or create new selections to repeat the process for the entire state or different counties.
You have the data you need. The rest is in your hands. Symbolization, projection, ancillary map elements — these are all things you will need to consider when constructing your final map. If you are enrolled in a GIS class, you will undoubtedly learn the skills to symbolize your map to your needs and liking.
Other ways to retrieve NHGIS Data:
This is just one way to get data from the Census Bureau and one of many ways to download data from data.census.gov. If you want to download census data for use in a GIS software, such as a shapefile, we strongly recommend the National Historic Geographic Information System (NHGIS.org).
This tutorial shows you the steps to download Median Household Income data from the 2020 American Community Survey 5-year estimates at the Census Tract geography for Milwaukee County, Wisconsin. Follow along or choose different data if you wish.
1. Navigate to https://data.census.gov/cedsci/
2. Click "Advanced Search"
3. Choose "Topics" underneath "Find a Filter"
4. Choose "Income and Poverty" > "Income and Earnings" > "Income (Households, Families, Individuals)"
5. Choose "Geography" underneath "Find a Filter"
6. Choose "Tract" > "Wisconsin" > "Milwaukee County, Wisconsin" > "All Census Tracts within Milwaukee County, Wisconsin"
7. Click the blue "SEARCH" icon in the lower right-hand corner.
8. In the search results, click the "+" icon underneath "S1903 | MEDIAN INCOME IN THE PAST 12 MONTHS (IN 2020-INFLATION-ADJUSTED DOLLARS)" and click the blue arrow icon next to "2020: ACS 5-Year Estimates Subject Tables"
9. You may come across a message stating that the table is too large to display. In this case, you can either:
Research using NHGIS data should cite it as:
Steven Manson, Jonathan Schroeder, David Van Riper, and Steven Ruggles. IPUMS National Historical Geographic Information System: Version 14.0 [Database]. Minneapolis, MN: IPUMS. 2019. http://doi.org/10.18128/D050.V14.0