Stata - Making Maps

go back to the homepage
go back to the economics page
go back to the Stata page
Articles: Pisati, M. (2004). Simple thematic mappingThe Stata Journal 4(4): 361-378.
Websites: How do I graph data onto a map with tmap?

For advanced mapping and even some spatial applications, Stata isn't the best software to use.  For mapping, I use ESRI's ArcGIS because it's the industry standard and I haven't found a reliable open source package yet.  For spatial software, you might consider Geoada or R.  Given those disclaimers, there are some cool things you can do in Stata using tmap (or spmap) and other functions.  To start out, type:

. ssc install tmap
. ssc install shp2dta

Now, you've got the tmap function installed to create maps in Stata and the shp2dta that takes a *.shp file and converts it to *.dta (there's also mif2dta).  On my computer, these add-ons are installed in the directory found at C:\ado\plus\t. To plot any relationships, you still need underlying map shape files.  

If you have SHP files but not in DTA format then convert them.  Let's look for a map of counties in Florida.  Search Google for "Florida counties .shp".  When I ran it (might be different later because of indexing), the 2000 US Census was the first hit.  Download the Florida shapefile from them.  Unzip the files and place them in the directory you'll be using.  For instance,

. pwd
. mkdir "c:\maps_Florida"
. copy "c:\maps_Florida", replace
. del co12_d00.*
. cd  "c:\maps_Florida"
. pkunzip -e  /* works only if you've got pkunzip.exe */ 

The filenames' quotes help if there is a space between any directory names.  In this case, there are no spaces, but it's good practice to use them anyway.  Convert the SHP to DTA files.

. shp2dta using "co12_d00.shp", database(florida) coordinates(florida_coord) genid(id)

You have successfully created a polygon map!  The map file is "florida_coord.dta" and the data file is "florida.dta."  By design, both share the "id" variable.   Examine the file, noting how they are grouped by "STATE" and then "COUNTY."  Do you see any problem?  

. browse
. duplicates list COUNTY

Not all counties have unique "id" variables but the "COUNTY" variable is unique.  In this example, though, this is not a problem.  Since the map must be closed form, if you eliminate the duplicates, you'll get rid of some of the closed polygons for the Florida Keys and other areas.  The moral of this lesson is to be careful in eliminating things without knowing what they symbolize!  If they really were duplicates, we could collapse over "COUNTY" and use the centroid of the polygons for the (x,y) to run tmap propsymbol

Another thing we need is some sort of variable measure like population size, income, race, etcetera.  Let's create a fictious one and plot it in choropleth.

. gen x=mod(id,4)
. tmap choropleth x, map("florida_coord.dta") id(id)
. graph export "floridamap.eps", replace
. !epstopdf floridamap.eps

For LaTeX users, the last step converted the graphic into a PDF using the shell mode.  If you work in MS Word, use a graphics program (GIMP or Photoshop) to save it as a JPG or other format.  As a challenge, see if you can get your final picture to look like the one on the right.  Put on white borders and you'll see the Florida Keys disappear because they're so thin.

Once you're comfortable with the previous steps, instead of choropleth, experiment with options for propsymbols, deviations, dots, and labels.  Later, you can work on cleaning up the IDs from before and finding real statistics to impose on the map.

DO file: making maps (don't look at the code until the end)

Update: Maurizio Pisati gave an excellent presentation on overlaying data visually on a map (link to PDF).