Plotting geographic data on a world map with Python

By Chris McDowall 12/08/2011 9


Last week I promised to write a blog post detailing how I created this public transport animation. On reflection, it’s a topic best dealt with over a few sessions. Let’s start simple. How might you plot lots of geographic data on a map? In this post I will show you how to programmatically create a map of the World’s top ten most populated cities. It will end up looking something like this.

Top ten world cities map

This tutorial involves programming with the Python language and some additional modules. If you have never programmed before it might be a tad confusing. I suggest you first read one of the many fine introductory Python tutorials to get your head around the language. The rest of this post will assume that you understand how to install and import modules, how to write and run a script and the fundamentals of Python data structures.

Before we begin you will need to install the following:

  • Python 2.5, 2.6 or 2.7.
  • NumPy (a Python extension that adds support for multi-dimensional arrays and a host of whiz-bang high-level mathematical operations).
  • Matplotlib (a flexible 2D plotting library for Python that “tries to make easy things easy and hard things possible”).
  • Basemap (a Matplotlib extension for plotting data on geographic projections).

Right, let’s make a map. Wikipedia has a list of World metropolitan areas by population that can serve as a nice test dataset. I will demonstrate how to turn this data into a labelled proportional symbol world map.

First we need to import our various libraries:

from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
import numpy as np

Next we will set up our map. Basemap supports many different geographic projections. For this exercise I am using the Robinson projection.

# lon_0 is central longitude of robinson projection.
# resolution = 'c' means use crude resolution coastlines.
m = Basemap(projection='robin',lon_0=0,resolution='c')
#set a background colour
m.drawmapboundary(fill_color='#85A6D9')

Basemap comes packaged with some global geographic data at four resolutions: ‘crude’, ‘low’, ‘medium’ and ‘high’. Let’s draw the continent and country datasets using the ‘crude’ outlines.

# draw coastlines, country boundaries, fill continents.
m.fillcontinents(color='white',lake_color='#85A6D9')
m.drawcoastlines(color='#6D5F47', linewidth=.4)
m.drawcountries(color='#6D5F47', linewidth=.4)

We can also ask Basemap to draw lines of longitude and latitude.

# draw lat/lon grid lines every 30 degrees.
m.drawmeridians(np.arange(-180, 180, 30), color='#bbbbbb')
m.drawparallels(np.arange(-90, 90, 30), color='#bbbbbb')

Populate three arrays of equal length with the latitude, longitude and population values for each city. Normally we would read the data from a file, database or service but in the interest of simplicity I have typed them directly into the script.

# lat/lon coordinates of top ten world cities
lats = [35.69,37.569,19.433,40.809,18.975,-6.175,-23.55,28.61,34.694,31.2]
lngs = [139.692,126.977,-99.133,-74.02,72.825,106.828,-46.633,77.23,135.502,121.5]
populations = [32.45,20.55,20.45,19.75,19.2,18.9,18.85,18.6,17.375,16.65] #millions

Use our basemap object to convert the latitude/longitude values into map display coordinates.

# compute the native map projection coordinates for cities
x,y = m(lngs,lats)

Multiply each population by itself to create a scaled list of values. These will be our circle display sizes.

#scale populations to emphasise different relative pop sizes
s_populations = [p * p for p in populations]

Use the matplotlib scatter function to plot the circles. Note the use of the zorder parameter. This ensures that the scattered circles will be rendered on top of the continents.

#scatter scaled circles at the city locations
m.scatter(
    x,
    y,
    s=s_populations, #size
    c='blue', #color
    marker='o', #symbol
    alpha=0.25, #transparency
    zorder = 2, #plotting order
    )

Loop though the unscaled population values and the display coordinates. Label each circle with the city population rounded to the nearest million people.

# plot population labels of the ten cities.
for population, xpt, ypt in zip(populations, x, y):
    label_txt = int(round(population, 0)) #round to 0 dp and display as integer
    plt.text(
        xpt,
        ypt,
        label_txt,
        color = 'blue',
        size='small',
        horizontalalignment='center',
        verticalalignment='center',
        zorder = 3,
        )

Finally, add a title and display the map on your screen.

#add a title and display the map on screen
plt.title('Top Ten World Metropolitan Areas By Population')
plt.show()

Run your script and you should see a map. Lovely. I recommend you experiment further by tweaking parameters, importing other datasets and using alternative plotting methods.

This is the basic overview of how I typically plot geographic data in Python. Next time I’ll take this a step further and show you how to map, export and animate temporal geographic data.


9 Responses to “Plotting geographic data on a world map with Python”

  • Hi Chris – this is really excellent stuff! We at http://www.mesa.ac.nz are always on the look out for more interesting examples of Python being used in science! Just though that i’d add if people want a introductory python tut written specifically for scientists (here in New Zealand none-the-less) we have one posted here: http://mesa.ac.nz/?page_id=423 and would really like to increase our database. So if anyone has any examples they would be willing to share to help newbies learn python, just get in touch! We’re always keen for new contributors too!

  • @Elf Thanks for your comment. That is a great list of resources. People should definitely check them out.

    Unless otherwise stated, everything I post here is licensed under CC-BY-SA ( https://sciblogs.co.nz/seeing-data/about/ ). Feel free to grab tutorial material if you see something that you think might be useful.

  • Very nice Chris,

    I’ve done all my big maps ‘by hand’ – munging together graphs with pre-done maps in Inkscape. Maybe I should dig a little deeper into matplotlib…

    Elf,

    You should add the Biopython wiki and tutorial to your list – great for some basic manipulation of sequences and a few simple analyses

    • @David – There is a bit of a setup involved installing all the necessary modules, but once that’s done it becomes very simple to create custom maps.

  • @Eliezer I intend to! It’s been a crazy time lately due to moving jobs and cities. Now that things are (vaguely) starting to settle down I am starting to get my free time back. Stay tuned….

  • Awesome. Looking forward.
    In the meantime, I’m running into a particular issue. I’m finding that even the full dataset that comes with basemap isn’t detailed enough to get a coastline at the city level. Did you find that to be true as well, or am I missing something?

  • Sorry for the multi-comments. Can’t go back and edit here. I’ve tuned my issue. The coastline is precise enough – my problem is that the city that I’m looking to map – New York City – is interlaced with rivers – the Hudson river, the East river. The map that’s being generated, even with the drawrivers() call, doesn’t show the major rivers of the area.

    Not expecting you to debug it, just wanted to update on my findings.