Say Goodbye to Flat Maps with Pydeck

Author:Murphy  |  View: 25779  |  Time: 2025-03-23 18:04:34
Image by Google-Deepmind on Unsplash

A 3D extrusion map is a type of data visualization where 3D bars or columns are positioned on a map based on their geographic coordinates. The height of each bar represents a numerical value, such as population or temperature, associated with that specific location. Here's an example showing urban population density on the Hawaiian Islands:

Population density (people/square kilometer) for Hawaii (all remaining images by the author)

Maps of this type are presented with a "tilted" perspective so that the height of the bars is apparent. By combining the geographical information provided by the map with the vertical dimension represented by the bars, a 3D extrusion map can convey information and patterns in an interesting spatial context. Relative relationships are often more important than absolute values.

In this Quick Success Data Science project, we'll use Python and the pydeck library to easily create 3D extrusion maps for population distribution in the United States and Australia. After finishing this short tutorial, you'll have no problem creating stunning visualizations of your own geospatial datasets.

The Population Datasets

In this project, we'll plot population data for the United States and Australia. For the US, we'll use the free Basic United States Cities Database at simplemaps.com [1].

This dataset contains information on 30,844 towns and cities that make up the bulk of the US population as of January 31, 2023. It's provided under a Creative Commons Attribution 4.0 license and can be redistributed and used commercially. For convenience, I've already downloaded the data and stored it in a Gist.

For Australia, we'll use a 2020 Kaggle dataset derived from the simplemaps.com World Cities Database [2]. It includes 1,035 prominent cities in Australia that contain most of its population. It's released for free under an MIT license and Creative Commons Attribution 4.0 license. For convenience, this dataset has also been stored in a Gist.

The pydeck Library

The pydeck graphics library is a set of Python bindings, optimized for a Jupyter Notebook environment, for making spatial visualizations using deck.gl. The latter is a WebGL (GPU)-powered framework for visually exploring large datasets using a layered approach.

The pydeck library grants you access to the full deck.gl layer catalog in Python. You can create beautiful deck.gl maps without using a lot of JavaScript, and you can embed these maps in a Jupyter notebook or export them to a stand-alone HTML file. The library uses Carto by default but can also work well with other base map providers, like Mapbox.

A pydeck thematic map is meant to be used interactively. Like Plotly Express maps, you can pan and zoom the maps. Passing the cursor over a bar will also launch a hover data window revealing details such as the name of the data point, its value, location, and so on.

To install pydeck with conda, enter the following in the command line:

conda install -c conda-forge pydeck

To install with pip enter:

pip install pydeck

For more on installing pydeck, and to see the gallery of examples, visit Gallery – pydeck 0.6.1 documentation.

The Code

The following code was entered into JupyterLab by cell.

Importing Libraries

In addition to pydeck, we'll use the pandas data analysis library to load and manipulate the data. You can install it with either:

conda install pandas

or

pip install pandas

Here are the imports:

import pandas as pd
import pydeck as pdk

Preparing the US Population Data

The following code reads the US cities dataset into a pandas DataFrame and keeps only columns for the city name, its latitude, longitude, estimated population, and density (in population per square kilometer). Because there is such a huge range in population values, it also makes a new column by dividing the population value by 100. This will make it easier to compare 3D bars between the US and Australia, which we'll do later in the project.

# Specify the column names to keep:
columns_to_keep = ["city", "lat", "lng", "population", 'density']

# Load the CSV file into a DataFrame and keep only the specified columns:
df_us = pd.read_csv('https://bit.ly/3ObClvP', usecols=columns_to_keep)

# Scale the population column for easier comparison to Australia:
df_us['popl_div_100'] = (df_us['population'] / 100)  

display(df_us)
Display of the US cities DataFrame

Plotting the US Population Data

The following code creates the thematic map in three steps. The first step instantiates a pydeck Layer object. The second step sets the ViewState parameters, such as the map's center point location, zoom level, pitch angle, and bearing. The final step instantiates a Deck object and renders the map in HTML.

The first argument used in the Layer() class is type. Here, we use the ColumnLayer type, which creates bars (technically, cylindrical columns). To see other options, such as heatmap layers and icon layers, visit the pydeck gallery.

Among the other important arguments for the Layer() class are get_elevation, which is the DataFrame column used for the bar height; elevation_scale, which scales the bar height; pickable, which turns on hover data when the cursor lands on a bar; and coverage, which sets the width of the bar. These arguments, along with the one for get_fill_color, will let you fine-tune the appearance of the map.

The ViewState() class arguments are straightforward. The bearing controls the view orientation, and pitch sets the view angle (0 = straight down).

# Build the map layer:    
layer = pdk.Layer(type='ColumnLayer',
                  data=df_us,
                  get_position=['lng', 'lat'],
                  get_elevation='population',
                  auto_highlight=True,
                  elevation_scale=0.03,
                  pickable=True,
                  get_fill_color=['population', 255],
                  coverage=5)

# Set the view parameters:
view_state = pdk.ViewState(longitude=-95, 
                           latitude=36,
                           zoom=3.8,
                           min_zoom=3,
                           max_zoom=15,
                           pitch=45.0,
                           bearing=0)

# Render the map:
r = pdk.Deck(layers=[layer], initial_view_state=view_state)
r.to_html('usa_popl.html')
Population map for 30,000+ US cities

Although we're only plotting about a third of the cities in the US, the map is still impressive. One of the most obvious features is the 100th Meridian, an imaginary vertical line that divides the more populated eastern half of the US from the more sparsely populated western interior.

A somewhat misleading aspect is the ultra-tall columns for places like New York City and Los Angeles. The free version of the database we're using provides urban populations, rather than municipal populations, which means it reports the population of the municipality and its surrounding suburbs and industrial areas, known as the greater metropolitan area. This is a bit of double-dipping, but it can be useful in its own right, as you don't have to identify and sum up the components of this larger area.

As far as functionality goes, you can intuitively manipulate this map using your mouse or keyboard. The scroll wheel lets you zoom. The first mouse button (MB1) will let you pan. SHIFT-MB1 lets you tilt the viewing angle or rotate the map. Finally, you can hover over a bar with your mouse to get detailed information on the data point (you'll probably want to zoom in first).

The "pickable" pop-up window for the city of Cut and Shoot, Texas

Note: to make a color bar or legend in pydeck you have to use an external library like Matplotlib and then render it beside your pydeck visualization rather than within it. You can learn about standalone Matplotlib color bars here.

Plotting the US Population Density Data

The following code plots the density data. I've tweaked some of the arguments to improve the display.

# Build the map layer:
layer = pdk.Layer(type='ColumnLayer', 
                  data=df_us,
                  get_position=['lng', 'lat'],
                  get_elevation='density',
                  auto_highlight=True,
                  elevation_scale=20,
                  pickable=True,
                  get_fill_color=['density', 220],
                  coverage=2)

# Set the view parameters:
view_state = pdk.ViewState(longitude=-95,
                           latitude=36,
                           zoom=3.8,
                           min_zoom=3,
                           max_zoom=15,
                           pitch=45.0,
                           bearing=0)

# Render the map:
r = pdk.Deck(layers=[layer], initial_view_state=view_state)
r.to_html('usa_density.html')
Population density map for 30,000+ US cities
The map zoomed to show the population density in the northeastern US

In the preceding figure, the tallest bar is for the island of Manhattan in New York City, which houses a whopping 28,654 people per square kilometer. But this is nothing compared to Manila, which has the world's highest Population Density at 46,178 people per square kilometer.

Preparing the Australian Population Data

The following code reads the Australian cities dataset into a pandas DataFrame and keeps only columns for the city name, its latitude and longitude, and its estimated population. Because there is such a huge range in population values, it also makes a new column by dividing the population value by 100. This will make it easier to compare 3D bars between the US and Australia later.

## Specify the column names to keep:
columns_to_keep = ["city", "lat", "lng", "population"]

# Load the Australia CSV file into a DataFrame:
df_au = pd.read_csv('https://bit.ly/3PXwziA', usecols=columns_to_keep)
df_au['popl_div_100'] = (df_au['population'] / 100)
display(df_au)
Display of the Australian cities DataFrame

Plotting the Australian Population Data

To plot the Australian data, we just repeat the plotting code with arguments tailored to the dataset. An important one is changing the view state's longitude and latitude!

# Build the map layer:      
layer = pdk.Layer(type='ColumnLayer',
                  data=df_au,
                  get_position=['lng', 'lat'],
                  get_elevation='population',
                  auto_highlight=True,
                  elevation_scale=0.2,
                  pickable=True,
                  get_fill_color=['popl_div_100', 220],
                  coverage=6)

# Set the view parameters:
view_state = pdk.ViewState(longitude=138,
                           latitude=-33,
                           zoom=3.6,
                           min_zoom=3,
                           max_zoom=15,
                           pitch=55.0,
                           bearing=310)

# Render the map:
r = pdk.Deck(layers=[layer], initial_view_state=view_state)
r.to_html('au.html')
Population map for 1,000+ Australian cities

Australia has been described as a collection of coastal city-states and you can see why. About 86% of the population lives in urban areas, with 72% in major cities, such as Melbourne, Sydney, and Perth. There's a reason for this, of course. The interior is barren, and they don't call it "The Red Center" for nothing!

Changing the Map Style

By default, Pydeck plots using a dark background (specifically, Carto's "Dark Matter" map). This is set using the map_style parameter of the Deck() class. To change the background to white, pass it pdk.map_styles.LIGHT. Other options are for satellite, roads, or the dark and light versions with no labels.

Here's an example of the US dataset plotted with a light background, the elevation set to the popl_div_100 column, and the bar fill color set to black (using the RGB color code [0, 0, 0]):

# Build the map layer:      
layer = pdk.Layer(type='ColumnLayer',
                  data=df_us,
                  get_position=['lng', 'lat'],
                  get_elevation='popl_div_100',
                  auto_highlight=True,
                  elevation_scale=30,
                  pickable=True,
                  get_fill_color=[0, 0, 0],
                  coverage=3)

# Set the view:
view_state = pdk.ViewState(longitude=-95,
                           latitude=36,
                           zoom=3,
                           min_zoom=3,
                           max_zoom=15,
                           pitch=0,
                           bearing=0)

# Render the map:
r = pdk.Deck(layers=[layer], initial_view_state=view_state,
             map_style=pdk.map_styles.LIGHT)
r.to_html('us_popl_light.html')
The US cities population map with a light background and black bars

Comparing the Populations of Australia and the United States

If you repeat the previous code using the df_au DataFrame with a longitude of 138 and latitude of -26, you'll produce a map of Australia that can be compared to the previous map of the US:

Comparison of the populations of US and Australian cities at the same scale

Despite being similar in size to the continental United States, Australia is much less populated. Its two largest cities hold 5–6 million each and are comparable in population to American cities like Houston, Miami, and Atlanta.

Summary

Thematic maps, such as 3D extrusions, help you highlight a specific theme tied to a physical space. All the relevant geospatial data is extracted and projected on the map, enabling your audience to quickly grasp the connection between the theme and the locations.

The pydeck library makes it easy to create interesting 3D thematic visualizations with Python. It's optimized for working with Jupyter Notebook, popular libraries like pandas, and large datasets.

Python has a large ecosystem of geospatial libraries in addition to pydeck. To see a summary of the most important ones – including guidance on how to pick the best one for your needs – check out my latest book, Python Tools for Scientists: An Introduction to Using Anaconda, JupyterLab, and Python's Scientific Libraries.

Citations

  1. US Cities Database (2023), https://simplemaps.com/data/us-cities.
  2. Australia Cities Database | Kaggle (2020), from https://simplemaps.com/data/world-cities

Thanks!

Thanks for reading and please follow me for more Quick Success Data Science projects in the future.

Tags: Geospatial Data Hands On Tutorials Population Density Programming Pydeck

Comment