Geospatial Data Analysis with OSMnx

This is the fourth article of the series regarding Geospatial Data Analysis:
- Geospatial Data Analysis using QGIS
- Guide for getting started with OpenStreetMap
- Geospatial Data Analysis with GeoPandas
- Geospatial Data Analysis with OSMnx (this post)
- Geocoding for Data Scientists
- Geospatial Data Analysis with Geemap
In the previous tutorials, I covered various aspects of Geospatial Data Analysis. I have started by showing practical examples of geospatial data without using any code at all to make you understand deeply the concepts. Geospatial data analysis is a ubiquitous field addressed to work with a special type of data, geospatial data.
It consists of adding the location to non-geographic data. It's full of examples. You can just think of cafes, hospitals, roads, rivers, satellite imaginaries and much more. Even when you search for a place with Google Maps, you are interacting with geospatial data.
This time I am going to focus on downloading, visualising and analysing data from OpenStreetMap, which is the biggest free and editable geographic database, where there are volunteers from all over the world that collaborate in this huge project. All this tutorial is possible thanks to a Python package, called OSMnx. Let's get started!
Table of contents:
- Introduction to OSMNx
- Download and Visualize OSM data
- Convert graph to GeoDataframe
- Extract Points of Interest
- Find the shortest route
Introduction to OSMNx
OSMnx is a library for downloading, analyzing and visualizing network data from OpenStreetMap. It depends on two libraries, NetworkX and GeoPandas. In particular, it exploits the graph
module from the NetworkX library to retrieve network data.
Moreover, it allows us to interact with two OpenStreetMap APIs:
- Nomatism for geocoding, which consists in finding locations by name and address.
- Overpass API to extract points of interest, like highways, schools, and parks.
Download and Visualize OSM data
Instead of manually downloading the data from the website or from Geofabrik, we can directly do it with OSMnx.
First, we need to import four libraries, that will be used later in the tutorial:
import osmnx as ox
import folium
import contextily as cx
import matplotlib.pyplot as plt
In addition to OSMnx and matplotlib, we are going to exploit folium, which is well-known for its capacity on creating an interactive leaflet map, and contextily to add the background map. This aspect can be very important to obtain realistic maps.
Like the other time, we read and visualize the OSM street network data of Bologna, one of the biggest cities in Italy.
PLACE_NAME = 'Bologna, Italy'
G = ox.graph_from_place(PLACE_NAME, network_type='drive')
ox.plot_graph(G)

From the black-and-white visualization, we can observe the points, that represent the nodes, and the lines, that portray the lines. Compared to OpenStreetMap's website, it can seem very static and basic. Folium comes to rescue us with its strong and interpretable maps:
ox.plot_graph_folium(G)

This is much better, don't you think? The bright colours and the possibility of interaction with the map are characteristics that are crucial when we use Google Maps to go to unknown places.
If you check better the OpenStreetMap website, you can notice there is the Standard layer as default. Besides the Standard layer, there are other layers, such as Cycle Map and Transport Map. It's incredible how we can exploit different layers depending on our purposes.

If we are passionate about bikes, we would be more interested in the Cycle Map. This is possible always with one line of code:
G = ox.graph_from_place(PLACE_NAME, network_type='bike')
We are taking into consideration the standard graph in the next sections.
Convert graph to GeoDataframe
Dealing with graphs isn't intuitive as working with Dataframes and GeoDataframes. For this reason, we might want to convert the graph to a GeoDataframe:
area,edges = ox.graph_to_gdfs(G)
area.head()

edges.head()

You can notice that we have obtained two GeoDataframes, one for nodes and one for edges. It's clear if you take a look at the geometry. The area geodataframe has only a pair of coordinates, latitude and longitude, while there are two pairs of coordinates in the geodataframe containing edges.
Extract Points of Interest
When working on Data Science projects, we try to add information to our dataset by searching open data on the Internet. From OSM data, it's possible to extract Points of Interest (POI), which are places we might find interesting depending on the purpose of our analysis. Examples are restaurants, churches, museums, and parks.
For example, we would like to analyze the traffic in Bologna to optimize and cut the cost of transportation. In this context, it would be useful to know the highways, gas stations, parking garages, and other places that are linked to the possible bottleneck.
Let's take all the gas stations in the city. This is possible by specifying fuel as the value of the amenity key.
fuel_stations = ox.geometries_from_place(
PLACE_NAME,
{"amenity": "fuel"},
)
fuel_stations.head()

Since we have extracted all the gas stations, it would be more useful to understand where they are located by visualizing the map. Moreover, we can add the base map to contextualize better our results.
area_crs = area.to_crs('3857')
edges_crs = edges.to_crs('3857')
fuel_stations_crs = fuel_stations.to_crs('3857')
fig, ax = plt.subplots(figsize=(10, 14))
area_crs.plot(ax=ax, facecolor='white')
edges_crs.plot(ax=ax, linewidth=1, edgecolor='blue')
fuel_stations_crs.plot(ax=ax, color='red', alpha=0.9, markersize=12)
plt.tight_layout()
cx.add_basemap(ax,crs=area_crs.crs.to_string())

That's great! We can notice that most of the fuel stations are concentrated in the periphery. Furthermore, we can distinguish different groups of service stations, that should be taken into account when measuring the traffic outside the center.
Find the shortest route
Another helpful functionality of the OSMnx library is the possibility to calculate the shortest path between two points.
origin = (
ox.geocode_to_gdf("Parco della Montagnola, Bologna, Italy")
.to_crs(edges.crs)
.at[0, "geometry"]
.centroid
)
destination = (
ox.geocode_to_gdf("Esso, Bologna, Italy")
.to_crs(edges.crs)
.at[0, "geometry"]
.centroid
)
origin_node_id = ox.nearest_nodes(G, origin.x, origin.y)
destination_node_id = ox.nearest_nodes(G, destination.x, destination.y)
This is possible with the shortest_path() method that uses by default the Dijkstra algorithm to compute the path between the source node and the target node.
route = ox.shortest_path(G, origin_node_id, destination_node_id)
route
#[400881920,
# 250763178,
# 250763179,
# 250763533, ...
# 1694666466]
We can also try to visualize both the graph and the shortest path in a unique map:
ox.plot_route_folium(G, route, route_linewidth=6, node_size=0)

Et voilà! It's like we have used Google Maps to find the way, but instead, we exploited the functionality of the OSMnx library to look for it.
Final thoughts:
This was a guide to let you know how to work with OSM data using Python. I have found that OSMnx is the most complete Python library to deal with OpenStreetMap data. Of course, it's more suitable for exploring smaller places, like cities. In the case there are bigger datasets, it's better to use more specialized software, like QGIS, to visualize them. Did you try other libraries to work with OSM data? Please comment on the story if you know it. Check out the code here. Thanks for reading! Have a nice day!
Useful Resources:
Did you like my article? Become a member and get unlimited access to new data science posts every day! It's an indirect way of supporting me without any extra cost to you. If you are already a member, subscribe to get emails whenever I publish new data science and Python guides!