A comprehensive guide for getting started with OpenStreetMap

Author:Murphy  |  View: 23887  |  Time: 2025-03-23 18:50:56
Photo by [Unsplash](https://unsplash.com/photos/w98knetr8EA)+ on Unsplash

This is the second article of the series regarding Geospatial Data Analysis:

  1. Geospatial Data Analysis using QGIS
  2. Guide for getting started with OpenStreetMap (this post)
  3. Geospatial Data Analysis with GeoPandas
  4. Geospatial Data Analysis with OSMnx
  5. Geocoding for Data Scientists
  6. Geospatial Data Analysis with Geemap

This article is in continuation with the story A Practical Introduction to Geospatial Data Analysis with QGIS. In the previous article, I introduce the magic world of geospatial data analysis, which is a subfield of Data Science that involves the manipulation and extrapolation of information from a special type of data, called geospatial data.

Different from the normal data, each row of the geospatial data corresponds to a specific location and can be drawn on a map. The simplest case is the specific data point with the location described just by latitude and longitude, but there can be more complex features, like roads, rivers, boundaries of countries, and morphology, where a pair of coordinates is not enough anymore.

This time I am focusing on understanding better what is OpenStreet, the concepts behind it, and how to download the data. First of all, OpenStreetMap is the biggest free and editable geographic database and project, to which any person can contribute, even you since you know now the existence of this map of the world. It is also known as the Wikipedia of the mapping world since they are both maintained by volunteers from all over the world. Is this capturing your attention? Let's get started!


Table of Contents:

  • Basic components of OpenStreetMap
  • Formats of OSM data
  • Start to play with OpenStreetMap
  • How to download OSM data

Basic components of OpenStreetMap

In the OpenStreetMap, there are three main components: nodes, ways, and relations. The most simple data type is the node, which is described by a pair of coordinates, latitude and longitude. Examples of nodes are restaurants, pubs, shops, libraries, banks, museums, and so on.

Screenshot by Author. Example of Node obtained on OpenStreetMap

In the figure above, I selected the Unicredit bank, which is a node, as you can see from the left sidebar. If you try to select different features from the map, it's also possible to notice that every real-world feature has tags, that describe the geographic attributes of that feature.

Like a Python dictionary, it is a collection of key-value pairs, where keys specify correspond to properties of the node, way, or relation. In the example, the properties of the Unicredit bank are amenity, atm, and name. In particular, amenity is typically used to specify the type of facility used by residents and tourists, such as cafes, schools, bars and restaurants.

Screenshot by Author. Example of Way (Linear Feature) obtained on OpenStreetMap

Now it's time to talk about the way, that is represented as a collection of nodes. A way can be a linear feature or a polygon feature. When we deal with linear features, there is always a starting node and an ending node. Common examples are roads and train railways.

The other case is the polygon feature, where the first and the last node coincide. In the polygon feature, there are two possible types: big buildings, like churches and palaces, and areas used for residential, industrial or commercial purposes. It's important to note that a way can have a maximum of 2,000 nodes.

Screenshot by Author. Example of Way (Polygon Feature) obtained on OpenStreetMap

The third type of data is the relation, which is a special structure used to organize a lot of nodes or ways into a larger whole. Classic examples are the boundaries of a country or a city. Similarly to the way, you can distinguish the linear and polygon features. It can also be a multipolygon, that describes an area containing polygons.

Screenshot by Author. Example of Relation (Multipolygon) obtained on OpenStreetMap

Here is an example of the residential area in Bologna, which contains 9 ways, where each way is composed of different nodes.

Formats of OSM data

In my previous tutorial, I have shown that the most popular formats for representing vector data and raster data are respectively the Shapefile and GeoTIFF. For the OSM data, the most common formats are the PBF and XML formats. The PDF file is usually preferred to the XML file because it's highly compressed and optimized to be more space efficient and fast.

Start to play with OpenStreetMap

Once the concepts of OpenStreetMap are clear, it's time to pass to the funniest part of the tutorial. You don't need to install anything. You just need to go to OpenStreetMap's website.

We can search for Bologna, the lively city of Emilia Romagna in Italy, well-known for having the world's oldest university and famous dishes, like tortellini and lasagne. The procedure is the following:

  • Write the city you prefer and click Go
  • Select the option "City Bologna, Emilia-Romagna, Italy" in the left sidebar
GIF by Author. Example of Relation (boundaries of Bologna) obtained on OpenStreetMap

We can move around and select one of the map elements present in the city. For example, we can go to Piazza Maggiore, press the "query features" button and click "Artwork Il Nettuno", a fountain dedicated to the God of Sea Nettuno.

GIF by Author. Example of Node (Artwork Il Nettuno) obtained on OpenStreetMap

This is the way to see the principal information of the elements you are interested in on the OpenStreetMap. For example, if you want to extract all the restaurants in Bologna, it's better to understand what are the common characteristics of these places. You need to make some explorations on the website like this, before directly extracting your points of interest.

How to download OSM data

There are a lot of ways to download the data from OpenStreetMap. The suitable approach depends on the size of the dataset. If you just want to download a small dataset, like a bar, a park, or a residential area, you can just do it from OpenStreetMap's website. You have two options: you can click "Download XML" or the button "Export".

Screenshot by Author. Download data from OpenStreetMap's website.

If you want OSM map data from entire continents and countries, you can download the data from geofabrik. Click the link to the file of the continent or the link of the subregion if you want the data from a specific country.

Screenshot by Author. Download data from geofabrik's website.

The third way is to directly download the OSM data with Python. There is a Python library, called Pyrosm, that is able to download and read PBF data from a huge number of locations all over the world.

from pyrosm import get_data
fp = get_data("Lisbon")

Unluckily, there aren't all the cities available. You can check it by printing the list of cities that can be downloaded:

print(sources.cities.available)

output:

['Aachen', 'Aarhus', 'Adelaide', 'Albuquerque', 'Alexandria', 'Amsterdam', 'Antwerpen', 'Arnhem',...]

I need to note that these aren't the only possible methods to download OSM data. You can check other ways in the resources I suggest at the end.

Final thoughts:

That's it! This was just an overview of the OpenStreetMap world! It can be challenging to get started on analyzing this type of data without having any idea of the main data types and playing around with the website by looking at intuitive examples. There are a lot of resources, but I found them dispersive since they cover very few aspects. I hope this tutorial helped you to begin your journey of analyzing Geospatial data. Thanks for reading! Have a nice day!


Useful resources:

Tags: Data Data Science Geospatial Python Web Development

Comment