Visualizing Geospatial Network Graphs using Basemap and mplleaflet

In my previous articles on network graphs, I showed you the basics of plotting directed and undirected graphs using the NetworkX and pyvis packages. In this article, I will use the flights delay dataset to visualize the flight paths between the different airports, and specifically show you how to visualize them using a geospatial network graph.
Using the Flights Delay Dataset
As usual, I am going to use the 2015 Flights Delay dataset.
2015 Flights Delay Dataset (airports.csv). Source: https://www.kaggle.com/datasets/usdot/flight-delays. Licensing – CC0: Public Domain
There are two files from this dataset that I will be using:
- flights.csv
- airports.csv
First, let's load the flights.csv file into a Pandas DataFrame:
import pandas as pd
df = pd.read_csv('flights.csv',
usecols = ["ORIGIN_AIRPORT", "DESTINATION_AIRPORT","YEAR"])
df.head()
We don't have to load all the columns in this file (it is a large file!) – three columns are sufficient for this article. You should see the dataframe as follows:

Once the dataframe is loaded, I will go ahead and count the numbers of flights from one airport to another:
df_between_airports = df.groupby(by=["ORIGIN_AIRPORT", "DESTINATION_AIRPORT"]).count()
df_between_airports = df_between_airports['YEAR'].rename('COUNT').reset_index()
df_between_airports = df_between_airports.query('ORIGIN_AIRPORT.str.len() <= 3 & DESTINATION_AIRPORT.str.len() <= 3')
df_between_airports = df_between_airports.sort_values(by="COUNT",
ascending=False)
df_between_airports
The resultant output is as shown:

As there are more than 4500 combinations of flights, let's only select the top 800 combinations:
top = 800
df_between_airports = df_between_airports.head(top)
df_between_airports

Creating the Graph
The NetworkX package has a function called from_pandas_edgelist()
that you can use to create an edge list from a Pandas DataFrame. It returns a graph object:
import networkx as nx
G = nx.from_pandas_edgelist(df_between_airports,
'ORIGIN_AIRPORT',
'DESTINATION_AIRPORT',
create_using = nx.DiGraph())
In the above statement, G
is a directed graph (networkx.classes.digraph.DiGraph
). If you want to create an undirected graph (networkx.classes.graph.Graph
), simply leave out the creating_using
parameter.
The graph now contains all the nodes and edges which it derives from the supplied dataframe. In our case, the nodes are all the airports from the ORIGIN_AIRPORT and DESTINATION_AIRPORT columns.
You can now examine the nodes in the graph:
G.nodes()
You should see the following:
NodeView(('SFO', 'LAX', 'JFK', 'LAS', 'LGA', 'ORD', 'OGG', 'HNL', 'ATL',
'MCO', 'DFW', 'SEA', 'BOS', 'DCA', 'FLL', 'PHX', 'DEN', 'TPA', 'SAN',
'PHL', 'KOA', 'ANC', 'MSP', 'SJC', 'MIA', 'CLT', 'HOU', 'DAL', 'OAK',
'SLC', 'LIH', 'BWI', 'MSY', 'SMF', 'JAX', 'EWR', 'DTW', 'IAH', 'MKE',
'ITO', 'RDU', 'SAT', 'AUS', 'MDW', 'SJU', 'SNA', 'PBI', 'PDX', 'CLE',
'CVG', 'RSW', 'IND', 'BUR', 'IAD', 'BNA', 'RIC', 'STL', 'MCI', 'CMH',
'DSM', 'PIT', 'RNO', 'BHM', 'CHS', 'MSN', 'GEG', 'SAV', 'MEM', 'GRR',
'ONT', 'CID', 'GRB', 'SDF', 'CHA', 'OKC', 'DAY', 'CAE', 'ORF', 'GSO',
'TUS', 'TUL', 'GRK', 'XNA', 'PVD', 'BTR', 'GSP', 'ABQ', 'HSV', 'BUF',
'AGS', 'BDL', 'ABI', 'JAN', 'LEX', 'SHV', 'PNS', 'FWA', 'MOB', 'SGF',
'MHT', 'VPS', 'MGM', 'ICT', 'PIA', 'LFT', 'PSP', 'CRP', 'TLH', 'FAR',
'TYS', 'SBA', 'GNV', 'COS', 'OMA', 'MAF', 'CAK', 'FSD', 'LIT'))
Likewise, you can examine the edges:
G.edges()
Here is a partial list of edges returned:
EdgeView([('SFO', 'LAX'), ('SFO', 'JFK'), ('SFO', 'LAS'), ('SFO', 'ORD'),
('SFO', 'SAN'), ('SFO', 'SEA'), ('SFO', 'DEN'), ('SFO', 'EWR'),
('SFO', 'PHX'), ('SFO', 'DFW'), ('SFO', 'SNA'), ('SFO', 'PDX'),
('SFO', 'BOS'), ('SFO', 'IAD'), ('SFO', 'IAH'), ('SFO', 'SLC'),
('SFO', 'ATL'), ('SFO', 'MSP'), ('SFO', 'ONT'), ('SFO', 'PSP'),
('SFO', 'SBA'), ('SFO', 'PHL'), ('SFO', 'HNL'), ('SFO', 'AUS'),
('LAX', 'JFK'), ('LAX', 'LAS'), ('LAX', 'ORD'), ('LAX', 'SEA'),
('LAX', 'PHX'), ('LAX', 'DFW'), ('LAX', 'SJC'), ('LAX', 'OAK'),
('LAX', 'DEN'), ('LAX', 'ATL'), ('LAX', 'SMF'), ('LAX', 'SLC'),
...
...
Plotting the Graph
You can now plot the network graph showing the top 800 flights between airports:
import matplotlib.pyplot as plt
plt.figure(figsize=(8, 8))
options = {
'node_color':'yellow',
'node_size': 1500,
'width': 1,
'arrowstyle': '-|>',
'arrowsize': 18,
}
nx.draw_circular(G, with_labels = True, **options)
The network graph looks like this:

Argh! Obviously we had too many airports and this makes the network graph really messy. Let's reduce the number of airports by changing the number of rows we fetch from the dataframe to 140:
top = 140
df_between_airports = df_between_airports.head(top)
df_between_airports
The network graph now looks like this:

This is now a much cleaner graph!
Geospatial Mapping
Our dataset contains geospatial data and it would not do justice to our dataset if we do not map the data onto a map!
In this section, I will show how you can plot the network graph onto a map. I will be using the following packages:
- Basemap
- mplleaflet
Installing basemap
The first map that I will be using is basemap.
Basemap is a matplotlib extension that is very useful for creating maps in Python.
To install basemap, use the pip
command:
!pip install basemap
If you are using Windows, the installation should be uneventful. However, on the Mac, you may get some error messages regarding missing geos library. To fix this, perform the following steps:
$ brew install geos
Go to https://brew.sh/ if you do not have Homebrew installed.
Observe where geos is installed in. For my machine, geos is installed in /opt/homebrew/Cellar/geos/3.11.1. Next, type the following command in Terminal and set the directory to point to where geos is installed in:
$ export GEOS_DIR=/opt/homebrew/Cellar/geos/3.11.1
Finally, restart your Jupyter Notebook. Basemap should now be installed correctly.
Loading the airport locations
In order to plot the locations of the airports on a map, you need the latitude and longitude of each airport. Fortunately, this is already available in the airports.csv file:
import pandas as pd
df_airports = pd.read_csv('airports.csv')
df_airports
This CSV file contains the IATA_CODE of all the airports and their corresponding latitudes and longitudes:

However, there is one thing you need to be careful here. There are three airports that have no location information. You can verify this using the following statements:
# check which airport does not have location information
df_airports[(df_airports['LATITUDE'].isna()==True) |
(df_airports['LONGITUDE'].isna()==True)]
You can now see that the following airports have no location details – ECP, PBG, and UST:

There are two ways to solve this:
- Delete all the airports with no location information, or
- Supply the location information for the three airports
We shall do the latter by filling in the missing location information:
# ECP airport
df_airports.at[96,'LATITUDE'] = 30.354984
df_airports.at[96,'LONGITUDE'] = -85.79934
# PBG airport
df_airports.at[234,'LATITUDE'] = 44.6597091
df_airports.at[234,'LONGITUDE'] = -73.46722069999998
# UST airport
df_airports.at[313,'LATITUDE'] = 29.954352
df_airports.at[313,'LONGITUDE'] = -81.342935
Plotting the base map
We are now ready to plot the basemap:
from mpl_toolkits.basemap import Basemap as Basemap
import matplotlib.pyplot as plt
plt.figure(figsize = (10,9))
basemap = Basemap(
projection = 'merc',
llcrnrlon = -180,
urcrnrlon = -50,
llcrnrlat = -10,
urcrnrlat = 70,
lat_ts = 0,
resolution = 'l',
suppress_ticks = True)
The above code snippet displays the basemap using the Mercator projection (merc
).
For more details of configuring basemap, see: https://basemaptutorial.readthedocs.io/en/latest/
In order to display the airports location on basemap, you need to convert the latitute, longitude into x and y map-projection coordinates:
# pass in lon, lat to convert to x/y map projection coordinates
basemap_x, basemap_y = basemap(df_airports['LONGITUDE'].values,
df_airports['LATITUDE'].values)
Next, we need to create a dictionary in the following format: {IATA_CODE: (x,y)}
. You can do so using the following statements:
pos = {}
for i, IATA_CODE in enumerate (df_airports['IATA_CODE']):
pos[IATA_CODE] = (basemap_x[i], basemap_y[i])
The pos
variable now looks like this:
{'ABE': (11626491.577256551, 6073282.907509623),
'ABI': (8930961.032284452, 4930788.720997522),
'ABQ': (8160681.891600923, 5282318.28670927),
'ABR': (9071074.35752435, 6803760.994159843),
'ABY': (10653083.864127252, 4815986.333659503),
'ACK': (12224744.463780478, 6161722.945706454),
...
Drawing the nodes, labels, and edges
With the x and y map-projection coordinates obtained, you can now start to plot the network graph onto the basemap:
ax = plt.figure(figsize=(13, 13))
nx.draw_networkx_nodes(G = G,
pos = pos,
nodelist = G.nodes(),
node_color = 'r',
alpha = 0.7,
node_size = [sum(df_between_airports.query(f'DESTINATION_AIRPORT == "{x}"')['COUNT']) / 400 for x in G.nodes()]
)
nx.draw_networkx_labels(G = G,
pos = pos,
labels = {x:x for x in G.nodes()},
font_size = 10
)
nx.draw_networkx_edges(G = G,
pos = pos,
edge_color='g',
alpha=0.2,
arrows = False)
basemap.drawcoastlines(linewidth = 0.5)
The above code snippet draws the nodes, labels, and edges onto the basemap:

If you try to plot the map using the Orthographic (ortho
) projection:
basemap = Basemap(projection='ortho',
lon_0 = -105,
lat_0 = 40,
resolution = 'l')
The map will now look like this:

Cool, isn't it!
Plotting using mplleaflet
While you can plot the network graph using basemap, the greatest drawback is that you can't really interact with it. It would be useful to be able to pan the map to examine each airport in more details. This is where mplleaftlet comes in.
mplleaflet is a Python library that converts a matplotlib plot into a webpage containing a pannable, zoomable Leaflet map. You can embed the Leaflet map in your Jupyter Notebook.
Let's first prepare the locations for each airport by converting all the locations into a dictionary:
import matplotlib.pyplot as plt
import Mplleaflet
import networkx as nx
# load the nodes and edges
G = nx.from_pandas_edgelist(df_between_airports,
'ORIGIN_AIRPORT',
'DESTINATION_AIRPORT')
# create a dictionary of this format: { IATA_CODE: [LONGITUDE, LATITUDE] }
pos = df_airports[['IATA_CODE','LONGITUDE','LATITUDE']].set_index('IATA_CODE').T.to_dict('list')
The pos
variable is now a dictionary of the following format: { IATA_CODE: [LONGITUDE, LATITUDE]}
:
{'ABE': [-75.4404, 40.65236],
'ABI': [-99.6819, 32.41132],
'ABQ': [-106.60919, 35.04022],
'ABR': [-98.42183, 45.44906],
'ABY': [-84.19447, 31.53552],
'ACK': [-70.06018, 41.25305],
'ACT': [-97.23052, 31.61129],
'ACV': [-124.10862, 40.97812],
...
...
You can now draw the nodes and edges on the mplleaflet
map:
fig, ax = plt.subplots(figsize=(15,15))
# draw the nodes
nx.draw_networkx_nodes(G,
pos = pos,
node_size = [sum(df_between_airports.query(f'DESTINATION_AIRPORT == "{x}"')['COUNT']) / 1500 for x in G.nodes()],
node_color='red',
alpha = 0.8)
# draw the edges
nx.draw_networkx_edges(G,
pos = pos,
edge_color = 'gray',
alpha=0.3)
# display the map
mplleaflet.display(fig=fig)
The size of each node is proportional to the number of flights arriving at the airport. The map now shows the network graph:

If you get an error like "AttributeError: ‘XAxis' object has no attribute ‘_gridOnMajor'", you may need to downgrade the version of matplotlib by using the
pip
command:pip install matplotlib==3.3.2
.
You can zoom into the map as well as pan it:

If you remove the earlier part of the code where we selected only the top 140 flights combination:
# top = 140
# df_between_airports = df_between_airports.head(top)
# df_between_airports
This is how the map will look like with all the airports displayed:

If you like reading my articles and that it helped your career/study, please consider signing up as a Medium member. It is $5 a month, and it gives you unlimited access to all the articles (including mine) on Medium. If you sign up using the following link, I will earn a small commission (at no additional cost to you). Your support means that I will be able to devote more time on writing articles like this.
Summary
I hope you have enjoyed trying out the code in this article! For drawing the network graph on the basemap, the main thing you need to do is to convert the latitude and longitude into a dictionary of map-projection coordinates, while for the mplleaflet map, you need to put the latitudes and longitudes into a dictionary. The basemap allows you to experiment with the different types of projections (such as Mercator, Orthographic, and more), but the main drawback is that it is not interactive. On the other hand, mplleaflet allows you to interact with the map, but it does not support projections like basemap does.