Building Interactive Network Graphs using pyvis

Author:Murphy  |  View: 27711  |  Time: 2025-03-23 19:33:18
Photo by DeepMind on Unsplash

In my previous article on creating network graphs, I showed how you can build one using the NetworkX package. The key problem with NetworkX is that the graph generated is static. Once the graph is plotted, there is no way the user is able to interact with it (such as rearrange the nodes, etc). The network graph would be more intuitive (and fun!) if the user can interact with it. And so this is the main focus of this article.

Plotting Network Graphs using Python

In this article, I will show you how you can create an interative network graph using the pyvis package.

The pyvis package is a wrapper for the popular visJS JavaScript library, and it allows you to easily generate visual network graphs in Python.

Installing pyvis

To install the pyvis package, use the pip command:

!pip install pyvis

Creating a network

First, create a new graph using the Network class in pyvis:

from pyvis.network import Network

net = Network(
    notebook=True,
)

To display the graph on Jupyter Notebook, set the notebook parameter to True. The above code snippet creates an undirected graph.

Adding nodes

You can now add nodes to the graph:

net.add_node("Singapore")
net.add_node("San Francisco")
net.add_node("Tokyo")
net.add_nodes(["Riga", "Copenhagen"],
              color=['lightgreen', 'yellow'])

The add_node() function adds a single node while the add_nodes() function adds multiple nodes to the graph. You can also set the optional color parameter for both functions to set the color of the node(s).

To display the graph, call the show() function with a name for the output:

net.show('mygraph.html')

The nodes should now be displayed:

All images by author

Adding Edges

With the nodes added to the graph, you can now add the Edges to connect the nodes:

net.add_edge("Singapore","San Francisco") 
net.add_edge("San Francisco","Tokyo")
net.add_edges(
    [
        ("Riga","Copenhagen"),
        ("Copenhagen","Singapore"),
        ("Singapore","Tokyo"),
        ("Riga","San Francisco"),
        ("San Francisco","Singapore"),
    ]
)

net.show('mygraph.html')

The add_edge() function adds a single edge connecting two nodes, while the add_edges() function takes in a list of tuples connecting the various nodes.

The graph should now display the edges connecting the various nodes. Try dragging each node and see how they are pulled back after you released it:

Directed Graph

If you want a directed graph, you should set the directed parameter in the Network class:

net = Network(
    notebook=True,
    directed=True
)

If you modify the code earlier and rerun all the code snippets, you should now see a directed graph:

Modifying the physics of the graph

If you click and drag the nodes in the graph, you will noticed that the nodes will bounce around. When you release your mouse, the nodes will snap back into its original position. All these behaves very much like real balls (the nodes) bounded by springs (the edges). You can customize the physics behind the graph (how they snap back, the damping of the spring, etc) using the repulsion() function. The following statement shows the default values of all the parameters in the repulsion() function:

net.repulsion(
    node_distance=100,
    central_gravity=0.2,
    spring_length=200,
    spring_strength=0.05,
    damping=0.09,
)

Here are the uses of the various parameters:

  • node_distance – This is the range of influence for the repulsion.
  • central_gravity – The gravity attractor to pull the entire network to the center.
  • spring_length – The rest length of the edges.
  • spring_strength – The strenght of the edges springs.
  • damping – A value ranging from 0 to 1 of how much of the velocity from the previous physics simulation iteration carries over to the next iteration.

Source: https://pyvis.readthedocs.io/en/latest/documentation.html?highlight=repulsion

The best way to understand the use of the various parameters is to try it out. The following example sets the spring_length and damping parameters:

net.repulsion(
    spring_length=400,
    damping=0.01,
)
net.show('mygraph.html')

Here's how the graph looks like:

The following video shows how the graph behaves when the nodes are dragged and released:

You can also show the UI for you to dynamically alter the physics of the graph by using the show_buttons() function:

net.show_buttons(filter_='physics')
net.show('mygraph.html')

The filter_ parameter takes one of the following options:

show_buttons(filter_=['nodes', 'edges', 'physics'])

If you want to show all filters, set it to True:

net.show_buttons(filter_= True)

I will leave it to you to try how the filters look like and how they work.Visualizing the Flights Delay Dataset

Now that you are familiar with the basics of using the pyvis package, we will use it to visualize the flights between the various airports in the 2015 Flights Delay dataset.

2015 Flights Delay Dataset (airports.csv). Source: https://www.kaggle.com/datasets/usdot/flight-delays. LicensingCC0: Public Domain

First, load the flights.csv file into a Pandas DataFrame. Because this CSV file is large, I will only load the three columns that I need to do my work:

import pandas as pd
df = pd.read_csv('flights.csv', 
                 usecols = ["ORIGIN_AIRPORT", "DESTINATION_AIRPORT","YEAR"])

Once the dataframe is loaded, I will go ahead and count the numbers of flights from one airport to another:

df_between_airports = df.groupby(by=["ORIGIN_AIRPORT", "DESTINATION_AIRPORT"]).count()
df_between_airports = df_between_airports['YEAR'].rename('COUNT').reset_index() 
df_between_airports = df_between_airports.query('ORIGIN_AIRPORT.str.len() <= 3 & DESTINATION_AIRPORT.str.len() <= 3')
df_between_airports = df_between_airports.sort_values(by="COUNT", ascending=False)
df_between_airports

The resultant output is as shown:

As there are more than 4500 combinations of flights, let's only select the top 130 combinations:

top = 130
df_between_airports = df_between_airports.head(top)
df_between_airports['COUNT'] = df_between_airports['COUNT'] / 5000
df_between_airports

Notice that I am dividing the values in the COUNT column by 5000 because later on I will be using the values in the COUNT column as the linewidth of the edges linking two airports. And so the values need to be reduced to a smaller range. The top 130 combinations are now as shown:

Next, I will sum up all the flights originating from each airport (remember the count has been normalized in the previous section):

node_sizes = df_between_airports.groupby('ORIGIN_AIRPORT').COUNT.agg(sum)
node_sizes

The number of flights originating from each airport will be used as the size of the node:

The larger the number of flights from an airport, the bigger the node.

ORIGIN_AIRPORT
ANC     1.2766
ATL    20.2544
BOS     6.3382
BWI     1.1674
CLT     1.2614
DAL     1.2524
DCA     4.0138
DEN    11.5638
DFW     5.5244
EWR     2.0252
FLL     2.5436
HNL     5.1544
HOU     1.2592
JAX     1.0192
JFK     6.1684
KOA     1.2694
LAS     6.8754
LAX    21.0822
LGA     7.3132
LIH     1.1710
MCO     2.7096
MIA     2.2936
MSP     2.3608
MSY     1.1186
OAK     1.2562
OGG     1.6626
ORD    12.6836
PHL     2.3876
PHX     7.2886
SAN     2.4130
SEA     7.3736
SFO    12.2998
SJC     1.2678
SLC     3.4424
SMF     1.1148
TPA     1.4166
Name: COUNT, dtype: float64

Plotting the graph

You can now plot the graph:

from pyvis.network import Network

net = Network(
    notebook = True,
    directed = True,            # directed graph
    bgcolor = "black",          # background color of graph 
    font_color = "yellow",      # use yellow for node labels
    cdn_resources = 'in_line',  # make sure Jupyter notebook can display correctly
    height = "1000px",          # height of chart
    width = "100%",             # fill the entire width    
    )

# get all the nodes from the two columns
nodes = list(set([*df_between_airports['ORIGIN_AIRPORT'], 
                  *df_between_airports['DESTINATION_AIRPORT']
                 ]))

# extract the size of each airport
values = [node_sizes[node] for node in nodes]

# extract the edges between airports
edges = df_between_airports.values.tolist()

# use this if you don't need to set the width of the edges
# edges = df_between_airports.iloc[:,:2].values.tolist()

# add the nodes, the value is to set the size of the nodes
net.add_nodes(nodes, value = values)

# add the edges
net.add_edges(edges)

net.show('flights.html')

The graph looks like the following:

Let's zoom in a little:

You can see that ATL and LAX have the most originating flights (they are the two largest nodes). You can highlight these two nodes by changing their color to red. To do so, you can iterate through all the nodes using the nodes attribute and examining the value of the value key. If the value is more than 20, set the node color to red using the color key:

...
...

# add the edges
net.add_edges(edges)

# color the nodes red if their count is more than 20
for n in net.nodes:
    if n['value'] > 20:
        n['color'] = 'red'

net.show('flights.html')

The ATL and LAX nodes are now in red:

If you like reading my articles and that it helped your career/study, please consider signing up as a Medium member. It is $5 a month, and it gives you unlimited access to all the articles (including mine) on Medium. If you sign up using the following link, I will earn a small commission (at no additional cost to you). Your support means that I will be able to devote more time on writing articles like this.

Join Medium with my referral link – Wei-Meng Lee

Summary

In this article, you learned how to create interative network graph using the pyvis package. The most interesting aspects of the pyvis package is that it makes your network graph come to life. Interactive network graphs are ideal for social networks, corporate structures or other networks where you want to visualize the relationships between entities. Have fun with pyvis and let me know the type of data you use with it!

Tags: Edges Interactive Network Graph Networkx Nodes Pyvis

Comment