Four Visualisation Libraries That Seamlessly Integrate With Pandas Dataframe

Author:Murphy  |  View: 24390  |  Time: 2025-03-23 11:52:44
Image Created in Canva by Author

Introduction

A few weeks ago, I wrote an article about using Pandas to directly plot its dataframes without importing any data visualisation libraries. In fact, it makes uses Matplotlib as the "Plotting Backend" by default.

You Don't Need Matplotlib When Pandas Is Enough for Data Visualisation

Actually, Pandas Plotting Backend is like an API that other libraries can implement. Therefore, apart from Matplotlib, there are many other amazing libraries that can be set as its backend. That means they all support being the visualisation tools when we directly plot a Pandas dataframe.

In this article, I'll introduce four libraries that implement Pandas dataframe backend API. So, they can be used to plot directly from a dataframe without even needing to be imported.

0. Preparation

As mentioned earlier, some Python Data Visualisation libraries can even better integrate with Pandas dataframes. Typically, they support the Pandas Plotting Backend. Usually, we can set the following option to change the backend visualisation library.

pd.options.plotting.backend = ""

Before we can demonstrate the libraries, let's import the Pandas module and define a simple Pandas dataframe.

import pandas as pd

# Sample DataFrame
df = pd.DataFrame({
    'Index': range(10),
    'Value': [x**2 for x in range(10)]
})

1. Matplotlib

By default, the Pandas plotting backend is Matplotlib. That's exactly why we can directly plot a dataframe using Matplotlib.

df.plot(x='Index', y='Value', kind='line', title='matplotlib as backend');
Screenshot Created by Author

Matplotlib doesn't need to be introduced in detail. It is considered one of the most foundational data visualisation tools in Python. Its API is relatively low-level, so there is lots of room to customise the presentation, but most of the time, we may need more effort to do so.

2. Plotly

This is my favourite data visualisation tool in Python. It supports the Pandas plotting backend straightforwardly. If you don't have it installed in your environment, make sure to do that before using it.

pip install plotly

Then, we just need to set the option as plotly. We don't even need to import it before use.

pd.options.plotting.backend = "plotly"

df.plot(x='Index', y='Value', title='Plotly as backend')
Screenshot by the Author

The critical feature of Plotly is its interactivity. As shown in the GIF below, I only demonstrated its selecting and panning gestures. There is more waiting for you to discover.

GIF created by the Author

Apart from that, in my opinion, Plotly is one of the easiest-to-use visualisation tools in Python. It supports a whole bunch of visualisation types with high-level API implemented for us. Most of the time, we just need to give a few parameters to get a stunning graph. It is highly recommended if you have never tried.

3. hvPlot

The hvPlot library is probably less known relatively. However, I found that it is very competitive with Plotly. It is very interactive, too. Of course, before we can use it, the library needs to be installed.

pip install hvplot

Similarly, since hvPlot supports Pandas plotting backend, we don't need to import it. The code below can directly plot from a Pandas dataframe.

pd.options.plotting.backend = "hvplot"

df.plot.line(x='Index', y='Value', title='hvPlot as backend')
Screenshot Created by Author

Apart from the df.plot.line, hvPlot also supports most of the common types of charts such as scatter, bar, histogram, box plot and even violin chart. However, to "enable" more types of charts like the violin, we do need to import some methods from hvPlot as follows.

import hvplot.pandas

The full code snippet is as follows. Please note that the syntax changed from df.plot to df.hvplot in order to make use of the extended methods from hvPlot.

import numpy as np
import hvplot.pandas

df = pd.DataFrame({
    'Group': ['A', 'B', 'A', 'B', 'A', 'B', 'A', 'B', 'A', 'B'],
    'Values': np.random.randn(10)
})

violin_plot = df.hvplot.violin(y='Values', by='Group')
display(violin_plot)
Screenshot by the Author

4. Bokeh

Bokeh is one of the most popular powerful visualisation libraries in Python. It is designed for interactive plots. It is also famous for its beautiful presentation. Let's have a look at how to enable it as the Pandas plotting backend.

Of course, if you don't have it installed, please run pip to install it.

pip install bokeh

Something that needs to be aware of is that, the latest Pandas (v2.1.4) seem incompatible with the latest Bokeh (v3.5.1). When we want to use Bokeh as the Pandas plotting backend, to avoid the conflicts, we can downgrade it to Bokeh (v2.4.3) as follows.

pip install bokeh==2.4.3

This is the reason why Bokeh is not that recommended, though it is a great library. Hopefully, this compatibility issue can be fixed later.

Once we installed the library, the code to plot a dataframe is as follows, which is not too much different from the others.

pd.set_option('plotting.backend', 'pandas_bokeh')

df.plot_bokeh(x="Index", y="Value", kind="line", title="Boken as backend")
Screenshot Created by Author

During the investigation to make it work, I found that the behaviour in different environments might be varied. If you find that the plot doesn't show, it might be necessary to explicitly ask Bokeh to output in a notebook environment. In that case, we have to import Bokeh.

import pandas_bokeh
pandas_bokeh.output_notebook()

pd.set_option('plotting.backend', 'pandas_bokeh')
df.plot_bokeh(x="Index", y="Value", kind="line", title="Boken as backend")

Summary

Image Created in Canva by Author

In this article, I have introduced four data visualisation libraries that support the Pandas plotting backend. They are supposed to be the easiest ones to integrate with Pandas dataframe. Apart from Matplotlib which is the default one, there are more interactive and beautiful libraries available. Plotly, hvPlot and Bokeh are among the best ones. Hope this article may help you in terms of plotting Pandas dataframe easily.

Tags: Data Science Machine Learning Programming Python Technology

Comment