7 Steps to Help You Make Your Matplotlib Bar Charts Beautiful

Author:Murphy | View: 24323 | Time: 2025-03-23 19:11:48

Matplotlib horizontal bar chart after changing several features to make it more visually appealing. Image by the author.

Bar charts are a commonly used data visualisation tool where categorical features are represented by bars of varying lengths/heights. The height or length of the bar corresponds to the value being represented for that category.

Bar charts can easily be created in [matplotlib](https://matplotlib.org/). However, the Matplotlib library is often regarded as a library that produces unexciting charts and can be challenging to work with. However, with perseverance, exploration, and a few extra lines of Python code, we can generate distinctive, aesthetically pleasing and informative figures.

If you want to see what matplotlib is capable of with a little bit of extra work, then you may be interested in checking out my previous article:

3 Unique Charts You Wouldn't Think Were Created with Matplotlib

Within this article, we will see how we can go from a boring figure like this:

Matplotlib horizontal bar plot displaying porosity values for different reservoir intervals. Image by the author.

To one that looks like this:

Before and after transforming our bar chart from a bland figure to one much more visually appealing. Image by the author.

We will see how we can improve the story we are trying to tell with a few simple extra lines of Python code.

Video Version of the Tutorial

If you would like to see how this code is built from scratch in video form, then you may be interested in watching the following:

Importing Libraries and Setting up Data

The first step is to import the libraries we are going to work with. In this case, we will be using pandas to store our data and matplotlib to create the figures.

import pandas as pd
import matplotlib.pyplot as plt

Next, we will create some data, which has been derived from the Xeex Force 2020 Lithology Machine Learning competition. These data represent individual wells, with average porosity values for sandstone lithology within the Hugin Fm. These wells originate on the Norwegian Continental shelf.

See the bottom of the article for further details on this dataset.

Rather than loading data from a CSV file, we can create one with a simple dictionary, and pass that to the pd.DataFrame() function.

wells_porosity = {'15/9-13': 18.2, '16/10-1': 26.0, 
 '16/10-2': 21.8, '16/10-3': 16.7, '16/2-16': 19.8,
 '25/2-13 T4': 13.3, '25/2-14': 11.6, '25/2-7': 10.7, 
 '25/3-1': 6.9, '25/4-5': 12.0, '25/5-1': 8.9,
 '25/5-4': 15.0, '25/6-1': 18.9, '25/7-2': 6.5, 
 '25/8-5 S': 21.2, '25/8-7': 26.1, '25/9-1': 23.0,
 '26/4-1': 13.9}

df = pd.DataFrame(wells_porosity.items(), columns=['well', 'porosity'])

Creating a Basic Bar Plot with Matplotlib

Now that we have our pandas dataframe setup, we can move on to creating our very first bar plot. There are a few ways to create a bar plot, one of which involves using the dataframe directly ( df.plot(kind='bar'....) ), however, for this article, we will focus on using matplotlib-focused code to build our plot.

To create a basic bar chart with matplotlib, we first need to setup our fig and ax variables, which will be set to plt.subplots() Within this function, we can pass in the figure size.

Next, we will create a new variable called bars, and assign it to plt.bar() Within this function, we can simply pass in our categorical variable, in this case a list of well names, and the average porosity value.

fig, ax = plt.subplots(figsize=(8,8))

bars = plt.bar(df['well'], df['porosity'])

plt.show()

When it is run, we are presented with the following bar plot. As you can see it is very basic and not very appealing.

A basic barplot generated with matplotlib. Image by the author.

If we take a closer look at the plot, we will start to see more issues:

It is difficult to read the labels on the x-axis
We have to work our brains more to understand the values of each of the bars
It is difficult to compare bars

Let's see how we can create a much more effective and aesthetically pleasing visualisation.

1. Rotate Your Chart

The first step in improving our bar chart is to rotate it 90 degrees.

This makes it easier to read longer labels like the ones we have. Another option we could consider is rotating the labels on the x-axis, however, that requires the reader to also tilt their heads to try and read them.

Additionally, horizontal bar charts are a great way to save on space in a report or presentation whilst maintaining readability. This is especially useful if you have a large number of categories.

To rotate our bar chart, we have to change the plot type we are calling in matplotlib from .bar() to .barh().

fig, ax = plt.subplots(figsize=(8,8))

bars = plt.barh(df['well'], df['porosity'])

plt.show()

What we get back is the following chart with the category labels (well names) in a much nicer and easier-to-read format.

We can now tell which bar belongs to what well.

2. Arrange Bars in Order

The next step for improving our plot is to sort the bars in ascending order. This can help improve the readability of our chart considerably.

Before applying any sorting to the data, you first need to consider if it is a sensible option.

If your bars are related to categories that should be in a certain order, then sorting the data from longest to shortest may not be the best option. For example, days of the week, months of the year or age groups.

Ordering bars from longest to shortest can make bar charts easier to read by allowing the reader to easily compare the different bars. This is especially true when the bars are of similar lengths.

It also creates a more aesthetically pleasing chart to look at by giving the data a sense of order.

To sort the data, we need to go back to the dataframe, and sort the values by porosity.

df = df.sort_values(by='porosity')

fig, ax = plt.subplots(figsize=(8,8))

bars = plt.barh(df['well'], df['porosity'])
plt.show()

When we run the above code, we get the following plot returned.

Matplotlib horizontal bar plot displaying porosity values for different reservoir intervals in descending order. Image by the author.

3. Remove Spines and Axes

If we have unnecessary chart elements such as gridlines and borders (commonly known as "chart junk") it can distract the reader and can take longer for the chart to be understood.

We can remove this extra chart junk, to improve, not only the readability of the chart but also the aesthetics and the message we are trying to get across to the reader.

For our chart, we will remove the top, bottom and right edges of the chart by calling upon ax.spines[['right', 'top', 'bottom']].set_visible(False).

We will also hide the x-axis. You may be thinking why would we want to remove the numbers on the axis – wouldn't that harm readability?

This is true, however, we will see in the next step, how we can make it better and easier for the reader to understand the values.

df = df.sort_values(by='porosity')

fig, ax = plt.subplots(figsize=(8,8))

bars = plt.barh(df['well'], df['porosity'])

ax.spines[['right', 'top', 'bottom']].set_visible(False) 
ax.xaxis.set_visible(False)

plt.show()

When we run the above code, we get back the following plot.

This now looks much cleaner than the previous step.

Matplotlib horizontal bar plot after removing extra chart junk. Image by the author.

4. Add Data Labels

In the above image, we removed the x-axis ticks and numbers. This does reduce readability, however, if the x-axis were to remain, we are expecting our reader to do extra work when trying to understand the absolute values and compare the different bars.

To make the chart more effective, we can add data labels to each of the bars with the absolute values. This improves clarity, saves space, and improves precision.

To make adding labels easier on bar charts, the developers of matplotlib introduced the [bar_label()](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.bar_label.html) function. This allows us to pass in our bar charts, and it will automatically add the labels.

df = df.sort_values(by='porosity')

fig, ax = plt.subplots(figsize=(8,8))

bars = plt.barh(df['well'], df['porosity'])

ax.spines[['right', 'top', 'bottom']].set_visible(False) 
ax.xaxis.set_visible(False)

ax.bar_label(bars)

plt.show()

When the above code is run, we get the following chart. We can see the absolute values directly at the end of the bars, which improves readability significantly. For example, if we went on bar length alone for the top two bars, we would say they were the same, however, if we look at the absolute value, we will see that they are very slightly different.

Matplotlib horizontal bar chart with data labels displayed to improve readability. Image by the author.

Controlling the Label Format of matplotlib's bar_label Function

The bar_label() function allows us to provide a number of keyword arguments.

In the example below, I have changed the font size, colour and font weight.

Also, in order to place the labels on the inside edge of the bar, we can call upon the padding parameter (pad). If we use a negative number, we will be able to place the labels inside the bars.

The fmt parameter, allows us to control how the labels are displayed. Using %.1f%% means we are using 1 decimal place and including a % sign at the end of the label.

df = df.sort_values(by='porosity')

fig, ax = plt.subplots(figsize=(8,8))

bars = plt.barh(df['well'], df['porosity'])

ax.spines[['right', 'top', 'bottom']].set_visible(False) 
ax.xaxis.set_visible(False)

ax.bar_label(bars, padding=-45, color='white', 
             fontsize=12, label_type='edge', fmt='%.1f%%',
            fontweight='bold')

plt.show()

When the above code is run, we get the following plot.

Matplotlib horizontal bar chart with data labels displayed on the inside edge of the bar to improve readability using bar_label. Image by the author.

5. Increase Spacing Between Bars

Another step in improving readability is to increase the space between the bars. It allows us to create a less cluttered and more aesthetically pleasing chart.

To increase the spacing, we first need to increase the height of our figure in the plt.subplots() function call, and then add the height parameter to the plt.barh() function.

df = df.sort_values(by='porosity')

fig, ax = plt.subplots(figsize=(8,12))

bars = plt.barh(df['well'], df['porosity'], height=0.7)

ax.spines[['right', 'top', 'bottom']].set_visible(False) 
ax.xaxis.set_visible(False)

ax.bar_label(bars, padding=-45, color='white', 
             fontsize=12, label_type='edge', fmt='%.1f%%',
            fontweight='bold')

plt.show()

When the plot is generated, we now have a chart that has a little more room to breathe and is easier on the eye.

Matplotlib horizontal bar chart after increasing the spacing between the bars. Image by the author.

6. Choosing Colours for Bar Plots

Choosing colours for charts can be quite subjective and time-consuming. We ideally want to try and avoid overwhelming the reader with a rainbow palette of colours. Not only will the plot look poor, but it can hinder the readability and impact the message you are trying to get across.

There are a few ways we can use colour in our bar chart:

We can keep colours consistent, such as the blue in the previous charts
Use colour to draw attention to the top bar or bottom bar
Use colour to draw attention to a specific bar
Use colour to highlight bars that meet certain criteria
Use a colour that is associated with category branding, for example using blue for Facebook, and red for YouTube
Use colour to show groupings
Improve accessibility for readers with colour vision deficiencies

And there are many more ways.

Check out this article if you are looking for a tool to help with colour selection:

4 Essential Tools to Help You Select a Colour Palette for Your Data Visualisation

Let's have a closer look at a few of these different options of using colour in a matplotlib bar plot.

Drawing Attention to a Single Bar using Colour

If we want to draw the reader's attention to a specific bar, we can use the following code.

Rather than creating a list for our colours, we will add the colours directly to the dataframe using the apply function and a lambda function. Here we are highlighting a specific well.

well_name = "16/2-16"
highlight_colour = '#d95f02'
non_highlight_colour = '#768493'

df['colours'] = df['well'].apply(lambda x: highlight_colour if x == well_name else non_highlight_colour)

df = df.sort_values(by='porosity')

fig, ax = plt.subplots(figsize=(8,12))

bars = plt.barh(df['well'], df['porosity'], height=0.7, color=df['colours'])

ax.spines[['right', 'top', 'bottom']].set_visible(False) 
ax.xaxis.set_visible(False)
ax.yaxis.set_tick_params(labelsize=14)

ax.bar_label(bars, padding=-45, color='white', 
             fontsize=12, label_type='edge', fmt='%.1f%%',
            fontweight='bold')

plt.show()

When the above code is run, we get the following plot. We can see that well 16/2–16 is highlighted in orange and immediately grabs your attention.

Matplotlib horizontal bar plot after applying colour to a single bar in order to draw the reader's focus. Image by the author.

Apply Colour to Above a Cutoff Value

Another way to apply colour is by highlighting specific categories/bars that meet or exceed a cutoff.

In this example, we want to highlight the bars where porosity is greater than 20%. We could just let the readers use the labels to identify the bars, however, to make it easier and quicker for the reader, we can highlight those bars.

This is done using the apply function within pandas and using a lambda function which checks if the values are greater than our cutoff value.

porosity_cutoff = 20
highlight_colour = '#d95f02'
non_highlight_colour = '#768493'

df['colours'] = df['porosity'].apply(lambda x: highlight_colour if x >= porosity_cutoff else non_highlight_colour)

df = df.sort_values(by='porosity')

fig, ax = plt.subplots(figsize=(8,12))

bars = plt.barh(df['well'], df['porosity'], height=0.7, color=df['colours'])

ax.spines[['right', 'top', 'bottom']].set_visible(False) 
ax.xaxis.set_visible(False)
ax.yaxis.set_tick_params(labelsize=14)

ax.bar_label(bars, padding=-45, color='white', 
             fontsize=12, label_type='edge', fmt='%.1f%%',
            fontweight='bold')

plt.show()

When the plot is generated, we get the following plot, and our eyes are immediately drawn to the top 5 bars.

Matplotlib bar chart showing bars that are greater than a 20% porosity cutoff. Image by the author.

However, the reader might not know why these five bars are highlighted, so we can add a text annotation to help them.

porosity_cutoff = 20
highlight_colour = '#d95f02'
non_highlight_colour = '#768493'

df['colours'] = df['porosity'].apply(lambda x: highlight_colour if x >= porosity_cutoff else non_highlight_colour)

df = df.sort_values(by='porosity')

fig, ax = plt.subplots(figsize=(8,12))

bars = plt.barh(df['well'], df['porosity'], height=0.7, color=df['colours'])

ax.spines[['right', 'top', 'bottom']].set_visible(False) 
ax.xaxis.set_visible(False)

ax.bar_label(bars, padding=-45, color='white', 
             fontsize=12, label_type='edge', fmt='%.1f%%',
            fontweight='bold')

ax.yaxis.set_tick_params(labelsize=14)

ax.axvline(x=20, zorder=0, color='grey', ls='--', lw=1.5)

ax.text(x=20, y=1, s='20% Porosity Cutoff', ha='center', 
        fontsize=14, bbox=dict(facecolor='white', edgecolor='grey', ls='--'))
plt.show()

Using matplotlib's ax.text and ax.axvlinefunctions, we can add a label and a vertical cutoff line to explain why the top 5 bars are highlighted.

Matplotlib bar chart showing bars that are greater than a 20% porosity cutoff and after adding a text annotation. Image by the author.

Applying a Traffic-Light-Type Scale to a Bar Chart

If we have multiple cutoffs or target values, we can use a traffic-light-type scale to indicate where each bar falls. Caution should be exercised when using conventional red, green and yellow colours, as they are not suitable for everyone, especially if they have colour perception issues.

In this example, we are going to have three colours, which have been picked from ColorBrewer 2.0, to indicate where we have wells with good, average and poor porosity.

Instead of using a lambda function like in the previous example, we can create a new function called bar_highlight and pass in three parameters: our actual value (value), the average value cutoff (average_value) and the good value cutoff (good_value).

We will then check the actual value against these cutoffs and assign a colour to it.

def bar_highlight(value, average_value, good_value):
    if value >= good_value:
        return '#1b9e77'
    elif value >= average_value:
        return '#d95f02'
    else:
        return '#7570b3'

#cutoff values
good = 20
average = 10

df['colours'] = df['porosity'].apply(bar_highlight, args=(average, good))

df = df.sort_values(by='porosity')

fig, ax = plt.subplots(figsize=(8,12))

bars = plt.barh(df['well'], df['porosity'], height=0.7, color=df['colours'])

ax.spines[['right', 'top', 'bottom']].set_visible(False) 
ax.xaxis.set_visible(False)

ax.bar_label(bars, padding=-45, color='white', 
             fontsize=12, label_type='edge', fmt='%.1f%%',
            fontweight='bold')

ax.yaxis.set_tick_params(labelsize=14)

ax.axvline(x=good, zorder=0, color='grey', ls='--', lw=1.5)
ax.axvline(x=average, zorder=0, color='grey', ls='--', lw=1.5)

ax.text(x=good, y=18, s=f'{good}% Porosity Cutoff', ha='center', 
        fontsize=14, bbox=dict(facecolor='white', edgecolor='grey', ls='--'))

ax.text(x=average, y=18, s=f'{average}% Porosity Cutoff', ha='center', 
        fontsize=14, bbox=dict(facecolor='white', edgecolor='grey', ls='--'))

plt.show()

To help the reader, we can add a new label and vertical line to indicate where these cutoff values are. In order to save typing the value multiple times, we can use f-strings in the calls to ax.text.

When we run the code, we get the following plot.

We can immediately see that the data has been split into three groups of colour, which helps tell our story to the reader.

Using colouring to indicate the performance of different bars to cutoff/benchmark values. Image by the author.

7. Add a Title

If we carry forward our chart where we are using a single 20% porosity cutoff, we can polish off our chart by adding an informative title. This tells the reader directly what the chart is about.

Matplotlib horizontal bar plot after a title has been added. Image by the author.

We can do this simply by adding a call to ax.title in our code.

porosity_cutoff = 20
highlight_colour = '#d95f02'
non_highlight_colour = '#768493'

df['colours'] = df['porosity'].apply(lambda x: highlight_colour if x >= porosity_cutoff else non_highlight_colour)

df = df.sort_values(by='porosity')

fig, ax = plt.subplots(figsize=(8,12))

bars = plt.barh(df['well'], df['porosity'], height=0.7, color=df['colours'])

ax.spines[['right', 'top', 'bottom']].set_visible(False) 
ax.xaxis.set_visible(False)

ax.bar_label(bars, padding=-45, color='white', 
             fontsize=12, label_type='edge', fmt='%.1f%%',
            fontweight='bold')

ax.yaxis.set_tick_params(labelsize=14)

ax.axvline(x=20, zorder=0, color='grey', ls='--', lw=1.5)

ax.text(x=20, y=1, s='20% Porosity Cutoff', ha='center', 
        fontsize=14, bbox=dict(facecolor='white', edgecolor='grey', ls='--'))

ax.set_title('Wells With > 20% Porosity in the Hugin Formation', fontsize=16,
              fontweight='bold', pad=20)

plt.show()

Summary

Even though matplotlib appears daunting at first, it can be a very powerful library for creating effective visualisations.

With a few extra lines of code and the matplotlib library, we have seen how we can go from an ugly and boring bar plot to one that is more aesthetically pleasing to look at and helps tell a story to our readers.

Why not give these examples a try on your next bar chart?

I would love to hear in the comments about any tips that you have for working with matplotlib and making beautiful data visualisations.

Thanks for reading. Before you go, you should definitely subscribe to my content and get my articles in your inbox. You can do that here! Alternatively, you can sign up for my newsletter to get additional content straight into your inbox for free.

Secondly, you can get the full Medium experience and support me and thousands of other writers by signing up for a membership. It only costs you $5 a month, and you have full access to all of the fantastic Medium articles and the chance to make money with your writing. If you sign up using my link, you will support me directly with a portion of your fee, and it won't cost you more. If you do so, thank you so much for your support!

Dataset Used in this Tutorial

The dataset used for this tutorial is a subset of a training dataset used as part of a Machine Learning competition run by Xeek and FORCE 2020.

Bormann, Peter, Aursand, Peder, Dilib, Fahad, Manral, Surrender, & Dischington, Peter. (2020). FORCE 2020 Well well log and lithofacies dataset for machine learning competition [Data set]. Zenodo. https://doi.org/10.5281/zenodo.4351156

This dataset is licensed under a a Creative Commons Attribution 4.0 International license.

Tags: Data Analysis Data Science Data Visualization Matplotlib Python