Chart Wars: Pie Chart vs. Sorted Radial Bar Chart

Author:Murphy | View: 26950 | Time: 2025-03-22 21:00:44

Quick Success Data Science

A sorted radial bar chart (by the author)

Bar charts may be the King of Charts, but let's face it, they're boring. Some popular alternatives, like pie charts, aren't much better. Here's a pie chart showing where Germany got its electricity in 2023:

Pie chart of German electricity sources in 2023 (by the author from Wikipedia)

This is boring as well. Flat and static. Worse, humans aren't great at discriminating by area, making this diagram hard to parse.

Now let's try a little thing called a sorted radial bar chart:

Radial bar chart of German electricity sources in 2023 (by the author from Wikipedia)

This chart is fetching and full of movement. Both the largest and smallest wedges draw the eye. You want to explore this chart!

Python's popular Matplotlib library produces radial bar charts by plotting bars on a polar axis. In this Quick Success Data Science project, we'll use Python, NumPy, pandas, and Matplotlib to produce a sorted radial bar chart of German energy sources.

Key Programming Subjects Covered:

Making a pandas DataFrame from a Python dictionary
Sorting a pandas DataFrame
Plotting polar coordinates with Matplotlib

The Code

We'll start by reproducing the previous figure. Next, we'll add grid lines for more precision. Then we'll select and highlight a single bar to draw attention to a specific energy source.

Importing Libraries

Start by importing the following libraries:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

Preparing the DataFrame

Next, we'll enter open-source Wikipedia data into a Python dictionary which we'll convert into a pandas DataFrame. A key step here is applying the pandas sort_values() method, as sorted values are required for making a sorted Radial Bar Chart.

# Dictionary of German energy sources 2023:
sources = {'Brown Coal': 17.7, 
            'Hard Coal': 8.3, 
            'Natural Gas': 10.5,
            'Wind': 32,
            'Solar': 12.2, 
            'Biomass': 9.7, 
            'Nuclear/other': 5.1, 
            'Hydro': 4.5}

# Create pandas DataFrame:
df = pd.DataFrame(sources.items(), 
                  columns=['Source', 
                           'Contribution Percent'])

# Sort the DataFrame:
df = df.sort_values(by='Contribution Percent')
df = df.reset_index(drop=True)

# Create a custom color list:
colors = ['blue', 'purple', 
          'black', 'green', 
          'orange', 'yellow', 
          'brown', 'skyblue']

# Add colors to DataFrame as a column:
df['Color'] = colors

df.head(10)

The German energy source DataFrame (by the author)

Plotting the Chart

Normal bar charts use coordinates of bar height (y-value) and bar position (x-value). Radial bar charts, on the other hand, use polar coordinates. The x-value becomes an angle (called theta), and the y-value becomes a radius, representing the bar height.

Although a radial bar chart looks much like a pie chart, the width of the bars is constant, and we'll calculate it with the equation:

2 * np.pi / len(df.index)

We're dividing 360 degrees (2pi radians) by the number of energy sources, represented by the DataFrame's index. Thus, radial bar charts still rely on bar height to convey their values, whereas pie charts rely on the area of each slice. This addresses the parsing problem.

For a more attractive chart, we'll leave off the axis grid and add value annotations to the top of each bar. This way, the chart will cleanly communicate both relative and absolute values.

Here's the annotated plotting code:

# Constants:
LOWER_LIMIT = 0  # Start bar at center
LABEL_PAD = 0.8  # Separate label from bar
FIG_SIZE = (8, 8)

# Set the figure:
plt.figure(figsize=FIG_SIZE)
ax = plt.subplot(polar=True)
plt.axis('off')

# Set height and width of bars:
heights = df['Contribution Percent']
width = 2 * np.pi / len(df.index)  # Note: 2pi radians = 360 degrees.

# Create a list of indexes and 
# calculate theta angle for a polar plot:
indexes = list(range(1, len(df.index)+1))
theta = [i * width for i in indexes]

# Create a radial bar plot:
bars = ax.bar(x=theta, 
              height=heights, 
              width=width, 
              bottom=LOWER_LIMIT,
              linewidth=1, 
              edgecolor='grey', 
              color=df.Color)

# Loop through parameters and 
# set bar labels to df column values:
for bar, theta, label1, label2 in zip(bars, 
                                      theta, 
                                      df['Source'], 
                                      df['Contribution Percent']):

    # Combine df column names into labels
    # (such as, "Biomass 9.7%"):
    label = f"{label1} {label2}%"

    # Orient labels based on semicircle location:
    # Convert to degrees:
    rotation = np.rad2deg(theta)

    # Rotate label based on position:
    if np.pi/2 <= theta < 3*np.pi/2:
        alignment = 'right'
        rotation = rotation + 180
    else: 
        alignment = 'left'

    # Add label text:
    ax.text(x=theta, 
            y=LOWER_LIMIT + bar.get_height() + LABEL_PAD,
            s=label, 
            ha=alignment, 
            va='center', 
            rotation=rotation, 
            rotation_mode='anchor')    

# Set the title:
ax.set_title('2023 German Energy Sources', 
             y=0.9)

# Cite the data source:
ax.text(x=0.40, 
        y=0.87, 
        s='Source: Wikipedia', 
        fontstyle='italic', 
        fontsize=10,
        transform=ax.transAxes);

The sorted radial bar chart with no axis grid (by the author)

In this example, all the labels are black. To match the label color to the bar color, pass the bar.get_facecolor() method to the color argument of the ax.text() method, as so:

    ax.text(x=theta, 
            y=LOWER_LIMIT + bar.get_height() + LABEL_PAD,
            s=label, 
            ha=alignment, 
            va='center', 
            rotation=rotation, 
            rotation_mode="anchor", 
            color=bar.get_facecolor())

Now the label and bar colors match, though at the expense of readability:

The sorted radial bar chart with colored labels (by the author)

Adding the contribution percent value to the bar annotations removed the need for an axis grid, resulting in a cleaner plot. In the next section, we'll add these grid lines to produce a plot suitable for scientific publications.

Adding Grid Lines

To add grid lines, we'll make the following edits to the previous plotting code:

Turn on the plot axis in the figure set-up block: plt.axis('on')

Add the following lines to the end of the code:

Turn off the polar grid and degree labels: ax.set_thetagrids([], labels=[])

Set the position of the radius labels: ax.set_rlabel_position(15)

Now we'll see grid lines at intervals of 5%, labeled at a 15-degree angle measured counter-clockwise from the x-axis.

Since we no longer need to post the percent value in the label, we'll turn it off by passing s=label1 to the ax.text() method.

Here's the edited plotting code:

# Constants:
LOWER_LIMIT = 0  # Start bar at center
LABEL_PAD = 0.8  # Separate label from bar
FIG_SIZE = (8, 8)

# Set the figure:
plt.figure(figsize=FIG_SIZE)
ax = plt.subplot(polar=True)
plt.axis('on')

# Set height and width of bars:
heights = df['Contribution Percent']
width = 2 * np.pi / len(df.index)  # Note: 2pi radians = 360 degrees.

# Create a list of indexes and 
# calculate theta angle for polar plot:
indexes = list(range(1, len(df.index)+1))
theta = [i * width for i in indexes]

# Create a radial bar plot:
bars = ax.bar(x=theta, 
              height=heights, 
              width=width, 
              bottom=LOWER_LIMIT,
              linewidth=1, 
              edgecolor='grey', 
              color=df.Color)

# Loop through parameters and 
# set bar labels to df column values:
for bar, theta, label1, label2 in zip(bars, 
                                      theta, 
                                      df['Source'], 
                                      df['Contribution Percent']):

    # Combine df column names into labels
    # (such as, "Biomass 9.7%"):
    label = f"{label1} {label2}%"

    # Orient labels based on semicircle location:
    rotation = np.rad2deg(theta)
    if np.pi/2 <= theta < 3*np.pi/2:
        alignment = 'right'
        rotation = rotation + 180
    else: 
        alignment = 'left'

    ax.text(x=theta, 
            y=LOWER_LIMIT + bar.get_height() + LABEL_PAD,
            s=label1, 
            ha=alignment, 
            va='center', 
            rotation=rotation, 
            rotation_mode='anchor')    

# Set the title and add text for the data source:
ax.set_title('2023 German Energy Sources', 
             y=0.9)
ax.text(x=0.40, 
        y=0.87, 
        s='Source: Wikipedia', 
        fontstyle='italic', 
        fontsize=10,
        transform=ax.transAxes);

# Set the background grid:
ax.set_thetagrids([], labels=[])

# Set the position of the radius labels:
ax.set_rlabel_position(15);

The sorted radial bar chart with the background grid displayed (by the author)

While busy, this display facilitates comparing bar heights around the circle.

Highlighting Specific Bars

Sometimes you want to draw a viewer's eye to a specific data point. For example, how much natural gas does Germany use?

Let's highlight natural gas use by coloring that bar red for emphasis while coloring the others grey. We'll create a bar_colors variable with an if statement, which we then pass to the ax.bar() method. Here's the code:

# Constants:
LOWER_LIMIT = 0  
LABEL_PAD = 0.8
FIG_SIZE = (8, 8)

# Set the figure:
plt.figure(figsize=FIG_SIZE)
ax = plt.subplot(polar=True)
plt.axis('off')

# Set bar colors to highlight the Natural Gas source:
bar_colors = ['red' if source == 'Natural Gas' else
              'gray' for source in df.Source]

# Set height and width of bars:
heights = df['Contribution Percent']
width = 2 * np.pi / len(df.index)  # Note: 2pi radians = 360 degrees.

# Create a list of indexes and 
# calculate theta angle for polar plot:
indexes = list(range(1, len(df.index)+1))
theta = [i * width for i in indexes]

# Create a radial bar plot:
bars = ax.bar(x=theta, 
              height=heights, 
              width=width, 
              bottom=LOWER_LIMIT,
              linewidth=1, 
              edgecolor='white', 
              color=bar_colors)

# Loop through parameters and 
# set bar labels to df column values:
for bar, theta, label1, label2 in zip(bars, 
                                      theta, 
                                      df['Source'], 
                                      df['Contribution Percent']):

    # Combine df column names into labels
    # (such as, "Biomass 9.7%"):
    label = f"{label1} {label2}%"

    # Orient labels based on semicircle location:
    rotation = np.rad2deg(theta)
    if np.pi/2 <= theta < 3*np.pi/2:
        alignment = 'right'
        rotation = rotation + 180
    else: 
        alignment = 'left'

    ax.text(x=theta, 
            y=LOWER_LIMIT + bar.get_height() + LABEL_PAD,
            s=label, 
            ha=alignment, 
            va='center', 
            rotation=rotation, 
            rotation_mode='anchor')    

# Set the title and add text for the data source:
ax.set_title('2023 German Energy Sources', 
             y=0.9)
ax.text(x=0.40, 
        y=0.87, 
        s='Source: Wikipedia', 
        fontstyle='italic', 
        fontsize=10,
        transform=ax.transAxes);

The sorted radial bar chart with the new color scheme (by the author)

For additional emphasis, we could color the "Natural Gas" label red using code we reviewed previously.

Summary

Sorted radial bar charts make for engaging and punchy infographics with more movement than traditional bar charts and pie charts. They are easier for humans to parse than pie charts because, like bar charts, they rely on length, rather than area, to communicate data values. They really raise the bar for visualizations (sorry).

Matplotlib's bar() method uses polar coordinates to produce radial bar charts. An angle is substituted for the bar width and a radius for the bar height.

Like pie charts, radial bar charts work best with a limited number of categories. You also want to avoid anomalously large or small values.