Step-by-Step Guide for Building Bump Charts in Plotly

Author:Murphy | View: 27435 | Time: 2025-03-22 19:23:55

Bump Charts in Plotly (Image by the Author)

Plotly is one of the most complete libraries for visualizing data in Python and, without a doubt, my favorite. It has a wide range of visualizations already defined, from basic visualizations, such as bar charts or pie charts, to more specific visualizations from the statistical or Data Science area, such as box plots or dendrograms.

The visualization options available in Plotly are immense; however, some visualizations are not available in the library. This does not mean that we cannot do them. With a little ingenuity, and using the customization and visualization options present in Plotly, it is possible to create many visualizations that, a priori, were impossible to do. One of them is the bump chart.

This article will explain how to create bump charts using Plotly. Starting with a scatter plot and a little imagination and creativity, we will see that creating this type of visualization is easier than it seems.

1. Why Choose Bump Charts?

Bump charts, also known as ranking charts, are designed to explore changes in a ranking over time. This type of chart allows you to quickly identify trends and detect elements at the top or bottom of the ranking and changes over time.

This display is handy when you want to know the ranking position of different categories rather than the values themselves and you want to identify transitions in the ranking rapidly.

In addition, in this type of visualization, colors can be used to emphasize changes in the position of the elements and create a narrative around the visualization.

2. Step-by-Step Guide to Building Bump Charts in Plotly

This article explains step-by-step how to create the following bump chart. It shows the districts of Valencia ordered according to the average net income per person in 2015 and 2022. These years correspond to the oldest and most recent year for which data is available from the National Institute of Statistics.

Ranking of Valencia Districts by Income (Image by the Author)

Data Acquisition from the National Institute of Statistics

The data were obtained from the Spanish National Statistics Institute (INE), available at the following link (CC BY 4.0). The INE is responsible for producing the country's official statistics.

Indicadores de renta media y mediana

Only the districts of Valencia must be selected in the "Territorial units" section. The indicator to be selected is the average net income per person, and the period is, firstly, 2015, and subsequently, 2022.

Spanish National Statistics Institute Portal (Image by the Author).

Once the two data sets have been downloaded, they are read with Pandas.

Average Net Income per Person in Valencia Districts in 2022 (Image by the Author)

Average Net Income per Person in Valencia Districts in 2015 (Image by the Author)

Data Transformation for Bump Chart Visualizations

The two downloaded datasets must be transformed to use them in constructing the bump chart. First, the district code (e.g. 01) must be extracted from the names used for the districts (e.g. 4625001 València district 01). Subsequently, the column with the average net income is multiplied by one thousand, since the point present in the data has been erroneously interpreted as a decimal. This column is then used to calculate the ranking of each district. Finally, the districts are sorted by ranking and the column Year is transformed to object type.

The clean_and_rank_district_data function performs all of the above transformations. It is applied to the two data sets, which are then concatenated into a single data frame.

Cleaned Dataset: Average Net Income per Person and Ranking Position (Image by the Author)

Finally, a mapping must be made between the district number and its name to create a new column with the names of all the districts. The codes and names of the districts can be found on the official page of the Valencia City Council.

Districtes / Distritos

The create_district_mapping function reads the file and converts it into a mapping, where the keys are the codes and the values are the district names. This dictionary is used to add an extra column with the district name.

Cleaned DataFrame Including District Names (Image by the Author)

We now have the data ready to create our bump chart.

Creating a Scatter Plot to Visualize Rankings

The base visualization in Plotly to create the bump plot is a scatter plot. Each district of Valencia is a trace of this scatter plot. The direct legends with the district names are made using annotations. The annotations are also used to create the subtitle and footer of the visualization. The creation of the following chart is explained in detail below, showing also the code needed to create it.

The first step in creating the above visualization is to create each of the traces that are part of the graph, using a scatter plot. Each trace is composed of two points: (1) the district's ranking position in 2015 and (2) in 2022. The add_district_traces function is responsible for the creation of these traces. In the visualization, a custom hover-over has been added where the position in the ranking as well as the income of each district can be consulted.

Initial Bump Chart Schema (Image by the Author)

To obtain the display colors, the function get_custom_colors has been used. In this function, two different schemes have been defined, intended for a light or dark background. Each scheme consists of a list of colors. In this article, only a light background will be used, but it is interesting to define a color scheme for a dark background, in case you decide to use it for the display.

Concerning the previous visualization, we see that the main outline of the bump chart has been realized; however, several adjustments still need to be made. One of the main adjustments that needs to be made to correctly interpret the visualization is to create a direct label next to each trace, instead of using a legend with the name of the districts. This will make it easier to identify to which district each trace corresponds.

Creation of Direct Labels Next to Each Trace

The creation of the direct labels will be done after creating each of the traces. That is, after creating a trace, we create a label. To do this, we need to define the X and Y of our direct label. The X will correspond to the X of the last year of our visualization, in this case, 2022. Because the years are strings and not integer values, the position must be counted as an integer value, starting from 0. In this case, 2015 corresponds to a value of X equal to 0 and 2022 to a value equal to 1. As for the Y, it corresponds to the position in the ranking in 2022. The direct labels will be created with the same color as the trace. In addition, since the legend is not necessary, it will be removed from the plot.

Initial Bump Chart Schema with Direct Labels (Image by the Author)

The main schema of the visualization, together with the direct labels, is already created. Now all that remains is to finalize the details of the visualization.

Finalizing the Visualization Details

The final details are based on the creation of a title, subtitle, footer, and modification of the layout. Annotations have been used to create the subtitle and footer. Regarding the modification of the layout, (1) a white background has been set in the visualization, (2) the X and Y axis grids have been removed, (3) the Y axis has been reversed so that the districts that are in higher positions in the ranking are at the top of the visualization, and (4) finally, the font of the visualization has been modified. The result of applying all these changes is the final visualization.

We have now obtained our bump chart: interactive, elegant and, as you have seen, easy to make. In the next section, we will explain some modifications we can make to this chart to give it a more informative touch.

3. Customization of the Previously Created Base Bump Chart

As we know, Plotly offers many customization options. In this section, we will explore some of them in detail to create a new customized chart from the base visualization created earlier. We will evaluate 3 types of customizations; however, I invite you to be creative and try your own.

Adjusting the Marker Size in the Chart

The first adjustment we will make is to modify the size of the markers, which in the previous visualization was always constant. Now, we will adjust them according to the average income per person, which is the variable based on which the neighborhoods are ranked. This can be easily done by creating a dynamic marker size based on the value of the column income marker=dict(size=district_data["Income"]/1000, opacity=1).

Ranking of Valencia Districts by Income with Marker Size Adjustment (Image by the Author)

Over these seven years, the average income per person in the neighborhoods has increased, as the size of the markers in 2015 is smaller than in 2022. However, the district's position in the ranking has hardly changed, which shows that, although the income has increased over these years, the privileged and non-privileged neighborhoods remain the same.

Adding Annotations to Markers Indicating Ranking Position

Another possible adjustment to the base graph is adding a number indicating the ranking position. This makes it easier to visualize where each neighborhood ranks regarding average net income per person. The following graph shows the result after adding the annotations with the ranking position.

Ranking of Valencia Districts by Income with Annotated Markers Showing Position (Image by the Author)

The annotations with the position in the ranking make it easy to identify the location of each district. For example, the district of La Saidia occupied the 11th position in both 2015 and 2022.

To add annotations, an additional function has been created, which generates an annotation for each of the defined markers.

Adjusting Trace Color Based on Ranking Changes

The last adjustment we will make consists of modifying the color of each trace according to the positions that the district has advanced or regressed in relation to the average income level in 2022 with respect to 2015. Districts that have not changed their position compared to 2015 are displayed in a neutral color, in this case, gray. Districts that have improved their position in the ranking are represented with bluish shades, while those that have worsened their position are shown with reddish shades. The intensity of the color adjusts to the change in position; for example, the greater the positive change in ranking, the darker the blue chosen to represent the district.

For this purpose, the function get_trace_color has been created, providing an RGB string with the color that the trace should have. This color will be used to determine the color of the line and the trace: line=dict(color=trace_color), marker=dict(size=20, color=trace_color).

Ranking of Valencia Districts by Income with Trace Color Adjustments Based on Ranking Changes (Image by the Author)

Most of the districts maintain their position in the ranking seven years later. The district that has improved its position the most is Poblats Marítims, which goes from position 17 to 14. On the other hand, the district that has experienced the greatest drop in position is L'Olivereta, which moves from 16th to 18th position. However, as mentioned above, there are no significant changes in the ranking positions.

In this section, we have presented three possible customizations, but I invite you to be creative and explore all the options that Plotly offers to create your custom graphics.

4. Beyond Bump Charts: Exploring Additional Data Visualizations You Can Create

Calendar Charts

Bump Charts are not the only visualization not available in Plotly that you can create with a little ingenuity. Many other visualizations can be developed from the charts already in the library. For example, using heatmaps as a base visualization, it is possible to create calendars.

The following calendar, created in Plotly, shows all the holidays in Barcelona in 2024. As you can see, different colors have been used to represent working days, weekends, and holidays. The months have been represented in four columns and three rows; however, all these elements are customizable and you can adapt them according to your design criteria.

Barcelona 2024 Holidays Calendar (Image created by the author)

If you want to consult all the steps and the code needed to create the above calendar, I recommend you to read the following article, where I explain step by step all the details.

Step-by-Step Guide for Building Interactive Calendars in Plotly

Waffle Charts

Waffle charts do not offer a customized visualization either, but can be created using heatmaps as a base visualization. In addition, visualizations in Plotly can also be customized in a dark theme, such as the following graph, where a waffle chart is used to show the proportion of the population with different educational levels in Barcelona.

Barcelona's Educational Landscape in Dark Theme (Image Created by the Author)

The following article explains step by step how to create waffle charts in Plotly. Different examples are shown, from the creation of a single waffle chart in the visualization to the creation of multi-plot waffle charts.

Step-by-Step Guide for Building Waffle Charts in Plotly

Hexagon maps

Another visualization that can be created from existing graphs are hexagon maps. This type of map is an interesting alternative to administrative choropleth maps, as it allows a better visualization of how a variable is distributed over a territory. In choropleth maps, the larger administrative boundaries tend to have a greater weight in the representation. Alternatively, hexagonal maps divide the territory into equal areas using a hexagonal grid. This allows a homogeneous representation of the variable throughout the territory and facilitates the detection of areas where data are concentrated.

The following hexagon map shows the distribution of hotels in Barcelona. The hexagons with more hotels are represented in the graph with reddish shades. On the contrary, the hexagons with few hotels are shown in light tones.

Hotel Distribution Hexagon Map of Barcelona City (Image created by the author)

The following article shows in detail all the steps to create the above visualization, including the code needed to perform it.

Constructing Hexagon Maps with H3 and Plotly: A Comprehensive Tutorial

Bump charts are widely used in presentations and the media, such as digital newspapers, as they allow you to quickly visualize not only the position in the ranking of different elements but also the changes that occur in the ranking over time. In this article, we have detailed step-by-step how to create these diagrams with Plotly. To do so, the average income per person in the different districts of Valencia in 2015 compared to 2022 has been visualized. Different ways to customize the created graphs have also been presented, to show the results with the design that best suits your tastes. This is a simple example, but it can serve as a basis for your future projects and visualizations.

Thanks for reading,

Amanda Iglesias

Tags: Data Science Data Visualization Design Hands On Tutorials Plotly