Visualize Endangered Animal Populations with Python

Author:Murphy  |  View: 22842  |  Time: 2025-03-22 23:01:00

Quick Success Data Science

Original image by Mathias Appel (CC0 1.0 Universal)

Data Journalism is a field of journalism that uses data analysis, visualization, and interpretation to tell compelling and informative stories. Like data scientists, data journalists leverage data and statistical techniques to uncover trends, patterns, and insights within datasets. When done well, their work provides context and depth to news reporting.

In 2008, the World Wildlife Fund Japan ran an award-winning print campaign that used image pixelation to depict endangered species populations. Each image of an endangered animal was decimated until its number of pixels equaled the number of existing members of that species. The more pixelated and indistinct an image, the fewer animals remain.

This was a great way to draw attention to the plight of many species, and others have replicated the campaign. In this Quick Success Data Science project, we'll write code that lets you reproduce this award-winning technique using the Pillow fork of the Python Imaging Library (PIL).

We'll also refine the process to show both the original and altered picture of the animal. This design is useful when (sadly) the animal is no longer recognizable in the altered image. Here's an example using an endangered California condor:

California condor (public domain, USFWS)

Each pixel in the right-hand image represents a living condor. This is a powerful way to demonstrate how few remain.


Key Programming Subjects Covered

We'll cover three important coding skills in this project:

  • How to programmatically load images from the internet.
  • How to programmatically manipulate images with Pillow.
  • How to programmatically annotate images of different sizes.

What You'll Need

To run the code, you'll need the Pillow and Matplotlib third-party libraries, images of endangered animals, and a count of the number of each animal alive today.

Installing Libraries:

To install Pillow use:

python3 -m pip install - upgrade Pillow

or, if you're an Anaconda user:

conda install pillow

or

conda install -c anaconda pillow

To install NumPy and Matplotlib use:

pip install numpy matplotlib

or

conda install numpy matplotlib

Population Counts

Because population estimates vary widely, I've "triangulated" some numbers on my own using reported values from Wikipedia, the _World Wildlife Fund_, the Smithsonian, and several other sources. While not "official," these counts will let us examine a nice range of Pixelation results.

  • Black-footed Ferret: 800
  • Mountain Gorilla: 1,000
  • Black Rhinoceros: 5,500
  • Red Panda: 10,000

Determining the number of living specimens of a given species is difficult. While it's relatively easy to count the number of large mammals, such as elephants, tracking down every little ferret is a challenge. Reporting is also a problem, as some sources count only animals in the wild, while others include animals in zoos and breeding centers.

Digital Images

A good place to find images is Wikimedia Commons, a media repository of free-to-use images, sounds, videos, and other media. I've already selected some files, but if you want to use different animals, search for the animal in Wikimedia, select an image, click the Download button, and copy the File URL (highlighted in gray in the following figure).

Wikimedia Commons download dialog with File URL highlighted in gray (Original tiger image by B_cool courtesy of Wikimedia Commons (CC By 2.0 Deed))

The Process

The workflow is as follows:

  1. Create Python lists of the animal names, the image file URLs, and the number of remaining animals.
  2. Load the images from the internet with urllib.
  3. Use PIL to open, resize, and resample (pixelate), the images.
  4. Use PIL to concatenate the original and pixelated image.
  5. Use NumPy and Matplotlib to label and display the twinned images.

The Code

The following code was written in JupyterLab. If you're using JupyterLab, I suggest you disable scrolling in output cells, so you won't have to scroll to see the pixelated pictures. Right-click in a cell and select Disable Scrolling for Outputs. If you're using the classic Jupyter Notebook, select Cell > Current Outputs > Toggle Scrolling.

Importing Libraries

We'll use urllib to load the images from Wikimedia Commons, PIL to resize and pixelate images, Matplotlib to add text (labels) to the images, and NumPy to convert the pixel data of the image into a NumPy array for display using Matplotlib.

import urllib.request
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image

Listing the Input Data

We need Python lists of the animal names, their image URLs, and the number of each species considered to be alive in both the wild and in captivity. The order of these items in each list should be the same. That is, if "Black-footed Ferret" is first in one list, its related items should be first in the other lists.

# List of endangered animals:
animals = ['Black-footed Ferret', 'Mountain Gorilla', 
           'Black Rhino', 'Red Panda']
# List of image file locations on Wikipedia Commons:
pics = [
    'https://upload.wikimedia.org/wikipedia/commons/c/cf/Mustela_nigripes_2.jpg',
    'https://upload.wikimedia.org/wikipedia/commons/5/50/'
    'Mountain_Gorilla%2CBwindi%2C_Uganda_%2815135296098%29.jpg',
    'https://upload.wikimedia.org/wikipedia/commons/3/35/'
    'Black_Rhino_%2815797036788%29.jpg',
    'https://upload.wikimedia.org/wikipedia/commons/8/81/'
    'Red_Panda_%2831350780004%29.jpg'
]

# List of number of specimens believed to exist:
remaining = [800, 1_000, 5_500, 10_000]

As mentioned previously, all the images are from Wikimedia Commons. Attributions are provided in the following table.

Image attributions (by author)

Defining a Function to Load Images

Since the workflow will involve using a loop to process multiple images, we'll first define a function to load the images, and then another to manipulate them.

The following function takes a URL as an argument, opens the address with urllib, loads the image using PIL's Image module, and then returns the Image object.

def load_image(url):
    """Open an image URL and return a PIL image object."""
    with urllib.request.urlopen(url) as url_response:
        img = Image.open(url_response)
    return img

Defining a Function to Manipulate the Images

Next, we define the workhorse function that will resample each animal image so that it has (approximately) the same number of pixels as the animals remaining. After that, we'll make a new image with the original image on the left and the pixelated image on the right. A detailed description follows the code.

def create_pixelated_image(original_img, num_remaining):
    """Return original and pixelated images side-by-side."""
    # Determine scalar for pixelating original image:
    ini_size = original_img.size
    ini_num_pixels = ini_size[0] * ini_size[1]
    scalar = (num_remaining / ini_num_pixels) ** 0.5

    # Resize the original image:
    im_resample = original_img.resize((round(ini_size[0] * scalar), 
                                       round(ini_size[1] * scalar)),
                                      resample=0)

    # Resample back to the original image:
    im_rescale = im_resample.resize((ini_size[0], 
                                     ini_size[1]),
                                    resample=Image.NEAREST)

    # Bundle the original and pixelated images:
    im_merge = Image.new('RGB', (2 * ini_size[0], ini_size[1]))
    im_merge.paste(original_img, (0, 0))
    im_merge.paste(im_rescale, (ini_size[0], 0))

    return im_resample, im_merge

We defined the function using arguments for the original image and the remaining number of animals. Next, we used PIL to get the initial image's size (ini_size). This is a tuple of width and height. We then multiplied these together to get the total number of pixels. We divided this value into the number of remaining animals and took its square root to produce a scaling factor (scalar).

Next, we resampled the original image by resizing its width and height by the scalar. This resampled image will be too small to easily see, so we rescaled it back to the size of the original image using "nearest neighbor" sampling, which preserved its pixelated nature.

To compare the original image to the pixelated version, we bundled the images together. The first step was to create a new image, named im_merged, that is twice as wide as the original image. We then pasted the original image (im) to the left side of this new image and the rescaled pixelated image (im_rescale) to the right side.

NOTE: While it's considered best practice to write short functions that perform a single task, there are times when this can become awkward. For example, in the previous function, the ini_size and _original_img_ variables are used throughout the process. Refactoring to multiple functions requires that these be either redundantly recalculated or repeatedly passed as an argument. (The latter can trigger redefining name from outer scope warnings from linters). As all the tasks in this function fall under the umbrella of creating the final image, bundling them in the same function reduces both the amount of code you need to write, as well as its complexity.

Building the Final Images with a Loop

To complete the code, we'll loop through the data in our three input lists (animals, pics, remaining) and use the built-in zip() function to link them together. A detailed explanation follows the code.

# Zip and loop through previous lists and load images:
for animal, pic, num in zip(animals, pics, remaining):
    im = load_image(pic)

    # Pixelate image and display with original image:
    im_resampled, im_merged = create_pixelated_image(im, num)

    # Add text and display using Matplotlib:
    fig, ax = plt.subplots(figsize=(12, 6))
    ax.imshow(np.asarray(im_merged))

    # Hide x and y-axis labels and ticks:
    ax.set_xticks([])
    ax.set_yticks([])
    ax.set_xticklabels([])
    ax.set_yticklabels([])

    # Assign the text to overlay on the image:
    textstr1 = f"{animal}"
    textstr2 = f"{num:,} is too few"

    # Set the text box properties as a dictionary:
    props = {"boxstyle": 'round', 
             "facecolor": 'wheat', 
             "alpha": 0.7}

    # Place a text box in the upper left of each paired image:
    ax.text(0.01, 0.98, textstr1, transform=ax.transAxes, 
            fontsize=12, verticalalignment='top', bbox=props)
    ax.text(0.51, 0.98, textstr2, transform=ax.transAxes, 
            fontsize=12, verticalalignment='top', bbox=props)

    # Print stats and display the merged image:
    # print(f"nResampled size: {im_resampled.size[0] * im_resampled.size[1]}")
    # print(f"{animal}: {num} remaining.")    
    plt.show()

After starting the for loop, we called our pre-defined functions. We first loaded an image from the list and then called the create_pixelated_image() function, which produced the final image. Note that we returned the resampled image (im_resampled) as an optional quality control step so that we can check the number of pixels versus the number of animals remaining (in the final print calls currently commented out).

While it's easy to display the images using PIL, it's not as easy to annotate them with attractive text, such as the animal's name and the number remaining. To facilitate this, we used Matplotlib to annotate and display the final images.

After setting up the figure (fig) and axes (ax) objects, we used NumPy's asarray() method to convert the PIL object to a NumPy array, then displayed it with Matplotlib's imshow() method. The latter displays data on a 2D regular raster, that is, as an image.

Next, we turned off the axis tick marks and labels, and assigned text variables for the name and the number of specimens remaining. To post them in attractive boxes, we created a dictionary of properties (props), that we passed to Matplotlib's text() method, used for posting text strings. The left-hand image will contain the animal's name, and the pixelated right-hand image will hold the number of specimens remaining.

Two important arguments to text() are transform=ax.transAxes and bbox=props. The first lets Matplotlib know that you want to use relative x and y coordinates (called axes coordinates). Resulting values range from 0 to 1, where 0.5 would be halfway across the image.

Using axes coordinates is the secret to posting text at the same relative place on each image. Otherwise, the text string will move around with changes to the x or y limits. For more on this, see the Transformations Tutorial.

The bbox argument tells Matplotlib how to draw the bounding boxes that hold the text. I chose to use boxes here so that the text would be apparent against the changing backgrounds of each image.

The final bit of code, currently commented out, is used to compare the number of pixels in the resampled (pixelated) image to the number of animals remaining. Because pixels have to be integers, sometimes these will be a little off, as shown in the table below.

Quality control table comparing the number of animals to the number of pixels (by author)

If you're not satisfied with the match, try cropping the image (width * height = number of pixels).

Here's the final display. It may take a few seconds for all the images to draw.

The final merged display (by author)

In theory, you'll want to crop the image closely to the animal, so you're not "wasting" pixels on the background. However, you'll need to balance this against loss of composition and "artistry."


Summary

The final images represent some very effective data journalism. It's sobering to think that every pixel in the resampled images represents one living animal. It's even more sobering to consider that some of these animals live in zoos, making the wild breeding population even smaller.

Thankfully, through the efforts of organizations like the World Wildlife Fund for Nature Inc., the populations of some endangered animals – including the ones featured in the Population by Pixel campaign – have been increasing in recent years.

If you're a fan of data journalism, consider getting a subscription to The Economist magazine. Subscribers receive emails and other links that provide behind-the-scenes insights into how their data journalists operate, including how they design their weekly magazine covers. Affordable subscriptions can be found at discountmags.com.


Thanks!

Thanks for reading and follow me for more Quick Success Data Science projects in the future. To help me choose content, please scroll down and clap if you found this article useful or otherwise engaging.

Tags: Data Journalism Endangered Species Image Manipulation Pixelation Python Programming

Comment