Death In The Himalayas

Author:Murphy  |  View: 26997  |  Time: 2025-03-22 23:08:18

Learning D3

I've been meaning to learn D3 for a while. To be honest, D3 has always been an overkill for the types of problems I've worked on (where visualizing the data was just a means to an end, not the final product itself). As a Python developer I often use tools like matplotlib, plotly, seaborn, pandas (or geopandas), and bokeh to "get the job done". Recently, however, I've been spending time creating data visualizations just for fun and it seems like the perfect time to start learning D3.

In this article I'll show you how I created a graphic like the one above for 5 peaks (Everest, Ama Dablam, Cho Oyu, Lhotse, and Manaslu) using Python, D3, and Illustrator. I will go over:

  1. Inspiration.
  2. Getting the data.
  3. Initial data preparation.
  4. Selecting 5 peaks to visualize.
  5. Preparing the data for plotting.
  6. Creating an SVG with D3.
  7. Saving the SVG and importing into Illustrator.
  8. Working with the SVG in Illustrator.
  9. Adding final touches.
  10. Lessons learned.

1. Inspiration

This visualization was inspired by "Gisa's Timeline" created by Barbara Rebolledo. I was looking for a nonstandard way to visualize the number of deaths during Himalayan expeditions and thought Barbara's timeline looked interesting (and it provided the perfect excuse to use D3 since creating something like that in Python would've been a nightmare).


2. Getting The Data

The data I used was obtained from the Himalayan Database and is the same dataset I used for the article "Visualizing Everest Expeditions".

The Himalayan Database is a compilation of records for all expeditions that have climbed in the Nepal Himalaya.

Specifically, I extracted information on Himalayan expeditions by following the instructions on the Himalayan Database website. The dataset is a small CSV file (with a little under 11,200 rows) containing expedition records. Here are the first 5 rows:

       expid peakid  year  season  host            route1            route2 route3 route4      nation               leaders                                        sponsor  success1  success2  success3  success4 ascent1 ascent2 ascent3 ascent4  claimed  disputed     countries                                   approach   bcdate   smtdate  smttime  smtdays  totdays  termdate  termreason                                         termnote  highpoint  traverse    ski  parapente  camps  rope  totmembers  smtmembers  mdeaths  tothired  smthired  hdeaths  nohired  o2used  o2none  o2climb  o2descent  o2sleep  o2medical  o2taken  o2unkwn                           othersmts                                          campsites                           accidents achievment agency  comrte  stdrte  primrte  primmem  primref primid   chksum
0  ANN260101   ANN2  1960       1     1  NW Ridge-W Ridge               NaN    NaN    NaN          UK      J. O. M. Roberts                                            NaN      True     False     False     False     1st     NaN     NaN     NaN    False     False  India, Nepal           Marshyangdi->Hongde->Sabje Khola  3/15/60   5/17/60   1530.0       63        0     -   -           1                                              NaN       7937     False  False      False      6     0          10           2        0         9         1        0    False    True   False     True      False     True      False    False    False  Climbed Annapurna IV (ANN4-601-01)  BC(15/03,3350m),ABC(4575m),C1(5365m),C2(5800m)...                                 NaN        NaN    NaN   False   False    False    False    False    NaN  2442047
1  ANN269301   ANN2  1969       3     1  NW Ridge-W Ridge               NaN    NaN    NaN  Yugoslavia          Ales Kunaver                Mountaineering Club of Slovenia      True     False     False     False     2nd     NaN     NaN     NaN    False     False           NaN           Marshyangdi->Hongde->Sabje Khola  9/25/69  10/22/69   1800.0       27       31  10/26/69           1                                              NaN       7937     False  False      False      6     0          10           2        0         0         0        0    False   False    True    False      False    False      False    False    False  Climbed Annapurna IV (ANN4-693-02)  LowBC(25/09,3950m),BC(27/09,4650m),C1(27/09,53...  Draslar frostbitten hands and feet        NaN    NaN   False   False    False    False    False    NaN  2445501
2  ANN273101   ANN2  1973       1     1    W Ridge-N Face               NaN    NaN    NaN       Japan       Yukio Shimamura  Sangaku Doshikai Annapurna II Expedition 1973      True     False     False     False     3rd     NaN     NaN     NaN    False     False           NaN        Marshyangdi->Pisang->Salatang Khola  3/16/73    5/6/73   2030.0       51        0     -   -           1                                              NaN       7937     False  False      False      5     0           6           1        0         8         0        0    False   False    True    False      False    False      False    False    False                                 NaN  BC(16/03,3300m),C1(21/03,4200m),C2(10/04,5000m...                                 NaN        NaN    NaN   False   False    False    False    False    NaN  2446797
3  ANN278301   ANN2  1978       3     1    N Face-W Ridge               NaN    NaN    NaN          UK  Richard J. Isherwood                British Annapurna II Expedition     False     False     False     False     NaN     NaN     NaN     NaN    False     False           NaN        Marshyangdi->Pisang->Salatang Khola   9/8/78   10/2/78      NaN       24       27   10/5/78           4  Abandoned at 7000m (on A-IV) due to bad weather       7000     False  False      False      0     0           2           0        0         0         0        0     True   False    True    False      False    False      False    False    False                                 NaN                   BC(08/09,5190m),xxx(02/10,7000m)                                 NaN        NaN    NaN   False   False    False    False    False    NaN  2448822
4  ANN279301   ANN2  1979       3     1    N Face-W Ridge  NW Ridge of A-IV    NaN    NaN          UK           Paul Moores                                            NaN     False     False     False     False     NaN     NaN     NaN     NaN    False     False           NaN  Pokhara->Marshyangdi->Pisang->Sabje Khola    -   -  10/18/79      NaN        0        0  10/20/79           4             Abandoned at 7160m due to high winds       7160     False  False      False      0     0           3           0        0         0         0        0     True   False    True    False      False    False      False    False    False                                 NaN  BC(3500m),ABC,Biv1,Biv2,Biv3,Biv4,Biv5,xxx(18/...                                 NaN        NaN    NaN   False   False    False    False    False    NaN  2449204

3. Initial Data Preparation

I wanted to show the evolution of the number of deaths over time and, if possible, I also wanted to add information on summit success rate and death rate (these ended up being just small dots in the final visualization). I decided to focus on the following columns:

  • year – When did the expedition take place?
  • expid— The expedition ID (unique when combined with year).
  • peakid – Peak ID.
  • totmembers + tothired – The number of members in the expedition.
  • smtmembers + smthired – The number of members that summited.
  • mdeaths + hdeaths – The number of members that passed away.

Including expid is useful because it helps us get additional information about an expedition whenever there is needed for clarification. For example, there are expeditions with 0 members. I assume this is an error, but it's also possible that these expeditions aren't expeditions at all and instead represent some other form of record. We can confirm that our intuition is correct by looking up some of these expeditions online. Let's take expedition ANNS7130, for example. This expedition appears to have 0 members. However, the Himalayan Database Online shows that there is exactly one member: Tomoyo Minegishi.

The expedition has one member.
Number of members listed as 0 in the "Total Mbrs" field.

Clearly there is an issue with the dataset and I decided to drop these records (expeditions with 0 members) from the analysis.

After dropping NaNs, grouping by year and peakid, counting the total number of members, summits, and deaths, and dropping any (year, peakid) combinations with 0 members (there were only 23 such combinations), this is what the data (exp_df) looked like:

>>> exp_df

   year peakid  no_summits  no_members  no_deaths  no_exped
0  1905   KANG           0           9          5         1
1  1907   KABN           0           2          0         1
2  1909   JONG           0           1          0         1
3  1909   LNPO           1           1          0         1
4  1910   KANG           0           1          0         1
  • year and peakid are defined as before.
  • no_members is the number of members.
  • no_summits is the number of members that summited.
  • no_deaths is the number of member deaths.
  • no_exped is the number of expeditions.

There are 406 peaks left in the database. I thought picking just a few of them would be best for the type of plot I wanted to create.


4. Selecting 5 Peaks To Visualize

Looking at the number of expeditions for each peak I saw that just a handful of peaks contained the bulk of the expeditions since 1905:


>>> key_exp = exp_df.groupby(by='peakid')[['no_exped']].sum().reset_index()
>>> key_exp.no_exped.describe()

mean       27.485222
min         1.000000
25%         2.000000
50%         3.000000
75%         8.000000
max      2303.000000

75% of all peaks have fewer than 8 expeditions since 1905! (at least according to the Himalayan Database). For example, here are 10 peaks with only one expedition since 1905:

>>> key_exp.tail(10)

    peakid  no_exped
252   NALS         1  # Nalakankar South
68    DHEC         1  # Dhechyan Khang
186   KUML         1  # Khumbutse
343   SAUL         1  # Saula
342   SATO         1  # Sat Peak 
71    DOGA         1  # Dogari
340   SANK         1  # Sano Kailash
254   NAN2         1  # Nangamari II
75    DOR2         1  # Dorje Lakpa II 
129   HMLE         1  # Himlung East

I decided to focus on the 5 peaks with the most expeditions: Everest, Ama Dablam, Cho Oyu, Manaslu, and Lhotse.

>>> key_exp.sort_values(by='no_exped', inplace=True, ascending=False, ignore_index=False)
>>> key_exp.iloc[:5, :]

    peakid  no_exped
84    EVER      2303  # Everest
1     AMAD      1525  # Ama Dablam
45    CHOY      1350  # Cho Oyu
233   MANA       754  # Manaslu
210   LHOT       497  # Lhotse

5. Preparing The Data For Plotting

To make my life easier when plotting, I make a few changes to the DataFrame.

Create all year/peak combinations for the 5 chosen peaks

I chose to remove years before 1921 because there weren't any expeditions before that for the 5 chosen peaks. After adding all (year, peak) combinations to these peaks we'll have introduced some NaN values which can be replaced with 0:

   year peakid  no_summits  no_members  no_deaths  no_exped
0  1921   AMAD         NaN         NaN        NaN       NaN
1  1921   CHOY         NaN         NaN        NaN       NaN
2  1921   EVER         0.0        30.0        2.0       1.0
3  1921   LHOT         NaN         NaN        NaN       NaN
4  1921   MANA         NaN         NaN        NaN       NaN

At this point our dataset is a DataFrame with a little over 500 rows.

NOTE: Adding all year and peakid combinations was not strictly necessary but I wasn't sure whether I wanted to include years and peaks with no expeditions in the visualization. I decided to leave all combinations in the to start and make a decision after seeing the visualizations

Add "is_good_seas" flag

I added a column called is_good_seas ("seas" stands for "season") with values that will be set to True whenever a (year, peak) combination has at least one expedition but no deaths (i.e., is a "good" season):

     year peakid  no_summits  no_members  no_deaths  no_exped  is_good_seas
510  2023   AMAD        27.0       126.0        0.0       8.0          True
511  2023   CHOY         5.0         9.0        0.0       1.0          True
512  2023   EVER       677.0      1251.0       18.0      50.0         False
513  2023   LHOT       107.0       153.0        0.0      20.0          True
514  2023   MANA         8.0        44.0        0.0       5.0          True

Add "death rate" and "success rate" columns

"Success rate" is defined as succrate = no_summits / no_members, and "death rate" is simply deathrate = no_deaths / no_members. I added these rates as new columns in the DataFrame along with two other columns: a column flagging when the death rate was higher than 10%, and a column flagging when the success rate was higher than 70% (these are numbers I played with after creating a first draft of the visualization). This is what the DataFrame looked like at this point:

     year peakid  no_summits  no_members  no_deaths  no_exped  is_good_seas  deathrate  high_deathrate  succrate  high_succrate
200  1961   AMAD         4.0         5.0        0.0       1.0          True   0.000000           False  0.800000           True
450  2011   AMAD       284.0       402.0        1.0      79.0         False   0.002488           False  0.706468           True
466  2014   CHOY       231.0       328.0        0.0      45.0          True   0.000000           False  0.704268           True
477  2016   EVER       678.0       935.0        5.0      80.0         False   0.005348           False  0.725134           True
481  2017   CHOY        77.0       105.0        0.0       6.0          True   0.000000           False  0.733333           True

Drop unnecessary columns, sort, and add time idx

Dropping unnecessary columns is not strictly necessary, but it helps keep things clean. To this end, I removed the no_summits, no_members, succrate, and deathrate columns. I also sorted by year and peakid (ascending) and added a temporal index (idx) to each peakid:

   year peakid  no_deaths  no_exped  is_good_seas  high_deathrate  high_succrate  idx
0  1921   AMAD        0.0       0.0         False           False          False    0
1  1922   AMAD        0.0       0.0         False           False          False    1
2  1923   AMAD        0.0       0.0         False           False          False    2
3  1924   AMAD        0.0       0.0         False           False          False    3
4  1925   AMAD        0.0       0.0         False           False          False    4

The idx column serves the same conceptual purpose as the year column, but I thought it might be useful when plotting.

In the end I decided to remove records where there were no expeditions whatsoever (this is something I decided to do after creating a first draft of the plot, where I realized that including these records resulted in the plot looking too cluttered). After filtering, I dropped the no_exped column:

    year peakid  no_deaths  is_good_seas  high_deathrate  high_succrate  idx
37  1958   AMAD        0.0          True           False          False   37
38  1959   AMAD        2.0         False            True          False   38
40  1961   AMAD        0.0          True           False           True   40
57  1978   AMAD        0.0          True           False          False   57
58  1979   AMAD        1.0         False           False          False   58

Log-transform and normalize no_deaths

I wanted the thickness of each square in the plot to represent number of deaths relative to every other year and peak combination. In other words, I wanted peaks with fewer expeditions (and therefore fewer deaths) to be made up of thinner squares than peaks with more expeditions (and therefor more deaths). This means that Everest's plot would be made up of nice thick squares, but the plots for the other 4 peaks will be made up of thin (barely visible) squares. To address this issue, I decided to log-transform the number of deaths across all 5 peaks and years. I also normalized the non-zero values (after log-transforming) to be in the interval [0.5, 3] (because this value is meant to be uses as line-thickness when plotting).

After log-transforming and normalizing I had something like this:

    year peakid  no_deaths  is_good_seas  high_deathrate  high_succrate  idx
37  1958   AMAD   0.000000          True           False          False   37
38  1959   AMAD   1.099531         False            True          False   38
40  1961   AMAD   0.000000          True           False           True   40
57  1978   AMAD   0.000000          True           False          False   57
58  1979   AMAD   0.500000         False           False          False   58

The no_deaths column values are in the interval [0.5, 3] or are equal to 0.

Add peak name and split into 5 CSV files

The Himalayan Database has a table for mapping peakid to the peak name which I merged into the DataFrame:

   year peakid  no_deaths  is_good_seas  high_deathrate  high_succrate  idx      pkname
0  1958   AMAD   0.000000          True           False          False   37  Ama Dablam
1  1959   AMAD   1.099531         False            True          False   38  Ama Dablam
2  1961   AMAD   0.000000          True           False           True   40  Ama Dablam
3  1978   AMAD   0.000000          True           False          False   57  Ama Dablam
4  1979   AMAD   0.500000         False           False          False   58  Ama Dablam

I then split the data into 5 CSV files: ama_dablam.csv, cho_oyu.csv, everest.csv, lhotse.csv, and manaslu.csv, one for each peak:

# ama_dablam.csv (49 rows)
   year peakid  no_deaths  no_exped  is_good_seas  high_deathrate  high_succrate  idx      pkname
0  1958   AMAD   0.000000       1.0          True           False          False   37  Ama Dablam
1  1959   AMAD   1.099531       1.0         False            True          False   38  Ama Dablam
2  1961   AMAD   0.000000       1.0          True           False           True   40  Ama Dablam
3  1978   AMAD   0.000000       1.0          True           False          False   57  Ama Dablam
4  1979   AMAD   0.500000       4.0         False           False          False   58  Ama Dablam

# cho_oyu.csv (53 rows)
    year peakid  no_deaths  no_exped  is_good_seas  high_deathrate  high_succrate  idx   pkname
49  1951   CHOY   0.000000       1.0          True           False          False   30  Cho Oyu
50  1952   CHOY   0.000000       1.0          True           False          False   31  Cho Oyu
51  1954   CHOY   0.000000       2.0          True           False          False   33  Cho Oyu
52  1958   CHOY   0.500000       1.0         False           False          False   37  Cho Oyu
53  1959   CHOY   1.699062       1.0         False            True          False   38  Cho Oyu

# everest.csv (76 rows)
     year peakid  no_deaths  no_exped  is_good_seas  high_deathrate  high_succrate  idx   pkname
102  1921   EVER   1.099531       1.0         False           False          False    0  Everest
103  1922   EVER   2.183097       1.0         False            True          False    1  Everest
104  1924   EVER   1.699062       1.0         False           False          False    3  Everest
105  1933   EVER   0.000000       1.0          True           False          False   12  Everest
106  1934   EVER   0.500000       1.0         False            True          False   13  Everest

# lhotse.csv (52 rows)
     year peakid  no_deaths  no_exped  is_good_seas  high_deathrate  high_succrate  idx  pkname
178  1955   LHOT        0.0       1.0          True           False          False   34  Lhotse
179  1956   LHOT        0.0       1.0          True           False          False   35  Lhotse
180  1972   LHOT        0.0       1.0          True           False          False   51  Lhotse
181  1973   LHOT        0.0       1.0          True           False          False   52  Lhotse
182  1974   LHOT        0.5       2.0         False           False          False   53  Lhotse

# manaslu.csv (60 rows)
     year peakid  no_deaths  no_exped  is_good_seas  high_deathrate  high_succrate  idx   pkname
230  1950   MANA        0.0       1.0          True           False          False   29  Manaslu
231  1952   MANA        0.0       1.0          True           False          False   31  Manaslu
232  1953   MANA        0.0       1.0          True           False          False   32  Manaslu
233  1954   MANA        0.0       1.0          True           False          False   33  Manaslu
234  1955   MANA        0.0       1.0          True           False          False   34  Manaslu

6. Creating An SVG With D3

I decided to create a plot for each peak separately. To make things specific, let's assume I'm creating the plot for Manaslu (the code will be reused for the other peaks by simply changing the path to the CSV file with the data).

Basic setup

I started by creating a bare bones HTML file called index.html (in the same folder as the CSV files created above) that includes the D3 library:




    
    
    Squares With D3

    
    




Next, I created an SVG container (you can think of this as the canvas on which the visual elements will be drawn) and started a

If you open index.html in a web browser, you should see a blank page.

Adding a background color

Adding a background color is easy: simply draw a rectangle with the desired color and make sure it covers the entire SVG. Specifically (omitting everything outside the tag):



    
    

    

If you open index.html in a web browser, you should see this.

Now we're ready to start adding some data to our SVG.

Adding peak name

Let's add the name of the peak towards the top left of the SVG. First, define some constants:

  • x0 and y0: used to specify where to place the peak name.
  • blackColor: used to specify text color.

Then, load the CSV file, store the peak name as a variable called peakName, and add it to the SVG:



    
    

    

The SVG should now look like this:

IMPORTANT: If your plot is suddenly blank, there could be issues loading the manaslu.csv due to blocking by CORS policy. If this happens, open up a terminal and start a simple HTTP server (you can do this by typing python3 -m http.server), then opening localhost:8000/ in your browser, navigating to the index.html file, and opening it.

Next, we'll draw the squares.

Logic for drawing squares

I wanted each square to be a closed path. The path would be composed of two vertical lines and two diagonal lines. Starting from the year 1921, I would decide whether or not to draw a square for that year (idx = 0) using the following logic:

  • Draw a red square if there were deaths that year. The line thickness should be determined by the no_deaths column.
  • Draw a black square with line thickness 0.25 if there were expeditions but no deaths (black squares represent "good" seasons) that year.
  • Don't draw anything for years with no expeditions (these years were removed from the DataFrame so enforcing this requirement is simple).
  • Then, take a step to the right and move on to the next year (idx = 1).
  • Repeat.

Drawing the squares

I started by defining a few additional constants to specify line lengths, the angle of the diagonal lines, step size when moving to the right, and the specific red color I wanted to use (omitting everything outside the

Next, iterate through every row in data (remember that rows are already sorted ascending by year/idx) and drawing a square using the logic from the previous section. This can be done by adding the following code:

// Add this after the code for adding peak name
// and sitll inside d3.csv("manaslu.csv").then(data => {})

// Iterate through each row in the data for squares
data.forEach(row => {

    // Extract is_good_seas from CSV and convert to boolean
    const isGoodSeason = row.is_good_seas === "True";

    // Calculate key values for square coordinates
    const x = x0 + row.idx * translationStep;
    const x2 = x + diag_len * Math.cos((-angle) * (Math.PI / 180));
    const y2 = y0 + vert_len + diag_len * Math.sin(angle * (Math.PI / 180));

    // Draw square
    svg.append("path")
        .attr("d", d3.line().curve(d3.curveLinearClosed)([
            [x, y0],
            [x, y0 + vert_len],
            [x2, y2],
            [x2, y2 - vert_len],
        ]))
        .style("stroke", seasonColors[isGoodSeason])
        .style("stroke-width", isGoodSeason ? 0.25 : row.no_deaths)
        .style("fill", backgroundColor)
});

This is the SVG we have at this point:

Adding red/black dots for flagging death rate and success rate

We've finished drawing the red/black squares. However, I wanted to add a small black dot below each square whenever that year had a success rate greater than 70%, and a small red dot whenever that year had a death rate greater than 10%. Doing this is straightforward. Simply add this code after drawing the squares:

// Add this after the code for drawing the squares
// and still inside data.forEach(row => {})

// Check if "high_deathrate" is True, then add a red dot below the square
if (row.high_deathrate === "True") {
    svg.append("circle")
        .attr("cx", x2)
        .attr("cy", y2 + 10)
        .attr("r", 2.5)
        .style("stroke", redColor)
        .style("fill", backgroundColor);
}

// Check if "high_succrate" is True, then add a black dot below the square
const secondCircleOffset = row.high_deathrate === "False" ? 10 : 20;
if (row.high_succrate === "True") {
    svg.append("circle")
        .attr("cx", x2)
        .attr("cy", y2 + secondCircleOffset)
        .attr("r", 2.5)
        .style("fill", blackColor);
}

Note that I added some logic to take care of cases where both high_succrate == True and high_deathrate == True. Specifically, this line:

row.high_deathrate === "False" ? 10 : 20;

would move the black dot down whenever a red dot was already drawn (it turns out this case never occurred, and I didn't get to see this in action).

This is what the final SVG looks like:

At this point we've finished our work with D3. We're now ready to save our SVG and start working with it in Illustrator.


7. Saving The SVG & Importing It Into Illustrator

Before we're able to work with the SVG in Illustrator we need to save it.

Saving the SVG

If you're using Chrome, you can right click on your SVG and click on "Inspect" to open Chrome developer tools:

Then, find the SVG element in the "Elements" tab of the developer tools, right click on it, and select Copy > Copy element:

Next, open a text editor and paste the contents. Save the file and make sure to use .svg as the file extension:

manaslu.svg

What if I'm not using Chrome?

Other browsers have similar functionality. However, if this doesn't work (for whatever reason) another option is to add a button to your HTML file that allows you to download the SVG when the button is clicked.

Opening the SVG in Illustrator

If you open manaslu.svg in illustrator you might see something like this:

Honestly, I'm not sure why the background black, but changing the color back to what it should be is easy (just three clicks):


8. Working With The SVG In Illustrator

Adobe Illustrator is a powerful vector graphics editor that allows users to create and manipulate digital artwork. Unlike presentation software such as PowerPoint, Illustrator is specialized for graphic design and illustration. Think of Illustrator as a digital canvas where you can create intricate designs, logos, icons, and illustrations with precision.

I won't go over the entire Illustrator process I followed but there are a few key things you can do in Illustrator that I want to highlight (to give you a sense of what's possible if you've never used Illustrator before).

Locking objects

I like to lock the background so that it can't be moved or modified. Simply select the background and go to Object > Lock > Selection. This is a great feature when there are elements that you absolutely don't want to be messing with.

Grouping objects

Like in PowerPoint, you can group objects in Illustrator. This is very useful because it helps you avoid accidentally moving squares independently and thus "fudging the data". Essentially, it helps prevent doing things like this:

Without grouping the squares and dots it's very easy to accidentally do something like this. Technically this is still possible to do even if you group the objects, so grouping helps but doesn't prevent accidentally messing up the data. It's important to keep this in mind and be careful when editing SVGs in Illustrator.

Selecting similar objects

Suppose I want to change the opacity of the fill of all squares to 20%, but I don't want to affect the opacity of the outline. Illustrator makes it very easy to do this. One way to achieve this effect is to select one of the squares, then go to Select > Same > Fill Color. This will select everything with the same fill color. Then you can edit the opacity of the fill color from the Appearance panel:


9. Final Touches

I'm a sucker for textures so I decided to open the Illustrator file and add a paper texture. The basic steps are as follows:

  1. Open the Illustrator file.
  2. Download the image of a texture (Unsplash has lots of free options).
  3. Convert the texture to black and white and adjust brightness and contrast to isolate the texture.
  4. Drag the texture on top of your image in Illustrator.
  5. Change the "transparency" mode to achieve the desired effect.
(Left) Original texture by Kiwihug on Unsplash. (Right) Desaturating texture and isolating the desired texture.

Exporting

Because I'm sharing the final image online and the image contains colors with opacity, I decided to export the image as a PNG. I chose the "Type Optimized" Anti-aliasing setting to help maintain sharpness in the text.

This is what the design looks like straight out of Illustrator:


The final images

Here's what the finale image for Everest looks like:

If you're interested, the 5 final images are available on my website.


10. Lessons Learned

Not colorblind friendly

I shared the final visualizations with a friend and was quickly reminded that they have color vision deficiency (CVP)! This is what the visualizations probably looked like to them (depending on the type of CVP):

CVP type: Protanopia.
CVP type: Deuteranopia.

In hindsight, I should've picked a different color palette. Adobe Color provides excellent tools for constructing color palettes that are accessible to people with CVP:

The Adobe Color website shows you what a color palette looks like to people with different types of CVP and highlights potential issues.

Editing in a low brightness interface

In the past I've learned the hard way about the value of:

  1. A well-calibrated monitor.
  2. Being able to precisely control brightness (for consistency).

Unfortunately, I made the mistake of not having a look at the final graphic in Illustrator with a lighter interface background. This would've shown me that the image was a bit dark prior to exporting.

Things often look brighter when plotted against a dark background.

Final Comments

  • Creating the initial SVG with D3 made things a lot simpler than trying to create this kind of plot in Python directly.
  • The video Cleaning up a Python data visualization in Adobe Illustrator (pandas to ai2html) by Jonathan Soman seems to cover a lot of important ideas related to editing data-based graphics in Illustrator if you're interested in getting started with editing data-based graphics.
  • The entire code is available in this GitHub repo (there may be small differences from what was presented here).

Tags: Data Science Data Visualization Deep Dives Hands On Tutorials Python Programming

Comment