Time Series for Climate Change: Forecasting Large Ocean Waves

This is Part 3 of the series Time Series for Climate Change. List of articles:
- Part 1: Forecasting Wind Power
- Part 2: Solar Irradiance Forecasting
Ocean Wave Power

Ocean waves are a promising source of renewable energy.
Why ocean waves?
When renewable energy comes to mind, people usually think about solar or wind power. These are the most popular renewable energy sources. Yet, ocean waves have great potential due to their consistency.
We could harvest energy from ocean waves around 90% of the time. This figure is about 20%-30% for solar or wind. You can check reference [1] for details. For example, solar technology only works during the daytime.
Besides its consistency, wave energy is also more predictable than the above two alternatives. The main limitation of ocean wave energy is the production costs. At present, these are higher relative to solar or wind.
From Wave to Electricity
Wave power is converted into electricity by devices called Wave Energy Converters (WECs). These devices are buoys that are placed on the surface of the ocean.
The energy output by ocean waves depends on the height of the waves. This quantity varies over time. So, Forecasting waves' height is a key task for efficient energy production from ocean waves.
The height of waves is quantified according to what is called the significant wave height. This quantity is defined as the average wave height, from trough to crest, of the highest one-third of the waves:

When Waves Are Too Big – A second motivation
Forecasting the height of waves is important for other factors besides energy production.
Predicting impending large waves helps manage the safety of maritime operations. Accurate forecasts can prevent coastal disasters, and protect wave energy converters. These devices might need to be shut down to prevent damage from large waves.
The safety concerns are also related to the passage of vessels. A vessel requires a minimum depth of water for its movement. Large waves reduce the depth of water and this minimum may not be met. So, forecasting improves the efficiency of vessel movement, thereby reducing costs and improving the reliability of ports.
In summary, forecasting the height of ocean waves is important for several reasons:
- estimating energy production for managing the grid;
- managing maritime operations, including the passage of vessels.
Tutorial: Forecasting Large Waves
In the rest of this article, we'll develop a model to forecast impending large waves. You'll learn how to build a probabilistic forecasting model that estimates the likelihood of an event.
The full code is available on Github:
Data set
We'll use a real-world data set that is collected from a smart buoy that is placed on the coast of Ireland. Among other things, the data collected includes the significant wave height, the variable we want to forecast. This data is available in the source in reference [2].
After downloading the data, you can read it using the following code:
import pandas as pd
file = 'path_to_data/IrishSmartBuoy.csv'
# reading the data set
# skipping the second with skiprows
# parsing time column to datetime and setting it as index
data = pd.read_csv(file, skiprows=[1], parse_dates=['time'], index_col='time')
# defining the series and converting cm to meters
series = data['SignificantWaveHeight'] / 100
# resampling to hourly and taking the mean
series = series.resample('H').mean()
Here's what the data looks like:

There are a few periods with missing data. Perhaps the buoy was under maintenance. Besides, there's a noticeable seasonal component. The height of waves is typically higher in the wintertime.

The data distribution shows a right-skewness. The heavy tail represents large waves, which are important to predict.
There's no clear consensus on how to define a large wave. We define it as a wave at least 5 meters high. This threshold is shown in a yellow vertical line in the plot above.
So, the goal is to predict if future waves will exceed 5 meters. This problem can be framed as an exceedance probability forecasting task.
Primer on exceedance probability forecasting
This task is relevant in domains where extreme values are highly relevant. For example, in environmental sciences models are used to predict the likelihood of natural disasters such as hurricanes or floods. Another example is engineering, where professionals use models to predict the chance of equipment failure.
From a machine learning standpoint, the main challenge is the lack of information about exceedance events. By definition, exceedance events are rare. Models need to cope with an imbalanced class distribution, where the vast majority of observations are non-event cases.
Why probabilities?
The occurrence of large waves is difficult to predict. So, presenting forecasts as probabilities is useful to convey the uncertainty and limitations behind these predictions.
In general, probabilistic forecasting helps decision-making by improving the assessment of the risks associated with each possible action.
Building a model
We can use auto-regression to tackle this task.
The goal is to use past recent values of the height of ocean waves as explanatory variables. The target is a binary variable that represents whether a large wave will occur soon.
from sklearn.model_selection import train_test_split
# https://github.com/vcerqueira/tsa4climate/tree/main/src
from src.tde import time_delay_embedding
# using past 24 observations as explanatory variables
N_LAGS = 24
# using the next 12 hours as the forecasting horizon
HORIZON = 12
# forecasting the probability of waves above 5 meters
THRESHOLD = 5
# leaving last 20% of observations for testing
train, test = train_test_split(series, test_size=0.2, shuffle=False)
# transforming time series into a tabular format for supervised learning
X_train, Y_train = time_delay_embedding(train, n_lags=N_LAGS, horizon=HORIZON, return_Xy=True)
X_test, Y_test = time_delay_embedding(test, n_lags=N_LAGS, horizon=HORIZON, return_Xy=True)
y_train = Y_train.apply(lambda x: (x > THRESHOLD).any(), axis=1).astype(int)
y_test = Y_test.apply(lambda x: (x > THRESHOLD).any(), axis=1).astype(int)
In this case, we set the forecasting horizon to 12. So, at each instant, we aim at predicting the likelihood of a large wave within the next 12 hours.
The target variables have an imbalanced distribution, where the majority of observations refer to normal waves:

Since the target variable is binary, we want to build a binary probabilistic classification model. For this case study, we settle for the Random Forest.
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import roc_auc_score, roc_curve
model = RandomForestClassifier(max_depth=5)
model.fit(X_train, y_train)
probs = model.predict_proba(X_test)[:, 1]
roc_auc_score(y_test, probs)
fpr, tpr, thresholds = roc_curve(y_test, probs)
After training the model, we can get probabilistic predictions using the _.predictproba method.
Binary probabilistic forecasts can be evaluated using the ROC (Receiver Operating Characteristic) curve. The idea is to plot the false positive rate (x-axis) against the true positive rate (y-axis) for different decision thresholds.
This leads to a curve like below:

The closer the curve gets to the top-left side the better. The diagonal dashed line is what should be expected from a random model.
The ROC curve is often summarised by the area under it (AUC). The AUC is a metric that is used to evaluate binary probabilistic classifiers. It quantifies how well the model is able to distinguish between the two classes. The AUC for our model is 0.94, which is a decent score.
Key Takeaways
- Ocean waves are a promising source of renewable energy;
- While this source is more consistent than solar or wind, production costs hinder its adoption;
- Large ocean waves are a concern for the safety of maritime operations, including coastal disasters or the passage of vessels. Besides, large waves can damage wave energy converters;
- Forecasting impending large waves is a useful task. Probabilistic forecasts are desirable because these convey more information that is important for decision-making;
- Advancements in forecasting models can help accelerate the adoption of energy production from ocean waves.
Thank you for reading, and see you in the next story!
References
[1] Drew, Benjamin, Andrew R. Plummer, and M. Necip Sahinkaya. "A review of wave energy converter technology." (2009): 887–902.
[2] Irish Wave Buoys (Licence: Creative Commons Attribution 4.0)