Decoding Time: Unraveling the Power of LSTM vs. N-BEATS for Accurate Time Series Forecasting

Comparing how two deep learning models perform short-term and long-term
Time series forecasting plays a pivotal role across various domains by facilitating predictions of future trends. This exploration focuses on two prominent Deep Learning models, delving into their respective strengths and weaknesses.
LSTM (Long Short-Term Memory) stands out as a specialised variant of RNN, adept at capturing patterns characterised by long-term dependencies in sequential data. It enhances traditional RNNs by effectively addressing the vanishing gradient problem, allowing for the modelling of extended dependencies. LSTM achieves this by selectively retaining or forgetting information over time through the incorporation of memory cells and gating mechanisms, including input, output, and forget gates. An inherent limitation of LSTM models is that the forecasting horizon must align with the length of the input sequences utilised during training.

N-BEATS (Neural Basis Expansion Analysis for Time Series) represents a non-recurrent architecture renowned for its ability to accurately forecast multiple time series. Constructed with distinct building blocks, it adopts a hierarchical structure wherein each block specialises in forecasting a specific horizon. The "backcast" block delves into the historical horizon, while the "forecast" block focuses on predicting future periods, which might vary in size.



N-BEATS excels in modelling seasonality by decomposing the series into trend and seasonality, akin to the approach taken by STL (Seasonal-Trend decomposition using LOESS). This decomposition allows the model to capture both short-term fluctuations (seasonality) and long-term trends separately. It utilises Fast Fourier transform to effectively model seasonality, a pivotal component in time series analysis. N-BEATS is designed to be robust to different types of seasonality patterns, including regular and irregular patterns.
Practical example
To better understand the theoretical concepts discussed above, let's delve into a practical example by applying both models. Initially, we'll generate synthetic data at a daily frequency, featuring an ascending trend and seasonality patterns.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# Generate synthetic time series data
def generate_synthetic_data():
# Create a time range with daily frequency
start_time = pd.Timestamp("1990-01-01 00:00:00")
end_time = pd.Timestamp("2023-12-31 23:59:00")
time = pd.date_range(start=start_time, end=end_time, freq='D')
# Weekly seasonality and ascending trend
weekly_seasonality = 10 * np.sin(2 * np.pi *
np.arange(len(time)) / (7 * 24 * 60))
ascending_trend = 0.01 * np.arange(len(time))
# Combine the components to generate synthetic data
uplift = 100
x = weekly_seasonality + ascending_trend + uplift
noise_level = 5
noise = white_noise(len(time), noise_level, seed=42)
x += noise
return time, x
def white_noise(length, noise_level=1, seed=None):
rnd = np.random.RandomState(seed)
return rnd.randn(length) * noise_level
time_series_data = generate_synthetic_data_minute()
data = pd.DataFrame({'ds': time_series_data[0], 'y': time_series_data[1]})
data= data.set_index("ds")
# Plot the generated synthetic data
plt.plot(data["y"])
plt.title("Synthetic Time Series Data")
plt.show()
The simulated data appears as follows:

The dataset exhibits seasonality, an ascending trend, and noise, making accurate forecasting challenging for most models. This difficulty arises because many models specialise in identifying either long-term or short-term patterns individually. Consequently, accurately predicting this data requires a model capable of capturing both types of patterns simultaneously. To facilitate this, the data has been scaled to ensure compatibility with an Lstm model.
data = data.reset_index()
train_data = data[data.ds<='2022-01-01']
test_data = data[data.ds>'2022-01-01']
# Normalize the data
scaler = MinMaxScaler()
train_data['x'] = scaler.fit_transform(train_data[['x']])
test_data['x'] = scaler.transform(test_data[['x']])
# Create sequences for the LSTM model
sequence_length = 10
train_sequences = []
test_sequences = []
for i in range(len(train_data) - sequence_length):
train_sequences.append(train_data['x'].iloc[i:i+sequence_length].values)
for i in range(len(test_data) - sequence_length):
test_sequences.append(test_data['x'].iloc[i:i+sequence_length].values)
train_sequences = np.array(train_sequences)
test_sequences = np.array(test_sequences)
# Prepare train and test targets
train_targets = train_data['x'].iloc[sequence_length:].values
test_targets = test_data['x'].iloc[sequence_length:].values
import time
start_time = time.time()
# Create and train an LSTM model
model = Sequential()
model.add(LSTM(50, activation='relu', input_shape=(sequence_length, 1)))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(train_sequences.reshape(-1, sequence_length, 1), train_targets,
epochs=5, batch_size=32)
# Make predictions
test_predictions = model.predict(
test_sequences.reshape(-1, sequence_length, 1))
print(time.time() - start_time)
# Inverse transform the predictions to the original scale
test_predictions = scaler.inverse_transform(test_predictions).flatten()
test_targets = scaler.inverse_transform(test_targets.reshape(-1, 1))
The model for this dataset was trained in 23 seconds. The test set and predictions cover the period after January 1st, 2022. Here are the forecasted values:
# Plot the original data and LSTM predictions
plt.figure(figsize=(10, 6))
plt.plot(test_data['ds'].iloc[sequence_length:], test_targets,
label="Actual Data", linestyle='-')
plt.plot(test_data['ds'].iloc[sequence_length:], test_predictions,
label="LSTM Predictions", linestyle='--')
plt.xlabel("Time")
plt.ylabel("Value")
plt.legend()
plt.grid(False)
plt.show()


The forecasts appear to capture the ascending trend and Seasonality effectively, which is promising. However, there seems to be a significant amount of noise in the predictions. Moreover, as the forecast horizon extends further, deviations from the test data become noticeable.
Now, let's assess the Weighted Absolute Percentage Error (WAPE):
test_predictions_df = pd.DataFrame(test_predictions, columns = ["LSTM"])
test_targets_df = pd.DataFrame(test_targets, columns = ["actuals"])
predictions = pd.concat([test_predictions_df, test_targets_df], axis=1)
wape = (predictions['actuals'] - predictions['LSTM']).abs().sum()
/ predictions['actuals'].sum()
print(wape * 100)
WAPE = 1.89%
The results are indeed promising.
Next, let's explore fitting an N-BEATS model to the same dataset and examine its performance.
data = generate_data_specific_column(col_name = "y")
# create column unique_id specifically for N-BEATS model with only 1.0
data['unique_id'] = 1.0
data = data.reset_index()
train_data = data[data.ds<='2022-01-01']
test_data = data[data.ds>'2022-01-01']
from neuralforecast.models import NBEATS, NHITS
from neuralforecast import NeuralForecast
import time
start_time = time.time()
horizon = len(test_data)
models = [NBEATS(input_size=2 * horizon, h=horizon, max_steps=50)]
nf = NeuralForecast(models=models, freq='D')
nf.fit(df=train_data) # default optimizer is MAE
test_predictions = nf.predict().reset_index()
print(time.time() - start_time)
The N-BEATS model was trained in 83 seconds, which, although fast, is slower than the LSTM model.
Now, let's examine the predictions generated by the N-BEATS model.
# Plot predictions
predictions = test_data.merge(test_predictions, how='left',
on=['unique_id', 'ds'])
plt.figure(figsize=(10, 6))
plt.plot(predictions['ds'], predictions['y'], label="Actual Data",
linestyle='-')
plt.plot(predictions['ds'], predictions['NBEATS'],
label="NBEATS Predictions", linestyle='--')
plt.xlabel("Time")
plt.ylabel("Demand per Minute")
plt.legend()
plt.grid(False)
plt.show()


wape = (predictions['y'] - predictions['NBEATS']).abs().sum()
/ predictions['y'].sum()
print(wape * 100)
WAPE = 1.80%
Both models demonstrate high accuracy according to the chosen metric. Interestingly, the forecasts generated by N-BEATS do not seem to deteriorate as we approach the end of the forecasting horizon. This suggests that N-BEATS could be particularly suitable for projects requiring long-term forecasts.
Summary
N-BEATS and LSTM bring unique strengths to the forecasting domain. N-BEATS excels in its ability to capture diverse patterns for longer horizons, while being highly interpretable. On the other hand, LSTM is a fast, stable and complex neural network mechanism which achieves accurate predictions.
References
*Unless otherwise noted, all images have been generated by the author
[1] Van Houdt, Greg & Mosquera, Carlos & Nápoles, Gonzalo. (2020). A Review on the Long Short-Term Memory Model. Artificial Intelligence Review. 53. 10.1007/s10462–020–09838–1.
[2] Binte Habib, Adria. (2022). A Detailed Explanation of the workflow of N-BEATS Architecture. 10.13140/RG.2.2.36379.34083.
Building a Neural Network Zoo From Scratch: The Long Short-Term Memory Network