Confidence Interval vs. Prediction Interval

Author:Murphy  |  View: 20402  |  Time: 2025-03-22 19:31:26

In many Data Science-related tasks, we want to know how certain we are about the result. Knowing how much we can trust a result helps us to make better decisions.

Once we have quantified the level of uncertainty that comes with a result we can use it for:

  • scenario planning to evaluate a best-case and worst-case scenario
  • risk assessment to evaluate the impact on decisions
  • model evaluation to compare different models and model performance
  • communication with decision-makers about how much they should trust the results

Uncertainty Quantification and Why You Should Care

Where does the uncertainty come from?

Let's look at a simple example. We want to estimate the mean price of a 300-square-meter house in Germany. Collecting the data for all 300-square-meter houses is not viable. Instead, we will calculate the mean price based on a representative subset.

And that's where the uncertainty comes from: the sampling process. We only have information about a subset or sample of a population. Unfortunately, a sample is never a perfect representation of the entire population. Thus, the true population parameter will differ from our sample estimate. This is also known as the sampling error. Moreover, depending on how we sample, the results will be different. Comparing two samples, we will get a different mean price for a 300-square-meter house.

If we want to predict the mean price, we have the same problem. We cannot collect all the population data that we would need. Instead, we must build our model based on a population subset. This results in a sampling uncertainty as we do not know the exact relationship between the mean price, i.e., the dependent variable, and the square meter, i.e., the independent variable.

Hence, we always have some uncertainty due to the sampling process. And this uncertainty we should quantify. We can do this by giving a range in which we expect the true value to lie. The narrower the range or interval, the more certain we are. (Assuming that the interval guarantees coverage.)

To quantify uncertainty two concepts are often used interchangeably: Confidence Interval and Prediction Interval.

You will hear them often as they are essential concepts in Statistics and thus, in the field of data science. On a high level, both provide a probabilistic upper and lower bound around an estimate of a target variable. These bounds create an interval that quantifies the uncertainty.

However, from a more detailed point of view, they refer to two different things. So, we should not use them interchangeably. Interpreting a Confidence Interval as a Prediction Interval gives a wrong sense of the uncertainty. As a result, we could make wrong decisions.

This article will help you avoid this trap. I will show you what a Confidence Interval and a Prediction Interval measure. Based on that I will show you their differences and when to use which interval.

So, let's get started with the more famous/more often used one.


Confidence Interval

A Confidence Interval quantifies the sampling Uncertainty when estimating population parameters, such as the mean, from a sample set. Hence, the Confidence Interval shows the uncertainty in the mean response of our sampled parameter.

But what does it mean?

Let's take the house prize example. We want to estimate the mean price of a 300-square-meter house in Germany. Our population is all houses in this category. However, we cannot gather all the data about all houses. Instead, we collect data for a few houses, i.e., our sample.

Then, we determine the Confidence Interval of our choice for the sample mean by

in which x is the mean, z is the number of the standard deviation from the mean (i.e., indicating the confidence level (1.96 for 95 % and 2.576 for 99 %)), s the sampled standard deviation and n the sample size.

We can repeat this process for different samples of the population.

Okay, but how do we interpret the Confidence Interval?

A confidence level of 95 % means that if we repeat the sampling process many times, 95% of the intervals would contain the true population parameter. The confidence level refers to the long-run performance of the interval generation process. The confidence level does not apply to a specific interval. It does not mean there is a 95% chance that the true value lies in the interval of a single sample. This is also known as the frequentist approach.

Drawing different samples from a normal distribution and determining the 90 % Confidence Interval for the mean. Some Confidence Intervals do not contain the population mean (red columns). (Image by the author)

It is a very subtle but important difference. The 95% confidence level applies to the process of interval generation, not a specific interval.

Let's assume we have a 95% confidence interval of 400,000 € to 1,000,000 € for a 300-square-meter house in Germany.

We can expect that 95% of the samples we draw will contain the true mean value in their Confidence Interval. This statement emphasizes the long-run probability of capturing the true mean if you repeat the sampling and interval calculation process many times.

Yet, you often hear "We are 95% confident that the true population mean lies between 400,000 € and 1,000,000 €." This is technically incorrect and implies more certainty about a specific interval. But it gives us a general intuition as it is easier to interpret. The statement reflects that 95% of similarly calculated intervals would capture the true parameter.

What factors influence the width of the Confidence Interval?

Looking at the equation above, we can identify two factors: The population variance and the sample size.

The higher the population variance, the more our samples will vary. Hence, the sample standard deviation is larger, resulting in wider Confidence Intervals. This makes sense. Due to the higher variation, we can be less certain that the sampled parameter is close to the population parameter.

A larger sample size can balance the effect of a few outliers while the samples are more similar. Hence, we can be more certain and thus have a narrower Confidence Interval. This is also reflected in the above equation. With an increasing sample size, the denominator becomes larger resulting in a narrower interval. In contrast, a small sample size results in wider Confidence Intervals. Fewer draws provide less information and will vary more as we increase the likelihood of a sampling error.


Prediction Interval

A Prediction Interval quantifies the uncertainty of a future individual observation from specific values of independent variables and previous data. Hence, the Prediction Interval must account for the uncertainty of estimating the expected value and the random variation of individual values.

For example, we have a 95% Prediction Interval stating a price range of 400,000 € to 1,000,000 € for a 300-square-meter house in Germany. This means any 300-square-meter house will fall in this range with a 95% chance.

What factors influence the width of the Prediction Interval?

Two factors influence the width of a Prediction Interval: the variance of the model's estimation and the variance of the target.

Similarly, to the Confidence Interval, the Prediction Interval must account for the variability in the model. The greater the variance of the estimation, the higher the uncertainty and the wider the interval.

Moreover, the Prediction Interval also depends on the variance of the target variable. The greater the variance of the target variable, the wider the Prediction Interval will be.

After we have covered the fundamentals, let's move on to the differences.


Differences between Confidence Interval and Prediction Interval

Confidence Interval

  • shows the uncertainty of a population parameter, such as the mean or a regression coefficient. ("We are 95% confident that the population mean falls within this range." (although this is technically not correct as I described above))
  • focuses on past or current events

Prediction Interval

  • shows the uncertainty of a specific value. ("We are 95% confident that the next observation will fall within this range.")
  • focuses on future events

To make things a bit clearer. Let's take a regression problem that looks like:

Here, y is the target value, E[y|x] the expected mean response, x the feature value, _beta0 the slope coefficient, _beta1 the intercept coefficient and epsilon a noise term.

The Confidence Interval shows the sampling uncertainty associated with estimating the expected value E[y|x]. In contrast, the Prediction Interval shows the uncertainty in the whole range of y. Not only the expectation.

The difference between a Confidence Interval and a Prediction Interval. The Confidence Interval shows the uncertainty for the mean of y given x, i.e., the expectation E[y|x]. The Prediction Interval shows the uncertainty for an individual y given x. (Image by the author)

Let's assume we have a linear regression model predicting house prices based on square meters. A 95% Confidence Interval for a 300-square-meter house might be (250,000 €, 270,000 €). A 95% Prediction Interval for the same house might be (220,000 €, 300,000 €).

We can see that the Confidence Interval is narrower than the Prediction Interval. This is natural. The Prediction Interval must account for the additional uncertainty of a single observation compared to the mean. The Prediction Interval shows the uncertainty of an individual 300-square-meter house's price. In contrast, the Confidence Interval shows the uncertainty of the average price for a 300-square-meter house.

Hence, using a Confidence Interval to show the uncertainty of single, future observations might lead to a wrong sense of forecast accuracy.


Conclusion

In this article, I have shown you two basic but very important concepts that are used to quantify uncertainty. Although they are often used interchangeably they should not.

If you stayed until here, you now should…

  • know what a Confidence Interval and Prediction Interval is and what they measure
  • most importantly, know the difference between them and when to use which interval

If you want to dive deeper and know more about the underlying mathematics, check out this post. Otherwise, comment and/or see you in my next article.

Tags: Data Science Getting Started Machine Learning Statistics Uncertainty

Comment