Need for Speed: Streamlit vs Functool Caching

Author:Murphy | View: 29897 | Time: 2025-03-23 11:39:30

Streamlit is my default framework for building both proof-of-concept demos and analytical dashboards. The simplicity of the framework allows quick development and easy maintenance. However, the dark side of simplicity is that it comes with in-built design assumptions that make it difficult to use as a top grade production tool. We will cover these in detail later, but the result of these assumptions is how slow Streamlit can be when processing and rendering your app.

In this post, I want to show you 2 methods to increase the speed of your Streamlit apps: using the built-in Streamlit Caching functions and using the built-in functools caching functions. Both methods are anchored on the idea of caching, where, if something has already been triggered before, the output is saved to re-use later.

Before getting into the results, I feel it is important to understand 3 basic pieces of theory: how does Streamlit, Streamlit caching, and functools caching work under the hood.

PS: All images are authored by me, unless otherwise specified.

Streamlit re-executes everything. Every. Single. Time.

As mentioned in the introduction, Streamlit is easy to use but simplicity comes at a cost. Streamlit operates on a unique principle that sets it apart from many other web frameworks: every time a user interacts with the app, the entire script is re-executed from top to bottom. The whoooooole thing. This behaviour might seem strange, but it is the key to Streamlit's simplicity and power.

One reason why developers designed Streamlit to re-execute every time, was to make it "stateless" by default. Since the script is re-executed entirely each time, there is no need to manage the state between different parts of the app explicitly. Each run of the script starts with a clean slate, and everything is recalculated based on the current inputs.

Being "stateless" is nice, but imagine that you have a function to read data. Unless we do something about it, Streamlit will, every-single-time, re-run the read data function. Script re-execution is where we hit speed performance issues. However, there are easy ways to fix this.

Understanding Streamlit caching

What is Streamlit caching?

Streamlit's caching mechanism allows you to store the results of expensive computations so that they can be reused in subsequent script executions. With caching, if Streamlit knows that a function or object has been previously called, it will skip the execution and return the cached result "instantly", significantly speeding up your app. Basically, you break Streamlit's stateless execution model, as caching allows certain parts of the app to behave in a stateful manner, so that the results persist across script reruns.

Streamlit provides two primary caching decorators:

@st.cache_data: This decorator is ideal for caching data-related operations, such as loading datasets or querying a database. This "is your go-to command for all functions that return data – whether DataFrames, NumPy arrays, str, int, float, or other serializable types." (directly quoted from [3])
@st.cache_resource: This decorator is used for caching resources, such as a machine learning model, where the resource needs to be initialized once and reused multiple times.

How to invoke Streamlit caching

As simple as decorating your function with the @st.cache_data .


@st.cache_data()
def filtering_pandas(df: pd.DataFrame,
                     dates_filter=None,
                     device_filter=None,
                     ROI_filter=None,
                     market_filter=None
                     ) -> pd.DataFrame:

  # filtering operations...

  return df

⚠️ Careful though! ⚠️

In the function above, we have 5 inputs. Streamlit will consider any combination of these 5 inputs as a new object to cache. For example, if device_filter=Desktop or device_filter=Mobile , Streamlit will cache 2 different dataframe outputs. You can imagine how this could explode in terms of memory size.

Set constraints to control your caching memory usage

Here are 2 constraints recommended by Streamlit:

ttl = Time To Live. The idea is to force streamlit to use caching limited to a period of time. "If that time is up and you call the function again, the app will discard any old, cached values, and the function will be rerun." (directly quoted from [3])
max_entries . "Sets the maximum number of entries in the cache. The oldest entry will be removed when a new entry is added to a full cache." (directly quoted from [3])

As you can see, Streamlit caching is very easy to use. But, it is important to understand how Streamlit looks at each decorated function to understand if you need constraints for your app. Let's look now at another way of caching.

Understanding functools caching

I learnt about functools.lru_cache from Fabian Bosler‘s post Every Python Programmer Should Know LRU_cache From the Standard Library [1]. But this covered only a super simple example showing that it was lightning fast. Then I also discovered a post comparing functools to streamlit caching by Marcin Kozak [2], but again, it only covered the ‘data read' part on an ETL. I wanted to try if I could make it work with a more complicated ETL.

What is functools caching?

functools.lru_cache is a built-in Python decorator and works similar to Streamlit caching in the sense that it stores the results of the function's calls in a cache. If the function is called again with the same arguments (emphasis here on the arguments section), Python will return the result from the cache instead of recomputing it.

The "LRU" stands for Least Recently Used, which is a caching strategy that discards the least recently accessed items when the cache reaches its maximum size. In other words, it tries to keep only the most frequently accessed or recent items in memory. That's pretty cool because you kind of ‘forget' about the constraints control you would have had to implement in Streamlit (although you can definitely still control the cache size)

How to invoke functools caching

Using lru_cache is as simple as decorating your function with @lru_cache.

@functools.lru_cache(maxsize=128)
def filtering_pandas(dates_filter=None,
                     device_filter=None,
                     ROI_filter=None,
                     market_filter=None
                     ):

  # filtering operations...

  return df

⚠️ Careful though! ⚠️

Have you detected that the functools.lru_cache decorated filtering_pandas() function doesn't have the df() dataframe as an input? Compare this function to the one used in the st.cache_data() example, and you will see the difference. The reason is because of hashable objects.

Hashable objects. The pain you will suffer if using functools caching.

functools.lru_cache decorator requires that all the arguments passed to the cached function be hashable. A dataframe is NOT hashable, because it is mutable. This is one of the pains points in writing functions that are decorated with functools.lru_cache .

I will cover how I approach this problem later.

Benchmarking design exercise

Having introduced both caching methods, it is time to compare the performance of both. To do this I have created the following benchmarking exercise:

I have created a set of synthetic dataframes, ranging from 1,000 to 10,000,000 rows.
I have created a typical ETL, where we load the data, filter it, join it with another dataframe and aggregate it based on a segment.
These ETL functions have been coded up in (1) Pandas (2) Polars (3) cached pandas functions with Streamlit cache (4) cached pandas and cached polars functions with functools cache.
I wrap these into a Streamlit app, where I capture the execution times of the first run of a given function and when it is re-run again.

Before moving to the results section, I want to show you some specific examples using Streamlit and functools caching.

Streamlit cache pandas example

Below you can see 2 simple ETL functions. None of them are using the streamlit cache decorator, but we are invoking different functions. For example:

pandas_etl() invokes read_and_combine_csv_files()
But pandas_etl_streamlit_cached() invokes read_and_combine_csv_files_cached()
In reality, you don't need to create 2 different functions. I have done this to run the benchmark exercise.

def pandas_etl(folder_path, secondary_df=None,
               dates_filter=None, device_filter=None, market_filter=None, ROI_filter=None,
               list_of_grp_by_fields=None,
               ):

    df = read_and_combine_csv_files_pandas(folder_path)
    df = filtering_pandas(df=df, dates_filter=dates_filter, device_filter=device_filter, market_filter=market_filter, ROI_filter=ROI_filter)
    df = join_pandas(df, secondary_df)
    df = aggregating_pandas(df=df, list_of_grp_by_fields=list_of_grp_by_fields)

    return df

@st.cache_data()
def pandas_etl_streamlit_cached(folder_path, secondary_df=None,
                                dates_filter=None, device_filter=None, market_filter=None, ROI_filter=None,
                                list_of_grp_by_fields=None,
                                ):

    df = read_and_combine_csv_files_pandas_cached(folder_path)
    df = filtering_pandas_cached(df=df, dates_filter=dates_filter, device_filter=device_filter, market_filter=market_filter, ROI_filter=ROI_filter)
    df = join_pandas_cached(df, secondary_df)
    df = aggregating_pandas_cached(df=df, list_of_grp_by_fields=list_of_grp_by_fields)

    return df

Looking at the filtering functions, this is how I coded it up:

Build the basic pandas / python filtering function. Because I am building a function for Streamlit, I have all those optional parameters which will be what the user inputs (or defaults to None if no input).
Decorate a function that invokes the non-cached function. As I said, you don't need this in your final app, but I would recommend benchmarking your cache decorator this way.

def filtering_pandas(df: pd.DataFrame,
                     dates_filter=None,
                     device_filter=None,
                     ROI_filter=None,
                     market_filter=None
                     ) -> pd.DataFrame:

    if dates_filter:
        # Ensure the filter dates are datetime objects
        df['Date'] = pd.to_datetime(df['Date'])
        start_date = pd.to_datetime(dates_filter[0])
        end_date = pd.to_datetime(dates_filter[1])
        df = df[(df['Date'] >= start_date) & (df['Date'] <= end_date)]

    if device_filter:
        df = df[df['Device'].isin(device_filter)]

    if market_filter:
        df = df[df['Market'].isin(market_filter)]

    if ROI_filter:
        df = df[(df['ROI'] >= ROI_filter[0]) & (df['ROI'] <= ROI_filter[1])]

    return df

@st.cache_data()
def filtering_pandas_cached(df: pd.DataFrame, dates_filter, device_filter, ROI_filter, market_filter) -> pd.DataFrame:
    return filtering_pandas(df, dates_filter, device_filter, ROI_filter, market_filter)

functools.lre_cache pandas or polars example

Remember the hashable objects? Here is how I deal with functions using functools.lru_cache .

If you can cache an inmutable object, do it. For example, reading a dataset from a csv is inmutable.
If your functions try to do something with a dataframe, you have 2 options:
- Option A is to hash the dataframe and pass it as an input parameter. This might lower your speed performance as you are hashing the dataframe, but you can save this hashed object in a streamlit session_state variable to re-use later.
- Option B is to simply invoke the cached function where your data is being read from and run the whole ETL. This adheres less to programming standards, but if the job is small enough, I don't see a problem with this.
If your functions require other inputs, make these inputs inmutable. This is the same idea as option A if you work with dataframes. For example, lists are mutable, so functools will not like it. But, if you transform a list into a tuple, then it will see it as an inmutable object (you could also hash it, but its an overkill).

Check the example below:

@functools.lru_cache
def read_and_combine_csv_files_pandas_cached_functools(folder_path):
    return pd.read_csv(folder_path)

@functools.lru_cache
def pandas_functools_etl(folder_path,
                         dates_filter=None, device_filter=None, market_filter=None, ROI_filter=None,
                         list_of_grp_by_fields=None):

  # This is the equivalent of option B, where I call the read function.
  # The read function is also cached, so effectively, this line will 
  # be faster after the first run.
  df = read_and_combine_csv_files_pandas_cached_functools(folder_path)

  # All the filters and aggregation fields below look like lists right?
  # See the how are we using the pandas_functools_etl() at the end
  if dates_filter:
      # Ensure the filter dates are datetime objects
      df['Date'] = pd.to_datetime(df['Date'])
      start_date = pd.to_datetime(dates_filter[0])
      end_date = pd.to_datetime(dates_filter[1])
      df = df[(df['Date'] >= start_date) & (df['Date'] <= end_date)]

  if device_filter:
      df = df[df['Device'].isin(device_filter)]

  if market_filter:
      df = df[df['Market'].isin(market_filter)]

  if ROI_filter:
      df = df[(df['ROI'] >= ROI_filter[0]) & (df['ROI'] <= ROI_filter[1])]

  markets_pandas_df = pd.read_csv('synthetic_data/data_csv/dataset_markets/markets.csv')
  df = pd.merge(df, markets_pandas_df, on='Market', how='inner')

  if list_of_grp_by_fields:
      df = (df
            .groupby(list(list_of_grp_by_fields))
            .agg({**{field: 'sum' for field in sum_fields},
                  **{field: 'mean' for field in mean_fields}}
                 )
            )

      # Rename columns to clarify which operation was performed
      df.columns = [f'{col}_{"Sum" if col in sum_fields else "Avg"}' for col in df.columns]

      df = df.reset_index()

  return df

# We create inmutable objects from lists by creating tuples
immutable_device_filter = tuple(device_filter) if device_filter else None
immutable_market_filter = tuple(market_filter) if market_filter else None
immutable_ROI_filter = tuple(ROI_filter) if ROI_filter else None
immutable_list_of_grp_by_fields = tuple(list_of_grp_by_fields) if list_of_grp_by_fields else None

pandas_functools_cached_df = pandas_functools_etl(
  folder_path=folder_path,
  dates_filter=dates_filter,
  device_filter=immutable_device_filter,
  market_filter=immutable_market_filter,
  ROI_filter=immutable_ROI_filter,
  list_of_grp_by_fields=immutable_list_of_grp_by_fields,
)

This is how the code above works (although you can follow through the comments):

The read data function is cached. Anytime, whatever function calls this read method, caching will kick in, and no reading will be required.
The example above follows option B, where instead of hashing a dataframe and adding it as an input parameter, I have gone for directly calling the read method.
Finally, even though the inputs to the function and the way pandas uses parameters such as _marketfilter, make it look like lists, what we are actually passing is a tuple with the list inside. This way we "trick" the caching operation to think it is an inmutable object.