How to Evaluate Unreported Epidemic Infections with Iterated Filtering

Author:Murphy | View: 23753 | Time: 2025-03-23 19:49:24

Understanding the dynamics of disease epidemics has essential importance for public health decision-makers for further prevention measures. However, the inference of epidemic models can be difficult because the disease spread, in the majority of cases, is only partially observed: not all compartments among the whole population can be fully observed.

An example of such phenomena is the spread of COVID-19, of which only a part of all positive cases was reported due to multiple possible reasons: some infected people did not have any severe symptoms and took it for granted that they carried no virus, some tested people were false negative, some people did not want to get themselves tested, etc. Therefore, it is obvious that the number of daily reported cases is smaller than that of real infections.

In this blog post, I am going to introduce briefly the Iterated Filtering algorithm which is designed for inference of such kind of partially observed Markov process (POMP) and use the corresponding API of TensorFlow Probability (TFP) on a specific example of a partially observed epidemic case. I would like to highlight the implementation of the algorithm I have done with TFP which has barely documented the usage of the API.

Inference of POMP

Also known as hidden state space models or stochastic dynamical systems, a partially observed Markov Process (POMP) usually contains two model components: an unobserved Markov process {X(t; θ) : t ≥ 0}, either discrete or continuous in time, and an observation model which describes how the data collected at discrete points Y(t) is related to the unobserved states X(t).

In general, inference of POMP would start by formulating a Markov process and connecting observed data by some observation model to that process and it would learn the posteriors of the model parameters by maximizing the likelihood of such model.

But this is never an easy task.

The volume of research on inference of POMP indicates both the importance and the difficulty of the problem. We categorize the methods with three criteria: the plug-and-play property; full-information or feature-based; frequentist or Bayesian. You can find a list of available algorithms here.

Iterated filtering

But why am I talking about iterate filtering today? This is because, among all the available methods, iterated filtering methods are the only currently available, full-information, plug-and-play, frequentist methods for POMP models and it has succeeded in solving likelihood-based inference problems, especially in epidemiological situations which are computationally intractable for available Bayesian methodology.

We can simply understand iterated filtering as the name suggests: filtering by iteration. Here, the word "filter" can be understood to mean an "estimator" that extracts information about a quantity of interest from noisy data, according to Simon Haykin's Adaptive Filter Theory.

We will introduce briefly the IF2 algorithm of Ionides et al. (2015) and yes, there was an IF1 in 2006 which would not the focus of today. In this IF2 algorithm, we input: (i) the prior of the initial state, (ii) the transmission model of the Markov process, (iii) the observation model describing the relation of the state and the observation, (iv) the observed data and we would like to get the posterior of models parameters by iteration:

Each iteration consists of a particle filter, carried out with the parameter vector, for each particle, doing a random walk.
At the end of the time series, the collection of parameter vectors is recycled as starting parameters for the next iteration.
The random-walk variance decreases at each iteration.

I am not going into the details of implementation but I high recommend the readers to look up to the IF2 algorithm pseudocode.

An example of epidemics with partially documented infections

Now let us consider the following example of partially observed dynamics: epidemic dynamics described by a SIR model which divides the total population into three compartments: susceptible, infectious, and recovered. The progress between compartments was modelized by an ordinary differential equation with two essential parameters: the infection rate and the recovery rate. Let us suppose more that only a portion of infections was documented and we denote the value of the portion by reported rate.

To understand the progress of the disease, we have to learn the three parameters mentioned above: the infection rate, the recovery rate and the reported rate given observed daily reported cases, which, sorry but I have to emphasize again, are always smaller than the real infections.

Let us start to learn the parameters of the SIR model in the interest of a synthetic dataset simulated by the author with reported daily infections. The figure below gives the plot of daily reported cases during the 100 days.

Image by author: reported infections in 100 days

Inference with TFP

The good news is, the IF2 algorithms have been implemented by TFP with tfp.experimental.IteratedFilter we can use it directly. The bad news is, no documentation is currently available. In the coming chapter, I would like to explain how to use the method in the dataset above. Let us take a look of the API first:

tfp.experimental.sequential.IteratedFilter(
    parameter_prior,
    parameterized_initial_state_prior_fn,
    parameterized_transition_fn,
    parameterized_observation_fn,
    parameterized_initial_state_proposal_fn=None,
    parameterized_proposal_fn=None,
    parameter_constraining_bijector=None,
    name=None
)

To initialize the method, it is necessary to define four arguments: parameter_prior, parameterized_initial_state_prior_fn, parameterized_transition_fn, and parameterized_observation_fn.

In our example, we define the parameter_prior as the priors of the three rates we are interested in as uniform distributions that with tfd.Distribution.

For parameterized_initial_state_prior_fn, we define it as a function mapping that gives parameters to compartments in SIR models.

Further, we define parameterized_transition_fn as a function describing how all compartments of the model progress after a one-time step and we would like to remark here the function is nothing but a discretized version of the original SIR model.

Coming to parameterized_observation_fn, we define it as a function connecting the SIR compartments to the observed reported infections, that is, at every time step, the reported infections should be the product of the newly reported cases (according to the model, the difference between of susceptible cases of two-time steps) and the reported rate: that is, *reported_case_t=(suspectiblecase{t-1}- suspectible_case_t)reported_rate**.

For detailed code, please refer to the notebook. Once everything delcared, we can call directly iterated_filter.estimate_parameters to learn the parameters by the IF2 algorithm. The plot below gives the visualiaztion of the posteriors of the reported rate. We can see that in this example, only ~76% of total infections was reported.

Image by author: boxplot of reporte_rate posteriors

Conclusion

In this blog post, I give brief of iterated filtering and highlight its importance in the analysis of POMP, especially in the study of epidemics spread. I use the IteratedFilter API of TFP on an example of partially reported epidemic infections. All questions are welcome.

Tags: Bayesian Inference Iterated Filtering Markov Process Physics Informed Tensorflow Probability