Simulating the Spread of COVID-19 in a Small Community Using Simpy

Nathan Zorndorf
6 min readFeb 15, 2021

MarbleScience’s simulations of COVID-19 spread in a community

A few days ago, I came across this youtube video (from MarbleScience’s Youtube channel) describing a simulation where members of a community go to work and come home, thereby spreading a disease (like COVID-19) to other members of the community. It’s a fun video and I highly recommend you check it out.

Simulating COVID-19 in a community with marbles!

COVID-19 Prevalence Estimation — a Monte Carlo approach

I was also inspired to try my hand at simulation after reading this research paper that looks at how to achieve the maximum accuracy on an estimate of the prevalence of a disease (like COVID-19) using the fewest number of tests, under various assumptions of test sensitivity, specificity, and prevalence.

Despite knowing next to nothing about statistical simulations, I went ahead and created a simulation of a small community of people experiencing a viral epidemic, similar to, albeit vastly simplified in comparison with, the one done by MarbleScience.

You can find my code, which I’ll explain a little bit about, here:

https://github.com/NathanZorndorf/simpy-covid-simulation

COVID-19 Simulation using Simpy

So what did I do exactly?

I used the Simpy python package to simulate a simple scenario where a given number of people randomly either go to a grocery store or wait at home (in quarantine, essentially). While at the store, they have a chance of contracting a disease, and, once they’ve contracted it, they have a chance of dying. If they live, they gain immunity from the disease from then on out. When a shopper visits the store, they spend a random amount of money there, and the total amount of money that the shop makes each day serves as a proxy for the economic impacts of the disease spreading throughout the community. It’s an extremely simple model and doesn’t include multiple stores (like MarbleScience does), although the code could be modified and extended to simulate that scenario.

This was also my first time using Simpy, and it proved to be rather difficult to learn about modeling while also learning a modeling package, however, after much banging of my head on the kitchen wall, I seem to have something that appears to work.

At the time of writing, Simpy didn’t include any examples of simulations where all of the agents had to persist throughout the whole simulation. So I hope that my code can serve as a template for someone who does want persisting agents, such as in a simulation of a community experiencing an epidemic!

Let’s start at the end — my results.

x-axis is simulation time (in days)

The framework that I made allows you to vary some high level parameters like the probability of infection, the risk of death, time spent in quarantine, time spent at home vs. shopping, etc. The output is a count of the income generated at the store, as well as the number of cases, deaths, and recoveries.

Since the model is so simple, nothing that surprising really happens. There is an initial spike in cases (since the most people are vulnerable to the disease and the least amount of people are self-isolating), which levels off and then declines as the number of people who gain immunity increases. Income (or, economic impact) is hit proportional to the count of infected cases, so there is an initial dip which then levels off. Woo.

If I were to continue developing this model, I would incorporate some kind of social distancing measure or vaccines into the simulation, and measure the effect such a measure/vaccine has on deaths, income, or the rate of change in cases (the “R” value we’ve all been hearing about) under various scenarios availability of vaccines, time to vaccine rollout). One could also incorporate age as a factor and look at the effects of various vaccine rollout strategies. The opportunities are endless!

The Code:

For a quick rundown on how the code works, I’ll include a couple of snippets and briefly discuss them.

For a full explanation of how the Simpy module works, see the documentation here.

At a glance

Inside of the actual code that gets run — main() — All I’m doing is creating a Simpy Environment, setting up my agents or “processes” (i.e. people), and then kicking off the simulation with env.run() before finally saving the results to CSV.

Details…

After establishing the environment with simpy.Environment(), I generate a list containing instances of the Person class — one for each individual agent I intend to track and have as a part of the “community”.

Then, I set up a “process”, which is a technical term used by Simpy to refer to Python generators — functions that yield instead of return. The use of generators allows processes to pause, when events like Timeouts are yielded from the generator, and to resume when the Timeout is exhausted. This allows the simulation of actions like waiting in line, getting one’s car washed, or, in our case, quarantining at home.

Each person, as well as the collect_metrics() function is defined as a generator, and added to the environment via the env.process() functions. This way, the people and collect_metrics() generator functions are added to the simulation.

After that, the env.run() kicks off the simulation, which runs until SIM_TIME is reached.

At the end of the simulation (when SIM_TIME is reached), we save the outputs to a .csv file, which can then be visualized via the analysis.ipynb notebook.

One tricky aspect of Simpy that I’m not sure I quite have a handle on yet is understanding when process generators are run. Initially, I had thought that processes worked like players in a game board, with each one having an established “turn” that stayed constant in relation to the other players (processes). So if I called env.process(collect_metrics()) before env.process(person.live()) , then each time the environment’s clock increments, I would expect collect_metrics() to be “run” before person.live(). However, this is not how it works, as my experimentation proved, unfortunately, as that would be more intuitive (to me). Still, at this point, I’m not exactly sure how Simpy works under the hood, but this example seems to work. Honestly, a diagram that explained environment and process control flow from Simpy would be very helpful (nudge nudge).

Details inside of details (detail inception)…

Each “person” is an instance of the Person class that I define, which includes code describing the routines of staying at home and going to the grocery store, as well as probabilities for contracting the disease and dying — essentially, everything that happens in the simulation is defined in the Person class.

I also include a collect_metrics function that routinely (once a “day”), collects all of the state variables held inside each person instance, and saves the total to a list (one list for each metric).

You’ll notice that the Person class and the collect_metrics() function both contain infinite loops inside of them. This is the key that keeps the agents in the simulation — without it, once the person hits the end of their defined block of code, they would disappear and effectively be removed from the simulation.

Also note that when env.process(generator_function()) is called, the generator function that is passed as an argument is not run at that time — all that code does is make the environment aware of that generator function. The generator functions actually get called and run inside of env.run(). That confused me for a bit before figuring out that env.process() didn’t actually run any of my functions!

Thanks for reading — I hope this was helpful, and gets you started running your own simulations using Simpy :)

--

--

Nathan Zorndorf

Data Scientist interested in Health, History, Sustainability. Swing dancer. Borderline eccentric.