Dr. Frank Bauer
Phone:+1 617 495-67132

Original publication

Frank Bauer and Joseph T. Lizier
Identifying influential spreaders and efficiently estimating infection numbers in epidemic models: a walk-counting approach

Complex Systems . Computer Science . Mathematics

The pathways of epidemics

A new computer model rapidly and accurately estimates who spreads an infection particularly extensively, thereby facilitating countermeasures

October 30, 2012

Epidemics could be more effectively contained in the future. A new computer-aided method developed by researchers of the Max Planck Institute for Mathematics in the Sciences in Leipzig identifies those persons in the population who propagate an infection most strongly. In contrast to other methods, this process is distinguished by significantly less computational effort than comparably precise ones in estimating the actual number of people who are directly or indirectly infected by a specific person. Other fast calculation methods provide solely a qualitative ranking of carriers, but do not enable statements to be made concerning how many more people a contagious person infects in comparison to a less virulent carrier. This information is especially important if a vaccine is in short supply. In that event, physicians need to know which persons they should preferentially vaccinate to most effectively prevent a pandemic.
A network of potential contagion pathways: the number of pathways having a pre-defined number of indirect contacts within a social network, and leading from one person to all other individuals, provides information about who spreads an infection especially effectively. Counting pathways makes it possible to determine more rapidly and precisely than before which individuals must be preferentially vaccinated in order to prevent an epidemic. Zoom Image
A network of potential contagion pathways: the number of pathways having a pre-defined number of indirect contacts within a social network, and leading from one person to all other individuals, provides information about who spreads an infection especially effectively. Counting pathways makes it possible to determine more rapidly and precisely than before which individuals must be preferentially vaccinated in order to prevent an epidemic. [less]

It is difficult to predict who transmits an infection most actively. Contagious persons who have contact with many other people do not always infect the most number of people. The efficiency with which an individual propagates pathogens of a disease depends upon how directly interconnected the person is, and while this is largely correct, it applies only under certain conditions. “There are people as well who are less directly interconnected and yet propagate an infection quite extensively,” says Joseph Lizier, from the Max Planck Institute for Mathematics in the Sciences, who investigated the spread of epidemics and is now a Postdoctoral Fellow at the Commonwealth Scientific and Industrial Research Organisation (CSIRO) in Sydney. Since it is not easy to identify which properties of social networks are pivotal for the spread of an infection, Lizier, a computer scientist, and mathematician Frank Bauer investigated these characteristics more closely. The population of a region, a country, or even the world can be regarded as examples of a social network.

Of course, there were already computer programmes that simulated the spread of epidemics prior to the work of the two researchers at the Max Planck Institute for Mathematics in the Sciences. “However, these are either fast but imprecise, or they provide precise results but require enormous computational effort,” says Joseph Lizier. Moreover, the researchers wanted to be able to gauge the actual number of contagious persons and not just establish the rank order of carriers by spreading efficiency. “With the absolute numbers of infections, we can differentiate whether the disease will likely run its course without harm, or whether it becomes rampant and strikes distant parts of the population,” says Bauer, who is now a Postdoctoral Fellow at the Faculty of Mathematics of Harvard University in Cambridge, US.

The programme tests how effectively each person propagates an infection

The researchers' computer algorithm calculates how many persons have been directly or indirectly infected by an initially contagious person selected at random (patient zero) after a specific period of time. Since this calculation is carried out for every person within a social network, the most effective propagators of the disease can be identified. In other words, the programme tests all the individuals for their propensity to be transmitters of the infection.

That sounds computationally intensive. However, the method developed by Bauer and Lizier operates very efficiently, as tests with data from several social networks demonstrated. The researchers chose a virtual network as a model for their study, which actually is a network of relationships in a research community that had already been created previously from data in an internet archive of scientific publications. The network comprises more than 27,000 individuals and more than 100,000 connections between them. Joseph Lizier and Frank Bauer then simulated how an infection in the network of relationships would propagate – assuming that the participating individuals do not just jointly publish, but instead personally meet with one another as well. “We had the result in about an hour,” reports Lizier. The conventional method requires about 2000 times as long, almost three months, to develop similarly precise statements.

The risk of contagion rises with the number of possible infection pathways

The Bauer-Lizier process counts all of the possible pathways that an infection can follow from patient zero to another person (person x) within a prescribed time. The time required for this is a function of the number of persons situated between patient zero and patient x. During the calculation, the method takes into account only pathways up to a pre-defined maximum number of indirectly infected persons. Moreover, for diseases where after infection patients either recover with full immunity or die (studied with the as so-called Susceptible-Infectious-Recovered or SIR model) the method leaves out pathways that traverse a person who was contagious already, since that person is effectively removed from the disease-spreading network. The larger the number of possible pathways, the higher the probability that patient x will infect a large portion of the population. Since all the possible transmission pathways from patient zero to all other individuals in the network are counted, the result is an estimated number of persons that will be infected by patient zero via a realistic number of intermediaries. The method calculates this contagion potential for every person within a network.

The approach of counting possible transmission pathways provides more information about the course of epidemics and their most important multipliers than previous methods, explains Bauer using a typical structure within this network. “Clusters play an important role,” says the mathematician. By cluster, he means smaller or larger groups within a network that are more strongly interconnected with one another than with the rest of the network. For example, the inhabitants of a village cultivate more contact with one another than with people from outside the village. Such clusters can be an obstacle to long-range spreading on the one hand, since their connection to the rest of the network is relatively weak. On the other hand, they encourage spreading within their interior, since any two members of the cluster have more acquaintances in common through whom they can reciprocally become contagious indirectly.

Parallel calculations also enable long transmission pathways to be taken into account

The role of clusters is thus important, but it is not clear whether they promote or discourage spreading. Since the Bauer-Lizier process counts the transmission pathways traversing the cluster, it accounts for this without the necessity of having to analyse the role of the cluster on an abstract level. “This is because the clusters affect the number of pathways,” says Bauer. Their structure is thus implicitly contained in the number of pathways. Above and beyond this, the new procedure even enables general statements to be made about the role of clusters.

But the new process has its limits, as well. It works most effectively for pathways that do not extend beyond more than four individuals. The speed of the method drops sharply for longer pathways. “Nevertheless, the computational time is still orders of magnitude lower than with other methods that attain the same precision," says Lizier. The investigation of longer infection pathways would certainly be interesting. If a highly infectious disease is involved – which the researchers can simulate by means of a higher probability of transmission between two neighbouring individuals in a network – it is not only the properties of the network in the immediate vicinity of patient zero that play a role. It is much more the characteristics of the network as a whole which become increasingly important, such as the average number of connections from one person to others, as an example.

Although the Bauer-Lizier method becomes more awkward as the transmission pathways grow longer, Lizier sees an alternative which can rapidly simulate these as well. This is because the new method lends itself to parallelisation, says the researcher. That is, simulation of the spread of disease can be dissected into several subroutines and distributed among many parallel processors in a large computer. “So in principle, one should be able to simulate the spread of infections within networks of millions of individuals in a reasonable amount of computational time,” says Lizier.


loading content