Does the absolute value have any practical applications

A little corona math

Of infections and series of numbers

The discussion seems to be polarized into two camps. For some it cannot go fast enough to establish restrictive measures to curb the spread of the virus. Others consider the whole scenario to be mass hysteria, if not a deliberate conspiracy of certain capitalist or control-hungry circles. The argument is so heated that it leads, for example, to the following quote in the taz:

"The Christian Drostens of the Republic warned early on that the increase in corona infections is exponential and not linear: If each infected person only infects two people, the chain goes like this: 1, 2, 4, 8, 16, 32, 64, 128, 256 - and so on. "

Felix von Leitner has already dissected the statement in his blog. Thereupon readers wrote to him that the series of numbers only meant the new infections, not the absolute number of cases. That's why the numbers would be right. That is what gave the impetus to write this article.

Let's do a quick math. I promise it will be very easy. You can use Excel or another spreadsheet program to understand. In the first cell A1 we write the value 1, from which everything starts. In cell B2 we write = A1 * 2. These are the new infections from cell A1. To the left of it in cell A2 we write = A1 + B2. That's the new total of all cases. Then it looks like this:

And now we can use Excel's nice tool to transfer formulas to other cells. To do this, we select the two cells A2, B2 and use the mouse to drag the small green square at the bottom right down:

So the table looks like this:

Then the sequence of numbers in the taz would definitely not be correct. Correct?

No. I fell into the same trap at first.

The calculations are based on a so-called base reproduction number R.0 out. R.0 indicates the average number of people infected by an infected person, without taking into account the influences that could dampen the spread, such as immunity (natural and through vaccination), medication or social measures.

R.0 indicates how many people an infected person infects on average - and that a total of. R.0 is not an "interest factor" where compound interest arises. The new infections In of the current generation stick R0 * In People and these have to be added to the stock of the current generation. So we have to write a new formula in our table in cell B3, this is B2 * 2. This results in the following course of the table:

Column A shows the total infections of the respective generation and column B shows the newly added infections.

The exponential function

The exponential function is always used in relation to Corona, so it is time to explain what it is. An exponential function is characterized by the following formula:

y = bx

or, to represent it in Excel notation:

y = b ^ x

where b is called the "base".

Incidentally, the series of numbers shown by the taz corresponds to the function

y = 2 ^ x

for all natural numbers, i.e. 0,1,2 ...

One can also say: With each generation, y increases by a factor of b.

Exponential functions as straight lines

Often times, exponential functions are displayed on a logarithmic scale. This has the advantage that they then appear as straight lines. It's easy to show. The logarithm is namely the inverse function of the exponential function. So you can say:

log (10x) = x

Furthermore you can bx also as 10 log (b) * x write.

According to the formula log (10x) = x we ​​can therefore say:

log (y) = log (b) * x

Perhaps you still remember the straight line equations in school. The above formula can be represented as a straight line with a slope of log (b). That is most elegant. Whenever the calculation with exponential functions becomes a bit unwieldy, we can log both sides of an equation and then work with the simple straight line equations. This will be helpful to us in the following.

Total number of infections

You can actually skip the chapter (continue with: "The practical use"), since in epidemiology only new infections are expected. The observation is only of interest to people who want to get an idea of ​​how it is possible to calculate a doubling time for values ​​of R <1. To do this, you have to use the total number of infections.

From the previous text and a few further thoughts that I leave out here, it follows that the total infections at the time of a generation with unhindered spread are calculated as follows:

I.n = R0n-1 + R0n

Where n is the generation. Can this be reduced to a simpler exponential formula?

The answer is: only approximately. For R0 = 2 there is a special way:

I.n = R0n + 1 - 1

Let's look at the difference between the two functions in a diagram

R.0n-1 + R0n (Total infections) and R.0n (New infections) to:

Note: The representation is logarithmized, a straight line is actually an exponential function. What I call n in the text is t in the graphic. The meaning of the two symbols is the same.

The blue curve runs largely parallel to the brown curve. In reality it hugs a straight line parallel to our exponential function without ever reaching it.

If you wanted to specify a function by how much higher the blue curve is than the brown curve, i.e. if you wanted to specify a factor, this factor is not constant and its function over time is no less complex than the addition R.0n-1 + R0n.

The practical use

We have shown that an exponential course arises if we assume that a person infects other people on average R, where R is the number or rate of reproduction. R = R0when there are no inhibiting factors in its spread. And we now know how to calculate the number of infections in a particular generation of virus replication. For example, we can insert the reproduction number between 2.4 and 3.3, empirically determined in China, into our formula. This gives us the following table for R = 2.4:

0

1

1

3

2

9

3

23

4

56

5

136

6

327

7

786

8

1886

9

4528

10

10868

11

26085

12

62606

13

150254

14

360612

15

865469

16

2077126

17

4985104

18

11964251

19

28714204

20

68914092

The values ​​are rounded to whole numbers. The table tells us that 20 generations are enough to just infect the population of the Federal Republic.

How long does it take?

Now the question is, how long does the virus take for a generation? Here you can work with different values. The figures from China show that the gap between two generations is about 6 days.

There are two terms here: generation time and series length. When a bakeriologist creates a culture, the bacteria multiply through cell division. Here R = 2. The time it takes to create a new generation is therefore identical to the doubling time. (Here you can see the thinking of virologists or bacteriologists ... There are only "new infections", since the old bacteria are replaced by the new ones. In this world the series of numbers quoted by the Taz is completely correct .)

Since the value R first has to be determined in an epidemic, we need a different method to estimate the generation time. Because only if we know this can we determine R from the reports of new infections. It works like this:

Suppose Hugo infects Hans. If Hans also develops symptoms, this typically happens 5-6 days after Hugo had the first symptoms. This period of time is called the series length. For epidemics, one can simply say: The generation time corresponds to the series length. On this website, a series length of 5.2 is used unless you change the value. But let's continue with the more optimistic 6 days. Update: The Robert Koch Institute calculates with numbers between 6 and 10. There the numbers are also called consciously moderate designated.

For our table for R = 2.4, this means that after 120 days of unchecked spread, the entire population of the Federal Republic is infected.

Let's take another value for R, namely the mean between 2.4 and 3.3. That is 2.85. Then we get the following course:

01
14
212
335
4101
5289
6825
72352
86705
919110
1054465
11155226
12442396
131260829
143593363
1510241086
1629187096
1783183224

Here it only takes 102 days to reach a "contamination".

Now we have to hand in something that may have preoccupied one or the other reader. How do you calculate the daily infection values? This results from the formula

I.t = R(t / T)

where T is the series length and t is the number of days for which we calculate the number of infections.

This is actually quite simple when you consider that log (R) is the slope of a straight line on the exponential scale. If you divide the slope by T, the curve is stretched in such a way that a value for generation n only appears after n * T days:

log (I.t) = log (R) * t / T

We're only 83 million ...

However, the curve does not run so steeply up to infinity.

First of all, we have a limited number of people to infect. For this we can assume the population of the Federal Republic of about 83 million inhabitants. At the moment there is no longer any major intermingling with other countries, so we can say that growth will stop at the latest when the population is reached. The calculation model that is used for this is the so-called SI model.

This in turn is based on a different calculation model, namely the logistic differential equation. Don't let these terms scare you. There is a video here that describes the underlying math very clearly.

Ultimately, the logistic differential equation does justice to the fact that we have a limited resource that we can exploit through an initially exponential course. We are rapidly approaching the border and then we are slowed down. The formula for this is something like:

= N * 1 / (1 + e (-k * N * t / T) * (N-1))

where N is the population, i.e. 83 million, t and T are our known values ​​of the elapsed days and the generation time. Now there is still a constant k that arises when solving the differential calibration. This constant has to be determined for the particular application of the equation. In our case it is

k = ln ((N-1) / (N / R-1)) / N

I leave the formula unedited, you can use it as it is directly in Excel. The formula can possibly be simplified, but for our purposes it does.

Update April 12th, 2021: In practice, nobody uses these formulas because the simple differential equations from which this complicated formula emerges can be calculated directly in computer programs. That's what I did in my article on the SIR model, and that's where the following diagram comes from.

Update 04/22/2020: As a by-product of my article about the SIR model, the SI model came out as a special case, so I want to show this new diagram here:

The brown curve is the number of infections, the blue curve the number of those not yet infected.

You can see that the course rises exponentially up to the vicinity of the total population, until it clings to the number N. But is the process really exponential at the beginning? And how long? Again, you can see that if you look at the numbers on a logarithmic scale. As long as the course is exponential, it should result in a straight line on the logarithmic scale:

Now we can see more clearly: There is an unchecked course of the spreading beyond the 10 million mark, which we reach after almost 100 days.

Consideration of immunization

However, this formula is not the last word in wisdom. We have not yet considered immunization.

To do this, we have to realize that R is not constant = R over time0 is, but that R becomes smaller the more people have already become infected. Everyone sticks an average of R0 People as long as the probability of meeting an immune person is practically zero. In relation to our generations, we can now state this probability as

W = In / N

where I is the total number of people infected in generation n and N is the population number. Suffice it to say: It is about the SIR model, i.e. an extension of the SI model with the inclusion of immunity. The SIR model can no longer be represented with a single formula. But it is relatively easy to solve numerically. This fantastic video shows the way there. Update April 22nd: I now have an implementation in C # that you can try out a bit with.

But a scientist named Gabriel Goh has already done the work and made a pandemic calculator available, with which you can view the course of the curves with the change in the individual parameters:

http://gabgoh.github.io/COVID/index.html

You can then see which quantities are included in the calculation and you can also see the course up to the maximum and the branch that swings down afterwards, which is created by the immunization:

This calculation shows that a maximum of just over 20 million infections is reached after a good 140 days. This corresponds to over 7 million cases that have to be treated in hospital (light blue curve).

Update: Only now did someone draw my attention to this publication by the Robert Koch Institute. Here you can see which parameters were included in the calculations. This can be compared very well with the parameters of the pandemic calculator. The publication calculates with R = 2 and a generation time of 10 days. This is a very moderate interpretation, as the author himself says (p. 10). End of update.

In this statement by the German Society for Epidemiology, parameters of the model calculation are also mentioned on page 3.

A study using the similarity of the curve to a Gaussian curve can be found here. In his model calculation, the author tries to calculate the end of the first wave of infections.

Update 05/20: At this point I went into an ominous 27% line in David Kriesel's evaluations. Since David removed this line from his evaluations some time ago, the paragraph has been deleted. End of update.

From a purely mathematical point of view, the following insight results from what has been said so far:

  • The infection numbers show an exponential course with very high values ​​after a short time.
  • There is only one variable that we can influence in order to keep the consequences of the corona pandemic within reasonable limits, and that is R.
  • Small changes in the flat course of the curve have very large effects after a few days.

Is it really that easy?

Now the question is justified: is it even possible to calculate something as complex as a pandemic with a few mathematical formulas?

Yes, at least with the corona virus *, which we are currently dealing with. No, if we wanted to use it to calculate the spread of measles or influenza.

Why is that? Because some circumstances that would make the calculation more complex simply do not apply to Corona:

  • We currently have hardly any immunization, so an unchecked rate R
  • The regeneration rate is well above 1, we have constant growth
  • The spread happens before symptoms appear, before the infected retire to bed sick

So there are very few factors that affect the regeneration rate R: This is the average frequency of physical encounters and the immunization, which is currently in a very low range.

And because it is so easy to calculate, the figures from China can also be used to estimate how high the mortality (death rate compared to the number of cases) is and how much the health systems are burdened by acute infections.

Since lethality depends on the state of the health system, one of the most urgent tasks is to avoid overloading the health system. The number of cases that require intensive care should be as far below the number of available beds as possible. You can read more detailed considerations here, among other things.

The conclusion from this article is clear, there is no discussion. And governments around the world are guided by such or similar calculations when they order restrictions on the freedom of movement of the population.

It was interesting to see how British Prime Minister Boris Johnson recently favored the concept of controlled immunization ("contamination") of the population and has now completely switched to the restrictions on freedom of movement that have already been introduced in other countries. I guess someone once calculated for him the relationship between the number of infections and the intensive care places required.

The exponential character

We now know how to recognize an exponential course of measurement data. And that exponential process is something that attaches sets of numbers in an almost sticky way. Once you have exponential output data, the exponential character is reflected in all the data that we derive from it.And that can be helpful if you want to estimate the risk from the available figures.

For example, there are people who say that half of all positive tests are false positives because the tests we used respond to other types of corona virus. The Charité's chief virologist, Christian Drosten, has already commented on this allegation and refuted it. The RNA traces to which the test responds come from corona viruses that no longer exist today. Fortunately, the tests respond to SARS-CoV-2.

Of course, I cannot judge whether this is really the case. But based on our knowledge of exponential functions, we can say the following: If we divide R by two - assuming that only half of the positive tests are actually SARS-CoV-2 positive - we still get an exponential function. The problem remains the same, you only have a little more time. At T = 5.2 and R = 2.4 we reach the 83 million limit after ~ 78 days, while at R = 1.2 we are that far after ~ 120 days.

Another aspect is the deaths from COVID-19. We can say the following about this: There is a connection between the actual cases, the number of which we do not know, and the actually identified deaths. We know: an average of three weeks after infection, a certain percentage of the actual cases will show up as deaths. So the course of actual cases will have the same character as the course of deaths. If the latter is exponential, the actual infections will also be exponential.

Now there are people who say that the deaths are actually other cases of acute respiratory diseases that "happen" to also have the corona virus *, which is harmless in itself. Let's take a look at the course of acute respiratory diseases (without COVID-19) (source: Robert Koch Institute):

The dark line is this year's course. We have seen a strong downward trend in the number of reported cases in the last 5-6 weeks. And now let's look at the deaths with Corona involvement:

We plotted the data on a logarithmic scale because we know that an exponential course on a logarithmic scale results in a straight line. And you can't imagine a better straight line than in this diagram. The curve fidgets a bit in the lower area because we have very few cases there. Later the course becomes smoother. The course of deaths does not correspond at all to the course of respiratory diseases. What we are looking for is a source that is exponential and that is not respiratory disease rate.

Now the question is: Which theory is one more inclined to believe: that the deaths have to do with the various acute respiratory diseases, or that they follow a presumed exponential course of the actual cases of COVID-19?

Finally, let's look at an article that expresses a very interesting theory. The fact that the exponential increase in confirmed infections has to do with the fact that the number of measurements increases exponentially, and is therefore an artifact of the measurements. Therefore, the author of the article, Paul Schreyer, from the Robert Koch Institute wanted to know what number of measurements their case numbers are based on.

For week 11 and 12 the numbers actually appeared. And Mr. Schreyer interprets this in such a way that the number of reported infections shows a barely noticeable increase compared to the number of measurements and therefore the number of actual infections hardly increases. The author divides the number of new infections by the number of measurements and thus obtains a rate of 5.9 or 6.8%. Or rather, he didn't have to do it himself, because the RKI published it like this (p. 6):

Now we can apply a fine method of mathematics. We take Mr. Schreyer's thesis as a prerequisite and see where that leads us. If we get into contradictions as a result of the thesis, we can reject it.

The thesis is roughly: No matter how many tests we carry out, we always arrive at roughly the same rate of 6-7% positive cases. The reported number of infections is almost exclusively dependent on the number of tests. And if this increased exponentially, the number of positive tests would also increase exponentially.

It follows from this approach that we have had an almost constant level of infections for some time, around 6%, with a slight increase in week 12 from 5.9 to 6.8%, which we can neglect here. We continue to reckon with 6%.

For this calculation to work out, we have to assume an average of 6% infections in the population, that would be a whopping 4,980,000 infections. Let's assume that the mortality rate for COVID-19 is 0.2%, as is the case with the flu. 0.2% is a factor of 0.002, which would result in 9,960 deaths, or 0.6%, which Mr Schreyer reckons with, would result in 29,880 deaths. The worldwide average of around 4% would be 199,200. When this article was written, we had 557 deaths in Germany. (Update: In a first version of the article, my factors were wrong by a factor of 100. Sorry again.)

You can tell from the size that something is wrong here. And that is now an important point: a thesis must harmonize with the existing numbers. Perhaps one can reconcile them with the actual numbers through an adventurous theory, but that is just not very effective. This is where I use Occam's razor.

But the real argument is as follows: We are seeing an exponential increase in deaths from Corona * infection. This can be a coincidence, but it can also follow an exponential course of the actual infections. Which is more likely? My experience says that an exponential course never arises randomly. It always comes from an exponentially growing source.

And we mustn't forget one thing in all considerations. We are not alone in the world in Germany. Italy shows a massive increase in intensive care cases, all with the same symptoms, all corona-positive *. What we don't see is a corresponding increase in other respiratory diseases in Italy. What we are seeing, however, are 10,000 deaths in the past few weeks that are corona positive. What we also see are the exponential trends in reported infections and corona-positive deaths worldwide.

So my mathematical feeling tells me: We have to be on our guard, we need more tests to be able to check the actual status of the infections. And we have to do everything we can to push the value R down.

We find the following sentence in the article by Mr. Schreyer:

»According to the latest data from the RKI (March 27), the proportion of those who died compared to those who tested positive is 0.6%. According to the RKI boss Wieler, their average age (!) Is 81 years. An extreme risk for the entire population can hardly be deduced from this. "

You can say that if you don't finish calculating your own numbers. We don't even want to talk about the attitude towards the older part of our society. But you could also see it differently:

One possible interpretation of the numbers

We are at the beginning of an exponential spread of a disease that only knows either very mild or very severe cases. The cases that have occurred so far and the short period of time within which the corona virus * spreads has so far only led to a few deaths of older citizens who are already pre-stressed in Germany. But once the disease really spreads across the population, the deaths of younger, healthier people will also increase. The probability is lower, but as the number of infections increases, a low probability becomes a high value.

The situation is similar with lethality (= deaths / reported infections). It may decrease slightly over time because those who don't have so much to counter the virus are dragged away. On the other hand, the number of infected people is increasing exponentially and will very soon reach the very large number ranges. Multiply those numbers by a lower lethality and the result is still a terrifying one.

This is our interpretation of the numbers that apply to us until someone can come up with a better interpretation.

Conclusion

With the help of a few parameters, the theoretical course of the corona pandemic can be calculated relatively easily. We have described the use of the exponential function for this purpose. The most important parameter for the course of the disease is the regeneration rate R. This can be reduced by measures that are suitable for reducing the frequency of physical closeness between people. Why this is so and how long it has to be maintained is explained quite well in this article. We hope that knowledge of the exponential function will help separate the wheat from the chaff in the many contributions to the discussion.


* Every mention of the corona virus in this article means the type SARS-CoV-2, unless explicitly stated otherwise.