Since 1986, initiative of President Ronald Reagan, the Mathematics Awareness Week aimed to increase public understanding of and appreciation for mathematics. Grown over the years to last a month, the inclusion of statistics is due in part to the rapid growth of statistics jobs. Driven by the US based organisation This is Statistics, the Mathematics & Statistics Awareness Month is relatively unknown Down Under but a good opportunity to talk about the role of mathematics and statistics in our current environment.
The role of mathematics and statistics in the COVID-19 pandemic
Strange times we live in. The information stream is endless and as we all are hybernating in our homes, every morsel of news coming in from reputable websites to social media is taken in and analysed. Should we wear masks? Is it safe to go for a run? How long is this situation going to last?
Whilst a lot of questions don’t quite have answers yet, it has become clear that data and visualisations of those data are forefront in the media. This is the time when statisticians come out of the woodworks (rather than there usual habitat of staying in the background) to help stear future directions and influence new policies (see what the ABS is doing here for example) but it is also a time in which scientists have a duty to help interpret key messages and enforce the ones that are evidence-based while countering urban myths. So in the spirit of the Mathematics & Statistics Awareness Month, it is useful to look at a few of these examples and think more critically about the role statistics plays.
Flatten the curve
It’s the talk of the street. We need to flatten the curve. But do we really understand what we mean by that? And which curve are we talking about?
ABC FactCheck did a good piece on explaining different formats of the curves that have been published in the media and why the numbers are not always what they are made up to be.
The same data graphed in a different format can lead to diverging interpretations if not investigated carefully. Mark Sanderson, Irene Hudson and Mark Osborn from RMIT University published a tutorial on The Bar Necessities in The Conversation. Republished by ABC News this hopefully reached an audience that exceeds the science-minded crowd.
Scott Morrison has announced that the curve is flattening based on the trend seen in the daily count of new infections. At the same time, they released the scientific modelling on which the government based their decisions to curb the COVID-19 infection rate. Rachael Brown from ANU provides a more in-depth analysis of what scientific modelling is exactly (The Conversation). Thus far, the released modelling has been a theoretical exercise and we are looking forward to the more in-depth modelling based on the actual Australian data to draw inference about if we are indeed flattening the curve to the extent that we think we do.
Irrespective of the challenges that COVID-19 is throwing at us from a public health and economical perspective, it has been an exceptional example to increase public awareness and understanding of data visualisation.
Thinking on the log scale
Another mathematical and statistical concept that is now in the midst of the public eye is the log scale. Basically, a log transform is applied to the COVID-19 cases count data to derive a growth rate. This practice is standard in time series analysis. In statistics, this transform is useful to analyse exponential time series within linear models whilst maintaining an intuitive interpretation.
Popular media have been using log-transformed numbers to discuss the rate at which cases double. The log operator is often seen as an abstract concept, but it is visualisations like this that attach practical meaning.
No two proportions are the same
Most stats that are published around COVID-19 are proportions or conditional counts, even if it is not always clear. For example, the number of cases that are reported is actually a count conditional on the number of tests performed. Interestingly though, the criteria for testing vary wide and broadly between countries. So when we get lured in comparing confirmed cases between countries, we tend to forget that there is a denominator there that is not taken into account.
For example, fatality rate is impacted by number of tests but test protocols between countries differ. In some countries in Europe tests are only performed on those with severe symptoms requiring hospitalisation whilst in Australia, being touted for its low death rate, tests are performed more broadly amongst those who are also displaying mild symptoms. On the other hand, in China, asymptomatic cases which have tested positive for COVID-19 are not being counted in their COVID-19 cases tally. So when we are comparing fatality rates between those countries we are really comparing apples and oranges. Even worse, in New York COVID-19 related deaths are only counted if they occur in a hospital. Meaning that the actual count is already an underestimation, let alone that we have a clear idea on the number of actual cases in the city.
When you see a proportion or a number, always ask yourself the question: Is this statistic conditional on something? If it’s not, should it be? Does it make sense to compare absolute counts when we know that underlying variables like population size and number of tests vary? It is all too easy to take numbers for granted as they are objective and quantitative. But the field of statistics is build around putting uncertainty around those numbers. Whether it is through measuring variance or through thinking through the conditional probabilities.
Miracle cures, misinformation and fake news
From celebrity chef (note chef, not scientist) Pete Evans’ $15,000 miracle machine to Harry Potter author J.K. Rowling’s dangerous breathing practices, you don’t have to look far to find a cure for COVID-19. One caveat, evidence for these so-called cures is typically based on an n=1 design. And whilst there is a place for case studies in science, one might hope that someone who has a bit of a critical mind will classify these particular “cures” as sham.
It becomes a little more complicated though when the touted cure is being backed by peer-reviewed science. Enter hydroxychloroquine, which has been the subject of lots of controversy within the scientific community but also in popular media after being publicised by the higher echelons. The claims are based on a French study which was published online after peer-review. As stated in the title, this was a non-randomised, non-blinded study and whilst the results could be easily interpreted as hydroxycloroquine indeed being an effective drug, the study has been hammered for its scientific flaws. There is uncertainty around discrepancies between the trial registration and reported data, mismatches between pre-published versions and the final paper, missing patients, etc. If you have time to read through the comments on the critique, you will soon see that all science goes out of the window and it becomes a personal and ideological argument.
Another paper from China shows less convincing results with regards to the drugs but also here there are issues in that the reported methods and results do not correspond with the trial registration. And while it was a randomised study it still wasn’t blinded. In the meantime, the hype around hydroxycloroquine has resulted in bans from countries like India after people died taking it unsupervised providing us with a sad example of what can happen when science rigour is not adhered to.
The role of statistics in the pandemic
It would be easy to dwell on the examples in which statistics were not applied properly or interpreted with bias, but the key message to take away here is that statistics is playing a crucial role to come out on the better end of the pandemic. Government decisions and policies are being based on modelling and science, and the general public is getting educated on how to interpret data in a critical way. For researchers, it is an opportune moment to maybe reserve some time that would normally be spent in the lab to upskill on their data analysis knowledge. Maybe you can find the time to produce that cool graph you always wanted to create. Or to learn to code in R and set up some routines to process your data for when you get out to collect again. Or join in on the conversation around how research is critical in handling the pandemic and how understanding data is crucial to have an informed opinion.
Further reading: A statistician’s guide to coronavirus numbers (Royal Statistical Society
Saul Newman uses large-scale human data to predict fitness-linked demographic traits at the newly-formed Biological Data Science Institute. Saul is also appointed to the ANU Research School of Biology, applying machine learning models to multi-species large scale experiment data using satellites and ground sensors.
Alice was appointed as Director of the Statistical Consulting Unit in October 2019. She is passionate about maintaining girls’ interest in maths and stats, and she enjoys collaborating on research projects in every part of the University.
Marijke joined the Statistical Consulting Unit in May 2019. She is passionate about explaining statistics, especially to those who deem themselves not statistically gifted.