Risk Post Statistical Methods Explanation

The BuDS Weekly Risk Assessment posts have been an integral part of our response to Covid-19 for over a year. To remain as transparent as possible, the Covid-19 statistics team have prepared the following summary of how we form the figures and conclusions described in the posts. We hope this will give a better understanding of how our numbers are reached.

BuDS’ Covid-19 research team is made up of qualified professional people including a statistician and a clinical epidemiologist. We take very great care to make sure that we only provide you with fact-checked, accurate information. The end of this article has links to all the sources that we use to get this data. If you have any further questions, please contact us using the information at the end of the article and we will be happy to explain more.


CASE NUMBERS AND INFECTIONS

BuDS uses 4 separate sets of official data to make sure we can give you the best and most balanced picture of the risk from Covid-19 in Bucks. These are: positive tests, average R rate data, maximum R rate data, and the ONS Infection Survey. All case and rate data is downloaded from the Government Covid-19 data download page, which can be accessed as part of the PHE data dashboard. The ONS Infection Survey data is downloaded directly from the Infection Survey page, on the ONS website. A full set of links to these pages can be found at the end of this article.

FIRST SET OF DATA: POSITIVE TESTS

As explained in our Risk Assessment posts, Public Health England count the number of people who test positive for Covid-19 using either a laboratory PCR test or a lateral flow test kit later confirmed by a PCR test. These figures are often heavily revised after publication, often due to either delays in reporting or positive lateral flow test results being overturned by a negative PCR test. BuDS uses specimen test dates when calculating the numbers, which is the date on which the test was taken. As test results are often not reported until a few days after the test was taken, this creates a lag in the “true” numbers for a given day or period.

BuDS tries to give an accurate estimate of the numbers of people who are infectious in the community, rather than just those who have tested positive in the last week. This is because people typically stay infectious for around 14 days. However, this total still underestimates the true number of infected people in the community, because lots of people who have Covid-19 don’t get tested or don’t report a positive test. So, BuDS increases the number of people testing positive to take account of the people who are positive but don’t get tested or report a positive test.

To do this, we take the true cumulative number of people testing positive for the most recent week, by specimen date. We download our data on a Monday afternoon; hence it contains case figures up to the end of the day before (Sunday). However, due to not all specimen cases being included in this data, this data will always show fewer positive cases for that week than were actually recorded. To correct this, we plot the cumulative case numbers for each Sunday on a graph, then fit a trend line to it with an accuracy of more than 95%. When we have done this, the coefficient of the trend line is used in a formula, to create a multiplier. This is then applied to the number of cases on the preceding Sunday, to work out how many cases we should have had on the Sunday we downloaded the data. This number is then added on to the cumulative total for that week, to produce a more accurate estimate of the number of positive tests in Bucks for that period.

However, we know from independent analysis that many people are often asymptomatic, and hence fail to get a test because they don’t feel the need to do so. To counter this, we add on 80% of the cumulative total for the most recent week to that number, and 20% of the cumulative total for the preceding week to that number, thereby producing 2 increased totals. This is based on the WHO estimate that 80% of people are asymptomatic in the first week of infection, and 20% are asymptomatic in the second week of infection. These two cumulative weekly totals are then combined, to produce a rolling 2 week total estimate of the number of infected people in Bucks. It is this number that we then use to calculate the rest of the figures in the post. It is important to note that this calculated cumulative rolling 2 week total is always less than the true number when viewed retrospectively. This is because while BuDS can account for the missing Sunday data, we cannot estimate the numbers of specimen cases that will be added, or any exponential rise in the data that could affect the trend. Hence while the number we give is the best estimate we can produce, it is always a low estimate.

SECOND AND THIRD SETS OF DATA: R RATE DATA

Public Health England use a range of surveys and test data every week to calculate the ‘R Rate’ for Covid-19. The R rate measures how quickly the number of people infected with Covid-19 is growing or shrinking, but only across the South East as a whole. It can also, however, be viewed as a multiplier – if the R number is 1, every infected person infects (on average) one other person. As it rises above 1, the number of people infected by one person increases (so for an R rate of 1.4, every infected person would infect, on average, 1.4 others). Hence we can apply this to the estimated PHE test numbers (as calculated above) for the previous week, to produce an estimate for what the cases could be doing that week if the rate of increase was following that of the rest of the South East. For example, with an estimated weekly case total (in the previous week) of 1500 and an R rate of 1.5, the R rate estimated cases for the most recent week would be 2250.

We then take this calculated number, and add it to the calculated number for the previous week (found using the same method) to again produce a rolling 2 week total. This is increased to account for asymptomatic cases in the same way as the PHE test data is, to produce a single total number of infected people as before. BuDS does this calculation twice, using both the average and maximum R rate for the South East, to produce 2 estimates. These estimates are usually close in value to that of the PHE testing data, but can be either above, below or split by the testing data. It is important to remember that the numbers used for these estimates apply to the whole South East, and not just Bucks. This means that if infection rates in Bucks are higher or lower than across the whole region, this can impact the data. However, we do not believe that this impact is sufficient to significantly impact the validity or accuracy of our calculations.

FOURTH SET OF DATA: ONS INFECTION SURVEY

The Office for National Statistics test a random sample of 50,000 people across the UK every week, whether they are ill or not, to see how many have Covid-19. This Infection Survey is the most accurate way of estimating how many people have Covid-19, because it doesn’t rely on people choosing to get a test or report a positive result. The Infection Survey consistently shows that there are more infected people in the community than are shown by the PHE test data, even when asymptomatic people are added in. Unfortunately, the survey results are published in arrears, usually 2 weeks behind the current date. This means that we cannot use it as a measure in our calculations, but we can use it to verify the accuracy of the calculations made before.

The Infection Survey gives an estimate of how many people are infected in the South East (and other regions) as a percentage of the total population. Hence for a sample week, the survey might report that 0.1% of the population of the South East were infected. BuDS multiplies this percentage by the population of Bucks, which we are taking to be 543,973 (according to 2018 survey data). This produces an estimate for how many people would be infected in Bucks if infections had the same prevalence as they did across the South East. So far, our calculations – even with the asymptomatic cases added in – have always been below the totals estimated using the Infection Survey. Hence, our weekly figures are always low estimates (as stated above). It is important to remember that, as with the R rate data, that the numbers used for estimates apply to the whole South East and not just Bucks. This means that if infection rates in Bucks are higher or lower than across the whole region, this can impact the data. However, we again do not believe that this impact is sufficient to significantly impact the validity or accuracy of our calculations.

POPULATION ESTIMATES

In the Risk Assessment posts, we produce a list of how many people we believe could be in a specific geographical location or situation. These estimates are based solely on the population of the area, and not on how many people are actually infected in those places. This is because BuDS does not have access to the data to know how many infected people are in a specific place. However, we believe that this population-based estimate is accurate enough to give a general picture of risk.

BuDS uses the “worst case” scenario approach when calculating these figures. This means we take the highest of the 3 totals described above (PHE test data, average, and max R rate data) and assume this to be the number of infected people in Bucks. We also assume that nobody is isolating or in hospital, and that all people are out in the community and potentially able to infect others. This was a conscious decision, as we believe it is better to slightly overestimate the risk but keep people safe, than underestimate it and put people at risk.

Once we have decided which set of figures we should use, we then divide this number by the population of Bucks (assumed to be 543,973, as above). This gives us a figure for the number of infections per person, which is usually somewhere in the region of 0.01-0.05 (depending on tests). For example, 1500 cases total would equate to a figure of 0.00276 (3sf). This number is then multiplied by the populations of the areas to give the estimated number of infected people in that area, which is always rounded to the nearest whole number for simplicity. For example, Aylesbury (population c.60,000 from survey data) would have 165.44 infected people using the 1500 cases assumption above, which we would then report as 165.

HOSPITALISATIONS AND DEATH DATA

To calculate the number of people in hospital in Bucks, we again use a custom data download from the Government Covid-19 data download page. This gives us the number of cumulative hospitalisations, along with the number of new admissions, the total number of people in hospital, and the number of people in intensive care beds. These figures are given for each day, and so we use the most recent day available to give the totals in hospital for that week. However, we also use the trends to describe the overall situation, which are simply recorded by looking at the data. The data given in the report only applies to the Buckinghamshire Healthcare NHS Trust, although we are also able to access data for other nearby trusts if needed.

To calculate the numbers of people dying in a given week, BuDS downloads the numbers of people who died within 28 days of a positive Covid-19 test, within 60 days of a positive Covid-19 test, and the ONS summary of deaths from Covid-19 from the Government data download page. As these are cumulative figures, we then subtract each day’s figure from the figure a week later, to give the number of deaths in that week. We then note down the number in the week leading to the day of the risk post (any given Monday), which is then used in the post.

VACCINATION DATA

To show what proportions of the Bucks population are vaccinated, and how many of them have had each vaccine, we use the “vaccinationAgeDemographics” data for Bucks from the Government Covid-19 data download page. This gives a spreadsheet split by vaccination date and age group according to the NHS vaccination register. From this we can work out how many people in each age group have been vaccinated using simple arithmetic, split by first and second dose. We then combine the population and vaccinated totals, to give an overall percentage and number vaccinated for under 70s and over 70s. These figures are quoted in the Risk Assessment post, exactly as calculated. The populations are updated each week as well as the numbers of people vaccinated, to reflect the fact that population changes occur each week due to birthdays and deaths.

SUMMARY

All the figures above are combined into the Risk Assessment post each week, and combined with BuDS’ own analysis. The statistics informs the analysis and assessments made, along with the factually considered opinion of the Covid-19 research team and trustees. The reports are intended to keep people safe, and as such may often differ in conclusion from some media outlets and Government policy. We hope this article gives you a better understanding of how we get the numbers used in the Risk Assessment posts.

LINKS

To access the Government Covid-19 data download page, use this link: https://coronavirus.data.gov.uk/details/download

To view the ONS Infection Surveys, use this link: https://www.ons.gov.uk/peoplepopulationandcommunity/healthandsocialcare/conditionsanddiseases/bulletins/coronaviruscovid19infectionsurveypilot/previousReleases

To view our weekly risk assessment posts, use this link: https://buds.org.uk/category/our-work/iag-covid-19/risk-assessments/

FINALLY

Please share this article on social media, but always credit BuDS.

If you’d like to know more about this topic, please contact us at info@buds.org.uk – we will be happy to explain more about how we calculate these numbers.

If you need advice on how to keep yourself safe, or any other form of help or support, or you’re anxious about Covid-19, BuDS is here for you. Please e-mail buds-support@buds.org.uk, call 01494 211179 (voicemail) or message us and we’ll do all we can to help.