Skip to content

Time series for Australian states, originally via covid19data.com.au

Notifications You must be signed in to change notification settings

pappubahry/AU_COVID19

Repository files navigation

AU_COVID19

(2022-08-20: I will no longer update these files. You can go to covid19data.com.au or covidlive.com.au for daily updates.)

Time series of confirmed COVID-19 cases for Australian states, originally from and still cross-checked against covid19data.com.au, compiled primarily by Juliette O'Brien. When case numbers are reported at differing times of day, there may be differences between my data and that site. I am trying to use my judgement to make the time series as consistent as possible, but the data is inherently messy and you shouldn't necessarily trust every daily percentage change for every state.

(Note added 2022-01-08: Victoria has started reporting probable cases from rapid tests. The sources of infection CSV file provided by Victoria only considers cases identified through PCR tests, and the federal government statistics also only consider PCR tests. To make my life easier, at least for now I am separating out the rapid-test cases into their own file, time_series_new_cases_rat.csv. This should also mean somewhat more continuity in the (PCR-)test positivity rate, which would otherwise be inflated by the rapid-test cases, negative results for which are not reported.)

Notes on NSW

  • On 3 July, NSW reported 189 historical cases from cthe Ruby Princess to the federal government, which now appear in NSW's totals published by the federal Department of Health, but not by NSW Health (the cases – all crew members – were diagnosed and managed on the ship). My numbers follow the NSW Health website, so exclude these cases.

  • The NSW sources of infection are from Data.NSW's CSV file, which now contains the full case dataset, up to several days ago ("Publication of some data in this dataset is being delayed because the risk of gaining information about an individual in the dataset increases as the number of cases decreases"). The following bullet points are no longer relevant unless you're trying to piece together the history of time_series_nsw_sources.csv. Update 4 June On 3 June, the dates in the CSV file for all or most cases were changed. I believe that previously, the date was the date of test sample, and now I don't know, but it might now be date of test analysis or something related.

  • The NSW sources figures prior to 9 March are extracted from the epidemiological curve graph at the NSW Health statistics page. My extraction code isn't necessarily precise (the axis ticks may be two pixels tall, for example), and there can be small discrepancies between my totals for each source of infection and the totals reported by NSW Health. On 3 April, the published graph stopped distinguishing between 'locally acquired from unknown source' and 'locally acquired from a known contact/cluster', and I have used the graph published on 2 April for these early dates, which will probably cause minor inconsistencies as this early data is still occasionally revised. I have arbitrarily placed two cases (1 March and 7 March) in the amalgamated 'local' category into 'local unknown'.

  • Numbers since 9 March are counted from the CSV file at Data.NSW There are much larger discrepancies between the number of cases as shown in the epidemiological curve and the number of confirmed cases in time_series_cases.csv, I think because the date reported in this file (and graph) is the date that the sample was taken, and there can be quite a lag between the sample being taken and it being analysed to a positive result. Expect the numbers for the last few days to be substantially revised upwards as the backlog of samples is tested; data from earlier days is also subject to revision.

  • On 11 April, many cases previously classified as 'Locally acquired - contact not identified' were reclassified into the new category 'Overseas or interstate'. On 14 April, a separate 'Interstate' category was introduced into the source CSV; this standalone category doesn't exist in the graphs, so my data does not show any interstate infections prior to 9 March, even though some may be recorded as such internally at NSW Health.

  • On 21 March, NSW changed from reporting case numbers as of 11am to case numbers as of 8pm the previous evening.

Thanks to @tetrakazi for her scraper of the Victorian data, which I have adapted for the sources of infections for that state. I don't know why our time series don't agree.

Prior to 21 April, a bug in my parsing scripts meant that there were occasional errors in time_series_vic_sources.csv and time_series_act_sources.csv. From 21 April, the total numbers of reported cases should tally correctly with time_series_cases.csv.

time_series_wa_sources.csv presents cumulative totals with the date supposedly being "optimal date of onset", but sometimes the numbers go down, which I don't understand. Some cases recorded as local contact are, according to media releases, of people in hotel quarantine.

WA counted historical cases identified through serology testing in its case count until 1 August. My numbers follow the WA dashboard, so 26 of these historical cases were removed from the tally on that date.

As of 2021-05-21, the ACT dashboard has five cases without a source of infection; based on the federal dashboard numbers, these appear to be interstate-acquired, and are now classified in time_series_act_sources.csv as such.

On 2021-08-21, the ACT changed its reporting time cutoff from 9am to 8pm; the case total for this day reflects an 11-hour period.

In time_series_tests.csv:

  • WA's figures are persons tested until 30 April; from 1 May they are tests performed.
  • NSW's figures are persons tested until 25 May; from 26 May they are tests performed.
  • Other states are (I believe) all tests performed.
  • On 6 June, Victoria's number fell by about 12,000 after removing duplicated data.
  • Qld added about 38,500 tests from a private provider on 22 June.
  • SA added tests from a private provider on 29 July.

As of 23 April, relevant state health department links:

NSW: COVID-19 page, Source of infections CSV

Vic: Dashboard

Qld: Statistics

SA: Dashboard

WA: Dashboard

Tas: Statistics; my daily updates used to follow the (discontinued as of 12 June) evening case announcements, usually tweeted by Monte Bovill (ABC), Emily Jarvie (Advocate/Examiner), and others.

ACT: Dashboard

NT: COVID-19 page

The federal government's statistics page has some testing statistics not always released by the state health departments.

About

Time series for Australian states, originally via covid19data.com.au

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published