Linkage findings
Scope of the data
Version 1 of the COVID-19 Register had around 250,000 records linked to a range of administrative data sets. Version 2 had an expanded data coverage to more than 6 million linked records with more recent case data (New South Wales), more jurisdictions (Victoria and Queensland) and more data sets. Version 2.5 had more than 7 million linked records with more case data for Tasmania, Northern Territory, Victoria and the Australian Capital Territory to 31 December 2022 and updated hospitals data for the Australian Capital Territory, Victoria, New South Wales, Queensland and Tasmania (to June 2022). Version 2.6 includes over 9 million linked records with updated hospitals data for New South Wales, South Australia and Queensland up to 31 December 2022. Table 1 provides an overview of the data sets and coverage between the versions. Table 1 provides an overview of the data sets and coverage between the versions.
Please refer to the data variables list for the temporal scope of each of the datasets and how it differs between versions.
Data set | Version 1 (released in December 2022) Number of linked records = 250,821 | Version 2 (released in November 2023) Number of linked records = 6,494,308 | Version 2.5 (released in February 2024) Number of linked records = 7,256,727 | Version 2.6 (released in June 2024) Number of linked records =9,455,255 |
---|---|---|---|---|
State/territory notifiable disease data on COVID-19 cases | ACT NSW NT SA Tas | ACT NSW (updated data) NT SA Tas Vic (new) Qld (new) | ACT (12/03/20 – 31/12/22) NSW (25/01/20 – 30/09/22) NT (21/02/20 – 31/12/22) SA (30/01/20 – 11/02/22) Tas (30/03/20 – 31/12/22) Vic (25/01/20 – 31/12/22) Qld (28/01/20 – 20/08/22) | ACT (12/03/20 – 31/12/22) NSW (25/01/20 – 31/12/22) NT (21/02/20 – 31/12/22) SA (30/01/20 – 31/12/22) Tas (30/03/20 – 31/12/22) Vic (25/01/20 – 31/12/22) Qld (28/01/20 – 31/12/22) |
Medicare Consumer Directory (MCD) | Yes Whole of population | Yes Whole of population | Yes Whole of population | Yes Whole of population |
National Death Index (NDI) | Yes Whole of population | Yes Whole of population | Yes Whole of population | Yes Whole of population |
Medicare Benefits Schedule (MBS) | Yes Cases only | Yes Whole of population | Yes Whole of population | Yes Whole of population |
Pharmaceutical Benefits Scheme (PBS, including Repatriation Schedule of Pharmaceutical Benefits (RPBS) information) | Yes Cases only | Yes Whole of population | Yes Whole of population | Yes Whole of population |
Australian Immunisation Register (AIR) | Yes Whole of population | Not available | Yes Whole of population | Yes Whole of population |
National Notifiable Disease Surveillance System (NNDSS) | Yes Cases only | Yes Cases only | Yes Cases only | Yes Cases only WA: cases included however, cases are not linked to other datasets in the COVID-19 Register |
National Hospitals Morbidity Database (NHMD) | Yes Cases only | Yes Whole of population | Yes Whole of population | Yes Whole of population |
National Non-Admitted Patient Emergency Department Care Database (NNAPEDCD) | Yes Cases only | Yes Whole of population | Yes Whole of population | Yes Whole of population |
National Aged Care Data Clearinghouse (NACDC) | Yes Cases only | Yes Whole of population | Yes Whole of population | Yes Whole of population |
National Disability Insurance Scheme (NDIS) | Not available | Not available | Yes Whole of population | Yes Whole of population |
Australian New Zealand Intensive Care Survey (ANZICS) Adult Patient Database (APD) | Not available | Not available | Yes Whole of population | Yes Whole of population |
Australian and New Zealand Paediatric Intensive Care Registry (ANZPICR) | Not available | Not available | Yes Whole of population | Yes Whole of population |
National Disability Insurance Scheme data (NDIS) | Not available | Not available | Yes Whole of population | Yes Whole of population |
Linkage rates by jurisdiction
Generally, linkage results depend on the accuracy and completeness of the linkage variables provided to the AIHW: more accurate and complete data result in better linkage rates. For more information on how the data are linked, please refer to the above section on Data and methods.
Figure 2 shows the number of records that were linked and those that were unable to be linked by state and territory. For most jurisdictions, linkage rates have generally remained the same or improved slightly, where over 90% of records supplied for the project were linked in Versions 1, 2, 2.5 and 2.6.
The exceptions are Tasmania, where the proportion of linked cases fell from 99% in Version 2 to 89% in Version 2.5 and 2.6, as well as South Australia where the proportion of linked cases fell from 95% in Version 2.5 to 91% in Version 2.6. This is due to the notable increase in Tasmanian COVID-19 cases from Version 2.5 (243) to Version 2.6 (282,277) and the high proportion of cases with missing address information, particularly for probable cases; for example, about 67% of cases were missing the ‘city’ variable. Similarly, there is a notable increase in South Australian COVID-19 cases from Version 2.5 (116,668) to Version 2.6 (784,635), and a higher proportion of cases with missing address information. See the Data and methods section for more information about the linkage method used.
There were also increases in the number of records supplied from New South Wales and Queensland between Version 2.5 and Version 2.6. The linkage rate fell slightly, from 98% to 96% for New South Wales and from 96% to 95% for Queensland.
The COVID-19 Register will not be updated beyond Version 2.6. The next step will be the integration of the COVID-19 Register with the National Health Data Hub (NHDH). For more details refer to Future developments.
Figure 2: Number of records and percentage linked by jurisdictions across versions
The segmented horizontal bar chart compares the linkage rates for participating jurisdictions for Versions 1, 2 and 2.5. For most jurisdictions, linkage rates have generally remained the same or improved slightly, where over 90% of records supplied for the project were linked in Versions 1, 2, 2.5 and 2.6.
Linkage rates by population groups
Table 2a and 2b describe the linkage rates by age group and sex/gender. Table 2a shows that the linkage rate largely improved for Version 2.5 compared to Version 1, where the linkage rate for all groups was well over 90%, except the ‘Other’ sex/gender category. This has remained similar for Version 2.6. Sex is one of the key variables used to link records, therefore, where sex is not reported consistently, or as neither male nor female (‘Other’ in Table 2a below) linkage rates are lower. The linkage rate for ‘Other’ considerably improved from 3% in Version 1 to about 57% in Version 2.6, though the linkage rate remains lower than males or females. The linkage rate was well over 90% across the age groups, except for those aged 16 to 29 in Version 2.6.
Sex/gender2 | Version 11 | Version 2 | Version 2.5 | Version 2.6 |
---|---|---|---|---|
Male | 125,673 (96.4) | 3,061,544 (97.6) | 3,403,063 (96.8) | 4,426,843 (91.5) |
Female | 125,075 (97.2) | 3,419,774 (97.8) | 3,836,445 (97.0) | 5,084,690 (92.6) |
Other3 | 73 (3.0) | 13,801 (77.4) | 18,433 (77.5) | 23,504 (57.5) |
Age group4 | Version 1 | Version 2 | Version 2.5 | Version 2.6 |
---|---|---|---|---|
0-15 | 47,241 (96.6) | 1,159,686 (96.8) | 1,281,183 (95.8) | 1,671,214 (94.0) |
16-29 | 73,074 (95.1) | 1,483,651 (96.9) | 1,638,564 (96.0) | 2,121,296 (86.2) |
30-49 | 79,326 (95.9) | 2,128,489 (98.2) | 2,373,946 (97.4) | 3,121,457 (92.2) |
50-69 | 39,433 (96.9) | 1,266,328 (98.6) | 1,430,191 (97.9) | 1,896,536 (95.8) |
70+ | 11,747 (95.9) | 456,920 (98.0) | 534,007 (96.8) | 724,329 (94.1) |
Notes:
- Results for Version 1 (released on 16 December 2022) are based on those participating states and territories as detailed in Figure 2 and will not be directly comparable to the figures in the previously released web report ‘Establishing a COVID-19 linked dataset’ which also includes Victoria.
- As reported by the state and territory.
- Other includes records where sex or gender is not reported, or sex is reported as neither male nor female.
- Age group is based on age as at 31 December 2022. Records with missing information on birth date are excluded. Person IDs with more than one year of birth and/or sex were restricted to the most recent notification date (only small number of records were affected). Where the notification dates were equal, a random record was used.
Additional data on linkage rates by population groups are available in the supplementary tables.