Linkage findings

Scope of the data

Version 1 of the COVID-19 Register had around 250,000 records linked to a range of administrative data sets. Version 2 had an expanded data coverage to more than 6 million linked records with more recent case data (New South Wales), more jurisdictions (Victoria and Queensland) and more data sets. Version 2.5 had more than 7 million linked records with more case data for Tasmania, Northern Territory, Victoria and the Australian Capital Territory to 31 December 2022 and updated hospitals data for the Australian Capital Territory, Victoria, New South Wales, Queensland and Tasmania (to June 2022). Version 2.6 includes over 9 million linked records with updated hospitals data for New South Wales, South Australia and Queensland up to 31 December 2022. Table 1 provides an overview of the data sets and coverage between the versions. Table 1 provides an overview of the data sets and coverage between the versions. 

Please refer to the data variables list for the temporal scope of each of the datasets and how it differs between versions.

Table 1: List of data sets included between versions

Data set

Version 1 (released in December 2022)

Number of linked records = 250,821

Version 2 (released in November 2023)

Number of linked records = 6,494,308

Version 2.5 (released in February 2024)

Number of linked records = 7,256,727

Version 2.6 (released in June 2024)

Number of linked records =9,455,255

State/territory notifiable disease data on COVID-19 cases

ACT

NSW

NT

SA

Tas

ACT

NSW (updated data)

NT

SA

Tas

Vic (new)

Qld (new)

ACT (12/03/20 – 31/12/22)

NSW (25/01/20 – 30/09/22)

NT (21/02/20 – 31/12/22)

SA (30/01/20 – 11/02/22)

Tas (30/03/20 – 31/12/22)

Vic (25/01/20 – 31/12/22)

Qld (28/01/20 – 20/08/22)

ACT (12/03/20 – 31/12/22)

NSW (25/01/20 – 31/12/22)

NT (21/02/20 – 31/12/22)

SA (30/01/20 – 31/12/22)

Tas (30/03/20 – 31/12/22)

Vic (25/01/20 – 31/12/22)

Qld (28/01/20 – 31/12/22)

Medicare Consumer Directory (MCD)

Yes

Whole of population

Yes

Whole of population

Yes

Whole of population

Yes

Whole of population

National Death Index (NDI)

Yes

Whole of population

Yes

Whole of population

Yes

Whole of population

Yes

Whole of population

Medicare Benefits Schedule (MBS)

Yes

Cases only

Yes

Whole of population

Yes

Whole of population

Yes

Whole of population

Pharmaceutical Benefits Scheme (PBS, including Repatriation Schedule of Pharmaceutical Benefits (RPBS) information)

Yes

Cases only

Yes

Whole of population

Yes

Whole of population

Yes

Whole of population

Australian Immunisation Register (AIR)

Yes

Whole of population

Not available

Yes

Whole of population

Yes

Whole of population

National Notifiable Disease Surveillance System (NNDSS)

Yes

Cases only

Yes

Cases only

Yes

Cases only

Yes

Cases only

WA: cases included however, cases are not linked to other datasets in the COVID-19 Register

National Hospitals Morbidity Database (NHMD)

Yes

Cases only

Yes

Whole of population

Yes

Whole of population

Yes

Whole of population

National Non-Admitted Patient Emergency Department Care Database (NNAPEDCD)

Yes

Cases only

Yes

Whole of population

Yes

Whole of population

Yes

Whole of population

National Aged Care Data Clearinghouse (NACDC)

Yes

Cases only

Yes

Whole of population

Yes

Whole of population

Yes

Whole of population

National Disability Insurance Scheme (NDIS)

Not available

Not available

Yes

Whole of population

Yes

Whole of population

Australian New Zealand Intensive Care Survey (ANZICS) Adult Patient Database (APD)

Not available

Not available

Yes

Whole of population

Yes

Whole of population

Australian and New Zealand Paediatric Intensive Care Registry (ANZPICR)

Not available

Not available

Yes

Whole of population

Yes

Whole of population

National Disability Insurance Scheme data (NDIS)

Not available

Not available

Yes

Whole of population

Yes

Whole of population

Linkage rates by jurisdiction

Generally, linkage results depend on the accuracy and completeness of the linkage variables provided to the AIHW: more accurate and complete data result in better linkage rates. For more information on how the data are linked, please refer to the above section on Data and methods.

Figure 2 shows the number of records that were linked and those that were unable to be linked by state and territory. For most jurisdictions, linkage rates have generally remained the same or improved slightly, where over 90% of records supplied for the project were linked in Versions 1, 2, 2.5 and 2.6.

The exceptions are Tasmania, where the proportion of linked cases fell from 99% in Version 2 to 89% in Version 2.5 and 2.6, as well as South Australia where the proportion of linked cases fell from 95% in Version 2.5 to 91% in Version 2.6. This is due to the notable increase in Tasmanian COVID-19 cases from Version 2.5 (243) to Version 2.6 (282,277) and the high proportion of cases with missing address information, particularly for probable cases; for example, about 67% of cases were missing the ‘city’ variable. Similarly, there is a notable increase in South Australian COVID-19 cases from Version 2.5 (116,668) to Version 2.6 (784,635), and a higher proportion of cases with missing address information. See the Data and methods section for more information about the linkage method used.

There were also increases in the number of records supplied from New South Wales and Queensland between Version 2.5 and Version 2.6. The linkage rate fell slightly, from 98% to 96% for New South Wales and from 96% to 95% for Queensland.

The COVID-19 Register will not be updated beyond Version 2.6. The next step will be the integration of the COVID-19 Register with the National Health Data Hub (NHDH). For more details refer to Future developments.

Figure 2: Number of records and percentage linked by jurisdictions across versions

The segmented horizontal bar chart compares the linkage rates for participating jurisdictions for Versions 1, 2 and 2.5. For most jurisdictions, linkage rates have generally remained the same or improved slightly, where over 90% of records supplied for the project were linked in Versions 1, 2, 2.5 and 2.6.

Linkage rates by population groups

Table 2a and 2b describe the linkage rates by age group and sex/gender. Table 2a shows that the linkage rate largely improved for Version 2.5 compared to Version 1, where the linkage rate for all groups was well over 90%, except the ‘Other’ sex/gender category. This has remained similar for Version 2.6. Sex is one of the key variables used to link records, therefore, where sex is not reported consistently, or as neither male nor female (‘Other’ in Table 2a below) linkage rates are lower. The linkage rate for ‘Other’ considerably improved from 3% in Version 1 to about 57% in Version 2.6, though the linkage rate remains lower than males or females. The linkage rate was well over 90% across the age groups, except for those aged 16 to 29 in Version 2.6.

Table 2a: Number of records and percentage linked by sex/gender across versions of the COVID-19 Register
Sex/gender2

Version 11

Version 2

Version 2.5

Version 2.6

Male

125,673 (96.4)

3,061,544 (97.6)

3,403,063 (96.8)

4,426,843 (91.5)

Female

125,075 (97.2)

3,419,774 (97.8)

3,836,445 (97.0)

5,084,690 (92.6)

Other3

73 (3.0)

13,801 (77.4)

18,433 (77.5)

23,504 (57.5)

Table 2b: Number of records and percentage linked by age groups across versions of the COVID-19 Register

Age group4

Version 1

Version 2

Version 2.5

Version 2.6

0-15

47,241 (96.6)

1,159,686 (96.8)

1,281,183 (95.8)

1,671,214 (94.0)

16-29

73,074 (95.1)

1,483,651 (96.9)

1,638,564 (96.0)

2,121,296 (86.2)

30-49

79,326 (95.9)

2,128,489 (98.2)

2,373,946 (97.4)

3,121,457 (92.2)

50-69

39,433 (96.9)

1,266,328 (98.6)

1,430,191 (97.9)

1,896,536 (95.8)

70+

11,747 (95.9)

456,920 (98.0)

534,007 (96.8)

724,329 (94.1)

Notes:

  1. Results for Version 1 (released on 16 December 2022) are based on those participating states and territories as detailed in Figure 2 and will not be directly comparable to the figures in the previously released web report ‘Establishing a COVID-19 linked dataset’ which also includes Victoria.
  2. As reported by the state and territory.
  3. Other includes records where sex or gender is not reported, or sex is reported as neither male nor female.
  4. Age group is based on age as at 31 December 2022. Records with missing information on birth date are excluded. Person IDs with more than one year of birth and/or sex were restricted to the most recent notification date (only small number of records were affected). Where the notification dates were equal, a random record was used.

Additional data on linkage rates by population groups are available in the supplementary tables.