Data and methods

Ethics approvals

The project has obtained ethics approval from the AIHW Ethics Committee, and additional approval from the Human Research Ethics Committee of Northern Territory Department of Health and Menzies School of Health Research, and the New South Wales Population and Health Services Research Ethics Committee (NSW PHSREC). A National Mutual Acceptance Scheme led by NSW PHSREC is in place for the Australian Capital Territory, South Australia, Tasmania, and Victoria. A data disclosure agreement with Queensland was established for the purpose of this project.

In addition to the ethics approvals outlined above, approval has also been received from the data custodian of each state/territory or national dataset.

How were the data linked?

As an Accredited Data Service Provider, the AIHW is accredited to provide complex integration, de-identification and secure access to linked data. 

COVID-19 case linkage variables (names, addresses, dates of birth and sex) provided by jurisdictions to the AIHW were probabilistically linked to the AIHW National Linkage Spine (NLS). The AIHW NLS combines linkage variables from Medicare Consumer Directory (MCD), National Death Index (NDI), Australian Immunisation Register (AIR) and uniquely covers almost all the population of Australia. Probabilistic record linkage is a data linkage method that makes an explicit use of probabilities to determine whether a pair of records is a match for the same person, or not. Records are matched by name, sex, address and date of birth. The resulting COVID-19 Register does not, however, contain any identifying information.

Analytical information on COVID-19 cases from states and territories and the Commonwealth Department of Health and Aged Care National Notifiable Disease Surveillance System (NNDSS) were combined with information from the NDI, Medical Benefits Schedule (MBS), Pharmaceutical Benefits Scheme (PBS), the National Hospitals Morbidity Database (NHMD), the National Non-Admitted Patient Emergency Department Care Database (NNAPEDCD), the National Aged Care Data Clearinghouse (NACDC), the Australian and New Zealand Intensive Care Society (ANZICS), the AIR and the National Disability Insurance Scheme (NDIS) to create a de-identified linked research data set. Figure 1 outlines the linkage processes for the current version of the project (Version 2.5). 

After both the initial and re-linkages, date of death and cause of death information from the NDI is released to the states and territories that provide the original notifiable disease data, for incorporation into their local notifiable disease systems. The aim of this is to improve NNDSS data completeness and utility, in a nationally consistent way, and add to the research potential of both the state and territory collections and the NNDSS. 

The AIHW data linkage protocols are based on the Five Safes framework which reinforce management of the privacy and confidentiality of data. These protocols prescribe strict separation of identifiers and analytical data. This means AIHW linkage staff do not have access to the personal identifiers and analytical data at the same time for the duration of the project. 

Figure 1: COVID-19 linked data flow

The flowchart shows linkage of COVID-19 case information and different datasets to create the de-identified COVID-19 Register.

A longitudinal resource for COVID-19 cases

The COVID-19 Register will be updated to include case data up to December 2022 for all jurisdictions. This date corresponds with the relaxation in testing and reporting requirements, which limit the completeness of data after this time. For deaths data, Australian Bureau of Statistics coded cause of death information will be incorporated as it becomes available. Updating the content provides a growing longitudinal resource for COVID-19 cases and allows research into the patients’ health journey over time. Providing data back to the states and territories will enhance data completeness of their notifiable disease systems.