Data and methods

Ethics approvals

The project has obtained ethics approval from the AIHW Ethics Committee, and additional approval from the Human Research Ethics Committee of Northern Territory Department of Health and Menzies School of Health Research, and the New South Wales Population and Health Services Research Ethics Committee (NSW PHSREC).  A National Mutual Acceptance Scheme led by NSW PHSREC is in place for the Australian Capital Territory, South Australia, Tasmania, and Victoria. A data disclosure agreement with Queensland was established for the purpose of this project.

In addition to the ethics approvals outlined above, approval has also been sought from the data custodian of each state/territory or national dataset.

How was the data linked?

As a Commonwealth Accredited Data Service Provider, the AIHW has the expertise and infrastructure to undertake complex national data linkage.

COVID-19 case linkage variables (names, addresses, dates of birth and sex) provided by jurisdictions to AIHW were probabilistically linked to AIHW National Linkage Spine (NLS). AIHW NLS combines linkage variables from Medicare Consumer Directory (MCD), National Death Index (NDI), Australian Immunisation Register (AIR) and covers almost all of the population of Australia. Probabilistic record linkage is a data linkage method that makes an explicit use of probabilities to determine whether a pair of records is a match for the same person, or not. Records are matched by name, sex, address and date of birth.

Analytical information on COVID-19 cases from states and territories and the Commonwealth Department of Health and Aged Care National Notifiable Disease Surveillance System (NNDSS) has been combined with information from the NDI , Medical Benefits Schedule (MBS) and Pharmaceutical Benefits Scheme (PBS, including Repatriation Schedule of Pharmaceutical Benefits (RPBS) information), the National Hospitals Morbidity Database (NHMD), the National Non-Admitted Patient Emergency Department Care Database (NNAPEDCD), the National Aged Care Data Clearinghouse (NACDC), the Australian and New Zealand Intensive Care Society (ANZICS) and the AIR to create a de-identified linked research data set. Figure 1 outlines the linkage processes for the current version of the project (Version 2).

After both the initial and re-linkages, date of death and cause of death information from the NDI is released to the states and territories that provide the original notifiable disease data, for incorporation into their local notifiable disease systems. The aim of this is to improve NNDSS data completeness and utility, in a nationally consistent way, and add to the research potential of both the state and territory collections and the NNDSS.

The AIHW data linkage protocols prescribe strict separation of identifiers and analytical data within the AIHW linkage team, so that where staff have access to personal identifiers and analytical data for study participants, they will not have access to the identifiers and analytical data at the same time for the duration of the project.

Figure 1: COVID-19 linked data flow

The figure shows the flow of data to be linked in the project. Two boxes, one from the states and territories and the other from the Department of Health and Ageing are pointing to AIHW, showing the flow of COVID-19 case information, and content information from the NNDSS respectively. Subsequent boxes show how each of the datasets is added on to create a de-identified linked data, stored in a secure access environment. A feedback loop shows linked deaths data being returned to the jurisdictions after linkage.

A longitudinal resource for COVID-19 cases

The project aims to re-link information periodically to include more recent case data, more jurisdictions, and more data sources where available. For deaths data, Australian Bureau of Statistics coded cause of death information will be incorporated as it becomes available. Updating the content regularly provides a growing longitudinal resource for COVID-19 cases and allows research into the patients’ health journey over time. Feeding data regularly back to the states and territories who provided the original notifiable disease data will also be done to enhance data completeness of their notifiable disease systems.