Comparisons of cancer survival rates over time may be affected by changes in the age composition of those diagnosed. For example, if more older people are diagnosed with cancer over time and older people have lower survival rates, improvements in survival over time may be offset by the increasingly older age of people diagnosed with cancer.
In order to calculate age-adjusted survival we first choose a fixed period, called the base period, and take note of the age composition of the people who were diagnosed with the cancer of interest during that period. We calculate age-adjusted survival for other periods by assuming that the age composition of patients in the other period is the same as that of the base period. Thus the age-adjusted survival is effectively the survival that would have occurred had there been no change in age composition from the base period.
Age-adjusted survival is different to age-standardised survival. An age-standardised rate uses the same standard population for all cancers and sexes (including persons). Using a standard population allows meaningful comparisons between different cancers, sexes and across time. In contrast, age-adjusted survival rates use a population relevant to the specific cancer (or cancer group) and sex to allow meaningful comparisons across time. Age-adjusted survival rates are only intended to enhance the understanding of how survival has changed over time for the specific cancer and sex and are not directly comparable with other cancers or sexes.
CdiA does not currently report on age-standardised survival but future releases are expected to contain age-standardised survival rates.
Age-standardised rates (ASR)
A crude rate provides information on the number of, for example, new cases of cancer or deaths from cancer by the population at risk in a specified period. No age adjustments are made when calculating a crude rate. Since the risk of cancer heavily depends on age, crude cancer incidence and mortality rates are not as suitable for looking at changes over time or making comparisons between different population groups if there are differences in those populations’ age structures.
More meaningful comparisons can be made using ASRs, with such rates adjusted for age in order to facilitate comparisons between populations that have different age structures – for example, between Indigenous Australians and other Australians. This standardisation process effectively removes the influence of age structure on the summary rate.
There are two methods commonly used to adjust for age: direct and indirect standardisation. In this report, the direct standardisation approach presented by Jensen and colleagues (1991) is used. To age-standardise using the direct method, the first step is to obtain population numbers and numbers of cases (or deaths) in age ranges – typically 5-year age ranges. The next step is to multiply the age-specific population numbers for the standard population (in this case, the Australian population as at 30 June 2001) by the age-specific incidence rates (or death rates) for the population of interest. The next step is to sum across the age groups and divide this sum by the total of the standard population to give an ASR for the population of interest. Finally, this is expressed per 100,000 population in this report.
Age-specific rates provide information on the incidence of a particular event in an age group relative to the total number of people at risk of that event in the same age group. It is calculated by dividing the number of events occurring in each specified age group by the corresponding ‘at-risk’ population in the same age group and then multiplying the result by a constant (for example, 100,000) to derive the rate. Age-specific rates are often expressed per 100,000 population.
Australian Cancer Database
All forms of cancer, except basal and squamous cell carcinomas of the skin, are notifiable diseases in each Australian state and territory. This means there is legislation in each jurisdiction that requires hospitals, pathology laboratories and various other institutions to report all cases of cancer to their central cancer registry. An agreed subset of the data collected by these cancer registries is supplied annually to the AIHW, where it is compiled into the ACD. The ACD currently contains data on all cases of cancer diagnosed from 1982 to 2018 for all states and territories with the exception New South Wales death-certificate only cases for 2018 and late registrations; death-certificate-only cases and late registrations are estimated in the ACD.
Cancer reporting and registration is a dynamic process, and records in the state and territory cancer registries may be modified if new information is received. As a result, the number of cancer cases reported by the AIHW for any particular year may change slightly over time and may not always align with state and territory reporting for that same year.
For more information on the ACD please see the ACD 2018 Data Quality Statement.
Estimating death-certificate-only cases for NSW for 2018
If a person’s death certificate states that they had cancer, in most cases the cancer registry already has other evidence of the cancer. However, in about 1.5% of cases, despite the registry’s subsequent enquiries with relevant institutions, the registry is unable to find any other evidence of the cancer. Such cases are called death-certificate-only (DCO) cases.
The New South Wales Cancer Registry was unable to submit its DCO cases for 2018 in time to be included in the 2018 ACD. The AIHW estimated the number of DCO cases for NSW for 2018 by assuming they would be the same as they were in NSW in 2017, stratified by sex, diagnosis age group, topography, histology and behaviour.
Estimating late registrations of cancer for 2018
Late registrations are cases of cancer that have not been registered by the cancer registry by the time the registry needs to submit its data to the AIHW. Almost all late registrations have a diagnosis year equal to that of the most recent year of the ACD, in this case 2018. Experience has shown that late registrations account for about 1% of cases in that year. For example, it is expected that about 1% of cases for diagnosis year 2018 are not part of the 2018 ACD; they will appear for the first time in the 2019 ACD (with a diagnosis year of 2018). The AIHW has made estimates of these cases based on the late registrations for 2017 that appeared for the first time in the 2018 ACD.
The estimated number of late registrations for 2018 is 2,247 cases overall.
International Classification of Diseases for Oncology (ICDO)
Cancers were originally classified solely under the ICD classification system, based on topographic site and behaviour. However, during the creation of the Ninth Revision of the ICD in the late 1960s, working parties suggested creating a separate classification for cancers that included improved morphological information. The first edition of the ICD-O was subsequently released in 1976 and, in this classification, cancers were coded by both morphology (histology type and behaviour) and topography (site).
Since the First Edition of the ICD-O, a number of revisions have been made, mainly in the area of lymphoma and leukaemia. The current edition, the Third Edition (ICD-O-3), was released in 2000 and is used by most state and territory cancer registries in Australia, as well as by the AIHW in regard to the ACD.
National Mortality Database
The AIHW National Mortality Database (NMD) contains information provided by the Registries of Births, Deaths and Marriages and the National Coronial Information System – and coded by the ABS – for deaths from 1964 to 2019. Registration of deaths is the responsibility of each state and territory Registry of Births, Deaths and Marriages. These data are then collated and coded by the ABS and are maintained at the AIHW in the NMD.
In the NMD, both the year in which the death occurred and the year in which it was registered are provided. For the purposes of this report, actual mortality data are shown based on the year the death occurred, except for the most recent year (namely 2020) where the number of people whose death was registered is used. Previous investigation has shown that the year of death and its registration coincide for the most part. However, in some instances, deaths at the end of each calendar year may not be registered until the following year. Thus, year of death information for the latest available year is generally an underestimate of the actual number of deaths that occurred in that year.
In this report, deaths registered in 2017 and earlier are based on the final version of cause of death data; deaths registered in 2018, 2019 and 2020 are based on revised and preliminary versions, respectively, and are subject to further revision by the ABS.
The data quality statements underpinning the AIHW NMD can be found on the following ABS internet pages:
For more information on the AIHW NMD see Deaths data at AIHW.
Throughout this report, population data were used to derive rates of, for example, cancer incidence and mortality. The population data were sourced from the ABS using the most up-to-date estimates available at the time of analysis.
To derive its estimates of the resident populations, the ABS uses the 5-yearly Census of Population and Housing data and adjusts it as described here:
- All respondents in the Census are placed in their state or territory, Statistical Local Area and postcode of usual residence; overseas visitors are excluded.
- An adjustment is made for persons missed in the Census.
- Australians temporarily overseas on Census night are added to the usual residence Census count.
Estimated resident populations are then updated each year from the Census data, using indicators of population change, such as births, deaths and net migration. More information is available from the ABS website.
The 2022 population estimates were sourced from the Centre of Population December 2021 update of the National age and sex structure, 2020–21 to 2031–32 (NOM upside scenario); this source was also used to inform the long-term prostate projections in Cancer data commentary 9.
Limited-duration prevalence is expressed as N-year prevalence throughout this report. N-year prevalence on a given index date – where N is any number 1, 2, 3 and so on – is defined as the number of people alive at the end of that day who had been diagnosed with cancer in the past N years. For example:
- 1-year prevalence is the number of living people who were diagnosed in the past year to 31 December 2017
- 5-year prevalence is the number of living people who were diagnosed in the past 5 years to 31 December 2017. This includes the people defined by 1-year prevalence.
Note that prevalence is measured by the number of people diagnosed with cancer, not the number of cancer cases. An individual who was diagnosed with two separate cancers will contribute separately to the prevalence of each cancer. However, this individual will contribute only once to prevalence of all cancers combined. For this reason, the sum of prevalence for individual cancers will not equal the prevalence of all cancers combined.
Projections - Estimating the incidence of cancer
Please note that no adjustments have been made to the projections to account for the potential impact of COVID.
Estimates of national incidence in 2019–2022 was estimated by projecting the sex- and age-specific incidence rates observed in Australia during 2009–2018. The time series were stratified by the following variables:
- 5-year age group (0–4, …, 85–89, 90+)
- 4-character ICD-O-3 topography code (C00.0, …, C80.9)
- 4-digit ICD-O-3.1 histology code (8000, …, 9992).
For each time series, the process was as described below:
- least squares linear regression was used to find the straight line of best fit through the time series
- if the slope was positive, the straight line of best fit was extrapolated to obtain the estimate of the 2019 rate
- if the slope was negative, the time series floor was set to 0
- the estimated incidence rates for 2019 were then multiplied by the Estimated Resident Populations for 2019 to obtain the estimated incidence numbers.
Note the following:
Projections - Estimating the mortality of cancer
Please note that no adjustments have been made to the projections to account for the potential impact of COVID.
This method is the same as the incidence projections with the exceptions that:
- the 10-year baseline for incidence is 2009–2018 while the baseline for mortality from the NMD is 2011–2020 and the baseline for mortality from the ACD is 2008–2017.
Relative survival is a measure of the survival of people with cancer compared with that of the general population. It is the standard approach used by cancer registries to produce population-level survival statistics and is commonly used as it does not require information on cause of death. Relative survival reflects the net survival (or excess mortality) associated with cancer by adjusting the survival experience of those with cancer for the underlying mortality that they would have experienced in the general population.
Relative survival is calculated by dividing observed survival by expected survival, where the numerator and denominator have been matched for age, sex and calendar year.
Observed survival refers to the proportion of people alive for a given amount of time after a diagnosis of cancer; it is calculated from population-based cancer data. Expected survival refers to the proportion of people in the general population alive for a given amount of time and is calculated from life tables of the entire Australian population. (Ideally these life tables should be restricted to the population of Australians who do not have cancer but such life tables are unavailable. It is standard practice around the world to use life tables for the entire population.)
A simplified example of how relative survival is interpreted is shown in Figure G1. Given that 6 in 10 people with cancer are alive 5 years after their diagnosis (observed survival of 0.6) and that 9 in 10 people from the general population are alive after the same 5 years (expected survival of 0.9), the relative survival of people with cancer would be calculated as 0.6 divided by 0.9, which is 0.67. This means that individuals with cancer are 67% as likely to be alive for at least 5 years after their diagnosis as are their counterparts in the general population.
Figure M1: Simplified example of how relative survival is calculated
The survival statistics in this report were produced using a modified version of a SAS program written by Dickman (2004) and employed the period method (Brenner and Gefeller 1996) with 1-year intervals. Observed survival was calculated from data in the ACD. Expected survival was calculated using the Ederer II method whereby matched people in the general population are considered to be at risk of death until the corresponding cancer patient dies or is censored (Ederer and Heise 1959).
Calculation of conditional relative survival
Conditional survival is the probability of surviving j more days, given that an individual has already survived i days. It was calculated using the formula:
S(j|i) is the probability of surviving at least j more days given that the person has already survived at least i days
S(i + j) is the probability of surviving at least i + j days
S (i) is the probability of surviving at least i days
Confidence intervals for conditional survival were calculated using a variation of Greenwood's (1926) formula for variance (Skuladottir & Olsen 2003):
dk is the number of deaths
rk is the number at risk during the kth interval.
The 95% confidence intervals were constructed assuming that conditional survival estimates follow a normal distribution.
We use 19 age groups, numbered 1 to 19. Age group i (i = 1 to 18) is 5 years wide and comprises all ages in the interval (5i - 5, 5i). Age group 19 comprises all ages 90 and above. The cancer under consideration is referred to as “the cancer”. This could be a specific cancer, a group of related cancers or all cancers combined. There are two different measures of risk, one adjusted for competing mortality and one not adjusted. For brevity, these are called the adjusted risk (AR) and unadjusted risk (UR). The full notation is as follows, where D is for diagnosis and M is for mortality.
ARD(5i) = adjusted risk of being diagnosed with the cancer before age 5i (i = 1 to 18),
ARD(∞) = adjusted lifetime risk of being diagnosed with the cancer,
ARM(5i) = adjusted risk of dying from the cancer before age 5i (i = 1 to 18),
ARM(∞) = adjusted lifetime risk of dying from the cancer,
and similarly for URD and URM.
For each age group i, the following three rates are used in the risk formulas.
Di = rate of first ever diagnosis of the cancer (the first in one's life, not the first in age group i) ,
Mi = rate of death from the cancer ,
Ai = rate of death from all causes (including the cancer) ,
Note that the denominator of Di is the general population, not the population of people who have never been diagnosed with the cancer.
Risk not adjusted for competing mortality
As this measure of risk is not adjusted for competing mortality, the formulas are relatively simple and do not involve Ai. The formulas come from Day (1987).
URD(5i) = , i = 1, 2, ..., 18
URD(∞) = 1.
URM(5i) = , i = 1, 2, ..., 18
URM(∞) = 1.
Note that the lifetime risks are necessarily 1. Not adjusting for competing mortality is equivalent to the scenario where it is impossible to die of any cause other than the cancer. Hence every person must eventually be diagnosed with the cancer and eventually die from it. This is why it is not informative to report unadjusted lifetime risks.
Risk adjusted for competing mortality
The formulas in this section come from Fay et al. (2003). The risk of diagnosis is as follows.
The formula for risk of death is the same as above except that Mi replaces Di throughout.
Use of a proxy to calculate risk of diagnosis
In order to calculate the risk of diagnosis we need the age-specific rates, Di, at which people are being diagnosed with the cancer for the first time in their lives. This requires knowledge of each person’s cancer history from birth. As the Australian Cancer Database (ACD) starts from the beginning of 1982, this is impossible for most age groups and will remain impossible for many decades to come. In order to estimate the risk of diagnosis we need a satisfactory proxy for Di.
The best available estimate of Di is obtained by using the entire history of the ACD. That is, instead of counting first ever diagnoses (which is impossible) we count “first from 1/1/1982” diagnoses. However, using such an estimate would mean that we couldn’t produce a consistent time series of risks. This is because each estimate in the time series would be based on a different amount of “lookback time” for previous diagnoses. The estimate in 1982 would be based on at most one year of lookback time, the estimate in 1983 would be based on up to two years of lookback time, and so on.
In order to enable the production of a time series of risks, the AIHW has chosen to use a lookback time of up to one calendar year for both the adjusted and unadjusted risks of diagnosis. That is, for the year for which the risks are being calculated, lookback goes back to the 1st of January of that year. Using this method we are in fact counting the number of people (not cancers) diagnosed in the year under consideration, irrespective of whether they have been diagnosed with the same cancer in a previous year. AIHW analysis has shown that this method provides a satisfactory estimate of Di, except for the group “all cancers combined”. No suitable period of lookback time was identified for this group. As such, AIHW does not produce a time series of risk of diagnosis for all cancers combined. However, the best available estimate for the latest year of data available is produced. This estimate is based on lookback to the beginning of 1982. Based on the analysis referred to above, this estimate is likely to be a few percentage points higher than the true value.