Appendix C: Statistical methods for country/region of birth report

Incidence rates

Age standardised incidence rates – WHO and Australian standard populations

Age-standardisation is a mathematical process that effectively removes the influence that a population’s age structure has on the crude rate, thus producing rates that can be compared fairly. In order to carry out age-standardisation, a so-called standard population must be chosen and adhered to. The population used in this report is the World Health Organisation (WHO) World Standard Population (estimates are also available using 2001 Australian Standard Population from the accompanying interactive data visualisation and Excel worksheet in the Data section). Then the age-specific rates that were calculated from the study population are applied to the analogous age groups in the standard population. This yields the number of cases (or deaths) in each age group that would have occurred in the standard population if it was subject to the same rates as those experienced by the study population. Then all those cases (or deaths) are added up across the age groups to arrive at the total number of cases (or deaths) that would have occurred in the standard population. Finally, that number is divided by the total size of the standard population. This final figure is the age-standardised incidence (or mortality) rate of cancer in the study population.

This report uses WHO world age-standardised incidence rates to enable comparisons with international published rates. As noted above, Australian age-standardised incidence rates (standardised to the 2001 Australian Standard Population) are available in this report’s accompanying data visualisation and Excel worksheet in the Data section.

Appendix D contains the standard populations used for this report and accompanying data downloads.

The WHO standard population is a younger population than the 2001 Australian Standard Population (the mean age group is 25 to 29 years of age for the WHO World Standard Population and 35 to 39 for the 2001 Australian Standard Population. Cancer incidence rates are higher in older ages and as the WHO standard population weighs more heavily towards younger ages, the rates it produces are generally lower than the 2001 Australian Standard Population. 

Through comparisons of Australia-born and China-born populations, Table C.1 highlights that the age-standardised population impacts the rates but has very little impact on general comparisons between COBs. Irrespective of which standard population is chosen, incidence rates for those born in Australia are higher for most of the selected cancers. Lung cancer incidence comparisons if treated precisely may be interpreted differently depending on which standard population is used. Given the uncertainties in the data, the incidence rates should not be considered as precise. In more general terms, Australia-born and China-born lung cancer incidence rates are quite similar. 

Table C.1: Age-standardised incidence rate comparisons for selected cancers, people born in China or Australia, 2006–2020
CancerWorld Health Organisation
World Standard Population1
(cases per 100,000 people)
Australia 2001
standard population2
(cases per 100,000 people)
 

China

Australia

China

Australia

Breast cancer (females)

64.4

100.9

80.6

127.7

Prostate cancer (males)

50.0

132.4

72.5

181.6

Colorectal cancer (persons)

30.2

44.2

43.0

62.0

Lung cancer (persons)

29.2

30.8

42.3

44.0

Thyroid cancer (persons)

12.8

9.0

14.9

10.4

Multiple myeloma (persons)

2.7

4.9

3.9

7.0

All cancers combined (persons)

236.2

401.4

321.2

538.2

Notes:

  1. Rates are age-standardised and expressed cases per 100,000 population (that is, per 100,000 females for females, per 100,000 males for males and per 100,000 people for persons).

Source: Australian Cancer Database 2020

Crude incidence rates

The crude incidence rate in a given year is defined to be the number of diagnoses of cancer in that year divided by the total population on 30 June of that year (it is standard to use the mid-year population as a best guess at the average daily population across the whole year).

For a range of years, as used in this report, this is defined as the total number of diagnoses of cancer in those years divided by the sum of the population across those years.

For cancer, these rates are typically quite small and difficult to conceptualise. To simplify communication, the convention is to express cancer incidence and mortality rates per 100,000 males, females or people, as the case may be. For example, instead of saying that the incidence rate was 0.000456 cases per person we say that it was 45.6 cases per 100,000 people.

As the incidence rate of cancer depends heavily on age, crude rates are not suitable for looking at trends or making comparisons between different populations. This is because a population whose average age is relatively young will generally have a lower crude rate than a population whose average age is relatively old simply by virtue of the age difference. More meaningful comparisons can be made by using the age-standardised rate.

Extreme caution is recommended interpreting crude incidence rates. The COB crude rate cancer incidence statistics are highly impacted by the very different age profiles across the COBs. For example, the Australia-born crude cancer incidence rate for 2016–2020 was 573 cases per 100,000 people while for the Greece-born population it was 1,657 cases per 100,000 people. Yet, according to the age-standardised incidence rate comparison, the Greece-born cancer incidence rate was lower than the Australia-born rate (361 compared to 407 cases per 100,000 people). The Greece-born crude rate is higher because the population is much older than the Australia-born population and cancer more commonly occurs in the older population (the median age of the Australia-born population in 2016 was 33.5 years of age compared with 71.0 for the Greece-born population) (ABS 2023).

While not published, crude rates are derivable using the population and cancer case data found within the cancer incidence by country of birth Excel data workbooks (Box 1).

Box C1: Crude incidence rate calculation

The following formula can be used to derive the crude incidence rates:

Rate per 100,000 = (total cases in period) / (average population for period * number of years) * 100,000

Confidence intervals

Confidence intervals and significance testing are not generally used as the basis for discussion within this report. This report often groups COBs/ROBs into those with relatively higher or lower incidence for various cancers. Each of the COBs/ROBs within these groupings has different confidence intervals for their respective age-standardised incidence rates and accordingly, the list of COBs/ROBs they are significantly different to will often vary. For simplicity of discussion, age-standardised incidence rate comparisons alone are the focus of discussion. Confidence intervals (95%) for age-standardised rates are however available in the accompanying data visualisation and Excel tables. 

Imputation of country/region of birth

This report uses incidence rates which include the imputation of COB for records where COB is unknown. Raw cancer incidence rates (that is, those excluding imputed records) are available within the data visualisation and accompanying Excel workbooks. This section provides a general outline of the imputation method used where country of birth was unknown or only the region of birth was provided.

Imputation method

There are 2 types of unknown country of birth used within the imputation method and the method used is dependent on the type.

Type 1: 'Unknown' country of birth may be any country of birth

Records with a status of 'Not stated' or 'Inadequately described" may potentially belong to any country of birth. Accordingly, the imputation method considers the distribution of records by country of birth, 5-year age group, sex, period and cancer type and distributes the unknown countries of birth records in the same proportion as those where the country of birth are known. For example, if 10% of colorectal cancer records for persons aged 40-44 for the 2016-2020 period were of people born in Greece, then 10% of the type 1 unknown country of birth colorectal cancer records, for persons aged 40-44 for the 2016-2020 period will be imputed with a country of birth of Greece.

Type 2: 'Unknown' country of birth may be one of a select number of countries

Some records are coded to a Minor region such as "Southern Europe, nfd" (nfd = not further defined) can only potentially belong to the countries within this region. The imputation method is the same as the Type 1 with the exception that it bases the imputation of unknown records on the distribution of records within the minor region.

Both methods of imputation assume that the distribution of country of birth in the missing records is the same as the complete data. Of course, this may not always be the case. The demographics such as age, gender and socioeconomic status, of people emigrating to Australia vary with both time and country of origin.