The natural log of hourly wages (ln_wages) was the dependent variable in the decompositions. The HRC dataset contains annualised salary. The conversion on salary was performed as:
The salary information comes from the employing department, and is the employee's annual base salary on the payroll system. This means that "above base" earnings that can be expected to change from one pay period to another, such as overtime and "at risk" pay, are omitted from the annual salary figures for all employees. The denominator (2088) is based on a 40-hour working week multiplied by 52.2 weeks (the average number of weeks in a year). The same denominator has been applied to all observations because the decompositions examine full-time employees only.
The dataset for the 2000 and 2001 collections contained 191 and 194 predictor variables respectively. The difference in predictor variables was due to a slightly different specification of 131 occupational indicator variables and the addition of the Serious Fraud Office as an indicator variable in the employer category. The same reference categories were used for the 2001 decomposition.
Eight categories of predictor variable were included in the decompositions:
1. Age was included as two continuous variables, encompassing a linear form and a quadratic (age2) form. Refer to Appendix C for a detailed analysis of the effect of age on ln_wages, using the 2000 dataset.
2. Ethnicity was included as five indicator variables, where ethnicity was defined using a priority classification system 17 . The indicator variables were: NZ Maori, Pacific peoples, Asian, Non-New Zealand European, and Other. The reference category is New Zealand European/Pakeha.
3. Occupation was included as 129 (2000 decomposition) or 131 indicator variables (2001 decomposition) 18 . While there were a higher number of occupations in the analysis dataset at the 5-digit NZSCO level, the decompositions omit the indicator variables containing only males or only females (refer to Appendix B for more information). The reference category is "Careers, Transition and Employment Adviser" (NZSCO code = 33511), an occupation not limited to the New Zealand Public Service, and one which represents 1.5% of the dataset and has a female proportion of 52.2% (2000 dataset analysis).
4. Tenure was based on tenure with current employer, and was included as a continuous variable. Tenure was moderately correlated with age (0.45) and age2 (0.44), p < .001 for both correlations (2000 dataset analysis).
5. Region was included as 16 indicator variables: Northland, Auckland, Waikato, Bay of Plenty, Gisborne, Hawkes Bay, Taranaki, Manawatu-Wanganui, West Coast, Canterbury, Otago, Southland, Tasman, Nelson, Marlborough, and Overseas. The reference category is Wellington.
6. Employer was included as 36 indicator variables. The reference category is the Ministry of Pacific Island Affairs, which has a female proportion of 48%, has an average salary close to the Public Service mean, and has a limited number of different occupation groups (2000 dataset analysis). To protect the identity of departments, the department names have been anonymised into the three categories (Small, Medium and Large based on employee numbers). For example, the variable "Small 1" indicates the first small department in the model.
7. Term of employment was included as one indicator variable for fixed-term employment 19 . The reference category is open-term employment.
8. Type of employment agreement was included as one indicator variable for individual agreement. 20 The reference category is collective agreement, and includes employees on expired collective agreements.
17 The HRC dataset contains up to two ethnicities per employee. The standard ethnicity prioritisation was used so that the ethnicity indicator variables in the model consisted of mutually exclusive groups. The priority system, in descending priority order, is: NZ Maori, Pacific peoples, Asian, Non-NZ European, Other, NZ European. The only alteration that has been made to the normal use of the system is that a NZ European/Other European combination has been coded to NZ European instead of Non-NZ European.
18 A reviewer suggested that occupation and earnings are simultaneously determined within the labour market. In the case of simultaneous determination (i.e., occupation is endogenous to wage determination), occupation should be excluded from the decompositions. I decomposed the 2000 dataset, excluding the occupation indicator variables, to test this hypothesis. The reduced versions of the male and female models produced lower adjusted R2 values (52.3% and 37.9% respectively, compared to 71.1% and 65.7% in the full models), and the explained gender pay gap was reduced to 50.9% compared to 71.4% in the full decomposition. These results are the basis for retaining occupation in the decompositions. A discussion of indirect versus direct discrimination in occupation is beyond the scope of this study.
19 A fixed-term contract has a specified end date to the employment, whereas an open-term contract does not.
20 An individual agreement is made between an employee and the employer. A collective contract has multiple employees and one employer as parties.