• Aucun résultat trouvé

Changes in the life course trajectories around the move

Methodological axes

Chapter 2. Data and Methods

2.5 Methods of analysis

2.5.9 Changes in the life course trajectories around the move

Another important aspect of this research is the investigation of the changes which might have occurred in the private and professional trajectories around the time of a move to the peri-urban area. Furthermore, it is of interest to find out if the peri-urban area attracts people with different characteristics compared to other spatial types. The former question is addressed by analysing the event trajectories instead of state trajectories and the latter question with Event History Analysis (discrete-time (multinomial) logit) and choice models (multinomial logit).

The transformation of the trajectories into events, where changes from one state to another are registered instead of one state per year (Gabadinho et al. 2011a), allows the analysis of the changes that take place in the private and professional trajectories. For the residential trajectories the period spent outside the peri-urban area is regrouped as non peri-urban in order to observe the event of moving to the peri-urban area. With the trajectories in event form the change to the peri-urban area in the residential trajectories can be compared with changes in the private and professional trajectories.

Furthermore, it possible to investigate when these changes take place compared to the change in residence. To this end, we observe if a change in those dimensions of the life course occurs either at the same time of a move to the peri-urban area, or slightly before or after (within a 5-year window). This is based on the hypothesis that within this timespan people adapt to a change that has occurred in their life (for example a marriage, a change in employment or the arrival of a child) or anticipate on changes to come, as the arrival of children (see for example Speare and Goldscheider 1987; Kulu 2008; Courgeau 1985;

Courgeau 1989).

The total number of each specific life course changes which occur around all the moves that are observed in the dataset (i.e. moves in general)

31 For these calculations the ‘nnet’ package in R is used (Venables and Ripley 2002).

are added together and divided by the total number of people in the respective datasets in order to arrive at the percentages of people who experience a certain event within the window (this because all the individuals in the datasets are at risk of experiencing a move).

The total number of changes that occur around a move towards the peri-urban area are also added together, yet divided by the total number of individuals exposed to this specific type of move. Only these latter individuals are at risk of experiencing a change around a move to the peri-urban area. The comparison between changes occurring before or after moves in general and moves to the peri-urban areas provides an insight into whether specific life course events are related to a move to the peri-urban area or not. To investigate the significance of the difference of these changes before and after the move, confidence intervals have been added (Sison and Glaz 1995).

In addition to the events data, graphs showing the changes five years either side of the move are shown. For this the private and professional trajectories are aligned upon the time they have moved to the peri-urban area.

The people that do not show a change are omitted from these graphs to put the focus on the changes observed.

Discrete time logit

The analysis of whether the peri-urban area attracts a specific type of population compared with other territorial types; and hence if this type of space is perceived differently, is investigated in three steps.

It is first of all of interest to look at what makes people move in general between territorial types, in order to understand who the Swiss internal migrants are. Second, it is of interest to discover what makes people move specifically to the peri-urban area. Lastly, it is of interest to examine, among the people that move, what makes them chose the peri-urban area over another type.

The first two questions are investigated with Event History Analysis (EHA) also known as survival analysis. This entails the “measurement of the entrance into some initial state until the attainment of some final state”

(Blossfeld et al. 1989:30), for example from alive to dead (life expectancy), or from not having a child to having a child. In the second case, for instance, the dataset includes a starting time (e.g. 15th birthday, or union formation) and the birthdates of the children, which allows computing the waiting time until the event. Additionally, if we want to know what causes these events, explanatory variables might be included, either as constant (sex) or time varying covariates (income) (Allison 1982). In this research we want to measure the risk of moving from outside to inside the peri-urban area and the characteristics associated with this event.

Within the EHA many different models exist (Figure 2-7) which all make more or less strong assumptions on the shape of the risk (or hazard rate) over time. As we do not want (can) make such assumptions for this research, at first glance the semi-parametric proportional hazard model (Cox) appears suited as it allows the investigation of the “influence of the covariates without additional assumptions about the time dependency” (Blossfeld et al. 1989:142). This model works very well if the exact point in time at which the events occur is known (i.e. the time is “continuous”), which is often not the case for panel data.

The SHP and SHP Pilot datasets record only information for each year, and

many people may thus appear to experience an event at the same time. The number of the so called ties (the number of events occurring at the same time) is thus high with panel data and hence continuous time methods such as Cox’s should not be used (Blossfeld et al. 1989). In general, continuous methods run into problems when the time units are large, such as months, years or decades (Allison 1982).

In the case of panel data, with yearly intervals, discrete-time models are, in consequence, more appropriate to assess the occurrence of the final state, i.e. the move to the peri-urban area (Blossfeld et al. 1989). The advantage of this model is also that time varying explanatory variables can be easily included (Allison 1982). These models treat the state of the individuals as a dichotomous variable indicating at each moment in time whether the person has experienced the event or not. Such variables are usually analysed with logistic (logit) models, which are part of the Generalised Linear Model (GLM) spectrum (Carter Hill et al. 2008:425).

Discrete-time logit models can be used as EHA by reorganising the dataset into a longitudinal form know as person-period files. This type of model can thus be both conceived as a GLM and an EHA model. One can thus say that it is situated at the overlap between the GLM and EHA families (Figure 2-7). In its longitudinal configuration, the logit model is diverted from its initial application on dichotomous outcomes thanks to a reorganisation of the data structure. Indeed, in order to conduct this type of analysis all trajectories are transformed into person period datasets; for every moment in all trajectories of all individuals we know where the individuals lives and what his or her socio-demographic characteristics are at this point in time.

For these discrete-time logit models32 the focus is placed on the SHP dataset instead of the SHP Pilot. Within the former dataset it is possible to select a larger group of individuals than that has been used until now for the calculations with this dataset (N=2391). A larger sample will create more observations per variable category, which in turn produces more robust results.

With a few criteria the individuals have been selected from the SHP dataset. First, all individuals with more than two entries for all the indicated variables (see section 2.2.1) have been selected. This has resulted in a group of just over 7700 individuals. Their trajectories have been transformed into person-period trajectories.

Second, within this person-period file all individuals which are 18 years of age or older in 1999 have been retained (this in order to omit the majority of individuals that still live at the parental home). Last, all the empty person-periods (the years for which no information exists in the SHP for the individuals) have been removed from the data file (N=2894).

32 For these calculations the ‘stats’ and ‘mlogit’ packages in R are used (Croissant 2013; R Core Team 2013).

Figure 2-7: A schematic representation of the Generalised Linear Models and Event History Analysis33

This person-period dataset is then subjected to a discrete-time logit model to examine the first question; the characteristics which makes people move in general (1 = move / 0 = no move (ref)). Question two is looked into in the same manner, with the same dataset. The people that do not move to the peri-urban area (move = 0 (ref)) are compared with those that move to the peri-urban area (move = 1) (see appendix E for the number of people per category).

Question two is also examined for Saint-Cergue and Palézieux (N=522). However, this time we use a discrete-time multinomial logit (Steele et al. 2004). In order for the estimates to be comparable (Mood 2010), the two localities have been placed in one model. In this model we consider the no move (move = 0 (ref)), the people that move to Saint-Cergue (move = 1) and the people that move to Palézieux (move = 2). For the SHP Pilot these analyses have not been executed due to the low number of people moving to the peri-urban area (N=85).

For the investigation of the Saint-Cergue and Palézieux trajectories, a small change has been made to these trajectories. It is the aim to investigate the characteristics upon arrival to the peri-urban area, therefore we need to observe an arrival in the peri-urban area in the residential trajectory. As we cannot be sure that the first residence in the trajectory is indeed the first year of residence in the peri-urban area, we omit this residence from the residential trajectory (see appendix E for the number of people per category).

33 This figure contains a selection of methods within these two large statistical methods and does thus not contain an exhaustive list.

Multinomial logit

For question three the same person-period dataset is used, but the approach of analysis is slightly different. In order to compare the moves towards the peri-urban area with those to other areas, a multinomial logit is used34. Here only the moves are used and not the entire trajectories as with the discrete time (multinomial) logit. This method allows for the peri-urban area to be the reference category to which the other territorial types can be compared. This thus allows for a clear comparison between people who chose to move to the peri-urban area and those who chose other territorial types35.

This analysis is only conducted on the SHP dataset, as only here a reasonable amount of moves exist for the separation in moves to the different territorial types. Still, the total number of moves is relatively small (N=716) and the results might be slightly affected by this.

Due to this relatively small number of moves, a few categories have been grouped together for the model to converge. The 50-59 and 60+ year age groups have been merged. In addition, the youngest children in the household categories ‘0’ and ‘1-4’ have been merged. For the employment and family variables three categories remain: full-time, part-time and other (i.e. inactive, study and other) and couple, nuclear and other (i.e. alone, complex and single parent).

Six residential types have been examined: the centres (23 per cent or the moves), the suburban area (25 per cent or the moves), the peri-urban area (ref; 16 per cent or the moves), the rural-commuter area (11 per cent or the moves), the rich area (5 per cent or the moves) and the remaining areas outside the agglomeration (20 per cent or the moves). It is recognised that the rich area does not contain a very large number of moves, however it has been included as a separate group due to the fact that it shows quite similar results to the peri-urban area (see appendix E for the number of people per category).

Variables used in the methods

For the calculations based on the SHP, Saint-Cergue and Palézieux datasets, a common group of characteristics have been looked into, where the focus lies on the private and professional characteristics. In all the models the sex, age, family type and occupation type are included. For the multinomial logit with the SHP data some categories have been grouped together due to low effectives:

single-parent with complex (private variable) and inactive with other occupation (professional variable).

In these calculations the age of the individuals is looked into instead of the year of birth in order to focus more on the age the people have at the arrival in the peri-urban. Five age groups, which could be considered to relate to different life course stages, are distinguished. Very similar age groups are used in research by Levy et al. (2013). The family and professional variables in the models are the same as used throughout the research.

For the SHP, advantage is taken of the larger size of the dataset to test the effect of a few more variables. Besides the aforementioned variables,

34For these calculations the ‘mlogit’ package in R is used (Croissant 2013).

35 It is important to stress that this is not an EHA model anymore since only the movers are considered and the other person-periods are removed from the dataset.

income, education and the youngest child in the household are considered in order to have a larger view on the socio-economic characteristics of the migrants (Table 2-2).

The highest attained education is included to find out if more highly educated people are more likely to make the move to the peri-urban area, as higher educated individuals might have a higher income which allows them to move to this area. The education groups have been aggregated those who only have finished the obligatory education until the age of 15, those that have continued non university post-obligatory schooling (e.g. ‘CFC’, ‘ECG’) and those that have continued their post-obligatory schooling on a university level36.

As the presence of children in the household might complicate a move it is looked into if the presence of children in the household has an effect on moving. New babies and children before the school going age have been selected as a category, as well as those in primary and secondary education.

Those households that do not have any children or children over the age of 18 are included in the category ‘none’. While household type is already included in the model, this variable has been added because they do not look at exactly the same aspect. This variable will make clear how the families are constructed, with (very) young children, older children or no children, and allow situating the household in its life cycle. From the literature it is know that moving with school going children is more difficult.

The socio-economic characteristic of income is added to the calculations in order to find out if a higher revenue has an impact on the move, as a high demand for housing in these areas might drive up the prices. The income presented here corresponds to yearly net household revenues that are standardized by the number of household members with the SKOS method (Swiss Conference for Social Welfare) (Swiss Household Panel 2012b). With this method every member of the household is assigned a value in proportion to its needs, often based on the number of household members and their age. The total household income is then divided by the sum of the values attributed to all members to obtain a standardised income (OECD 2014; Guggisberg et al.

2013). In the analysis three income groups are used. An annual income of less than 50’000, which is the median value, is considered as the lower income group. Those who earn between 50’000 and 100’000 are considered the middle income group and those who earn more than 100’000 are the high income group.

The variable regarding income has a considerable number of missing values. As this is quite an important variable, it has been tried and diminish the number of missing values using imputation by interpolation. It has been decided to only estimate values in the middle of sequences and not at the beginning or the end. In the middle we have surrounding values on which the estimation can be based, which we do not have for the ones in the beginning and end.

It is first of all determined for every individual if there are missing values or not. If this is the case it is determined if the missing values are located in the beginning or end. When the latter is the case no attempt is made to impute

36 The ISCED code for employment that exists for Switzerland has not been applied here as this creates too small groups, therefore only three groups of education are used.

these missing values. Secondly, the missing values which occur in the middle of the sequences are replaced with an interpolated value.

The manner in which these values are interpolated is represented with an example in Figure 2-8. This individual has no residential information at time 10 and 11 (the pink lines) so these two periods are not included in the person-period file or in the interpolation of the missing income values. For time 2 and 3 there is no information on income (blue lines). After this, all the points of income are plotted and a line is drawn based on the known observations and a spline smoothing method37. This black line also crosses with the blue lines at time 2 and 3. From this intersection (the purple crosses) replacement values for the missing values are derived.

The imputation of the income data has diminished the number of missing values, especially those in the move in general model. Within the total person period dataset the number of missing decreased by 25 per cent.

Figure 2-8: An example of the interpolation method for the missing values in the income variable

Source: Swiss Household Panel (SHP), calculations by author