Migration and the Labour Market:Data and Intro to STATA
Prof. Dr. Herbert Brucker
Otto-Friedrich-University of Bamberg
Projectseminar, Meeting May 27 and June 9, 2010
Herbert Brucker Projectseminar
Contents of today’s meeting
1 Repetition of last meeting
Repetition of Borjas (QJE 2003) and Ottaviano/Peri (NBER2006) structural modelA model with imperfect labour markets
2 Outline of the tasks ahead
3 An introduction to the data set
4 STATA: a primer
Herbert Brucker Projectseminar
Repetition of last meeting
Borjas’(QJE 2003) national level regression model
Borjas’ (QJE 2003) production function approach
Ottaviano/Peri (NBER 2006) extension
A model with imperfect labour markets (Brucker/Jahn, SJE2010)
Herbert Brucker Projectseminar
What do we have to do?
1 Review of fundamental contributions in the literature (done?)
2 Getting familiar with the data and data handling with STATA
3 Providing descriptive statistics and making graphs
4 Running the Borjas (QJE 2003) national regression modelwith STATA
5 Getting familiar with the production function approach
6 Estimation of the parameters of the nested CES productionfunction following Borjas (QJE 2003) and Ottaviano/Peri(NBER 2003)
7 Simulation of migration effects using the estimated parameters
8 Presenting the findings
9 Writing the paper
Herbert Brucker Projectseminar
Presentation of the data setDescriptive results
Teil III
Sketch of dataset
Herbert Brucker Projectseminar
Presentation of the data setDescriptive results
A nice data set ...
IAB Employment Sample (IABS)
2 % sample of all employees and unemployed derived fromsocial security recordsprecise information on wages and unemployment spellswe use the 1980 - 2004 period (25 time series observations)we restrict sample to Western Germany (without Berlin)
Herbert Brucker Projectseminar
Presentation of the data setDescriptive results
... with a lot of problems
Foreigners covered only by citizenship, not immigrants bycountry of birth
Many immigrants not covered, distortions though (i)naturalizations and (ii) 2nd and 3rd generation migrants
Wages censored at threshold level of contributions to pensionsystem
Only daily wage information (problem: part-time workers)
Incomplete education information (17% of our cases have noinformation)
Herbert Brucker Projectseminar
Presentation of the data setDescriptive results
How did we address the problems
Identification of foreignersWe treat all individuals as foreigners if they are once reportedas foreign nationals (to control for naturalizations)Identification of ethnic Germans (SSpataussiedler”) byprogramme participation (e.g. special language classes)We do not consider immigration from Eastern Germany(treated as German nationals)
Censored wage information: 5,800 Euro income ceiling (3 % oflabour force)
Imputation of wages above ceiling using the Buttner/Rassler(2008) heteroscedastic single imputation approach
Imputation of missing education information (17 per cent) byusing the Fitzenberger et al. (2005) approach
We exclude all part-time employees due to missing hourlywage information
Herbert Brucker Projectseminar
Presentation of the data setDescriptive results
Share of foreigners in labour force and employment
Abbildung: Share of foreign labor force and workers
Herbert Brucker Projectseminar
Presentation of the data setDescriptive results
Foreigner shares by education group
education 1980 1990 2000 2004
no vocational degree 0.240 0.279 0.394 0.388vocational degree 0.051 0.064 0.106 0.113high school (Abitur)+ vocational degree 0.065 0.058 0.078 0.082university degree 0.071 0.061 0.063 0.071
Herbert Brucker Projectseminar
BasicsData managementData description
Intro to regressions
Teil IV
STATA: A Primer
Herbert Brucker Projectseminar
BasicsData managementData description
Intro to regressions
Basics
STATA consists of a main menue and different editors andviewers
Three types of files:
Do files: Do-Files can be used to run all commands and savethem (very useful, you can always repeat what you have donein the last session)
Data files: e.g. wagcurve.dta
Log files: Report all commands and results of your session(not necessary, but perhaps useful)
Herbert Brucker Projectseminar
BasicsData managementData description
Intro to regressions
Do Files
To open a Do File or create a new Do file: Syntax: doedit
Choose path where you work and your data are:Syntax: cd c: mig
Type your commands into the Do file
Run selected lines of the Do file (Run command in the Do filemenue)
Do entire Do File (Do command in the Do file menue)
Save your Do File after end of session
Hint: It is helpful to describe what you have done. You can dothis by placing words or sentences between stars,e.g. *** Creating Dummy Variables ***
Herbert Brucker Projectseminar
BasicsData managementData description
Intro to regressions
Data management (I)
Getting started: open the data editor. Syntax: edit
A simple way to load data from EXCEL tables: copy allvariables (incl. labels) by control c and insert them in dataeditor by control v (there exist more sophisticated ways)
Preserve and close data editor
Load existing data sets. Syntax: Use c: mig wagecurve.dta,clear
Use command: loads datasetclear command: replaces old data in memory
Herbert Brucker Projectseminar
BasicsData managementData description
Intro to regressions
Data management (II)
Creating new variables
Syntax: gen lnwage = log(wage) (Generates log of a variable)
Syntax: gen V1*V2 (Generates product of two variables)
Creating Dummy variables
Syntax: gen Ded1 = 0Generates Variable where all values are zero
Syntax: replace Ded1 = 1 if ed == 1Replaces all zero values with 1 if ed index has values of 1
The if command (condition of something)
Syntax: if x1 == 1 or ifx2 < 5
Herbert Brucker Projectseminar
BasicsData managementData description
Intro to regressions
Data management (III)
Other useful commands
Syntax: rename oldvarname newvarname (renames variables)
Syntax: drop var (deletes variable from dataset)
Syntax: sort year (sorts dataset by a variable, e.g. a timeindex)
Numeric and string variables: All variables are either numeric(e.g. 512) or string (e.g. alpha)
How to deal with string variables?
Syntax: replace Ded1 = 1 if ed == novocational”
I.e. use novocationalınstead of a numeric value
Herbert Brucker Projectseminar
BasicsData managementData description
Intro to regressions
Data description
Browse your data
Syntax: list year var1, var2, var3 ... varN
Produce summary statistics: observations, mean, standarddeviation, minimum, maximum
Syntax: sum year var1, var2, var3 ... varN
Herbert Brucker Projectseminar
BasicsData managementData description
Intro to regressions
Graphics
Line Graphs
Syntax: graph twoway line var1 var 2 year
Bars
Syntax: graph twoway bar var1 var 2 year
Scatter Plots
Syntax: graph scatter line var1 var 2 year
Herbert Brucker Projectseminar
BasicsData managementData description
Intro to regressions
Simple regressions
Simple ordinary least-squared estimation:
Syntax: regress var1 var2 var3 varX
Least-squared Dummy or fixed effects estimation
Syntax: regress var1, var2, var3 ... varX, D1, D2, D3 ... DN
Herbert Brucker Projectseminar
BasicsData managementData description
Intro to regressions
Pooled or panel regressions
Organise your data set as a panel data set:
Syntax: tsset group timevar
Group variable: Index for each group in data set
Time variable: Index for each time period in data set (e.g.year)
Fixed-effects regression:
Syntax: xtreg var1, var2, var3 ... varX, fe
Random-effects regression (default):
Syntax: xtreg var1, var2, var3 ... varX, re
Herbert Brucker Projectseminar
BasicsData managementData description
Intro to regressions
Recall the Borjas (QJE 2003) regression
Definition of labour supply shock:
mijt = Mijt/(Mijt + Nijt) (1)
Estimation equation:
yijt = θmijt+δi+xj+πt+(δi×xj)+(δi×πt)+(xj×πt)+eijt (2)
yijt = wage or unemployment rate in logarithms; Mijt =immigrants; Nijt = natives; δi = vector of fixed effects foreach education group; xj = vector of fixed effects for eachexperience group; πt = vector of fixed time effects; eijt =error term; i = 1...4 = index for education group; j = 1...8index of experience group
Herbert Brucker Projectseminar