+ All Categories
Home > Documents > The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and...

The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and...

Date post: 25-Dec-2016
Category:
Upload: kathryn-wilson
View: 212 times
Download: 0 times
Share this document with a friend
35
The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions Author(s): Kathryn Wilson Source: Southern Economic Journal, Vol. 67, No. 3 (Jan., 2001), pp. 518-551 Published by: Southern Economic Association Stable URL: http://www.jstor.org/stable/1061450 . Accessed: 11/05/2014 11:23 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. . Southern Economic Association is collaborating with JSTOR to digitize, preserve and extend access to Southern Economic Journal. http://www.jstor.org This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AM All use subject to JSTOR Terms and Conditions
Transcript
Page 1: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

The Determinants of Educational Attainment: Modeling and Estimating the Human CapitalModel and Education Production FunctionsAuthor(s): Kathryn WilsonSource: Southern Economic Journal, Vol. 67, No. 3 (Jan., 2001), pp. 518-551Published by: Southern Economic AssociationStable URL: http://www.jstor.org/stable/1061450 .

Accessed: 11/05/2014 11:23

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Southern Economic Association is collaborating with JSTOR to digitize, preserve and extend access toSouthern Economic Journal.

http://www.jstor.org

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 2: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

Southern Economic Journal 2001, 67(3), 518-551

The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions Kathryn Wilson*

This paper develops and estimates a theoretical model of an individual's high school graduation choice. The model incorporates the idea of a utility-maximizing youth responding to the eco- nomic incentives associated with incremental education, as posited by the human capital lit- erature. However, it also allows for family, neighborhood, and school characteristics to affect the process of being educated, as posited by the education production function literature. Es- timation of the model, using the Panel Study of Income Dynamics (PSID) supplemented with neighborhood and school data, indicates that indeed students respond to economic incentives in making education choices; however, most of the effects of background characteristics are working through the education process rather than affecting returns to schooling.

1. Introduction

Despite the important role of education in our society, there are still many unknowns about the process of educational attainment and an individual's decision regarding how much edu- cation to obtain. This study develops and estimates a theoretical model of an individual's high school graduation decision within the framework of utility maximization. The model posits that an individual makes this education decision in response to the expected economic returns as- sociated with the incremental human capital accumulation and allows background characteristics and circumstances to affect both the returns to education and the schooling process. In addition, the study broadens the scope of past research on educational attainment by examining both neighborhood and school characteristics in the same study.

The model and its estimation address three questions that are informative and also have policy implications: Do youths respond to economic incentives?' Is youth behavior better ex- plained by the human capital model or the education production function model? Does school spending affect either the education level of students directly or the future income students will

* Department of Economics, Kent State University, Kent, OH 44242, USA; E-mail [email protected]. I would like to thank Robert Haveman, Barbara Wolfe, and James Walker for their guidance on this paper. In

addition, I would like to thank John Pepper, Stacy Dickert-Conlin, David Figlio, Jeremy Arkes, Elaine Peterson, Donald Williams, CSWEP CCOFFE participants, numerous seminar participants, and two anonymous referees for their useful comments and suggestions. Finally, I am grateful to the National Center for Education Statistics for their assistance in obtaining the school data.

Received October 1998; accepted April 2000. 1 Haveman, Wolfe, and Wilson (1998) also address the question of student's responses to economic incentives and find that students do respond to the incentives. However, their study is not able to address issues of the effect of school quality.

518

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 3: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

Determinants of Educational Attainment 519

receive? Answering these questions not only gives us insights into the decision-making process but also suggests what policy instruments may be most effective at increasing both educational attainment and the income returns to this education.

Historically, there have been three rather different ways of conceptualizing educational attainment in economics. The human capital literature models individuals as choosing to acquire human capital in response to the expected returns to education. In this view, education is an investment good, and the individual chooses the level of education consistent with these returns. Studies that focus on the empirical effect of school characteristics on educational attainment model education production functions in which the output (education) is a function of the inputs (school and family characteristics); changing an input will result in a change in the output. Therefore, educational attainment in these models is determined by the inputs and the technology rather than being a choice of the individual. The final method, used by most empirical studies of family and neighborhood characteristics, is to estimate reduced-form equations of educational attainment where there is no attempt to understand the mechanism through which the indepen- dent variables are affecting educational attainment but that rather simply examine the relation- ships that are present.

This study combines both the idea of family, school, and neighborhood characteristics being inputs in the production of educational attainment and that of an individual who makes decisions in response to the expected returns to high school graduation. The structural model recognizes that youths make decisions about educational attainment on the basis of their per- ceptions of the expected returns to education. By explicitly modeling the expected returns to high school graduation for each individual, I am able to separate the nonincome effects of family, neighborhood, and school characteristics on educational attainment (the education pro- duction process) from the linkages through which those characteristics influence the youth's expected returns to education and hence attainment.2

Section 2 of this paper provides a more thorough discussion of the previous literature on educational attainment. Section 3 contains the structural model where utility-maximizing youths choose an education level on the basis of expected future income and the nonpecuniary benefits and costs of education. Following a description of the data in section 4, section 5 contains a discussion of methodology and issues involved in estimating the structural model, and the results of the estimation are reported in section 6. Section 7 discusses the policy implications of the paper and the conclusions.

2. Literature Review of Educational Attainment

A review of the primary findings of previous literature provides a context for understanding the contributions of this model and data. Given the vastness of the literature on educational attainment and returns to education, the review focuses on providing a general overview of the strategies employed, the findings, and a discussion of the shortcomings of these studies. The

2 Throughout the paper, I use the term "income effect" to refer to an effect working through returns to education and the term "production effect" for an effect working through the education production process. This use of the term "income effect" is different from its common meaning in the consumer demand literature referring to normal and inferior goods.

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 4: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

520 Kathryn Wilson

literature review is divided into those studies that use the human capital theory, the education production function, and reduced-form models.

Human Capital Literature

Becker's (1964) famous work Human Capital develops a model of the individual's in- vestment in education. An individual is modeled as choosing between current work or forgoing this work to acquire human capital that will have higher rates of return in the future. The key factor in the model is the notion that education is an investment of time and forgone earnings for future pay. This marked the first work where education is an investment good. An empirical body of literature has focused on the investment model of the decision to pursue education by using time-series data.3 These types of studies, summarized in Freeman (1986), generally find that the estimated elasticity of supply of higher-educated persons to changes in salaries falls in the range of one to two.

Willis and Rosen (1979) is the first study to examine education as an investment good at an individual level. They present a structural model of the demand for postsecondary schooling in which individuals make their education decision in response to the incomes associated with the two education levels. They estimate that expected lifetime earnings gains associated with incremental schooling influence the decision to attend college; they find an estimated elasticity of enrollment to earnings of about two.

These studies assume that returns to human capital affect the amount of education an individual will choose to receive, but they ignore other factors that may directly affect the education process. For example, if an individual is attending a very poor quality school, this may affect her expected returns to education and thus how much education she receives. How- ever, if attending a poor-quality school reduces the utility she receives from the process of going to school, perhaps because the school is unsafe or other students are unruly, this direct utility from school is not allowed to influence her decision conditional on returns to schooling. The importance of this shortcoming will be more evident when contrasted with the discussion of the education production function literature, which takes the opposite position that her schooling environment will directly affect how much education she receives but which does not consider the returns to education.

A related vein of literature on the returns to education, particularly relevant for this study, examines the relationship between education, earnings, and measures of school quality. Card and Krueger (1992) use state-level data to estimate the effect of school quality on the rate of return to education for men born between 1920 and 1949. They find that men educated in states with lower student/teacher ratios, longer average term length, and higher-paid teachers have higher rates of return to education but that men educated in states where parents are more educated or have higher income do not have statistically significantly different rates of return to education. Altonji and Dunn (1995) use individual-level data in estimating the effect of parental education and school quality on returns to education. They find mixed evidence re- garding the effect of parental education, but in most of the specifications, including their pre- ferred specification, having a more educated parent is associated with a higher rate of return to

3 Freeman (1975) uses time-series data for the United States to examine the relationship between market conditions and the supply of new entrants to physics. Matilla (1982) also uses U.S. time-series data, covering the years 1956 to 1979, in studying the hypothesis that the expected rate of return to school is an important variable for explaining variation in school enrollment rates.

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 5: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

Determinants of Educational Attainment 521

education. School expenditures per student, however, while increasing the level of income at all education levels, does not have a positive effect on the rate of return to education.

Education Production Function Literature

Studies that focus on the effects of school characteristics on educational attainment have done so primarily within the context of an education production function. School quality and family characteristics are viewed as inputs in the production of children's educational attainment. The measurement of the school inputs and the outputs of interest vary from study to study, but most measure inputs as expenditures per student, student/teacher ratios, teacher salaries, and class size. For about two-thirds of the studies, the output of educational attainment is measured by test scores. The other one-third focus on quantity of schooling achieved, such as high school graduation, college attendance, or years of education.

The most well known and cited article on the school quality literature is Hanushek's (1986) article "The Economics of Schooling: Production and Efficiency in Public Schools."4 In this article, Hanushek reviews 147 regressions, taken from 33 separate published articles, of the effects of school characteristics on educational attainment. He compares the sign and signifi- cance levels of the estimated effects of school inputs. On the basis of a lack of consistent findings, he concludes, "there appears to be no strong or systematic relationship between school expenditures and student performance" (p. 1162).

Hedges, Greenwald, and Laine (1994) reexamine the same studies reviewed by Hanushek (1986) and reach a much different conclusion: "Yet the data upon which this conclusion (that money does not matter) is based support exactly the opposite conclusion and demonstrate that expenditures are positively related to school outcomes" (p. 5). Whereas Hanushek used a "vote counting" method of examining the previous studies in which the results of each regression receives one vote and the only important information is sign and significance level, Hedges, Greenwald, and Laine (1994) use a more sophisticated synthesis method that accounts for the size of the estimate, the expected correlation in the error terms from regressions estimated over the same sample but with slightly different specifications, and the potential influence of outliers. In addition, they find that the magnitudes of the effects are large enough to be of "practical importance" economically.

The production function framework employed by the studies discussed in Hanushek (1986) and Hedges, Greenwald, and Laine (1994) identify broadly how the school characteristics affect educational attainment: They are inputs in the process of being schooled. These studies view the relationship between school (and family) characteristics and educational attainment as de- terministic. If an input changes, then the educational attainment of the individual will change by an amount determined by the technology and the level of the other inputs. The individual is not viewed as a decision maker who is choosing level of education, nor do the returns to education play an explicit role within this framework. However, the human capital accumulation literature gives reason to believe that the individual is a decision maker choosing level of schooling and that the returns to education are an important factor in that decision.

Reduced-Form Estimation Literature

The majority of studies that examine the effect of family and neighborhood characteristics on educational attainment estimate reduced-form models with some measure of educational

4 See also Hanushek (1991, 1994).

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 6: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

522 Kathryn Wilson

attainment as the dependent variable and a variety of family and neighborhood characteristics as the independent variables. Most of these studies do not have any information about the school the individual attends. The causality of the family and neighborhood effects on educational attainment is based on the fact that the characteristics are temporally prior to the measured outcome; however, the exact causal mechanism is not identified.

Haveman and Wolfe (1995) and Jencks and Mayer (1990) provide summaries of the results of these studies. The strong effect of parental education on children's educational success, both statistically and economically, stands out in this research (Hill and Duncan 1987; Haveman, Wolfe, and Spaulding 1991). The economic status of the family, measured by total family income or income to needs, also tends to be positively associated with educational attainment, but this result is more robust for years of education than for high school graduation, and the magnitudes of the effects are not very large (Hill and Duncan 1987; Behrman et al. 1992). Finally, growing up in an intact family is associated with substantially higher educational at- tainment (Astone and McLanahan 1991; Manski et al. 1992).

Neighborhood characteristics have received much less attention than family characteristics until recently, and there is less consensus on the effects of neighborhood and peer characteristics. Most studies focus on the socioeconomic status indicators of neighborhood quality, such as median income and percentage of people in high-status occupations; others focus on the com- position of the neighborhood (e.g., the percentage white).5 In these studies, the magnitude of the neighborhood effect tends to be small relative to the effect of parental education, but the measured effect of neighborhood is often statistically significant, even when controlling for an extensive number of family characteristics.

One problem with studies that examine the effect of school quality and neighborhood on educational attainment is the potential endogeniety of the school and neighborhood variables. Because parents choose where to live, unobserved parental attributes that are correlated with the school or neighborhood choice may lead to a biased estimate of the effect of neighborhood and school. A number of studies have attempted to address this problem, yet no clear consensus exists as to its severity.6 The current study is subject to the same criticism; however, the extent of bias may be lessened in this paper because of the inclusion of a variety of family controls that are generally not available in empirical studies.

3. The Model

This section presents a structural model where family background, neighborhood, and school quality are allowed to affect both the returns to education that an individual can expect

5 Datcher (1982) was the first of the recent wave of studies of the effects of neighborhood characteristics on children's attainments. Using ZIP code-level data from the Panel Study of Income Dynamics (PSID), she finds that neighborhood racial composition is not significantly associated with educational attainment. Corcoran et al. (1992) expand on Datcher's

analysis with the PSID by examining four different ZIP code characteristics; they find that living in a neighborhood with more mother-only families and more people on public assistance reduces educational attainment but that median income and male unemployment do not have significant effects. Finally, Brooks-Gunn et al. (1993) examine the role of high- and low-income families in the neighborhood. They find that the fraction of families with high and low incomes, as well as the percentage in managerial or professional occupations, affects educational attainment.

6 Evans, Oates, and Schwab (1992) find that peer group effects are not present when simultaneous equations are estimated that control for endogeniety. Plotnick and Hoffman (1999) and Aaronson (1998) each use fixed-effects models that rely on neighborhood variation between siblings, with Plotnick and Hoffman finding no effect once parental unobservables are controlled but Aaronson finding that there is an effect.

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 7: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

Determinants of Educational Attainment 523

to receive as well as the utility costs and benefits of acquiring education. In this framework, individuals make their education decisions on the basis of these expected returns and costs. Individuals are taken to be utility maximizers who decide to graduate from high school if the expected utility from graduating is greater than the expected utility from not graduating, de- pending on their perceptions of the expected income returns to, and the utility value of, alter- native levels of schooling. They form their expectations of the returns to schooling by observing older individuals with similar background and family characteristics who have already made their education decisions and are now working. The younger individuals assume that they will receive the same returns to education that their older counterparts received. This model identifies the effect of family, neighborhood, and school quality measures on educational attainment via the returns to education as well as directly through the education production process.

In order to better characterize the economic decisions that the individual is making, I first present the model as continuous in education. However, I estimate the discrete model, where an individual chooses between graduating from high school or dropping out, which is described next.

Continuous Model

The Individual's Utility Function

Consider an individual utility maximizer who has the following utility function, which is separable in an education consumption good and all other consumption:

utility: U, = Ei + Bcln Ci + Ee, (1)

where Ui is utility of individual i, Ci = lifetime discounted stream of consumption, Ei = utility from an education consumption good, Bc

= weight of consumption in utility, and Eei = random utility term (conditional on education).

The first element in the utility function, the education consumption good E,, is the utility an individual receives from schooling. The individual enjoys some aspects of schooling, such as perhaps the social contacts and the excitement of learning. These are utility benefits the individual receives from schooling. Other aspects of schooling, however, the individual does not like. These aspects may include the authoritarian structure of the classroom or the amount of time spent doing homework to meet course requirements; these are the utility costs of school- ing. The consumption good Ei is the net effect of these utility benefits and utility costs of schooling.

The consumption good Ei, the utility cost and benefits of schooling, is produced by the following production function:

Ei = g(ei, xi), (2)

where g(-) = the production function, ei = schooling of individual i, and xi = variables that affect the utility benefits and costs of being schooled. The factors xi are inputs in the process of being schooled. The consumption good E,, the net utility derived from schooling e, is the outcome that is being produced. The technology that transforms these inputs xi into the con- sumption good E, is the function g(.).

The generalization of g(-) to be a function of both schooling and background characteristics

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 8: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

524 Kathryn Wilson

allows for individuals to derive different amounts of utility from being schooled, depending on their family background, neighborhood, and school quality. Variables included in x, are family, neighborhood, and school characteristics that may influence the disutility of being schooled or the nonincome utility benefit the individual experiences as a result of the additional schooling.7

The production function g(ei, xi) is consistent with the idea of the traditional education production function literature. It recognizes that family, neighborhood, and school characteristics all combine to influence the amount of education an individual will receive. However, contrary to the traditional education production function literature, the educational attainment of the individual is not deterministic in this model. Rather, these factors influence the consumption good E, but the individual chooses the level of schooling so as to maximize utility given the individual has the production function g(ei, xi).

The second element in the individual's utility function, C., includes all other consumption goods. The coefficient B, indicates the relative weight that consumption receives in the indi- vidual's utility function. The weight of the education consumption good Ei has been normalized to one.

The Budget Constraints

The individual maximizes utility subject to the following budget constraints, showing the relationships between schooling, income, and consumption:

budget constraints: Yi = c(Q)ei + i (3)

Ci < Yi, (4)

where Y, = lifetime discounted income stream, a(Q,) = returns to schooling, Qi = variables that affect returns to schooling, and k = random component of income. Equation 3 gives the relationship between schooling and future income. The level of income for an individual is a function of the amount of schooling the individual receives, ei, and his or her background characteristics, Qi. The vector Q consists of any factors that may influence the amount of income an individual can expect to receive. It includes family characteristics such as parental education, family income, and family structure during childhood as well as neighborhood and school factors that may affect the income an individual will receive.

The return to schooling an individual receives is the change in income associated with a marginal change in schooling.8 Allowing the return to schooling (a) to vary with Q allows family, neighborhood, and school quality to affect not only the level of income but also the returns to schooling. This simply means that the amount of income derived from an additional year of schooling will not be the same for everyone. The vector Q affects not only the intercept of the education/income relationship but also the slope. This is consistent with the findings of

7 For example, an individual from a high-income family may have higher educational utility Ei from any given schooling level ej because of the availability of a computer and other resource materials at home that make it easier to acquire e,

years of schooling. Conversely, if an individual attends school in a crowded, unsafe environment (where school envi- ronment is an element in x,), then the utility (E,) derived from schooling (ei) will be lower than if the individual were in a more pleasant schooling environment.

8 It is assumed that returns to schooling are concave: The first derivative of income with respect to schooling is positive, but the second derivative is negative. While income is higher with more education, for each additional year of schooling, the individual is forgoing income that could be made if the individual were not in school. When the forgone income of a year of schooling is greater than the increased future earnings from schooling, the returns to schooling are negative.

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 9: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

Determinants of Educational Attainment 525

Card and Krueger (1992) that individuals who attend a higher-quality school have higher returns to education and of Altonji and Dunn (1995), who find that parental education affects the return to education.

The second budget constraint, Equation 4, states that the lifetime discounted stream of consumption cannot exceed the individual's lifetime discounted income stream. Perfect credit markets are assumed with an interest rate equal to the individual discount rate. Schooling does not directly enter into the budget constraint of consumption because it is assumed that the price of schooling for the individual equals zero.9 This assumption clearly is much more accurate for elementary and secondary schooling than for postsecondary schooling in the United States; however, since the educational outcome of interest in this study is high school graduation, for ease of notation I will maintain the assumption of zero price for all levels of schooling. The implicit assumption is also being made that each individual is so small relative to the local school system that individuals assume that local education costs, and thus local tax rates, are exogenous to their decision of whether to attend school. The price of other consumption goods is normalized to one.

Expectations Formation

Because youths do not know future income with certainty, they form expectations about that income. There is no clearly superior or consensus way to model expectations.'0 Freeman (1971) assumes myopic expectation formation where individuals form their expectations by examining the outcomes realized by an older cohort. Willis and Rosen (1979) use data on the individual's actual outcome and model individuals as having rational expectations. This assumes that the individual knows the actual income-generating process and forecasts what his or her actual income will be. Manski and Wise (1983) take a different approach and assume that an individual does not know either the outcome measures attained by an earlier cohort or the income-generating process. Rather, individuals assume that their outcome will be a function of the difference between their test scores and the average test scores in the college.

In this model, I adopt the framework of Manski (1993a, b) and Freeman (1971) that individuals form their expectations by observing the incomes realized by individuals in an older cohort who are similar to the individual in many observable traits." I assume that they form these expectations by observing the incomes of siblings and friends, who are in their late teens to early 30s, and who have similar backgrounds to their own. These youths assume that they can expect the same returns to schooling that those older individuals have experienced.'2

9 While schooling does not enter the budget constraint, the opportunity cost of schooling is included in the model. As discussed previously, the forgone earnings associated with an additional year of schooling are captured in the net present value of income.

10 Manski (1993a) provides an excellent discussion of the assumptions made implicitly and explicitly about expectation formation as well as the problems with misspecifying expectations. He is fairly critical of the current empirical literature for lack of attention to the expectation formation process. He states, "The recent literature shows little concern with expectations formation. The prevailing sentiment seems to be complacency. Either researchers are confident that their expectations assumptions are correct, or they believe that misspecifying expectations is innocuous" (p. 44).

" In the sociology literature, this process of forming expectations is often defined as "role models" where individuals look to role models in the society in forming their expectations. Indeed, most of the literature in sociology assumes this expectation formation process. Manski summarizes it this way: "The central social psychological idea is that expectation formation is a social phenomenon, each person learning about his prospects by observing the experiences of others" (Manski 1993a, p. 47).

12 Manski (1993a) shows the parallel between this method and how econometricians forecast: "I instead assume that youth form their expectations in the manner of practicing econometricians: youth observe the incomes realized by members of the preceding generation who chose schooling, and they make inferences from these observations" (p. 49).

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 10: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

526 Kathryn Wilson

This expectations process is captured by the following equation:

expectations: E[Y i Qj, ej = ej] = a(Qj)ej for Qi = Q, (5)

which states that if individual i chooses to have the same amount of schooling as individual j (who is like individual i in terms of characteristics that affect income, Q), then i's expected income will be what individual j's income is, given that schooling is transformed into income by a(Qj). The vector of variables Q is the same vector defined in Equation 3. It includes those family, neighborhood, and school variables that affect the individual-specific return to education.

Maximization Solution

Substituting Equations 2 through 5 into Equation 1 gives the following Lagrangian for maximizing expected utility:

Lagrangian: g(eg,

xi) + E[ln Y,]B, - X[a(Qi)e, - Yj]. (6)

The individual seeks the level of schooling that will maximize utility, given the constraint that consumption can be no greater than income and that income is a defined function of schooling and individual characteristics.

The solution to this maximization problem is

SUg(ei, xi) -U ----- Ug ae

oa(Qi) =

. (7)

The left-hand side of the equation is the marginal rate of transformation of schooling to income. As schooling changes, the individual's expected income changes by a(Q,). The right- hand side is the marginal rate of substitution of consumption and the consumption good E for the individual, the relative marginal utility of schooling and marginal utility of consumption. Therefore, Equation 7 indicates that the individual chooses schooling such that the marginal rate of transformation of education for income (returns to schooling) equals the marginal rate of substitution of education for consumption. Interpreted another way, the individual continues to get schooling until the marginal utility benefits equal the marginal utility costs.

Comparative Statics

The role of family, neighborhood, and school variables in educational attainment can be seen by examining the following comparative statics:

> 0 e> 0. (8) aa(Q,) a - [UEag(ei, xi)/aea]

Schooling is increasing in a(Q,) and -[UEag(ei, x,)/ael]. Variables in Qi that increase ax will increase educational attainment by increasing the marginal rate of transformation. In other words, variables that increase the returns to education will increase educational attainment. This is defined as the income effect of the variable.

Variables xi that increase the nonincome benefits of education (increase - UE g(e,, x,)/ae,) will increase educational attainment by increasing the marginal rate of substitution. For example, variables that make schooling less costly in utility or increase the utility benefits of schooling will increase the amount of schooling an individual receives. This is defined as the production

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 11: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

Determinants of Educational Attainment 527

effect of the variable. This is consistent with the production function literature where variables that enter positively into the production function increase schooling and variables that enter negatively reduce schooling.

The Discrete Case

The previous model assumes that schooling is a continuous variable. Now consider the discrete case where the individual chooses either to graduate from high school or to drop out. The individual will graduate if the expected utility of graduating is greater than the expected utility of not graduating, depicted in Equation 9. The utility function is defined exactly as it is in the continuous model, and the production function has been substituted in for the consumption good Ei. (For notational convenience, the subscript i is dropped. Subscript h refers to high school graduate, and d refers to high school nongraduate.)

E[Uh] = g(eh, x) + E[ln Ch]Bc + Eh > g(ed, x) + E[ln Cd]Bc + Ed = E[Ud]. (9)

To obtain a closed-form solution, I define g(e, x) -- B, and at(Q)e - YeQ. These are

general functional forms that allow for differential effects of x and Q for each level of schooling. While the functional forms are linear in x and Q, the effect of x and Q are not constrained to be linear in schooling (x and Q can have increasing, decreasing, or constant marginal effects as schooling increases).

Substituting the budget constraint into the utility function, the individual will graduate if expected utility from graduating is greater than expected utility from not graduating:

Bhx + E[ln Yh]Bc + Eh > Bx + E[ln Yd]Bc + Ed, (10)

where E[ln Yh] = In(yhQ) and E[ln Yd] = ln(ydQ). Rearranging the terms, the individual will

graduate if

(Bh - Bd)x + {E[ln Yh] - E[ln Yd]}Bc > Ed - Eh. (11)

The probability of the individual choosing to graduate is the probability that utility is higher with graduating:

Prh = Pr[E < Bx + {E[ln Yh] - E[ln Yd]}Bc], (12)

where E = Ed - Eh and B = Bh - Bd. Given the dichotomous choice to either graduate or drop out, individuals choose the schooling choice that maximizes utility on the basis of expected returns to education formed by observing the returns to education experienced by others and on the utility costs and benefits of education.

The unknown parameters of the model to be estimated are Yh, Yd, Bc, and B (Bh - Bd). The y-s indicate how background characteristics affect income for each level of education. The parameter

Bc indicates the importance of the expected returns to education in the individual's

education decision. The combination of ys and Bc

can be used to recover the income effect of how family, neighborhood, and school quality affect the individual's education decision through changing the returns to education. The parameter B recovers the production effect of family, neighborhood, and school quality in the process of being schooled.

In summary, the continuous model developed in this section is based on the standard human capital model in which a utility-maximizing youth chooses the amount of schooling to receive on the basis of the expected returns to schooling. The individual does not know these returns

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 12: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

528 Kathryn Wilson

to schooling with certainty and forms expectations of what they will be by observing the incomes received by other youths, of an older cohort, with characteristics similar to those of the individual. In addition, it is recognized that there are many factors that increase or decrease the utility benefits and costs of additional schooling for a youth. These factors do not determine the amount of schooling an individual will receive, such as in a traditional production function; rather, the individual makes the utility-maximizing choice of how much education to receive given these utility benefits and costs of acquiring schooling and the expected income returns to additional schooling. There are many important potential roles of family, neighborhood, and school quality characteristics in the model. They may change the expected returns to schooling an individual receives or act as an input in the production process by which schooling is transformed into the consumption good education. Both of these factors will affect the amount of schooling the individual chooses to attain. Yet ultimately, the schooling attained is a utility- maximizing choice of the individual.

4. The Data

In this study, I merge data from three sources to provide a more comprehensive picture of the environment an individual experiences during childhood. The three data sets are the Mich- igan Panel Study of Income Dynamics (PSID), the 1970 and 1980 U.S. Census, and the Com- mon Core of Data; each is described next. The resulting data set provides an extensive set of family, individual, neighborhood, and school information for each youth.

The primary sample for the empirical estimation is 1772 individuals in the PSID who were between the ages of 0 and 6 in 1968. For these individuals, age-indexed family and individual characteristics have been averaged for the ages 6 to 15. The ability to average background information over most of the individual's childhood years, rather than at one particular point in time, reduces measurement bias and presents a more accurate picture of the individual's entire childhood (see Haveman et al. 1996). The outcome variable, a dummy variable indicating wheth- er the individual has graduated from high school, is assigned a value of one if the individual has completed 12 years of schooling or received a general equivalency diploma (GED). In addition to this primary sample, the expectations of future earnings is based on a reference group of older individuals (ages 8 to 12 in 1968) for which background characteristics (including neighborhood and school quality) are observed for the ages 12 to 15 and income is observed through age 32.

The neighborhood data come from the 1970 and 1980 census and are matched to the PSID sample by a special Geocode data set provided by the PSID. For every year, the location where the individual lived is identified, and neighborhood characteristics for that location are assigned to the individual.13 A neighborhood is defined as a census tract, which typically contains between 4000 and 6000 individuals and has been designed on the recommendation of local community leaders to approximate a true neighborhood. The neighborhood variables have been chosen to capture the race and family composition as well as economic status of the neighborhood.

13 While the neighborhood data are merged to where the individual lives each year, the characteristics of the neighborhood are taken from the 1970 and 1980 census. For years prior to 1970 and after 1980, the characteristics of the neighborhood in 1970 and 1980, respectively, are used. For years 1970 through 1980, a linear combination of the neighborhood's 1970 and 1980 characteristics is used.

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 13: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

Determinants of Educational Attainment 529

The education data are taken from two surveys of the Common Core of Data (CCD) compiled by the National Center for Education Statistics. The CCD is an annual, national statistical database of all public elementary and secondary schools and school agencies or dis- tricts in the United States. The first survey I use contains data on student/teacher ratios at the school level for over 90,000 schools for the years 1978, 1980, and 1982. In order to reduce measurement error, I assign each school the mean of the three years of data. The second survey, at the school agency or district level, contains detailed revenue and expenditure data for 1980.14 I merge the school and school district data to the PSID sample by the five-digit ZIP code contained in the Geocode file. The algorithm used for the merge appears in Appendix A.'5 More than 75% of the sample of 1772 youth have school data matched to them using ZIP code information, while the rest are assigned the mean value of the high schools in the school district(s) corresponding to the individual's ZIP code.

Table 1 contains the unweighted and weighted means and standard deviations for the variables.16 All dollar values are in 1993 dollars. The variables labeled "percentage years" are the percentage of years during ages 6 to 15 in which a given event occurred. For example, if an individual lived in a single-parent family for the ages of 6, 7, and 12, then the variable percentage years in a single-parent family would equal 0.3 (3 of 10 possible years). The neigh- borhood variables are also averaged over the ages 6 to 15; the school variables are averaged over the ages 14 to 16, which should roughly correspond to an individual's high school years.

Most of the sample, 86%, has graduated from high school; the weighted sample mean, corresponding to national average, is 89%.17 Table 2 illustrates the means and standard devia- tions of the sample by high school graduation status. The high school dropout subsample con- tains more African-American and fewer female individuals than the graduate subsample. The families of the dropout sample have lower-educated parents and a lower income/needs ratio and spend considerably more time in poverty and in single-parent families. The individuals who drop out of high school live in neighborhoods with lower socioeconomic status and attend schools with higher student/teacher ratios and lower expenditures per student.

5. Estimation of the Structural Model: Methodology and Issues

The human capital literature models schooling as an investment choice in response to economic incentives. The production function literature considers schooling an output that is produced by the combination of a variety of inputs. Estimating the structural model presented in section 3 uncovers both the income mechanism by which the independent variables affect high school graduation by affecting returns to graduation and the production mechanism by

14 In 1980, my primary sample was between the ages of 12 and 18, which closely corresponds to the time the individuals would have been in high school. While it would be ideal to use data from a variety of years because of the age difference within my sample, the financial data are available only for 1980.

'1 The data in the CCD are for public schools. Therefore, if an individual attended a private school, I will misassign the school quality variables of the public school rather than the school actually attended. If individuals attend private schools that have higher expenditures and lower student/teacher ratios than the public school alternative, then the misassignment results in a negative bias on the coefficient estimates of school quality.

16 Unweighted observations are used in the estimations, but weighted observations are used for all simulations. 17 An individual is considered as having graduated from high school if observed as having graduated in any year during the sample period.

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 14: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

530 Kathryn Wilson

Table 1. Means and Standard Deviations

Weighted Standard Unweighted Standard Variable Mean Deviation Mean Deviation

High school graduate = 1 0.890 0.313 0.859 0.348 African-American = 1 0.153 0.360 0.461 0.499 Female = 1 0.515 0.500 0.512 0.500 African-American*female = 1 0.083 0.276 0.248 0.432 Mother high school graduate = 1 0.415 0.493 0.345 0.476 Mother attended college = 1 0.154 0.361 0.106 0.307 Father high school graduate = 1 0.243 0.429 0.198 0.399 Father attended college = 1 0.286 0.452 0.173 0.378 Both parent's education not

available = 1 0.080 0.271 0.159 0.366 Average income/needs ratio 3.062 1.944 2.391 1.738 Percentage years income/needs < 1 0.111 0.230 0.223 0.318 Average number of siblings 1.983 1.346 2.384 1.555 First born = 1 0.287 0.452 0.237 0.425 Percentage years in SMSA 0.671 0.439 0.717 0.424 Percentage years head disabled 0.116 0.228 0.167 0.277 Percentage years single-parent family 0.184 0.326 0.297 0.404 Percentage years moved location 0.158 0.183 0.164 0.180 Percentage years mother worked 0.596 0.352 0.580 0.367 Percentage neighborhood white 85.144 24.197 64.243 35.384 Percentage neighborhood mother-only families 13.713 8.615 20.065 13.362

Neighborhood median income 45,349 16,091 39,768 14,326 Neighborhood percentage high-status

occupations 23.883 10.649 20.238 9.760 School student/teacher ratio 19.628 4.083 19.805 4.240 District expenditure/student 3193 889 3135 800 No. of observations = 1772

which the independent variables affect educational attainment by affecting the production pro- cess associated with additional schooling.

Estimation Methodology

In order to estimate the model, expected income streams conditional on educational at- tainment (E[Yh] and E[Yd]) must be estimated for each individual in the primary sample (those ages 0 to 6 in 1968). This is done using a sample of older individuals, referred to as the reference group, for whom I observe both annual personal income over ages 19 to 32 as well as back- ground characteristics over the ages 12 to 15. The estimation of the structural model is accom- plished in three stages. First, since the incomes of the reference group are only observed con- ditional on the education level they chose, a sample selection equation is estimated for the reference group. Second, predicted income streams conditional on educational attainment are calculated for each individual in the primary sample on the basis of a series of income equations estimated over the reference group. This is done by splitting the reference group into those who have and those who have not graduated and, for each of ages 19 to 32, fitting a separate tobit model over the sample of graduates and nongraduates. Finally, a high school graduation equation

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 15: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

Determinants of Educational Attainment 531

Table 2. Means by High School Graduation Status Graduates Nongraduates

Weighted Unweighted Weighted Unweighted Variable Mean Mean Mean Mean

African-American = 1 0.142 0.440 0.244 0.593 Female = 1 0.520 0.520 0.465 0.461 African-American*female = 1 0.080 0.244 0.109 0.274 Mother high school graduate = 1 0.440 0.375 0.193 0.158 Mother attended college = 1 0.170 0.120 0.026 0.021 Father high school graduate = 1 0.253 0.214 0.157 0.095 Father attended college = 1 0.316 0.195 0.047 0.036 Both parent's education not

available = 1 0.072 0.144 0.153 0.257 Average income/needs ratio 3.189 2.519 1.948 1.579 Percentage years income/needs < 1 0.093 0.194 0.274 0.410 Average number of siblings 1.922 2.290 2.516 2.981 First born = 1 0.298 0.252 0.187 0.141 Percentage years in SMSA 0.680 0.715 0.596 0.729 Percentage years head disabled 0.100 0.146 0.254 0.300 Percentage years single-parent

family 0.165 0.272 0.351 0.459 Percentage years moved location 0.145 0.151 0.272 0.252 Percentage years mother worked 0.606 0.601 0.515 0.450 Percentage neighborhood white 85.9 65.9 78.5 53.5 Percentage neighborhood mother-only

families 13.3 19.3 17.1 25.0 Neighborhood median income 46,101 40,447 39,248 35,612 Neighborhood percentage high-status

occupations 24.4 20.8 19.2 16.6 School student/teacher ratio 19.6 19.7 20.0 20.2 District expenditure/student 3230 3154 2898 3024

No. of observations 1531 241

that includes the predicted income terms is estimated for the primary sample. Each of these steps is described in detail next.

Selection Equation

The first step is to estimate a probit equation over the reference group to obtain a sample selection correction term (lambda).'8 This selection term captures those characteristics that are unobservable to me (but assumed observable to individuals in the primary sample) that reflect the fact that those who drop out of high school may have lower earnings even if they were to graduate and similarly that those who graduate may have abilities that would cause them to earn more even if they dropped out of high school.'9 Since these are characteristics that are

18 Heckman (1979) provides the derivation of this technique. 19 An alternative hypothesis, comparative advantage, suggests that qualities that increase income in highly educated

occupations may be different than qualities that increase income for those who have dropped out. For example, an individual with the analytic and articulation skills necessary to be a well-paid lawyer may not earn much in low- education crafts that require manual dexterity.

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 16: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

532 Kathryn Wilson

more difficult to measure quantitatively than to assess through observation or interaction, it is reasonable to assume that individuals in the primary sample recognize these characteristics when observing the incomes of these individuals.20 The selection equation estimated is

(prob. high school graduation = 1) = 0 + AV, + ,i, (13)

where variables included in V are parental education; race, gender, and a race-gender interaction term;21 family position (first born and average number of siblings); family background variables measured over the ages 12 to 15 (years in a Standard Metropolitan Statistical Area [SMSA], years head disabled, years in single parent family, years in poverty, years the individual's mother worked, and three region dummies); and school and neighborhood variables (neighborhood percentage white, percentage mother only, median income, and percentage high status occupa- tions; school district expenditures per student; and student/teacher ratio). Also included and used as identifiers are percentage of years (over the ages 12 to 15) the individual moved location and family income/needs ratio. From this, I calculate a lambda variable for each person in the reference group.

Income Equations

The second step in the estimation process is estimate the individuals' predicted income for the ages 19 to 32 conditional on whether the individual graduates from high school. While income streams over the ages 19 to 32 may not perfectly reflect the lifetime earnings differences associated with graduating from high school, a trade-off between the number of outcome years and the number of childhood years for which data are observed is imposed by the constrained length of the longitudinal data.

First the reference group is divided into the subsample that did graduate and the subsample that did not graduate from high school. I estimate 14 tobits (one for each year from ages 19 to 32) for each subsample (with and without graduation), with personal income as the dependent variable.22 Many studies in the human capital literature estimate the log of income. However, a BM test of the log-linear functional form rejected the log-linear form for about half the income equations, so the estimation in this paper is based on actual income.23 The income equation is

Yi = yYeQi +

ki +

i. (14)

The independent variables included in Q/

are an intercept, race, gender, and a race-gender interaction term; family position (first born and number of siblings); parental education; family variables measured over the ages 12 to 15 (years in poverty, years mother worked, female-years mother worked interaction term, years in an SMSA, years in a single parent family, and years head disabled); and neighborhood and school quality variables (percentage of mother-only

20 It is also reasonable to suppose that the individuals in the primary sample know what their own "unobservable" characteristics are like and consider this in forming their income expectations. However, since I cannot observe this, I will make unconditional predictions of income, and the unobserved characteristics will be captured in the error term.

21 Throughout the paper, race is defined by a dummy variable that equals 1 if the individual is African-American and 0 otherwise. While it would be interesting to also include a dummy variable indicating Hispanic origin, particularly given the high dropout rates within the Hispanic community, the sample does not contain enough Hispanic individuals to allow this. Although the race dummy variable is for African-Americans, all individuals remain in the sample.

22 Personal income includes all taxable and transfer income. The tobit specification is used because of the mass point at zero income.

23 See Madalla (1992) for a discussion of the BM test.

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 17: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

Determinants of Educational Attainment 533

households, percentage white, median income, and percentage in high status occupations in the neighborhood; expenditures per student in the school district, and student/teacher ratio of the school). Also included and used as identifiers are the natural log of average total family income and number of years in each of three regions.

The predicted income streams for the primary sample, conditional on educational attain- ment, are created using the coefficient estimates from the regressions of Equation 14. The predicted income stream for the primary sample if they were to graduate from high school is calculated on the basis of the regressions estimated over the sample of individuals in the ref- erence group who graduated. Similarly, the predicted income stream if the primary sample were to not graduate is calculated on the basis of the regressions estimated over the sample of individuals in the reference group who did not graduate. The net present value of these income streams, discounted to age 16, is calculated using a 3% discount rate. This process results in two present value of income terms for each individual in the primary sample: predicted income if the individual were to graduate from high school and predicted income if the individual were to drop out of high school.

The coefficients -Ye indicate how the independent variables affect income given education level e. If a variable increases income regardless of level of education, then /yh

> 0 and -Yd > 0. The effect of a variable on the rate of return to education depends on the relative magnitude of the effects. If a variable increases income more if the individual graduates than if the indi- vidual drops out of high school, then it increases the rate of return to education. However, if the positive effect is larger if the individual drops out, a variable may increase income in both states of education but not increase the rate of return to education.

High School Graduation Equation

The final stage in estimating the structural model is to include these predicted income streams in an estimate of the probability that an individual graduates high school. A probit model of Equation 15 is estimated:

e, = Bx + (In Yh - In Yd)Bc - E, (15)

where ei equals 1 if the individual graduates from high school and 0 otherwise. The coefficients in vector B represent the effects of family, neighborhood, and school on education directly through their impact on the nonincome utility benefits and costs of education.24 The coefficient

BC represents the effect of returns to education on the education choice. When taken in con- junction with -y, the vector of coefficients describing the determinants of income, Bc, can indicate how family, neighborhood, and school variables have an income effect on the educational choice.

Identification of the Model

The structural model is identified through both exclusion restrictions and functional form assumptions of the utility function. Consider, first, a model that is not identified:

ei = BX + [f(yhQ) - f(~yedQ)]Bc, (16)

24 While the variables used in this study are more extensive than in most studies, variables directly measuring ability and ambition are not included; this could result in the potential of omitted variables bias.

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 18: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

534 Kathryn Wilson

where the vectors Q and X are identical and f(y,Q) - y,Q. The second term (yhQ - YedQ) is a linear combination of elements in X, the model is not identified, and the coefficient estimate

BC cannot be recovered because of perfect multicollinearity. The identification of the structural model estimated in this paper is based on two differences from Equation 16.

First, the model is identified because of nonlinearity of y,eQ. Since f(yeQ) In(yeQ), the second term (In yYhQ - In YdQ) is not a linear function of X. The functional form assumptions of the utility function provide the basis for this source of identification.

Second, the model is identified because of exclusion restrictions; the vector Q includes elements that are not included in X. Divide the vector Q into two subvectors, q and z, where q includes variables that are not elements of X and z is a subset of X:

e, = BX + (In yhq + n YhZ - In yq - In Ydz)Bc. (17)

Since q is not an empty vector, the income term contains information not included in the X vector. In other words, there are variables that affect the income an individual can expect to receive but that have no direct effect on the person's choice of educational attainment.

I model the educational choice decision relying on exclusion restrictions that are far less extreme than prior research in this area. Willis and Rosen (1979), for example, rely on the very stringent exclusion restriction that ability, measured by test scores, affects income but has no direct effect on educational attainment.25 There are four elements of Q that are not in X: three region dummies and total family income. The region variables capture the regional differences in income, recognizing that there is not much regional variation in the probability of graduating high school.

Second, I argue that the individual's income will be affected by family total income (not adjusting for family needs) since this reflects the family's labor market connections. However, the education decision will be affected by the economic well being of the family, reflected in the income/needs ratio of the family. For example, consider two individuals whose parents have the same jobs earning $30,000. These individuals have both been exposed to the same types of labor markets and may have the same labor market connections; however, if one individual is the only child and the other individual has 10 siblings, the standard of living for the two individuals will be substantially different. I have modeled that the $30,000 will affect future incomes but that the standard of living will affect the graduation choice.

The presence of more than one identifying variable allows testing of overidentifying re- strictions. This involves regressing the residuals from the high school graduation model on the instruments. The test statistic is the number of observations times the uncentered R2, which is distributed chi-squared. The test statistic is 0.99, which fails to reject the null that the instruments in the income equation are uncorrelated with the residual from the high school graduation equation.26

Identification is also necessary in the sample selection equation, the first stage of the model. In order for the selection term lambda to be identified, there must be elements in the high school graduation selection Equation 13 that have an effect on the high school graduation choice of the individual but that do not affect the later incomes of the individuals (Eqn. 14). I use two

25 In order to estimate their model, they assume (i) that ability can affect educational attainment only through expected future incomes and (ii) that family background can have no direct effect on incomes but can have a direct effect on educational attainment. However, Ginther (1995) uses nonparametric analysis to examine the exclusion restrictions of Willis and Rosen and rejects the assumption that family background does not affect income.

26 See Johnston and DiNardo (1997, pp. 336-8). This is asymptotically equivalent to a Basmann test.

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 19: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

Determinants of Educational Attainment 535

identifying variables: percentage of years during the ages 12 to 15 that the family moved location and the family income/needs ratio over these years. The argument for the inclusion of family income in the personal income equation but not the high school graduation equation was discussed previously. It posits that total family income represents the family's ties to the labor market, while the family's income/needs ratio represents the family's standard of living.

Location moves reflect the fact that if an individual's family changes location many times, the individual will have to make adjustments to new schools, new teachers, and new friends. In addition, the graduation requirements vary by school district, so individuals who change locations may face different graduation requirements than those for which they were being prepared in the prior schools. Therefore, it is expected that an individual who changes location more often will be less likely to graduate from high school but that these moves do not affect income streams as an adult.

Tests of overidentifying restrictions in the income equations support the use of these in- struments. In all 28 regressions on residuals (14 ages for each education outcome), the null that the instruments are uncorrelated with the error term is not rejected at the 5% level and is not rejected at the 10% level for 26 of the 28 regressions.

6. Estimation of the Structural Model: Results

Predicted Income Results

The initial selection Equation 13, a probit of high school graduation estimated for the reference group, appears in Appendix B. The two variables used as identifiers, average income/ needs ratio of the family and percentage of years during ages 12 to 15 the family moved location, are statistically significant and of the expected signs.

The results of the 14 tobits of the income Equation 14 for the subsample of the reference group who graduated and for the 14 tobits on the subsample that dropped out are summarized next.27 The effects on income of school quality, family, and neighborhood are recovered from these equations.

School expenditures are an important determinant of personal income, both statistically and economically. Higher school spending is associated with a higher income throughout the individual's 20s and early 30s. School student/teacher ratio, however, does not appear to bear any relationship to income.

For those who graduate from high school, being female or African-American is negatively and significantly associated with income. The magnitudes of the effects are very large. However, the coefficient on the African-American female interaction term is positive and economically as well as statistically significant.

Higher parental education is negatively related to income for late teens and early 20s but positively related at older ages. This pattern would be consistent with children of higher-edu- cated parents being more likely to attend college and thus having lower initial earnings during these college years but having higher earnings as they begin to experience the returns to college education.

Having a mother who works and being poor are both associated with lower, but not sta-

27 These results are available from the author on request.

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 20: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

536 Kathryn Wilson

tistically significantly lower, income in the late teens and early 20s, but neither affects subse- quent income. Surprisingly, the coefficient on female interacted with mother working is nega- tively associated with income, implying that having a mother who works lowers incomes for females relative to males, but the effect is very small.

Having more siblings is associated with higher income in the initial years but lower income at later years, while the effect of being the oldest child follows no strong pattern. Spending more time in a single parent family or living in an SMSA also have no strong relationship with income, while having a disabled parent is associated with higher income in the late 20s and early 30s.

The effects of neighborhood variables on the incomes of those who graduate from high school are relatively small in magnitude. Having more mother-only families or fewer white families in the neighborhood is associated with lower income, but the effects are not statistically significant and diminish over time. Living in a neighborhood with higher median income but fewer individuals in a high-status occupation is generally positively related to income. However, the unexpected sign on the high-status-occupation variable may be a result of only observing incomes through age 32, and the result is not statistically significant.

The variables used as identifiers, natural log of family income and region dummies, were generally significantly associated with income. Family income is positive and increasing in magnitude at older ages. The effect of growing up in a certain region, on the other hand, decreased in magnitude over time. Individuals who grew up in the north-central region, the omitted category, have lower incomes than the other three regions. Bound, Jaeger, and Baker (1995) demonstrate the importance of having identifying variables that have an economically significant effect. The importance of the exclusion restrictions in the income equation can be seen by comparing the R2 of an ordinary-least-squares (OLS) estimation of the income equations where the independent variables are just the exclusion restrictions compared to the R2 for the full set of independent variables. The exclusion restrictions alone result in an R2 approximately half the size of full regression in the income equations for graduates and approximately one- quarter for the incomes of dropouts. Thus, the identifying variables are important factors in explaining income.28

The results of the income tobits for those who did not graduate from high school have some important differences relative to those who graduated. Whereas parental education had a positive effect on income at later ages for those who graduate, it is generally negatively asso- ciated with income for those who drop out. Family size and birth order also have different effects: Having more siblings increases income at later ages, while being first born reduces income. Finally, among the subsample who do not graduate from high school, females whose mothers work and individuals living in an SMSA have higher income.

The effect of neighborhood and school variables on income follow similar patterns for those who do not graduate as was found for those who do. However, the identifying variables have different relationships in each case. Family income is not related to the income of those who do not graduate, and while growing up in the north-central region is associated with lower incomes in the early years, by the late 20s it is associated with higher earnings than the West

28 An alternative way of measuring this is to compare the R2 when the independent variables are everything except the

identifying variables to the R2 with the full set of independent variables. This tells the additional explanatory power of the identification variables controlling for all the other independent variables. When the identifying variables are added to the other independent variables, the R2 increases an average of 5% for the graduation equations and 4% for the dropout equations.

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 21: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

Determinants of Educational Attainment 537

or the Northeast. For both the graduates and nongraduates, the coefficient estimate on the lambda term is negative and sometimes significant, but it decreases over time.

Predicted Incomes

Table 3 presents the predicted incomes of the primary sample conditional on educational attainment. These predicted incomes were generated by applying the coefficient estimates from the income tobits on the reference group to the primary sample to obtain a fitted value for income at every age, conditional on education.

The first panel shows the results for the entire primary sample. The pattern is as expected. Initially, the predicted income if an individual does not graduate is higher than if the individual does graduate, but the growth rate of income with a high school degree is much steeper.

The second and third panels of Table 3 divide the primary sample into those who actually do and do not graduate. The difference in expected income streams is greater for those who do indeed graduate ($73,666) than for those who do not graduate ($55,679).

High School Graduation Results

The probability that an individual graduates from high school is now estimated as a func- tion of the expected returns to education and the variables that affect the nonincome benefits of education (Eqn. 15). The measure of expected returns to education is the difference of the natural logs of income streams. The results are presented in Table 4.29

The first column of Table 4 corresponds to a model in which I assume that there are no nonincome benefits of education and that the only thing that affects the decision to graduate from high school is the expected returns to education. In this specification, the greater the returns to education, the greater the likelihood an individual will graduate from high school.

In columns 2 and 3, more background variables reflecting nonincome benefits are allowed to enter the model directly.30 The basic relationship between the expected returns to education and the probability of graduation remains. The likelihood that an individual will graduate from high school increases as the expected returns to education increase. This indicates that, as is posited by the human capital literature, individuals are making a decision to invest in schooling on the basis of the future income returns to this investment.

However, the expected returns to schooling are not the only factor that affects an indi- vidual's high school graduation decision. Family, neighborhood, and school factors affect the utility benefits and costs of schooling and thereby affect the likelihood of graduating. Having a higher-educated mother or one that works, being an African-American female, and attending a school with higher expenditures per student or lower student/teacher ratios increase the like- lihood of graduating. Conversely, having a disabled household head, moving geographic location

29 Murphy and Topel (1985) demonstrate that when a regressor is imputed, standard errors should be corrected. Unfor- tunately, the imputed regressor of income in this model is a nonlinear combination of coefficient estimates from 28 regressions (the income tobits) estimated over a separate sample (the reference sample) such that it is not feasible to correct the standard error in the normal fashion. A bootstrapping technique is used to obtain a confidence interval for the predicted income term in the full model. The resulting confidence interval from the bootstrapping is virtually identical to that given by the regular standard error from the probit estimation. Because of the computational require- ments of the bootstrapping and the fact that the confidence interval does not change using it, uncorrected standard errors are reported for the estimation results.

30 The full-model estimation correctly predicts the high school graduation choice for 87% of the sample.

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 22: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

538 Kathryn Wilson

Table 3. Predicted Incomes with and without Graduating High Schoola Predicted Income Predicted Income

if Graduate if Dropout

Standard Standard Mean Deviation Mean Deviation

Entire sample (n = 1772) Age 19 6129 2167 6254 3092 Age 20 7908 2083 6274 3158 Age 21 9180 2131 5893 3602 Age 22 10,976 3470 8753 4218 Age 23 14,053 4241 7802 5809 Age 24 15,300 4598 9584 6096 Age 25 16,508 5461 10,977 6722 Age 26 17,710 6484 9013 5518 Age 27 19,231 6507 10,421 5658 Age 28 20,434 7216 11,341 6645 Age 29 21,465 7916 10,379 6023 Age 30 21,874 8394 9187 5796 Age 31 22,875 9077 10,890 7009 Age 32 23,152 9343 9625 6874 Net present value 165,490 52,233 94,352 49,591

Individuals who actually graduate (n = 1523) Age 19 6049 2154 6103 3059 Age 20 7845 2085 6209 3153 Age 21 9161 2141 5661 3566 Age 22 11,080 3453 8875 4322 Age 23 14,147 4230 7523 5786 Age 24 15,427 4629 9584 6218 Age 25 16,653 5495 11,138 6890 Age 26 18,003 6535 8890 5551 Age 27 19,504 6592 10,400 5716 Age 28 20,712 7318 11,394 6754 Age 29 21,814 7994 10,394 6118 Age 30 22,283 8545 8996 5742 Age 31 23,386 9271 10,984 7146 Age 32 23,654 9553 9598 6942 Net present value 167,480 52,643 93,814 50,211

Individuals who actually do not graduate (n = 249) Age 19 6617 2187 7175 3137 Age 20 8289 2035 6671 3164 Age 21 9302 2072 7307 3505 Age 22 10,336 3510 8004 3428 Age 23 13,481 4272 9508 5668 Age 24 14,522 4332 9585 5297 Age 25 15,619 5170 9988 5488 Age 26 15,919 5864 9770 5623 Age 27 17,560 5691 10,547 5298 Age 28 18,731 6305 11,020 5938 Age 29 19,327 7071 10,287 5416 Age 30 19,376 6908 10,353 5995 Age 31 19,745 7030 10,315 6086 Age 32 20,078 7235 9791 6458

Net present value 153,320 47,982 97,641 45,578 a Figures expressed as 1993 dollars.

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 23: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

Determinants of Educational Attainment 539

as a child, and living in a neighborhood with many mother-only families negatively affect the likelihood of graduating. The family income/needs ratio positively affects educational attain- ment, but when many family controls are included in the model, the effect is no longer statis- tically significant.

A comparison of column 3 (full model) and column 4 (production function) indicates some significant differences when education is modeled as a choice dependent on expected returns to schooling rather than the traditional production function. The most important difference is father's education. In the production function specification, father's education is a very important determinant of graduation, while in the full model it is not significant. This indicates that the effect of father's education is to increase the returns to schooling and thus increase the likelihood of graduation because of the student's response to this increased return to schooling. A second difference is race and gender. Many other studies find that controlling for background charac- teristics, being African-American and being female are associated with higher education (see Haveman and Wolfe 1995). The production function specification also finds positive results, but in the full model they are negative. If the wage differential for high school graduates is higher for African-Americans and females, which it is for this sample, this suggests that it is these higher returns to schooling, rather than some other aspect of race or gender, that makes African- Americans and females more likely to graduate, controlling for other family characteristics. However, the effects for race and gender are not statistically significant, so while the direction of the coefficient estimate suggests this, the null that there is no effect cannot be rejected.

Simulation Results

In order to better understand the mechanism through which predicted income and the background variables are affecting educational attainment, Table 5 presents simulations in which the effect of changing a variable is divided into production, income, and total effects. The production effect is the impact of the variable on the probability of graduating high school, operating through nonincome benefits of high school graduation (the marginal rate of substi- tution between education and consumption). The income effect is how the variable affects education by changing the expected returns to education (the marginal rate of transformation of education into consumption). The total effect is the combination of the production and income effect.31 In addition to the effects on educational attainment, the simulated weighted mean predicted incomes conditional on educational attainment are presented in the table.

A simulation of increasing the expected income stream if a person graduates by 10% is associated with more than three-quarters a percentage point increase in the probability that an individual graduates from high school.32 All of this effect is an income effect, the student responding to the income returns to schooling. For most of the background variables examined, the income effect is much smaller than the production effect. For example, in simulating all parents as high school graduates, the production effect is an increase of 5.0 percentage points in the probability that an individual will graduate high school, while the income effect is an increase of only 1.4 percentage points. However, it is interesting that the production effect is

31 Because of the nonlinear nature of the probit, the total effect is not the exact sum of the production and the income effects.

32 Another way of interpreting this estimate is that a 10% increase in income associated with graduating reduces the probability an individual will drop out of high school from 0.1099 to 0.1018, a decrease of 7%.

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 24: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

540 Kathryn Wilson

Table 4. Structural Model of High School Graduation Human Full Production Capital Base Family Model Function

Intercept 0.654*** 0.305** 1.197** 1.834*** (0.073) (0.121) (0.557) (0.469) [0.140] [0.056] [0.193] [0.297]

Income returns to education In(income if graduated) - 0.715*** 0.423*** 0.588** In(income if dropped out) (0.111) (0.159) (0.281)

[0.153] [0.078] [0.095] Family/individual characteristics African-American = 1 -0.078 -0.070 0.232

(0.135) (229) (0.178) [-0.014] [-0.011] [0.038]

Female = 1 -0.091 -0.171 0.073 (0.132) (0.170) (0.124)

[-0.017] [-0.028] [0.012] African-American*female = 1 0.317* 0.446** 0.252

(0.166) (0.192) (0.168) [0.059] [0.072] [0.041]

Mother high school graduate = 1 0.537*** 0.525*** 0.482*** (0.104) (0.114) (0.112) [0.100] [0.085] [0.078]

Mother attended college = 1 0.748*** 0.751*** 0.677*** (0.225) (0.241) (0.238) [0.139] [0.121] [0.110]

Father high school graduate = 1 0.177 0.090 0.277** (0.134) (0.162) (0.135) [0.033] [0.014] [0.045]

Father attended college = 1 0.260 0.059 0.348* (0.192) (0.244) (0.201) [0.048] [0.010] [0.056]

Both parent's education not 0.073 0.080 available = 1 (0.125) (0.124)

[0.012] [0.013] Average income/needs ratio 0.134*** .059 0.098*

(0.041) (0.057) (0.054) [0.025] [0.010] [0.016]

Percentage years income/needs 0.013 -0.134 < 1 (0.205) (0.194)

[0.002] [-0.022] Average number of siblings -0.048 -.059*

(0.031) (0.031) [-0.008] [-0.010]

First born = 1 0.203 0.294** (0.124) (0.117) [0.033] [0.048]

Percentage years in SMSA 0.233 0.085 (0.147) (0.130) [0.038] [0.014]

Percentage years head disabled -0.460*** -0.493*** (0.145) (0.145)

[-0.074] [-0.071]

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 25: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

Determinants of Educational Attainment 541

Table 4. Continued Human Full Production Capital Base Family Model Function

Percentage years single-parent -0.227 -0.203 family (0.141) (0.140)

[-0.037] [-0.033] Percentage years moved location -1.408*** -1.433***

(0.224) (0.224) [-0.228] [-0.232]

Percentage years mother worked 0.397*** 0.296** (0.137) (0.128) [0.064] [0.048]

Neighborhood characteristics Percentage neighborhood white -0.000 -0.003

(0.003) (0.003) [0.000] [0.000]

Percentage neighborhood mother- -0.015** -0.022*** only families (0.007) (0.007)

[-0.002] [-0.004] Neighborhood median income -0.037*** -0.034***

(*1000) (0.008) (0.008) [-0.006] [-0.006]

Neighborhood percentage high 0.039*** 0.035*** status occupations (0.009) (0.009)

[0.006] [0.006] School characteristics

School student/teacher ratio -0.020** -0.018* (0.010) (0.010)

[-0.003] [-0.003] District expenditure/student 0.241"*** 0.177**

(* 1000) (0.083) (0.076) [0.039] [0.029]

Log likelihood -696.60 -640.85 -577.88 -580.09 Standard errors in parentheses. Marginal effects evaluated at the sample mean in brackets.

* Statistically significant at the 10% level. ** Statistically significant at the 5% level.

*** Statistically significant at the 1% level.

coming almost entirely from mother's education, while the income effect is from father's edu- cation.

The increase in educational attainment from school spending is entirely the production effect. School expenditures increase income in both educational states but has a greater impact if the individual does not graduate from high school, thus the negative income effect. This school expenditure result is similar to that of Altonji and Dunn (1995), who find that school expenditures affect the level of income but have a negative and nonsignificant effect on the returns to education; this result is in contrast to Card and Krueger (1992), who find that school quality increases the rate of return to education for men. However, the effects of school expen- ditures on income in both educational states are quite large. This suggests that policy interven- tions directly toward school spending would increase not only the educational attainment of the current generation but their income streams well into adulthood.

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 26: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

542 Kathryn Wilson

Table 5. Estimated Probabilities from Structural Model Proba- bility

High High of High School School School Produc-

Graduate Dropout Gradu- tion Income Income Income ation Effect Effect

Baseline 183,440 105,040 0.8901 Increase income with high school graduation

by 10% 201,784 105,040 0.8982 Both parents at least high school graduates 181,120 93,231 0.9485 0.0496 0.0143 Mother at least high school graduate 183,110 107,580 0.9309 0.0433 -0.0042 Father at least high school graduate 181,460 90,971 0.9165 0.0092 0.0186 Half as many single-parent families 184,240 103,700 0.9120 0.0204 0.0018 Increase family income by 10% 185,170 103,710 0.8943 0.0020 0.0022 No children living in poverty (total sample) 184,920 104,940 0.8918 0.0000 0.0018 No children living in poverty (those in pover-

ty) Baseline 159,780 103,880 0.7976 Simulated effect 164,830 103,600 0.8039

Increase school expenditures per student 10% 186,620 110,890 0.8973 0.0102 -0.0033 Reduce student teacher ratio by 10% 182,660 105,390 0.8951 0.0057 -0.0007

The simulated production effect is calculated by holding the predicted return to education at its original level and simulating a change in the variable in the predicted value of the probability of high school graduation from Equation 15. The simulated income effect is calculated by allowing the simulated variable to change the predicted returns to education in the income tobits (eqn. 14) but constraining the production effect of the variable at its original level. The total simulated effect is calculated by using the simulated value to calculate the returns to education and also the predicted value of the probability of high school graduation. The income and production effects do not exactly sum to the total effect because of the nonlinearity of the probit model.

Sensitivity of the Model

A variety of assumptions have been made in estimating the model. Table 6 shows the results of a sensitivity analysis to these assumptions. The baseline specification is from Table 4.

The measure of income used in the baseline case is personal income, including nonearned income. The rationale for including other forms of income is that transfer income may be an important source of income, particularly for those who do not graduate from high school. Ignoring this income may overstate the returns to education. However, most of the human capital literature focuses on labor income. When the model is estimated using labor income, the results are very consistent with the baseline case, but the marginal effect is smaller and the coefficient estimate significant only at the 10% level in the full model. It is also possible that many individuals will have zero labor income because they choose to not enter the labor force. If this is true, then including these zero earning observations may capture an inaccurate picture of the returns to schooling. Specification (b) presents the results when members of the reference sample are included in the earnings estimation only if they have positive earnings at that age. This is equivalent to assuming that individuals look only at working older adults in making their earn- ings estimation. Once again, the results are similar with marginal effects that are very similar to the baseline case.

The second set of sensitivity results pertains to modeling assumptions.33 The baseline is

33 Because these model assumptions could potentially affect much more than just the income term, the full regression results are contained in Appendix C.

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 27: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

Determinants of Educational Attainment 543

Table 6. Sensitivity of the Model Human Capital Full Model

Baseline 0.715*** [0.153] 0.588** [0.095] (0.111) (0.281)

Measure of income (a) Labor income 0.380*** [0.082] 0.299* [0.048]

(0.068) (0.180) (b) Nonzero labor income 0.413*** [0.091] 0.622*** [0.101]

(0.107) (0.211) Modeling

(c) No selection equation 0.328** [0.073] 0.700** [0.113] (0.129) (0.327)

(d) Rational expectations 0.822*** [0.165] 0.881** [0.076] (0.133) (0.355)

(e) Net present value 0.655*** [0.140] 0.642** [0.103] (0.104) (0.258)

Identification of the model (f) Include income identification vari-

ables 0.689 [0.111] (0.483)

(g) Region measured at each age during 0.636*** [0.138] 0.664** [0.107] working years (0.117) (0.280)

(h) Ratio income HS - income drop 0.354*** [0.075] 0.264** [0.042] (0.057) (0.116)

Age of reference group (i) Limit sample to youngest/oldest 0.684*** [0.154] 0.433 [0.072]

(0.129) (0.342) (j) Include income only to age 28 0.704*** [0.152] 0.659** [0.106]

(0.117) (0.306) Discount rate

(k) 1% discount rate 0.700*** [0.150] 0.571** [0.092] (0.109) (0.275)

(1) 10% discount rate 0.755*** [0.161] 0.641** [0.104] (0.116) (0.298)

GED recipients (m) No GED recipients 0.767*** [0.171] 0.658** [0.110]

(0.116) (0.306) The table presents coefficient estimates for the predicted income term. Standard errors are in parentheses and

marginal effects are in brackets. The human capital specification includes only an intercept and the income term. The full-model specification includes all the independent variables in Table 4.

* Statistically significant at the 10% level. ** Statistically significant at the 5% level.

*** Statistically significant at the 1% level.

modeled with sample selection assuming that individuals in the primary sample observe the unobservable characteristics that affect the reference sample's selection into their education level. Specification (c) estimates the model without the first-stage sample selection equation. A second modeling issue is the assumption that individuals form their income expectations by observing others. An alternative, which is what Willis and Rosen (1979) used, is that individuals

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 28: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

544 Kathryn Wilson

have rational expectations. In order to estimate the model using rational expectations, the whole model is estimated only for the older sample, what had been the reference sample.34 The co- efficient estimates from the tobit estimation of this older sample are used to created predicted income in both education states for each individual in the older sample. These predicted income terms are then included in the final stage high school graduation probit that is estimated over this older sample. The final model assumption is that background factors affect income differ- ently at each age, and thus there are 14 income equations estimated for each education outcome. If the coefficients do not vary by age, then this will add unnecessary variance to the predicted income stream and result in a noisy measure of predicted income. To estimate specification (e), the net present value of income (using a 3% discount rate) is calculated for each individual in the reference sample. There are only two income regressions: net present value of income for the sample that graduates and for the sample that does not graduate. The log of the difference of the fitted values of these two income streams is then included in the high school graduation estimation for the primary sample. There are some differences in marginal effects, but the results are very similar across all these model specifications. In fact, changing these model assumptions results in a higher t-statistic for the predicted income term.

Exclusion restrictions and functional form identify the model, and the justification for these assumptions is discussed earlier. The first identification sensitivity analysis is to include in the high school graduation equation the exclusion variables from the income equations. If these exclusion restrictions are what are identifying the model, then, when they are included, it is expected that the effect of the income term will not be precisely estimated because of high multicollinearity. Panel (f) shows that this is precisely what happens. Without the exclusion restrictions, the predicted income term is no longer statistically significant. The second identi- fication sensitivity analysis is to include in the income regression the region where the reference sample lives each year during the ages 19 to 32 rather than the region where the individual lived during the ages 12 to 15. Region measured at each age is expected to be more closely related to future income, and thus to be a better exclusion restriction, but in forming the pre- dicted income term for the primary sample, this specification implies that the individuals in the primary sample assume that they will not move to another region. The predicted income results are stronger when region is measured at each age but robust to either specification. The final identification assumption to be examined is that the difference of the log of income is what is important in the education decision. Since part of the model identification is based on the functional form assumption, and there is no clear a priori indication of what the best form should be, it is important that this functional form is not what is driving the results. Rather than taking the difference in the log, specification (h) uses the ratio of predicted income with graduation divided by predicted income without graduation. The smaller marginal effect is a result of different units of measurement; a 10% increase in expected income with graduating increases the likelihood of graduating by 0.007 when the ratio of incomes is used compared to 0.008 for the log of the difference. The model is robust to either functional form assumption.

The model is estimated under the assumption that individuals form their expectations by observing an older set of individuals who share similar background characteristics. Unfortu- nately, as was pointed out in the estimation methodology and issues section, because the PSID has data for only a limited number of years, a trade-off must be made between the number of years for which childhood background characteristics are observed and the latest age at which

34 The primary sample cannot be used because I do not observe the younger individual's actual income at later ages.

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 29: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

Determinants of Educational Attainment 545

income is observed. In order to accommodate the trade-off that must be made between observ- able childhood background characteristics of the reference group and observed income at later ages, the reference sample is relatively close in age to the primary sample. Ideally, we would be able to use a much older reference sample yet still be able to observe their childhood experience. Specification (i) limits the primary sample to only the youngest (ages 0 to 3 in 1968) and the reference sample to only the oldest (ages 10 to 12 in 1968). This severely reduces the sample size. Specification (j) uses the entire sample but includes the income stream only to age 28, eliminating the later ages that the primary sample may not actually observe. The results are very similar to the baseline, but when the sample size is limited, the income term is not statistically significant in the full-model specification. A final indication that the specification is robust to the age of the reference group is that when the model is estimated using rational expectations (specification [d]), for which there is not a reference sample, the results hold.

The final two panels of the sensitivity analysis are for the discount rate and the treatment of individuals who have completed their GED. The baseline discount rate is assumed to be 3%, and specifications (j) and (k) indicate that the results are consistent with a discount rate of 1% or 10%. In the baseline sample, someone who receives a GED is classified as having graduated from high school. However, there is conflicting evidence on whether GED recipients should be classified as graduates or nongraduates (see Cameron and Heckman 1993; Murnane, Willett, and Boudett 1995). Therefore, in the last panel the model is estimated without GED recipients.

7. Conclusion

This paper develops a framework for examining educational attainment that incorporates utility-maximizing individuals choosing schooling level in response to both the economic returns to schooling and the utility derived from schooling. The model allows family, neighborhood, and school quality to independently affect educational attainment both through changing an individual's expected income conditional on education and through directly affecting the utility an individual receives from schooling. However, the amount of education in which an individual invests is a utility-maximizing choice of the individual given the conditional expected incomes and expected utility from schooling.

The human capital model implies that students respond to economic incentives. However, empirically there has been very little evidence of whether this is true. The estimation of the model in this paper finds evidence that students do make rational economic choices in their schooling. Even in the basic education choice of high school graduation, the greater the returns to graduating, the greater the likelihood an individual will graduate. In addition to this confir- mation of the human capital model, the finding that youths respond rationally to economic incentives in education choices provides more general empirical support for one of the most basic economic assumptions: that individuals are rational decision makers.

The paper also finds, though, that economic incentives are not the only things that matter. There are aspects of both the human capital framework and the education production function framework that are important. Individuals do make their schooling choice in response to the economic returns to schooling. However, most of the effect of family, neighborhood, and school characteristics comes not through changing the returns to schooling but rather because these factors affect the utility associated with being schooled.

Finally, the paper finds that school factors are important determinants of educational at-

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 30: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

546 Kathryn Wilson

tainment. Increasing school expenditures not only directly increases the educational attainment of the student as posited by studies using the education production function but also increases the expected future income whether the individual graduates or not. These findings for income confirm those of Altonji and Dunn (1995) that school spending increases the level of income but does not affect the returns to education. From a policy perspective, increasing school ex- penditures would increase not only the educational attainment of students but also their future incomes.

Appendix A: Algorithm for Matching Individuals to Schools

The school data from the Common Core of Data are merged to the individual-level data from the Michigan Panel Study of Income Dynamics (PSID) for each of ages 14 to 16 based on the ZIP code where the individual lived at each age. The criteria for a match are outlined in the following. The earliest criteria the individual meets at each age is how the merge is accomplished. The number in parentheses is the percentage of the sample that was matched on the basis of these criteria.

Criteria

1. If there is one high school in the individual's ZIP code, then the individual is assigned the characteristics of that school (47.7%).

2. If there are multiple high schools in the individual's ZIP code and all high schools are in the same district, then the individual is assigned the mean characteristics of the high schools in the ZIP code. If there are multiple high schools in the individual's ZIP code that do not belong to the same district, attempts were made to identify which high school is correct on the basis of census place (city), and the individual is assigned the characteristics of this school; if this is not possible, the individual is assigned the mean characteristics of the high schools in the ZIP code (31.2%).

3. If there is one elementary or middle school in the individual's ZIP code and the school is a member of a district that contains at least one high school, then the individual is assigned the mean characteristics of the high school(s) in the district associated with the elementary school (18.2%).

4. If there are multiple elementary or middle schools in the individual's ZIP code and these schools belong to different school districts, attempts were made to identify which school district is correct on the basis of census place (city), and the individual is assigned the mean characteristics of the high schools in the district associated with the school. If this is not possible, the individual is assigned the mean of the high schools associated with the school districts (2.0%).

5. The individual is matched to the school districts in the county of residence (0.9%).

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 31: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

Determinants of Educational Attainment 547

Appendix B Selection Equation

Coefficient Standard Estimate Error

Intercept 0.746 (0.555) African-American = 1 0.463** (0.192) Female = 1 0.462*** (0.143) African-American*female = 1 -0.183 (0.186) Mother high school graduate = 1 0.524*** (0.124) Mother attended college = 1 4.478 (54.01) Mother college graduate = 1 3.980 (53.58) Father high school graduate = 1 0.294* (0.157) Father attended college = 1 0.852* (0.436) Father college graduate = 1 0.706* (0.416) Both parent's education not available = 1 0.414** (0.162) Average income/needs ratio 0.180** (0.071) Percentage years income/needs < 1 -0.218 (0.181) Average number of siblings -0.023 (0.025) First born = 1 0.190 (0.138) Percentage years in SMSA -0.036 (0.136) Percentage years head disabled -0.109 (0.128) Percentage years single-parent family 0.172 (0.162) Percentage years moved location -0.838*** (0.222) Percentage years mother worked -0.052 (0.122) Percentage years in West 0.050 (0.175) Percentage years in Northeast 0.218 (0.176) Percentage years in South 0.067 (0.140) Percentage neighborhood white -0.003 (0.003) Percentage neighborhood mother-only families -0.022*** (0.008) Neighborhood median income (*1000) -0.005 (0.010) Neighborhood percentage high-status occupations 0.010 (0.009) School student/teacher ratio -0.009 (0.012) District expenditure/student (* 1000) 0.098 (0.097) Log likelihood -484.08 No. of observations 1540 * Statistically significant at the 10% level.

** Statistically significant at the 5% level. *** Statistically significant at the 1% level.

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 32: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

548 Kathryn Wilson

Appendix C Alternative Specifications

Rational Full Model No Selection Expectations NPV Income

Intercept 1.197** 1.090* 0.147 1.018* (0.557) (0.582) (0.585) (0.571) [0.193] [0.176] [0.013] [0.163]

Income returns to education In(Income if graduated) - 0.588** 0.700** 0.881** 0.642**

In(income if dropped out) (0.281) (0.327) (0.355) (0.258) [0.095] [0.113] [0.076] [0.103]

Family/individual characteristics African-American = 1 -0.070 -0.020 0.064 -0.009

(0.229) (0.214) (0.252) (0.221) [0.011] [-0.003] [0.006] [-0.015]

Female = 1 -0.171 -0.135 0.107 -0.227 (0.170) (0.158) (0.208) (0.173)

[-0.028] [-0.022] [0.009] [-0.036] African-American*female = 1 0.446** 0.423** 0.077 0.486**

(0.192) (0.187) (0.221) (0.194) [0.072] [0.068] [0.007] [0.078]

Mother high school graduate = 1 0.525*** 0.603*** 0.570*** 0.510*** (0.114) (0.126) (0.128) (0.113) [0.085] [0.097] [0.049] [0.082]

Mother attended college = 1 0.751*** 0.823*** 4.402 0.752*** (0.241) (0.248) (39.160) (0.241) [0.121] [0.133] [0.378] [0.121]

Father high school graduate = 1 0.090 0.133 0.043 0.002 (0.162) (0.151) (0.189) (0.171) [0.014] [0.022] [0.004] [0.002]

Father attended college = 1 0.059 0.074 0.418 0.004 (0.244) (0.239) (0.346) (0.237) [0.010] [0.012] [0.036] [0.006]

Both parent's education not avail- 0.073 0.070 0.157*** 0.007 able = 1 (0.125) (0.125) (0.161) (0.125)

[0.012] [0.011] [0.044] [0.012] Average income/needs ratio 0.059 0.069 0.105 0.005

(0.057) (0.056) (0.082) (0.006) [0.010] [0.011] [0.009] [0.009]

Percentage years income/needs < 0.013 -0.004 -0.107 0.006 1 (0.205) (0.202) (0.218) (0.208)

[0.002] [-0.001] [-0.009] [0.009] Average number of siblings -0.048 -0.048 0.000 -0.005*

(0.031) (0.031) (0.027) (0.003) [-0.008] [-0.008] [0.000] [0.009]

First born = 1 0.203 0.231"* 0.020 0.161 (0.124) (0.120) (0.150) (0.128) [0.033] [0.037] [0.002] [0.026]

Percentage years in SMSA 0.233 0.253* 0.234 0.234* (0.147) (0.151) (0.172) (0.142) [0.038] [0.041] [0.020] [0.038]

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 33: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

Determinants of Educational Attainment 549

Appendix C Continued

Rational Full Model No Selection Expectations NPV Income

Percentage years head disabled -0.460*** -0.501*** -0.016 -0.470*** (0.145) (0.148) (0.143) (0.146)

[-0.074] [-0.081] [-0.001] [-0.075] Percentage years single-parent -0.227 -0.194 0.028 -0.240*

family (0.141) (0.140) (0.305) (0.141) [-0.037] [-0.031] [0.002] [-0.039]

Percentage years moved location - 1.408*** - 1.406*** - 1.185*** - 1.412*** (0.224) (0.224) (0.280) (0.224)

[-0.228] [-0.227] [-0.102] [-0.227] Percentage years mother worked 0.397*** 0.406*** 0.160 0.388***

(0.137) (0.138) (0.145) (0.134) [0.064] [0.066] [0.014] [0.062]

Neighborhood characteristics Percentage neighborhood white -0.000 0.000 0.000 0.000

(0.003) (0.003) (0.003) (0.003) [0.000] [0.000] [0.000] [0.000]

Percentage neighborhood -0.015** -0.017** -0.017** -0.016** mother-only families (0.007) (0.007) (0.008) (0.007)

[-0.002] [-0.003] [-0.001] [-0.003] Neighborhood median income -0.037*** -0.039*** -0.015 -0.036***

(*1000) (0.008) (0.008) (0.010) (0.007) [-0.006] [-0.006] [-0.13] [-0.006]

Neighborhood percentage high- 0.039*** 0.043*** 0.019** 0.040*** status occupations (0.009) (0.010) (0.010) (0.009)

[0.006] [0.007] [0.002] [0.006] School characteristics

School student/teacher ratio -0.020** -0.021** -0.014 -0.016* (0.010) (0.010) (0.011) (0.010)

[-0.003] [-0.003] [-0.001] [-0.003] District expenditure/student 0.241*** 0.278*** 0.208** 0.278***

(*1000) (0.083) (0.090) (0.098) (0.087) [0.039] [0.045] [0.018] [0.045]

Log likelihood -577.88 -577.79 -480.75 -576.95 Standard errors in parentheses. Marginal effects evaluated at the sample mean in brackets.

* Statistically significant at the 10% level. ** Statistically significant at the 5% level.

*** Statistically significant at the 1% level.

References

Aaronson, D. 1998. Using sibling data to estimate the impact of neighborhoods on children's educational outcomes. Journal of Human Resources 33(4):915-46.

Altonji, Joseph, and Thomas Dunn. 1995. The effects of school and family characteristics on the returns to education. NBER Working Paper No. 5072.

Astone, Nan Marie, and McLanahan, Sara. 1991. Family structure and high school completion: The role of parental practices. American Sociological Review 56:309-20.

Becker, Gary. 1964. Human capital: A theoretical and empirical analysis with special reference to education. New York: Columbia University Press.

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 34: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

550 Kathryn Wilson

Behrman, Jere, Lori G. Kletzer, Michael S. McPherson, and Morton Owen Shapiro. 1992. The college decision: Direct and indirect effects of family background on choice of postsecondary enrollment and quality. Unpublished man- uscript.

Bound, John, David Jaeger, and Regina Baker. 1995. Problems with instrumental variables estimation when the correlation between the instruments and the endogenous explanatory variable is weak. Journal of the American Statistical Association 90(430):443-50.

Brooks-Gunn, Jeanne, Greg J. Duncan, Pamela Kato Klebanov, and Naomi Sealand. 1993. Do neighborhoods influence child and adolescent development? American Journal of Sociology 99(2):353-95.

Cameron, Stephen V., and James J. Heckman. 1993. The nonequivalence of high school equivalents. Journal of Labor Economics 11(1):1-47.

Card, David, and Alan Krueger. 1992. Does school quality matter? Returns to education and the characteristics of public schools in the United States. Journal of Political Economy 100(1):1-40.

Corcoran, Mary, Roger Gordon, Debra Laren, and Gary Solon. 1992. The association between men's economic status and their family and community origins. Journal of Human Resources 27(4):53-79.

Datcher, Linda. 1982. Effects of community and family background on achievement. Review of Economics and Statistics 64:32-41.

Evans, William, Wallace Oates, and Robert Schwab. 1992. Measuring peer group effects: A study of teenage behavior. Journal of Political Economy 100(2):966-91.

Freeman, Richard. 1971. The market for college-trained manpower. Cambridge, MA: Harvard University Press. Freeman, Richard. 1975. Supply and salary adjustments to the changing science manpower market. American Economic

Review 65:27-39. Freeman, Richard. 1986. Demand for education. In Handbook of Labor Economics 1, edited by 0. Ashenfelter and R.

Layard. Amsterdam: North-Holland Press. Ginther, Donna. 1995. Examining the U.S. white male full-time worker earnings distribution and the effect of schooling

on earnings: Applications of nonparametric estimation. Ph.D. diss., University of Wisconsin, Madison. Hanushek, Eric A. 1986. The economics of schooling: Production and efficiency in public schools. Journal of Economic

Literature 24:1141-77. Hanushek, Eric A. 1991. When school finance reform may not be a good policy. Harvard Journal on Legislation 28:

423-56. Hanushek, Eric A. 1994. Making schools work: Improving performance and controlling costs. Washington, DC: Brook-

ings Institution. Haveman, Robert, and Barbara Wolfe. 1995. The determinants of children's attainments: A review of methods and

findings. Journal of Economic Literature 33:1829-78. Haveman, Robert, Barbara Wolfe, and James Spaulding. 1991. Childhood events and circumstances influencing high

school completion. Demography 28:133-57. Haveman, Robert, Barbara Wolfe, and Kathryn Wilson. 1998. The role of expectations in youth schooling choices: Do

youths respond to economic incentives. Unpublished paper, University of Wisconsin, Madison. Heckman, James. 1979. Sample selection bias as a specification error. Econometrica 47:153-61. Hedges, Larry V., Rob Greenwald, and Richard Laine. 1994. Does money matter? A meta-analysis of studies of the

effects of differential school inputs on student outcomes. Educational Researcher 23(3):5-14. Hill, Martha, and Greg J. Duncan. 1987. Parental family income and the socioeconomic attainment of children. Social

Science Research 16:39-73. Jencks, Christopher, and Susan Mayer. 1990. The social consequences of growing up in a poor neighborhood. In Inner-

city poverty in the United States, edited by Lawrence Lynn, Jr., and Mary McGeary. Washington, DC: National Academy Press, pp. 111-86.

Johnston, Jack, and John DiNardo. 1997. Econometric methods. 4th ed. New York: McGraw-Hill. Madalla, G. S. 1992. Introduction to econometrics. 2nd ed. Englewood Cliffs, NJ: Prentice Hall. Manski, Charles E 1993a. Adolescent econometricians: How do youth infer the returns to schooling? In Studies of supply

and demand in higher education, edited by Charles Clotfelter and Michael Rothschild. Chicago: University of Chicago Press.

Manski, Charles E 1993b. Dynamic choice in social settings. Journal of Econometrics 58:121-36. Manski, Charles E, Gary Sandefur, Sara McLanahan, and Daniel Powers. 1992. Alternative estimates of the effect of

family structure during adolescence on high school graduation. Journal of the American Statistical Association 87:25-37.

Manski, Charles E, and David Wise. 1983. College choice in America. Cambridge, MA: Harvard University Press. Matilla, J. P. 1982. Determinants of male school enrollments: A time series analysis. Review of Economics and Statistics

64:242-51. Murnane, Richard J., John B. Willett, and Katherine P. Boudett. 1995. Do high school dropouts benefit from obtaining a

GED? Educational Evaluation and Policy Analysis 17:133-47.

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions

Page 35: The Determinants of Educational Attainment: Modeling and Estimating the Human Capital Model and Education Production Functions

Determinants of Educational Attainment 551

Murphy, Kevin, and Robert Topel. 1985. Estimation and inference in two-step econometric models. Journal of Business and Economic Statistics 3(4):370-9.

Plotnick, Robert, and Saul Hoffman. 1999. The effect of neighborhood characteristics on young adult outcomes: Alter- native estimates. Social Science Quarterly 80(1):1-18.

Willis, Robert, and Sherwin Rosen. 1979. Education and self-selection. Journal of Political Economy 87(5):S7-S35. Wolfe, Barbara, Robert Haveman, Donna Ginther, and Chong Bum An. 1996. The 'window problem' in studies of

children's attainments: A methodological exploration. Journal of the American Statistical Association 91:970-82.

This content downloaded from 79.132.81.227 on Sun, 11 May 2014 11:23:33 AMAll use subject to JSTOR Terms and Conditions


Recommended