Date post: | 27-Jan-2015 |
Category: |
Data & Analytics |
Upload: | somdeep-sen |
View: | 355 times |
Download: | 1 times |
SOMDEEP SEN; Business Analyst: Trimax Analytics
(e) [email protected]; (p): 09748229123
LinkedIn: http://linkd.in/1ifqs3x
The Data set contains:
Performance of 400 elementary schools from the California Department of Education
Factors like class-size, parent education, student performance, etc.
Objectives:
To find the factors having major influence on the academic performance
To predict academic performance of an school using those factors
Note: Factors have been chosen based on statistical significance
Factors Impact
English language learners(ELL) Negative
Percentage first year in school (Mobility) Negative
Parent grad school (grad_sch) Positive
Percentage full credential (Full) Positive
Average Class size 4-6 (ACS_46) Positive
Variable Label Parameter
Intercept Intercept 459.71
ell english language learners -2.90
mobility pct 1st year in school -3.11
acs_46 avg class size 4-6 3.69
grad_sch parent grad school 3.38
full pct full credential 2.33
Regression Equation
API00= 459.71+ (-2.90)* ell+ (-3.11)*(mobility) + 3.69* acs_46+ grad_sch*(3.38) + full* (2.33)
To view the detailed SAS Code please visit the following link:http://bit.ly/1c08pGE
• ELLs are one of the fastest growing populations in the public schools
• Number of ELLs in CA is healthy due to the geographic location & economic significance
• ELL students come from different backgrounds &face multiple challenges
• But, the main challenge continues to be the problem in communication
Recommendations:
• Provide special coaching to ELLs to ensure that they master English
• Special coaching should be done before they get tested in English in core content areas
• Ensure that all ELL students receive the full range of services
• Improve teacher training opportunities so teachers can understand the needs of ELLs
• Students making non-promotional school changes is known as mobility
• California students, like students in the rest of the U.S., are highly mobile
• Mobility happens due to following reasons:
– Families changing their residences
– School changes initiated by students especially in California
– School changes initiated by schools especially in California
Recommendations:
Families should:
• Attempt to resolve problems at school before initiating transfer
• Make changes between semesters or at the end of the school year
Schools should:
• Counsel students to remain in the school if at all possible
• Prepare in advance for incoming transfers
• Assess the past enrollment history of incoming students
• Assess the number of previous school changes
• Facilitate the transition of new students as soon as they arrive
• Research shows US students spend less than 15% of their time in school
• Therefore parent involvement is as important as the time spend in school
• Checking homework, attending school meetings, influences student performance
• Educated parents finds it easier to get involved than the others
Recommendations:
• Look to ensure that parents with school graduation lie between 65-70%
• Conduct parent interview during the admission of the students
• Also take initiatives to increase parent engagements
• But schools shouldn’t limit a parent’s involvement based on socio-economic status
• Experienced teachers are more effective at raising student
• Experienced teachers are also more likely to be fully credentialed also
• Hence teacher retention could be instrumental in performance improvement
• Shortage of fully credential teachers is a prime reason for low performance & mobility
• Many assume that financial incentive is the silver bullet; but that is only partially true
Recommendations:
• Financial incentives can make schools more attractive to more qualified teachers
• Money is Necessary, But Clearly Not Sufficient
• Teachers often leave due to poor working conditions, and lack of administrative support
• Schools should recruit & develop administrators who can draw on the expertise of teachers
• Improvement in avg. class size in 4-6 grade((ACS_46) tends to improve performance
• ACS_46 can be improved when:
– Mobility is low
– Promotion of student’s from one grade to another is high
• So, it can be said that ACS_46 is an indicator of the overall academic performance
Recommendation:
• Focus should be on all the recommendations mentioned previously to improve ACS_46
• Outliers were found using the proc univariate option & treated accordingly
Before the treatment After the treatment
• This is done to check the overall significance of the model
• H0: independent variables collectively or individually can’t influence the dependent variable
• H1: the independent variables collectively or individually can influence the dependent variable
• If P value>α: H0 can’t be rejected & hence the model is useless
• If P value<α: H0 is rejected & hence some independent can influence the dependent variable
• In this case the P value<α & hence some independent can influence the dependent variable
• This happens when the independent variables are highly interdependent
• Hence the individual impact on the dependent variables can’t be correctly estimated
• The extent of multicolineraity is captured by the variance inflation factor(VIF)
• The final model must have only those variables having VIF ranging from 1.5 to 2
• To control multicolineraity certain variables gets removed based on high VIF
values
• For the rest the significance of the corresponding population parameter
• The P values of the variables are checked for the significance
• Variables having P value>α are not important for the model
• The final model must have variables having P value>α & VIF ranging from 1.5 to 2
• This occurs when the variance of the random error component is not constant
• The White’s test used for the check For Heteroscedasticity
• Null Hypothesis: Model is Homoscedastic
• If P value>α: H0 can’t be rejected & hence the model is Homoscedastic & vice-
versa
• The VIF SPEC option is used to check for the Heteroscedasticity
• Once the model has only the significant variables the o/p file created
• The o/p file contains the predicted & the residual variables
• The residual variables saved in the o/p file for normality
• This is done using the proc univariate with normal option
• Mean absolute percentage error or MAPE captures the overall % error of the model
• Ideally MAPE should be with in 10%
• This captures the proportion variation that can be explained by the linear regression
• Higher the value of R-square, better the explanatory power
• This acts as a measure of goodness of fit of the model
• R- square value should be at least 65% or .65
API00(E3)= C3+C4*D4+C5*D5+C6*D6+C7*D7+C8*D8
OR
API00= 459.71+ (-2.90)* ell+ (-3.11)*(mobility) + 3.69* acs_46+ grad_sch*(3.38) + full* (2.33)