Confidence Interval: Definition
DefinitionFor 95% of samples, the confidence interval will contain the population average
For a particular sampleThere is an 95% chance that the confidence interval contains the population average
Confidence Intervals Answer Questions Like…
What does my sample of mile-run times tell me about the average 2nd grader in Seattle?
Is the average 2nd grader in Seattle likely to run a faster mile than me?
Hypothesis Testing: Test Statistic
A value calculated from your data
This value always follows the same distribution, regardless of the distribution of your data*
Hypothesis Testing: Procedure
1. State the hypothesis and the null hypothesis
2. Choose an appropriate test statisticThis usually follows a well-known distribution
3. Choose a threshold probabilityUsually small, we will use 0.005 (0.5%)
Hypothesis Testing: Procedure (continued)
4. Calculate the p-valueThe probability under the null hypothesis of sampling a test statisticat least as extreme as what we observed.
5. Accept or reject the hypothesisAccept: p < 0.005
Reject: p > 0.005
Student’s t-test
Test statistic follows Student’s t-distribution
We will use a two-sample location test
Tests the null hypothesis that the means of two populations are equal
Hypothesis Testing Can Answer Questions Like…
Are CrossFit Games athletes stronger on average in 2018 than they were in 2007? (t-test)
Are observations of two groups independent of one another? (Chi-squared test)
Is my sample drawn from a normally distributed population? (Shapiro-Wilk test)
Trend Lines: OLS Questions
1. Do I suspect there is a relationship between two variables? What do I suspect that relationship is?
2. Do the residuals have mean = 0? Do they appear unrelated to the independent variable?
3. Are the residuals are unlikely to be correlated with one another?
4. Does the spread of the residuals look roughly the same with changes in the independent variable?
Trend Lines Answer Questions Like…
What is the relationship between profit and CEO compensation?
When wind speed changes, how does windmill power output change?
Does compensation change in a meaningful way when age changes?
Forecasting: Model Quality
We will consider only Mean Absolute Scaled Error (MASE)
MASE compares the error of your model with the error of the naïve forecast
MASE is typically between 0 (good) and 1 (bad)
Forecasting: Naïve Forecast
Forecast values copied from the last observed value.
For seasonal forecasts, values are copied from the last observed season.
Forecasting: Unexpected and Poor Forecast
1. Does it look like there is a structural break in my data?
2. Is there a lot of short-scale variation at the current date level?
Forecasts Answer Questions Like…
How many visitors to my page can I expect in the future, given data on past visits?
Based on past data, what will my inventory be in the future?
How is the value of my collection likely to change in the future?