Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or...

Introduction to Statistics for the Social Sciences

SBS200, COMM200, GEOG200, PA200, POL200, or SOC200Lecture Section 001, Fall 2015

Room 150 Harvill Building10:00 - 10:50 Mondays, Wednesdays & Fridays.

http://courses.eller.arizona.edu/mgmt/delaney/d15s_database_weekone_screenshot.xlsx

Everyone will want to be enrolled in one of the lab sessions

Labs continue next week

Please re-register your clickerhttp://

student.turningtechnologies.com/

By the end of lecture today10/9/15

Law of Large Numbers

Central Limit Theorem

Before next exam (October 16th)

Please read chapters 1 - 8 in OpenStax textbook

Please read Chapters 10, 11, 12 and 14 in PlousChapter 10: The Representativeness Heuristic

Chapter 11: The Availability HeuristicChapter 12: Probability and Risk

Chapter 14: The Perception of Randomness

Schedule of readings

On class website: Please print and complete homework worksheet #11 Due Monday October 12th

Dan Gilbert Reading and Law of Large Numbers

Homework

Review of Homework Worksheet

just in case of questions

Homework review

Based on apriori probability – all options equally likely – not based on previous experience or data

Based on expert opinion - don’t have previous data for these two companies merging together

25

= .40

Based on frequency data (Percent of rockets that successfully launched)

Homework review

Based on apriori probability – all options equally likely – not based on previous experience or data

Based on frequency data (Percent of times that pages that are “fake”)

30100

= .30

Based on frequency data (Percent of times at bat that successfully resulted in hits)

Homework review5

50= .10

Based on frequency data (Percent of students who successfully chose to be Economics majors)

.

50 554444 - 50 4

= -1.5

55 - 50 4

= +1.25

z of 1.5 = area of .4332

.4332 +.3944 = .8276

z of 1.25 = area of .3944

50 55

55 - 50 4

= +1.25

.5000 - .3944 = .1056

1.25 = area of .3944

.3944

52 5552 - 50 4

= +.5

55 - 50 4

= +1.25

z of .5 = area of .1915

.3944 -.1915 = .2029

z of 1.25 = area of .3944

.3944.1915

.8276

.1056

.2029

.4332.3944

Homework review

3,0003000 - 2708

650 =0.45

z of 0.45 = area of .1736

.5000 - .1736 = .3264

3,000 3,500

.1736

3000 - 2708

650 =0.45

z of 0.45 = area of .1736

.3888 - .1736 = .2152

3500 - 2708

650 =1.22

z of 1.22 = area of .3888

.1736

2,500 3,500

.1255

2500 - 2708

650 =-.32

z of -0.32 = area of .1255

.3888 +.1255= .5143

3500 - 2708

650 =1.22

z of 1.22 = area of .3888

.3888

.3264

.2152

.5143

.3888

Homework review

20 20 - 15 3.5

=1.43

z of 1.43 = area of .4236

.5000 - .4236 = .0764

.4236

20 - 15 3.5 =1.43

z of 1.43 = area of .4236 z of -1.43 = area of .4236

.4236 – .3051 = .1185

z of -.86 = area of .3051

10 1220

.4236

.5000 + .4236 = .9236

10 - 15 3.5 =-1.43

12 - 15 3.5 =-0.86

.0764

.1185

.9236

.3051.4236

Comments on Dan Gilbert Reading

Law of large numbers: As the number of measurementsincreases the data becomes more stable and a better

approximation of the true (theoretical) probability

As the number of observations (n) increases or the number of times the experiment is performed, the estimate will become more accurate.

Law of large numbers: As the number of measurementsincreases the data becomes more stable and a better

approximation of the true signal (e.g. mean)

As the number of observations (n) increases or the number of times the experiment is performed, the signal will become more clear (static cancels out)

http://www.youtube.com/watch?v=ne6tB2KiZuk

With only a few people any little error is noticed (becomes exaggerated when we look at whole

group)

With many people any little error is corrected (becomes minimized when we look at whole

group)

Sampling distributions of sample means versus frequency distributions of individual scores

XXXXXX

XXXXX

XXXXX

XXX

XXX

XX

XX XX

XXXXX

XXXXX

XXXXXX

Melvin

Eugene

Distribution of raw scores: is an empirical probability distribution of the values from a sample of raw scores from a population

Frequency distributions of individual scores• derived empirically• we are plotting raw data• this is a single sample

Population

Take a single score

Repeat over and

over

x xx

xx

xxx

Preston

Sampling distribution: is a theoretical probability distribution of the possible values of some sample statistic that would occur if we were to draw an infinite number of same-sizedsamples from a population

Sampling distributions of sample means• theoretical distribution• we are plotting means of samples

Population

Take sample –

get mean

Repeat over and over

important note:

“fixed n”

Mean for 1st sample


Population Distribution

of means of samples

Sampling distributions of sample means• theoretical distribution• we are plotting means of samples

Take sample –

get mean

Repeat over and over

important note:

“fixed n”


XXXXXX

XXXXX

XXXXX

XXX

XXX

XX

XX XX

XXXXX

XXXXX

XXXXXX

2nd sample

23rd sample

Sampling distributions sample means• theoretical distribution• we are plotting means of samples

Frequency distributions of individual scores• derived empirically• we are plotting raw data• this is a single sample

Melvin

Eugene

Central Limit Theorem: If random samples of a fixed N are drawnfrom any population (regardless of the shape of thepopulation distribution), as N becomes larger, the distribution of sample means approaches normality, with the overall mean approaching the theoretical populationmean.

Sampling distribution for continuous distributions

XXXXXX

XXXXX

XXXXX

XXX

XXX

XX

XX XX

XXXXX

XXXXX

XXXXXX

MelvinEugen

e

Sampling Distribution of Sample means

Distribution of Raw Scores

2nd sample

23rd sample

An example of asampling distribution of sample means

µ= 100σ = 3

= 1

XXXXXX

XXXXX

XXXXX

XXX

XXX

XX

XX XX

XXXXX

XXXXX

XXXXXX


Mean = 100

100

100

Standard Deviation = 3

µ = 100Mean = 100

Standard Errorof the Mean = 1

Notice: SEM is smaller than SD – especially as n increases

Melvin

Eugene

2nd sample

23rd sample

Proposition 1: If sample size (n) is large enough (e.g. 100) The mean of the sampling distribution will approach the mean of the population

Central Limit Theorem

Proposition 2: If sample size (n) is large enough (e.g. 100) The sampling distribution of means will be approximately normal, regardless of the shape of the population

XXXXXX

XXXXX

XXXXX

XXX

XXX

XX

XX XX

XXXXX

XXXXX

XXXXXX

Proposition 3: The standard deviation of the sampling distribution equals the standard deviation of the population divided by the square root of the sample size. As n increases SEM decreases.

As n ↑

x will approach µ

As n ↑ curve will approach normal shape

As n ↑ curve variability gets smaller

Date post:	13-Dec-2015
Category:	Documents
Upload:	charlene-miller
View:	215 times
Download:	0 times

Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or...

Documents