+ All Categories
Home > Documents > STA 291 Fall 2009

STA 291 Fall 2009

Date post: 22-Feb-2016
Category:
Upload: malo
View: 30 times
Download: 0 times
Share this document with a friend
Description:
STA 291 Fall 2009. Lecture 1 Dustin Lueker. Topics. Statistical terminology Descriptive methods Probability and distribution functions Estimation (confidence intervals) Hypothesis testing Inferential methods for two samples Simple linear regression and correlation. Why study Statistics?. - PowerPoint PPT Presentation
24
STA 291 Fall 2009 Lecture 1 Dustin Lueker
Transcript
Page 1: STA 291 Fall 2009

STA 291Fall 2009

Lecture 1Dustin Lueker

Page 2: STA 291 Fall 2009

Statistical terminology Descriptive methods Probability and distribution functions Estimation (confidence intervals) Hypothesis testing Inferential methods for two samples Simple linear regression and correlation

Topics

STA 291 Fall 2009 Lecture 1

Page 3: STA 291 Fall 2009

Research in all fields is becoming more quantitative◦ Look at research journals◦ Most graduates will need to be familiar with basic

statistical methodology and terminology Newspapers, advertising, surveys, etc.

◦ Many statements contain statistical arguments Computers make complex statistical

methods easier to use

Why study Statistics?

STA 291 Fall 2009 Lecture 1

Page 4: STA 291 Fall 2009

Many times statistics are used in an incorrect and misleading manner

Purposely misused◦ Companies/people wanting to furthur their

agenda Cooking the data

Completely making up data Massaging the numbers

Incidentally misused◦ Using inappropriate methods

Vital to understand a method before using it

Lies, Damn Lies, and Statistics

STA 291 Fall 2009 Lecture 1

Page 5: STA 291 Fall 2009

Statistics is a mathematical science pertaining to the collection, analysis, interpretation or explanation, and presentation of data

Applicable to a wide variety of academic disciplines◦ Physical sciences◦ Social sciences◦ Humanities

Statistics are used for making informed decisions◦ Business◦ Government

What is Statistics?

STA 291 Fall 2009 Lecture 1

Page 6: STA 291 Fall 2009

Design •Planning research studies•How to best obtain the required data•Assuring that our data is representational of the entire population

Description •Summarizing data•Exploring patterns in the data•Extract/condense information

Inference •Make predictions based on the data•‘Infer’ from sample to population•Summarize results

General Statistical Methodology

STA 291 Fall 2009 Lecture 1

Page 7: STA 291 Fall 2009

Population◦ Total set of all subjects of interest

Entire group of people, animals, products, etc. about which we want information

Elementary Unit◦ Any individual member of the population

Sample◦ Subset of the population from which the study

actually collects information◦ Used to draw conclusions about the whole

population

Basic Terminology

STA 291 Fall 2009 Lecture 1

Page 8: STA 291 Fall 2009

Variable◦ A characteristic of a unit that can vary among

subjects in the population/sample Ex: gender, nationality, age, income, hair color, height,

disease status, state of residence, grade in STA 291 Parameter

◦ Numerical characteristic of the population Calculated using the whole population

Statistic◦ Numerical characteristic of the sample

Calculated using the sample

Basic Terminology

STA 291 Fall 2009 Lecture 1

Page 9: STA 291 Fall 2009

Why take a sample? Why not take a census? Why not measure all of the units in the population?◦ Accuracy

May not be able to find every unit in the population◦ Time

Speed of response from units◦ Money◦ Infinite Population◦ Destructive Sampling or Testing

Data Collection and Sampling Theory

STA 291 Fall 2009 Lecture 1

Page 10: STA 291 Fall 2009

University Health Services at UK conducts a survey about alcohol abuse among students◦ 200 of the students are sampled and asked to

complete a questionnaire◦ One question is “have you regretted something

you did while drinking?” What is the population? Sample?

Example

STA 291 Fall 2009 Lecture 1

Page 11: STA 291 Fall 2009

‘Flavors’ of Statistics Descriptive Statistics

◦ Summarizing the information in a collection of data

Inferential Statistics◦ Using information from a sample to make

conclusions/predictions about the population

STA 291 Fall 2009 Lecture 1

Page 12: STA 291 Fall 2009

Example The Current Population Survey of about 60,000

households in the United States in 2002 distinguishes three types of families: Married-couple (MC), Female householder and no husband (FH), Male householder and no wife (MH)

It indicated that 5.3% of “MC”, 26.5% of “FH”, and 12.1% of “MH” families have annual income below the poverty level◦ Are these numbers statistics or parameters?

The report says that the percentage of all “FH” families in the USA with income below the poverty level is at least 25.5% but no greater than 27.5%◦ Is this an example of descriptive or inferential statistics?

STA 291 Fall 2009 Lecture 1

Page 13: STA 291 Fall 2009

Univariate vs. Multivariate Univariate data

◦ Consists of observations on a single attribute Multivariate data

◦ Consists of observations on several attributes Special case

Bivariate Data Consists of observations on two attributes

STA 291 Fall 2009 Lecture 1

Page 14: STA 291 Fall 2009

Quantitative or Numerical◦ Variable with numerical values associated with

them Qualitative or Categorical

◦ Variables without numerical values associated with them

Scales of Measurement

STA 291 Fall 2009 Lecture 1

Page 15: STA 291 Fall 2009

Nominal◦ Gender, nationality, hair color, state of residence

Nominal variables have a scale of unordered categories It does not make sense to say, for example, that green

hair is greater/higher/better than orange hair Ordinal

◦ Disease status, company rating, grade in STA 291 Ordinal variables have a scale of ordered categories,

they are often treated in a quantitative manner (A = 4.0, B = 3.0, etc.) One unit can have more of a certain property than does

another unit

Qualitative Variables

STA 291 Fall 2009 Lecture 1

Page 16: STA 291 Fall 2009

Quantitative◦ Age, income, height

Quantitative variables are measured numerically, that is, for each subject a number is observed The scale for quantitative variables is called interval

scale

Quantitative Variables

STA 291 Fall 2009 Lecture 1

Page 17: STA 291 Fall 2009

A study about oral hygiene and periodontal conditions among institutionalized elderly measured the following◦ Nominal (Qualitative): Requires assistance from staff?

Yes No

◦ Ordinal (Qualitative): Plaque score No visible plaque Small amounts of plaque Moderate amounts of plaque Abundant plaque

◦ Interval (Quantitative): Number of teeth

Example

STA 291 Fall 2009 Lecture 1

Page 18: STA 291 Fall 2009

A birth registry database collects the following information on newborns◦ Birthweight: in grams◦ Infant’s Condition:

Excellent Good Fair Poor

◦ Number of prenatal visits◦ Ethnic background:

African-American Caucasian Hispanic Native American Other

What are the appropriate scales? Quantitative (Interval) Qualitative (Ordinal, Nominal)

Example

STA 291 Fall 2009 Lecture 1

Page 19: STA 291 Fall 2009

Statistical methods vary for quantitative and qualitative variables

Methods for quantitative data cannot be used to analyze qualitative data

Quantitative variables can be treated in a less quantitative manner◦ Height: measured in cm/in

Interval (Quantitative) Can be treated at Qualitative

Ordinal: Short Average Tall

Nominal: <60in or >72in 60in-72in

Importance of Different Types of Data

STA 291 Fall 2009 Lecture 1

Page 20: STA 291 Fall 2009

Try to measure variables as detailed as possible◦ Quantitative

More detailed data can be analyzed in further depth

◦ Caution: Sometimes ordinal variables are treated at quantitative (ex: GPA)

Other Notes on Variable Types

STA 291 Fall 2009 Lecture 1

Page 21: STA 291 Fall 2009

A variable is discrete if it can take on a finite number of values◦ Gender◦ Nationality◦ Hair color◦ Disease status◦ Grade in STA 291◦ Favorite MLB team

Qualitative variables are discrete

Discrete Variables

STA 291 Fall 2009 Lecture 1

Page 22: STA 291 Fall 2009

Continuous variables can take an infinite continuum of possible real number values◦ Time spent studying for STA 291 per day

43 minutes 2 minutes 27.487 minutes 27.48682 minutes

Can be subdivided into more accurate values Therefore continuous

Continuous Variables

STA 291 Fall 2009 Lecture 1

Page 23: STA 291 Fall 2009

Number of children in a family Distance a car travels on a tank of gas % grade on an exam

Examples

STA 291 Fall 2009 Lecture 1

Page 24: STA 291 Fall 2009

Quantitative variables can be discrete or continuous

Age, income, height?◦ Depends on the scale

Age is potentially continuous, but usually measured in years (discrete)

Discrete or Continuous

STA 291 Fall 2009 Lecture 1


Recommended