Res701 research methodology lecture 7 8-devaprakasam

transcript

DEVAPRAKASAM DEIVASAGAYAMProfessor of Mechanical Engineering

Room:11, LW, 2nd FloorSchool of Mechanical and Building Sciences

Email: devaprakasam.d@vit.ac.in, dr.devaprakasam@gmail.com

RES701: RESEARCH METHODOLOGY (3:0:0:3)

Devaprakasam D, Email: devaprakasam.d@vit.ac.in, Ph: +91 9786553933

Design in the Research Process

Small Samples Can Enlighten

““The proof of the pudding is in the eating.The proof of the pudding is in the eating.By By a small sample a small sample we may judge of thewe may judge of thewhole piece.”whole piece.”

Miguel de Cervantes Saavedra Miguel de Cervantes Saavedra authorauthor

The Nature of Sampling

•Population•Population Element•Census•Sample•Sampling frame

Sampling Terminology

• Sample– A subset, or some part, of a larger population.

• Population (universe)– Any complete group of entities that share some

common set of characteristics.• Population Element

– An individual member of a population.• Census

– An investigation of all the individual elements that make up a population.

Why Sample?

Greater Greater accuracyaccuracy

Availability Availability of elementsof elements

Greater Greater speedspeed

Sampling Sampling providesprovides

Lower costLower cost

Why Sample?• Pragmatic Reasons

– Budget and time constraints.– Limited access to total population.

• Accurate and Reliable Results– Samples can yield reasonably accurate information.information.–– Strong similarities Strong similarities in population elements makes in population elements makes

sampling possible.sampling possible.–– Sampling may be Sampling may be more accurate more accurate than a census.than a census.

•• Destruction of Test UnitsDestruction of Test Units–– Sampling Sampling reduces the costs reduces the costs of research in finite of research in finite

populations.populations.

A Photographic Example of How Sampling Works

Steps in Sampling Design

What is the target population?What is the target population?

What are the parameters of What are the parameters of interest?interest?

What is the sampling frame?What is the sampling frame?

What is the appropriate What is the appropriate sampling method?sampling method?

What size sample is needed?What size sample is needed?

When to Use Larger Sample?

Desired Desired precisionprecision

Number of Number of subgroupssubgroups

Confidence Confidence levellevel

Population Population variancevariance

Small error Small error rangerange

Simple Random

Advantages• Easy to implement with

random dialing

Disadvantages• Requires list of

population elements• Time consuming• Larger sample needed• Produces larger errors• High cost

Systematic

Advantages• Simple to design• Easier than simple

random• Easy to determine

sampling distribution of mean or proportion

Disadvantages• Periodicity within

population may skew sample and results

• Trends in list may bias results

• Moderate cost

Statistical estimation

Population

Random sample

Parameters

Statistics

Every member of the population has the same chance of beingselected in the sample

estimation

Statistical inference. Role of chance.

Reason and intuition Empirical observation

Scientific knowledge

Formulate hypotheses

Collect data to test hypotheses

Statistical inference. Role of chance.

Formulate hypotheses

Collect data to test hypotheses

Accept hypothesis Reject hypothesis

C H A N C E

Random error (chance) can be controlled by statistical significanceor by confidence interval

Systematic error

Making Data Usable

• To make data usable, this information must be organized and summarized.

• Methods for doing this include:

–frequency distributions–proportions–measures of central tendency and

dispersion

Population Mean

Making Data Usable (cont’d)• Proportion

– The percentage of elements that meet some criterion

• Measures of Central Tendency– Mean: the arithmetic average.

– Median: the midpoint; the value below which half the values in a distribution fall.

– Mode: the value that occurs most often.

Sample Mean

Statistics and Research Design

• Statistics: Theory and method of analyzing quantitative data from samples of observations … to help make decisions about hypothesized relations.– Tools used in research design

• Research Design: Plan and structure of the investigation so as to answer the research questions (or hypotheses)

Frequency

• Frequency Distributions

– In tables, the frequency distribution is constructed by summarizing data in terms of the number or frequency of observations in each category, score, or score interval

– In graphs, the data can be concisely summarized into bar graphs, histograms, or frequency polygons

Measures of Dispersion• The Range

–The distance between the smallest and the largest values of a frequency distribution.

Descriptive Statistics• Measures of Central Tendency

– Mode• The most frequently occurring score• 3 3 3 4 4 4 5 5 5 6 6 6 6: Mode is 6• 3 3 3 4 4 4 5 5 6 6 7 7 8: Mode is 3 and 4

– Median• The score that divides a group of scores in half with 50% falling above and

50% falling below the median.• 3 3 3 5 8 8 8: The median is 5• 3 3 5 6: The median is 4 (Average of two middle numbers)

– Mean• Preferred whenever possible and is the only measure of central tendency

that is used in advanced statistical calculations:– More reliable and accurate– Better suited to arithmetic calculations

• Basically, and average of all scores. Add up all scores and divide by total number of scores.

• 2 3 4 6 10: Mean is 5 (25/5)

Measure of Dispersion

• Measures of Variability (Dispersion)– Range

• Calculated by subtracting the lowest score from the highest score. • Used only for Ordinal, Interval, and Ratio scales as the data must

be ordered– Example: 2 3 4 6 8 11 24 (Range is 22)

– Variance• The extent to which individual scores in a distribution of scores

differ from one another– Standard Deviation

• The square root of the variance• Most widely used measure to describe the dispersion among a set

of observations in a distribution.

Low Dispersion versus High Dispersion

Descriptive Statistics

– Normal Curve – Bimodal Curve

Descriptive Statistics

– Positively Skewed – Negatively Skewed

Measures of Dispersion (cont’d)

• Why Use the Standard Deviation?– Variance

• A measure of variability or dispersion.• Its square root is the standard deviation.

– Standard deviation• A quantitative index of a distribution’s spread, or variability;

the square root of the variance for a distribution.• The average of the amount of variance for a distribution.• Used to calculate the likelihood (probability) of an event

occurring.

Calculating Deviation

Standard Deviation =

Calculating a Standard Deviation: Number of Sales Calls per Day for Eight Salespeople

17–29

Population Distribution, Sample Distribution, and Sampling

Distribution• Population Distribution

– A frequency distribution of the elements of a population.

• Sample Distribution– A frequency distribution of a sample.

• Sampling Distribution– A theoretical probability distribution of sample means

for all possible samples of a certain size drawn from a particular population.

• Standard Error of the Mean– The standard deviation of the sampling distribution.

EXHIBIT 17.13Fundamental

Types of Distributions

Three Important Distributions

Central-limit Theorem

• Central-limit Theorem– The theory that, as sample size increases, the

distribution of sample means of size n, randomly selected, approaches a normal distribution.

The Mean Distribution of Any Distribution Approaches Normal as n Increases

The Normal Distribution• Normal Distribution

– A symmetrical, bell-shaped distribution (normal curve) that describes the expected probability distribution of many chance occurrences.

– 99% of its values are within ± 3 standard deviations from its mean.

• Standardized Normal Distribution– A purely theoretical probability distribution that

reflects a specific normal curve for the standardized value, z.

EXHIBIT 17.8 Normal Distribution: Distribution of Intelligence Quotient (IQ) Scores

The Normal Distribution (cont’d)• Characteristics of a Standardized Normal

Distribution1. It is symmetrical about its mean; the tails on both sides

are equal.2. The mean identifies the normal curve’s highest point

(the mode) and the vertical line about which this normal curve is symmetrical.

3. The normal curve has an infinite number of cases (it is a continuous distribution), and the area under the curve has a probability density equal to 1.0.

4. The standardized normal distribution has a mean of 0 and a standard deviation of 1.

Standardized Normal Distribution

The Normal Distribution (cont’d)• Standardized Values, Z

– Used to compare an individual value to the population mean in units of the standard deviation

– The standardized normal distribution can be used to translate/transform any normal variable, X, into the standardized value, Z.

– Researchers can evaluate the probability of the occurrence of many events without any difficulty.

Res701 research methodology lecture 7 8-devaprakasam

Education