Probability and Statistics in Engineering Philip Bedient, Ph.D.

Probability and Probability and Statistics in Statistics in Engineering Engineering

Philip Bedient, Ph.D.Philip Bedient, Ph.D.

Probability: Basic IdeasProbability: Basic Ideas

Terminology:Terminology: Trial:Trial: each time you repeat an each time you repeat an

experimentexperiment Outcome:Outcome: result of an experiment result of an experiment Random experiment:Random experiment: one with random one with random

outcomes (cannot be predicted exactly)outcomes (cannot be predicted exactly) Relative frequency:Relative frequency: how many times a how many times a

specific outcome occurs within the entire specific outcome occurs within the entire experiment.experiment.

Statistics: Basic IdeasStatistics: Basic Ideas

Statistics is the area of science that deals with Statistics is the area of science that deals with collection, organization, analysis, and collection, organization, analysis, and interpretation of data. interpretation of data.

It also deals with methods and techniques that It also deals with methods and techniques that can be used to draw conclusions about the can be used to draw conclusions about the characteristics of a large number of data points--characteristics of a large number of data points--commonly called a commonly called a populationpopulation-- --

By using a smaller subset of the entire data.By using a smaller subset of the entire data.

For Example…For Example… You work in a cell phone factory and are asked You work in a cell phone factory and are asked

to remove cell phones at random off of the to remove cell phones at random off of the assembly line and turn it on and off. assembly line and turn it on and off.

Each time you remove a cell phone and turn it Each time you remove a cell phone and turn it on and off, you are conducting a on and off, you are conducting a random random experiment.experiment.

Each time you pick up a phone is a Each time you pick up a phone is a trialtrial and the and the result is called an result is called an outcomeoutcome..

If you check 200 phones, and you find 5 bad If you check 200 phones, and you find 5 bad phones, thenphones, then

relative frequencyrelative frequency of failure = 5/200 = 0.025 of failure = 5/200 = 0.025

Statistics in EngineeringStatistics in Engineering Engineers apply physical Engineers apply physical

and chemical laws and and chemical laws and mathematics to design, mathematics to design, develop, test, and develop, test, and supervise various supervise various products and services. products and services.

Engineers perform tests Engineers perform tests to learn how things to learn how things behave under stress, and behave under stress, and at what point they might at what point they might fail.fail.

Statistics in EngineeringStatistics in Engineering

As engineers perform experiments, they As engineers perform experiments, they collect data that can be used to explain collect data that can be used to explain relationships better and to reveal relationships better and to reveal information about the quality of products information about the quality of products and services they provide.and services they provide.

Frequency Distribution: Frequency Distribution: Scores for an engineering class are as follows: 58, 95, 80, Scores for an engineering class are as follows: 58, 95, 80,

75, 68, 97, 60, 85, 75, 88, 90, 78, 62, 83, 73, 70, 70, 85, 75, 68, 97, 60, 85, 75, 88, 90, 78, 62, 83, 73, 70, 70, 85, 65, 75, 53, 62, 56, 72, 7965, 75, 53, 62, 56, 72, 79

To better assess the success of the class, we make a To better assess the success of the class, we make a frequency chart:frequency chart:

Now the information can be better analyzed. Now the information can be better analyzed.

For example, 3 students did poorly, and 3 did For example, 3 students did poorly, and 3 did exceptionally well. We know that 9 students exceptionally well. We know that 9 students were in the average range of 70-79. We can also were in the average range of 70-79. We can also show this data in a freq. histogram (PDF).show this data in a freq. histogram (PDF).

Divide each no. by 26

Cumulative FrequencyCumulative Frequency

The data can be further organized by calculating the The data can be further organized by calculating the cumulative frequencycumulative frequency (CDF) (CDF)..

The cumulative frequency shows the cumulative number The cumulative frequency shows the cumulative number of students with scores up to and including those in the of students with scores up to and including those in the given range. Usually we normalize the data - divide 26.given range. Usually we normalize the data - divide 26.

Measures of Central Tendency & Measures of Central Tendency & VariationVariation

Systematic errorsSystematic errors, also called , also called fixed errorsfixed errors, are , are errors associated with using an inaccurate errors associated with using an inaccurate instrument.instrument. These errors can be detected and avoided by properly These errors can be detected and avoided by properly

calibrating instrumentscalibrating instruments Random errorsRandom errors are generated by a number of are generated by a number of

unpredictable variations in a given measurement unpredictable variations in a given measurement situation.situation. Mechanical vibrations of instruments or variations in Mechanical vibrations of instruments or variations in

line voltage friction or humidity could lead to random line voltage friction or humidity could lead to random fluctuations in observations.fluctuations in observations.

When analyzing data, the mean alone cannot signal When analyzing data, the mean alone cannot signal possible mistakes. There are a number of ways to define possible mistakes. There are a number of ways to define the dispersion or spread of data. the dispersion or spread of data.

You can compute how much each number deviates from You can compute how much each number deviates from the mean, add up all the deviations, and then take their the mean, add up all the deviations, and then take their

average as shown in the table below.average as shown in the table below.

As exemplified in Table 19.4, the sum of deviations As exemplified in Table 19.4, the sum of deviations from the mean for any given sample is always zero. from the mean for any given sample is always zero. This can be verified by considering the following:This can be verified by considering the following:

Where Where xxi i represents data points,represents data points, x x is the average,is the average, n n is is the number of data points, andthe number of data points, and d, d, represents the represents the deviation from the average.deviation from the average.

€

x =1

nx i

i=1

n

∑

€

di = (x i − x )

Therefore the average of the deviations from the Therefore the average of the deviations from the mean of the data set cannot be used to measure mean of the data set cannot be used to measure the spread of a given data set. the spread of a given data set.

Instead we calculate the average of the Instead we calculate the average of the absolute absolute valuesvalues of deviationsof deviations. (This is shown in the third . (This is shown in the third column of table 19.4 in your textbook)column of table 19.4 in your textbook)

For group A the mean deviation is 290, and Group For group A the mean deviation is 290, and Group B is 820. We can conclude that Group B is more B is 820. We can conclude that Group B is more scattered than A.scattered than A.

€

di

i=1

n

∑ = x i

i=1

n

∑ − x i=1

n

∑

€

di

i=1

n

∑ = nx − nx = 0

VarianceVariance Another way of measuring the data is by Another way of measuring the data is by

calculating the calculating the variancevariance.. Instead of taking the absolute values of Instead of taking the absolute values of

each deviation, you can just square the each deviation, you can just square the deviation and find the means. deviation and find the means.

(n-1) makes estimate unbiased(n-1) makes estimate unbiased

€

v = i=1

n

∑ (x i − x )2

n −1

Taking the square root of the variance Taking the square root of the variance which results in the which results in the standard deviation.standard deviation.

The standard deviation can also provide The standard deviation can also provide information about the relative spread of a information about the relative spread of a data set. data set. €

s = i=1

n

∑ (x i − x )2

n −1

The mean for a grouped distribution is calculated The mean for a grouped distribution is calculated from:from:

WhereWhere

xx = midpoints of a given range = midpoints of a given range

f f = frequency of occurrence of data in the = frequency of occurrence of data in the rangerange

nn = = ff = total number of data points = total number of data points

€

x =(xf )∑n

The standard deviation for a grouped distribution is The standard deviation for a grouped distribution is calculated from:calculated from:

€

s =(x − x )2 f∑n −1

Normal DistributionNormal Distribution

We could use the probability distribution from the figures We could use the probability distribution from the figures below to predict what might happen in the future. (i.e. below to predict what might happen in the future. (i.e. next year’s students’ performance)next year’s students’ performance)

Normal DistributionNormal Distribution

Any probability distribution with a bell-shaped Any probability distribution with a bell-shaped curve is called a curve is called a normal distributionnormal distribution. .

The detailed shape of a normal distribution curve The detailed shape of a normal distribution curve is determined by its mean and standard is determined by its mean and standard deviation values. deviation values.

THE NORMAL CURVETHE NORMAL CURVE

Using Table 19.11, approx. 68% of the data will Using Table 19.11, approx. 68% of the data will fall in the interval of fall in the interval of -s-s to to ss, one std deviation , one std deviation

~ 95% of the data falls between -2~ 95% of the data falls between -2ss to 2 to 2ss, and , and approx all of the data points lie between -3approx all of the data points lie between -3ss to 3 to 3ss

For a standard normal distribution, 68% of the For a standard normal distribution, 68% of the data fall in the interval of data fall in the interval of z z = -1 to = -1 to zz = 1. = 1.

zi = (xi - x) / s

AREAS UNDER THE NORMAL CURVEAREAS UNDER THE NORMAL CURVE

zz = -2 and = -2 and zz = 2 (two standard deviations below and = 2 (two standard deviations below and above the mean) each represent 0.4772 of the total area above the mean) each represent 0.4772 of the total area under the curve. under the curve.

99.7% or almost all of the data points lie between -399.7% or almost all of the data points lie between -3ss and 3and 3ss..

Analysis of Two HistogramsAnalysis of Two Histograms

Graph A is class distribution of numbers 1-10Graph A is class distribution of numbers 1-10Graph B is class distribution of semester creditsGraph B is class distribution of semester credits

Data for A = 5.64 Data for A = 5.64 +/-+/- 2.6 (much greater spread than B) 2.6 (much greater spread than B)Data for B = 15.7 Data for B = 15.7 +/-+/- 1.96 (smaller spread) 1.96 (smaller spread)Skew of A = -0.16 and Skew B = 0.146Skew of A = -0.16 and Skew B = 0.146CV of A = 0.461 and CV of B = 0.125 (CV = SD/Mean)CV of A = 0.461 and CV of B = 0.125 (CV = SD/Mean)

Frequency A

0

1

2

3

4

5

6

7

2 3 4 5 6 7 8 9 10

Frequency B

0123456789

12 13 14 15 16 17 18 19 20

Date post:	22-Dec-2015
Category:	Documents
Upload:	aleesha-ball
View:	227 times
Download:	1 times

Probability and Statistics in Engineering Philip Bedient, Ph.D.

Documents