Date post: | 13-Jan-2016 |
Category: |
Documents |
Upload: | gerald-barker |
View: | 213 times |
Download: | 0 times |
1
Jerry PostCopyright © 2003
Database Management Database Management Systems:Systems:Data MiningData Mining
Statistics Review
2
DDAATTAABBAASSEE
Probability
Relative frequency approach: The number of times that an event occurs out of the total population of events. You have 3 red balls and 7 white balls in a bag. The probability of
drawing a white ball on the first try is 70%. Your customers are distributed across five cities: 35% in City A,
25% in City B, 20% in City C, 15% in City D, 15% in City E.
Subjective probability: A belief in the likelihood of an outcome. Often subjective because of lack of full information. Generally modified over time based on acquisition of new information. It is important to separate belief from preference (but difficult), and also important that subjective probability maintain consistency. There is a 65% chance that the Federal Reserve board will reduce
interest rates at the next meeting.
3
DDAATTAABBAASSEE
Probability: Frequency
Need a complete count of events Permutations: Order does count Combinations: Order does not count
Basic multiplication rule. If a single action has k ways to be performed, and the action is performed n times; the total number of possible outcomes is: k*k*k*…*k Flip a coin five times (n=5). A single act has two outcomes (k=2), so
there are 25 = 32 possible outcomes.
iespossibilittotalof
successforwaysof
#
#Pr
4
DDAATTAABBAASSEE
Counting: Permutation How many ways can objects (or actions) be
rearranged?You have four cards: A, K, Q, J. How many ways can they
be arranged?Four items (n) arranged one card at a time (r):4 * 3 * 2 * 1
A
K
Q
J
Q
J
K
J
K
Q
J
Q
J
K
Q
KK, Q, J
4 3 2 1
5
DDAATTAABBAASSEE
Permutation: General
Ways to rearrange n items taken r at a time: n(n-1)(n-2)…(n-r+1)
)!(
!
rn
nPnr
6
DDAATTAABBAASSEE
Combinations
Number of ways of selecting items, and order does not count. Combinations are smaller
than permutations You can divide the number
of permutations by the number of ways of arranging the r objects (r!)
Elect three people from a group of ten. n = 10, r = 3
!)!(
!
rrn
n
r
nC nr
1206/720)1*2*3)(1*2*3*4*5*6*7(
1*2*3*4*5*6*7*8*9*10
!3)!310(
!10
7
DDAATTAABBAASSEE
Probability Rules: Complement
Complement (opposite): P(E) + P(E’) = 1
The probability of an event happening or not happening is one.
8
DDAATTAABBAASSEE
Probability Rules: Mutually Exclusive
Mutually Exclusive: Only one event of a group can happen. The probability of both occurring is zero.
P(A B) = 0 Then, the probability of one or the other of the events occurring
is computed by the sum of the probabilities: P(A B) = P(A) + P(B)
Example, pool balls, numbered 1 through 10 Event A: Draw a ball number <= 3 Event B: Draw a ball number >= 6 P(A or B) = 3/10 + 5/10 = 8/10 Can also find as complement: 1 – 2/10 = 8/10
In general, P(E1 E2 … En) = P(Ei)
9
DDAATTAABBAASSEE
Probability Rules: Independence
Events are independent (pairwise) if they have no influence on each other.
If events are independent, the probability of both events occurring is found by multiplying their individual probabilities:P(A B) = P(A) P(B)
Example: An urn has 3 red balls and 7 white ones. Draw a ball and then flip a coin. What is the probability you draw a white ball and flip heads?P(A B) = 0.7 * 0.5 = 0.35
10
DDAATTAABBAASSEE
Conditional Probability
The probability that event A will occur given that event B has already happened: P(A | B)Example 1: An urn has 3 red balls and 7 white ones. On the
first draw you pull out a white ball (event B). If you do not replace that ball in the urn, what is the probability of drawing a red ball next (Event A). Answer: 3/9 Note that these events are not independent.
In general, the probability of two events occuring:P(A B) = P(A) P(B | A)Example 2: Draw 2 cards from a 52-card deck without
replacement. What is the probability that both are kings?P(King1) = 4/52 P(King2 | King1) = 3/51P(King2 King1) = 4/52 * 3/51
11
DDAATTAABBAASSEE
Probability: Joint and Conditional Table
Female Male
Married .42 .18 .60
Not Married
.28 .12 .40
.70 .30 1.00
P(Female) = .70P(Married Female) = .42P(Married | Female) = P(M F)/P(F) = .42/.70
12
DDAATTAABBAASSEE
Joint Probability: Tree Diagram
Manufacturing: Group A: 4 machines 5% defect rateGroup B: 6 machines, 10% defect rateChoose a machine, then a product—probability defective?
*
*
*
*
*
*
*
P(A) =
.4
P(B) = .6
P(D | A) = .05
P(D’ | A) = .95
P(D | B) = .10
P(D’ | B) = .90
P(A D) = .02
P(A D’) = ..38
P(B D) = .06
P(B D’) = .54
1.00
13
DDAATTAABBAASSEE
Joint Probabilities: Table
Probability Defective (D) Non-defective (D’)
P(A) = 0.4 0.05 0.95
P(B) = 0.6 0.10 0.90
Production Defective (D) Non-defective (D’)
A 0.02 0.38
B 0.06 0.54
Total 0.08 0.92
P(A D) = P(A)*P(D|A) = 0.4(0.05) = .2
14
DDAATTAABBAASSEE
Bayes’ Theorem
Now, in a sense, work backwards.We sample a part at random and it is defective.What is the probability that it came from machine A? Machine B?
)(
)()|(
DP
DAPDAP
P(A | D) = 0.02/0.08 = 1/4P(B | D) = 0.06/0.08 = 3/4
In this example, the machine is the state of nature we wish to identify, and defective or not is the information.
15
DDAATTAABBAASSEE
Bayes’ Theorem in GeneralWe know: (1) There are n states of nature S1, S2, …, Sn
(2) An initial (a priori) probability for each state(3) Some type of information I(4) The conditional probabilities: P(I | Si)
We can compute the posterior probabilities,given the new information:
)(...)()(
)(
)(
)()|(
21 ISPISPISP
ISP
IP
ISPISP
n
iii
)|()(...)|()()|()(
)|()(
2211 nn
ii
SIPSPSIPSPSIPSP
SIPSP
16
DDAATTAABBAASSEE
Bayes’ Theorem Example
Chao: Statistics for Management/2e States of economy: S1: recession, S2: stable, S3: prosperity P(S1) = .25, P(S2) = .5, P(S3) = .25 (in general/a priori) We have forecasts as information. The forecasts are either optimistic
(I) or pessimistic (I’) The results of the forecasts in the past are as follows:
Prior Probability
State of Economy
Optimistic (I) Pessimistic (I’)
P(S1) = .25 S1 0.1 0.9
P(S2) = .50 S2 0.5 0.5
P(S3) = .25 S3 0.8 0.2
17
DDAATTAABBAASSEE
Example: Joint Probability
Prior Probability State of Economy
Optimistic (I)
P(I | Si)
Pessimistic (I’)
P(I’ | Si)
P(S1) = .25 S1 0.1 0.9
P(S2) = .50 S2 0.5 0.5
P(S3) = .25 S3 0.8 0.2
State Optimistic (I) Pessimistic (I’)
S1 P(S1 I) = 0.025 0.225
S2 P(S2 I) = 0.250 0.250
S3 P(S2 I) = 0.200 0.050
Total P(I) = 0.475 P(I’) = 0.525
18
DDAATTAABBAASSEE
Bayes’ Example
State Optimistic (I) Pessimistic (I’)
S1 P(S1 I) = 0.025 0.225
S2 P(S2 I) = 0.250 0.250
S3 P(S2 I) = 0.200 0.050
Total P(I) = 0.475 P(I’) = 0.525
Probability next year is prosperous (S3) if the forecast is optimistic (I):P(S3 | I) = P(S3 I)/P(I) = 0.200/0.475 = .421
475.0/2.0)8.0(25.0)5.0(5.0)1.0(25.0
)8.0(25.0)|3(
ISP
19
DDAATTAABBAASSEE
Bayes: Prior and Posterior Probabilities
Probability estimates at the start (a priori) are naïve:
P(S1) = 0.25
P(S2) = 0.50
P(S3) = 0.25
Probabilities after the forecast (posterior) reflect the new information:
P(S1 | I) = 0.053 P(S1 | I’) = 0.429
P(S2 | I) = 0.526 P(S2 | I’) = 0.476
P(S3 | I) = 0.421 P(S3 | I’) = 0.095
20
DDAATTAABBAASSEE
Mean and Standard Deviation
Normal Distribution
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
-6 -5 -4 -3 -2 -1 -0 1 2 3 4 5 6
N(0,1)
N(3,1)
N(0,5)
Mean=0
Standard deviations: 1, 2, 3
21
DDAATTAABBAASSEE
Cumulative NormalCumulative Normal Probability
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
-4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4
P(X<=3)0.9987
P(X<=0)0.5000
P(X<=1)0.8413
P(X<=2)0.9773
22
DDAATTAABBAASSEE
Hypothesis Testing
0
0.05
0.1
0.15
0.2
0.25
-6
-5.3
-4.6
-3.9
-3.2
-2.5
-1.8
-1.1
-0.4
0.3 1
1.7
2.4
3.1
3.8
4.5
5.2
5.9
0
0.05
0.1
0.15
0.2
0.25-6
-5.3
-4.6
-3.9
-3.2
-2.5
-1.8
-1.1
-0.4
0.3 1
1.7
2.4
3.1
3.8
4.5
5.2
5.9
Critical value