Lecture 2: Chapter 2. Basic Concepts of Probability...

transcript

Lecture 2:Chapter 2. Basic Concepts of

Probability Theory

ELEC206 Probability and Random Processes, Fall 2014

Gil-Jin Jang

gjang@knu.ac.kr

School of EE, KNU

/ 45 — Chapter 2. Basic Concepts of Probability Theory

Overview

2.1 Specifying Random Experiments

2.2 The Axioms of Probability

2.3 Computing Probabilities using Counting Methods

2.4 Conditional Probability

2.5 Independence of Events

2.6 Sequential Experiments

2.1 Specifying Random Experiments

The outcome varies in an unpredictable fashion when repeated underthe same conditions.

A random experiment is a process characterized by the following

It is performed according to some set of rules

It can be repeated arbitrarily often

The result of each performance depends on chance and cannotbe predicted uniquely

Example: Tossing of a coin

Sequential random experiments

Performing a sequence of simple random sub-experiments

Example: First toss a coin, then throw a dice

Sometimes, the second sub-experiment depends on the outcomeof the first; example: toss a coin first, if it is a head, then throw adice

2.1.1 The Sample Space

Sample space S of random experiment

Defined as the set of all possible outcomes.

Outcomes are mutually exclusive in the sense that they cannotoccur simultaneously

A sample space can be finite, countably infinite or uncountablyinfinite

Example 2.2, Figure 2.1

Discrete sample space: S is countably finite or infinite.S1 ∼ S5: countably finiteS6: countably infinite

Continuous sample space: S is not countable.S7 ∼ S13: continuous sample space

Multi-dimensional sample space: one or more observations froman experimentS2 ∼ S12, S11 ∼ S13

Multi-dimensional sample spaces can be written as Cartesianproduct of other sets: S11 = R ×R

2.1.2 Events

A set of possible outcomes of an experiment, so an event is a subsetof sample space S

Certain event, S: always occurs.

Null event, ∅: never occurs

Elementary event: An event from an S that contains a single outcome

The whole sample space is an event and is called the sure event

Example: tossing a dice

2.1.3 Review of Set Theory

A set is a collection of objects and will be denoted by capital lettersS,A,B, . . ..

U : universal set consisting of all possible objects of interest in agiven setting or applications. For example, the universal set in

Experiment E6 is U6 = {1, 2, . . .}.

A set A is a collection of objects from U , and these objects are calledthe elements or points of the set A, and usually denoted by

lowercase letters. The notations x ∈ A and x /∈ A indicate that “x isan element of A” or “x is not an element of A”.

Set operations

Union, A ∪B: Set of outcomes either in A, or in B, or in both

Intersection, A ∩B : Set of outcomes that are in both A and B

Complement, Ac: Set of outcomes that are not in A

Difference, A−B: Set of outcomes that are in A but not in B

See Figure 2.2 and Example 2.5

Set Theory

Mutually exclusive ≡ Disjoint: if A ∩B = ∅

Disjoint events cannot occur simultaneously

Implication: If an event A is a subset of an event B, then A implies B

1. A ⊂ B

2. A ∩B ⊂ A and A ∩ B ⊂ B

3. A ⊂ B ∪A and B ⊂ B ∪A

Equal: A and B are equal if and only if A ⊂ B and B ⊂ A

Properties

Commutativity: A ∪B = B ∪A and A ∩B = B ∩A

Associativity: A ∪ (B ∪ C) = (A ∪B) ∪ C and

A ∩ (B ∩ C) = (A ∩B) ∩ C

Distributivity: A ∪ (B ∩ C) = (A ∪B) ∩ (A ∪ C) and

A ∩ (B ∪ C) = (A ∩B) ∪ (A ∩ C)

DeMorgan’s rule: (A ∩B)c = Ac ∪Bc and (A ∪B)c = Ac ∩Bc

See Example 2.7

2.2 The Axioms of Probability

Let E be a random experiment with sample space S and event class

F . The probability of A, P [A] satisfies the following axioms:

Axiom I: 0 ≤ P [A]

Axiom II: P [S] = 1

Axiom III: If A ∩B = ∅, then P [A ∪B] = P [A] + P [B] (A and Bare mutually exclusive events)

Axiom III’: If A1, A2, . . . is a sequence of events such thatAi ∩Aj = ∅ for all i 6= j, then

∞⋃

∞∑

P [Ak]

Axioms I∼III are enough with finite sample spaces. Axiom III’needed for infinite sample spaces.

Prove the Followings

Corollary 1: P [Ac] = 1− P [A]

Corollary 2: P [A] ≤ 1

Corollary 3: P [∅] = 0

Corollary 4: If A1, A2, . . . An are pairwise mutually exclusive, then

P [Ak] for n ≥ 2

Corollary 5: P [A ∪B] = P [A] + P [B]− P [A ∩B]

Corollary 6:

P [Aj ]−n∑

P [Aj∩Ak]+· · ·+(−1)n+1P [A1∩· · ·∩An]

Corollary 7: If A ⊂ B, then P [A] ≤ P [B].page 9 / 45 — Chapter 2. Basic Concepts of Probability Theory

2.2.1 Discrete sample spaces

Consider finite sample space S = {a1, a2, . . . , an}, and all distinct

elementary events are mutually exclusive. By Corollary 4 the

probability of any event B = {a′1, a′

2, . . . , a′

m} is given by

P [B] = P [{a′1, a′

2, . . . , a′

m}] = P [{a′1}] + P [{a′2}] + · · ·+ P [{a′m}]

In the case of countably infinite S, the probability of any event

B = {a′1, a′

2, . . .} is given by

P [B] = P [{a′1, a′

2, . . .}] = P [{a′1}] + P [{a′2}] + · · ·

Equally likely outcomes

P [{a1}] = P [{a2}] = · · · = P [{an}] =1

P [B] = P [{a′1}] + P [{a′2}] + · · ·P [{a′m}] =m

Example 2.9

An urn contains 10 identical balls numbered 0, 1, 2, . . . , 9. Find theprobability of the following events:

A : number of ball selected is odd

B : number of ball selected is a multiple of 3

C : number of ball selected is less than 5

2.2.2 Continuous Sample Spaces

In the case of single event, experiments consist of intervals of thereal line.

Probability laws in experiments with such spaces specify a rule forassigning numbers to intervals of a real line, but there are infinite

number of events.

∞⋃

=∞∑

P [Ak]

Borel field contains all open and closed intervals of the real line aswell as all events that can be obtained as countable unions,intersections, and complements.

B ∋ (a, b), [a, b], (a, b], [a, b), [a,∞), (a,∞), (−∞, b), {b}

In the case of 2 combined events, it is for assigning numbers to

intervals of a rectangular region in a plane.

Example 2.12

Consider “pick a number x at random between 0 and 1.” The sample

space S is the unit interval [0, 1] —uncountably infinite.

Suppose that all the outcomes in S are equally likely, then the

probability that the outcome is in the interval[

is the same as

the probability that the outcome is in[

12 , 1

The probability that the outcome falls in a subinterval of S is equal tothe length of the subinterval, that is

P [X ∈ [a, b]] = (b− a) for 0 ≤ a ≤ b ≤ 1

Axiom I and II are satisfied: b ≥ a ≥ 0, S = [a, b] with a = 0, b = 1.

12 , 1

= 0.5, P(

= 0.5− 0.5 = 0.

The probability of the outcome being exactly 12 is zero, since

there are uncountably infinite number of equally likely outcomes.

Union of intervals: “the outcome is at least 0.3 away from thecenter of the unit interval":

A = [0, 0.2] ∪ [0.8, 1] P [A] = P [0, 0.2] + P [0.8, 1] = 0.4page 13 / 45 — Chapter 2. Basic Concepts of Probability Theory

Zero Probability Paradox

In the continuous space, the probability of a specific value is zero ⇒Cannot occur?

A specific value in the continuous space is modeled by an intervalwhose width approaches zero, and so does the probability sinceit is proportional to the interval width.

lima→b

P [[a, b]] = lima→b

(b− a) = 0

If there are an infinite number of trials, the relative frequency of asingle event becomes almost zero.

It does not mean that it cannot occur, but occurs very rarely.

Probability specification: to discrete and elementary events, a realline, or regions of the plane.

Example 2.14: pick two numbers x and y at random in [0, 1].

The probability is the area. See Figure 2.7.

A = {x > 0.5}, B = {y > 0.5}, and C = {x > y}.

What are P [A], P [B], and P [C]?page 14 / 45 — Chapter 2. Basic Concepts of Probability Theory

Example 2.13

Lifetime of a computer memory chip “The proportion of chips whose lifetimeexceeds t decreases exponentially at a rate of α.” Find anappropriate probability law.

Let the sample space be S = (0,∞). Assign the probabilities to

events of lifetime:

P [(t,∞)] = e−αt for t > 0, α > 0

Axiom I is satisfied (why?); Axiom II is satisfied since

P [S] = P [(0,∞)] = e−α0 = 1

How to compute P [(r, s]]?(hint: decompose into exclusive subintervals)

(r, s] ∪ (s,∞) = (r,∞) ⇔ P [(r,∞)] = P [(r, s]] + P [(s,∞)]

⇔ P [(r, s]] = P [(r,∞)]− P [(s,∞)]

= e−αr − e−αs

2.3 Probabilities using Counting

In many experiments with finite sample spaces, the outcomes can beassumed to be equiprobable.

Using relative frequency, the probability of an event is the ratio of thenumber of event outcomes to the total number of outcomes.

The calculation of probabilities reduces to counting the numberof outcomes in an event.

In general, the number of distinct ordered k-tuples (x1, . . . , xk) with

components xi from a set with ni distinct elements is n1n2 · · ·nk.

See figure 2.8 for k = 2.

2.3.1 with Repl. and with Ordering

Condition 1: Ordering Choosing k objects from a set A of n distinct objects.

Let A be the “population.” The order of selections is recorded.

Condition 2: Replacement after selecting an object, it is placed back in the

set for the next choice.

The experiment produces an ordered k-tuple (x1, . . . , xk), where xi ∈ Aand i = 1, . . . , k.

With replacements, n1 = n2 = · · · = nk = n, so the number of distinct

ordered k-tuples becomes nk.

Example 2.15 An urn contains five balls. Suppose we select two balls withreplacement.

1. How many distinct ordered pairs are possible?

2. What is the probability that the two draws yield the same number?

2.3.2 without Repl. and with Ordering

(No replacement) Choosing k objects in succession without replacementfrom a population A of n distinct objects. Clearly, there exists anunmentioned condition k ≤ n.

The number of possible outcomes decreases by 1 every time, so thenumber of distinct ordered k-tuples is

n(n− 1) · · · (n− (k − 1)) =n∏

i=n−k+1

(n− k)!, Pn

Pnk is often read as “n permutation k.”

Example 2.16 An urn contains five balls. Suppose we select two ballswithout replacement.

1. How many distinct ordered pairs are possible?

2. What is the probability that the first ball has a larger number thanthe second ball?

Example 2.17 Suppose we select 3 balls with replacement. What is theprobability that the three balls are different?

2.3.3 Permutations of n Distinct Objects

Choosing k objects without replacement with k = n. Drawing objects

from an urn until the urn is empty. The number of distinct orderedk-tuples is

n(n− 1) · · · (2)(1) , n!

Example 2.19 Suppose that 12 balls are placed at random into 12 cells,where more than 1 ball is allowed to occupy a cell. What is theprobability that all cells are occupied?

· · ·

2.3.4 without Repl. and without Ordering

Combination Choosing k objects without replacement regardless of theorder, called a combination.

In 2.3.2, choosing k objects without replacement and with keeping the

selection order out of n distinct objects is defined by Pnk = n!

(n−k)! .

The number of combinations is obtained by dividing Pnk by the number of

k-tuple orderings:

(n− k)!k!,

Cnk is often read as “n combination k.”

is often read as “n choose k.”

Example 2.22 A batch of 50 items contains 10 defective items. Suppose 10items are selected at random and tested. What is the probability thatexactly 5 of the items tested are defective?

2.4 Conditional Probability

Definition: The probability of an event A occurring when it is knownthat some event B has occurred

P [A|B] =P [A ∩B]

P [B]for P [B] > 0

P [A|B] does not exist if P [B] = 0.

In the view of relative frequency,

nA∩B

=nA∩B/n

nB/n→

P [A ∩B]

If we multiply both sides of the definition of P [A|B] by P [B],

P [A ∩B] = P [A|B]P [B]

Similarly we also obtain

P [B ∩A] = P [B|A]P [A]

Example 2.24

2.24 An urn contains two black balls numbered 1 and 2, and two whiteballs numbered 3 and 4. The sample space is

{(1, b), (2, b), (3, w), (4, w)}. Assuming that the four outcomes are

equally likely, find P [A|B] and P [A|C].

Example 2.26 Binary Communication

The user inputs a 0 or a 1 into the system, and a corresponding signal istransmitted. The receiver makes a decision based on the receivedsignal.

Suppose that the user sends 0s with probability 1− p and 1s withprobability p, and the receiver makes random decision errors withprobability ε. For i = 0, 1, let Ai be the event “input was i,” and let Bi

be the event “receiver decision was i.”

Find the probabilities P [Ai ∩Bj ] for i = 0, 1 and j = 0, 1.

(See Figure 2.11 on page 64).

Partitioning

Partition Let B1, B2, . . . , Bn be mutually exclusive events whose unionequals the sample space S as shown in Figure 2.12. We refer tothese sets as a partition of S.

Any event A can be represented as the union of mutually exclusiveevents in the following way:

A = A ∩ S = A ∩ (B1 ∪B2 ∪ · · · ∪Bn)

= (A ∩B1) ∪ (A ∩B2) ∪ · · · ∪ (A ∩Bn)

Theorem of total probability By Corollary 4, the probability of A is

P [A] = P [A ∩B1] + P [A ∩B2] + · · ·+ P [A ∩Bn]

= P [A|B1]P [B1] + P [A|B2]P [B2] + · · ·+ P [A|Bn]P [Bn]

Example 2.28 Manufacturing

A manufacturing process produces a mix of “good” and “bad” memory

chips. The lifetime of good chips follows the exponential law inExample 2.12, with a rate of failure α. The lifetime of bad chips alsofollows the exponential law, but the rate of failure is 1000α.

Suppose that the fraction of good chips is 1− p and of bad chips, p. Find

the probability that a randomly selected chip is still functioning after tseconds.

Solution Let

C the event “chip still functioning after t seconds.”

G the event “chip is good.”

B the event “chip is bad.”

P [C] = P [C|G]P [G] + P [C|B]P [B]

= P [C|G](1− p) + P [C|B]p

= (1− p)e−αt + pe−1000αt

2.4.1 Bayes’ Rule

Two events For P [A] > 0 and P [B] > 0,

P [B|A] =P [A ∩B]

P [A]=

P [A|B]P [B]

Multiple events Let B1, B2, . . . , Bn be a partition of a sample space S.Suppose that event A occurs; what is the probability of event Bj?

P [Bj|A] =P [A ∩Bj ]

P [A]=

P [A|Bj]P [Bj]∑n

k=1 P [A|Bk]P [Bk]

P [Bj]: “a priori probabilities” of events Bj

P [Bj|A]: “a posteriori probabilities” given additional information

that A occurred.

Ex 2.29 Binary Communication System

In Example 2.26, find which input is more probable given that the receiver has output a 1.

Assume that, a priori, the input is equally likely to be 0 or 1.

1. “the receiver has output a 1” is the given “fact” → B1

2. “which input is more probable given B1” → P [A0|B1] ≷ P [A1|B1] ?

3. “equally likely” → P [A0] = P [A1] =1

2(partitioning)

P [A0|B1] =P [A0 ∩B1]

P [B1]=

P [B1|A0]P [A0]

P [B1]

P [A1|B1] =P [B1|A1]P [A1]

P [B1](Bayes’ rule)

P [B1] = P [(B1 ∩A0) ∪ (B1 ∩A1)] = P [B1 ∩A0] + P [B1 ∩A1]

= P [B1|A0]P [A0] + P [B1|A1]P [A1] = ε1

2+ (1− ε)

⇒ P [A0|B1] =ε(1/2)

1/2= ε

P [A1|B1] =(1− ε)(1/2)

1/2= (1− ε)

Example 2.30 Quality Control

Every chip is tested for t seconds prior to leaving the factory. Find the value of t for which

99% of the chips sent out to customers are good.

1. Define an event C as “test t seconds, and good result” (still functioning)

2. Define events G and B as “chip is good” and “chip is bad” without any given facts (a

priori probabilities).

3. “99% of the chips with good test result are really good ones”:

P[good chip | good test result] ≥ 0.99 ⇒ P [G|C] = 0.99

P [G|C] =P [C|G]P [G]

=P [C|G]P [G]

P [C|G]P [G] + P [C|B]P [B]

=e−αt · (1− p)

e−αt · (1− p) + e−1000αt · p= 0.99

999αln

1− p

2.5 Independence of Events

An event A is independent of an event B if the probability that Aoccurs is not influenced by whether B has or has not occurred. Interms of probabilities,

P [A] = P [A|B] =P [A ∩B]

P [B]for P [B] 6= 0

⇔ P [A ∩B] = P [A]P [B] = P [A|B]P [B]

implies P [A] = P [A|B] and P [B] = P [B|A]

In general if two events have nonzero probability and are mutually

exclusive, they cannot be independent. Since

0 = P [A ∩B] = P [A]P [B]

implies that at least A or B must have zero probability.

Example 2.31

A ball is selected from an urn containing two black balls numbered 1 and2, and two white balls numbered 3 and 4. Let the events A, B, C bedefined as

A = {(1, b), (2, b)}, “black ball selected”;

B = {(2, b), (4, w)}, “even-numbered ball selected”;

C = {(3, w), (4, w)}, “number of balls is greater than 2”

Are events A and B independent? Are events A and C independent?

Example 2.32

Two numbers x and y are selected at random from [0, 1]. Let the events A,

B, C be defined as

A = {x > 0.5}, B = {y > 0.5}, C = {x > y}.

Are events A and B independent? Are events A and C independent?(see Figure 2.13)

Independence of 3 Events

Conditions for independence of events A, B, C.

1. Pairwise independence

P [A∩B] = P [A]P [B], P [A∩C] = P [A]P [C], P [B∩C] = P [B]P [C].

2. Joint occurrence of any two events should not affect theprobability of the third

P [C|A ∩B] =P [A ∩B ∩ C]

P [A ∩B]= P [C].

3. This in turn implies that we must have

P [A ∩B ∩ C] = P [A ∩B]P [C] = P [A]P [B]P [C].

Pairwise independence does not necessarily follow that

P [A ∩B ∩ C] = P [A]P [B]P [C].

Example 2.33

Two numbers x and y are selected at random from the unit interval.Let the events B, D, F be defined as

2∩ y <

2∩ y >

Show that the three events are not independent (see Figure 2.14).

2.6 Sequential Experiments

2.6.1 Sequence of Independent Experiments

Consider a random experiment consisting of performingexperiments, E1, E2, . . . , En

The outcome is an n-tuple s = {s1, s2, . . . , sn}, where sk is the

outcome of experiment k.

Sample space: the Cartesian product space S ofS1 × S2 × · · · × Sn

If A1, A2, . . . , An are the events such that Ak concerns only of thesub-experiment Ek

and if the sub-experiments are independent, Then

P [A1 ∩ A2 ∩ · · · ∩An] = P [A1]P [A1] · · ·P [An]

Example 2.36

10 numbers are selected from [0, 1]. Let x1, x2, . . . , x10 be the sequence

of 10 numbers.

for k = 1, . . . , 5

for k = 6, . . . , 10

If we assume independent experiment

P [A1 ∩A2 ∩ · · · ∩A10] = P [A1]P [A1] · · ·P [A10] =

2.6.2 The Binomial Probability Law

A Bernoulli trial involves performing an experiment once and notingwhether a particular event A occurs or not, i.e., binary outcome.

Example 2.37 Coin tossing three times: assume the tosses are

independent and the probability of heads is p, compute theprobability for the all possible sequences.

Binomial Probability Theorem

Let k be the number of successes in n independent Bernoulli trials, thenthe probabilities of k are given by

pn(k) =

pk(1− p)n−k for k = 0, . . . , n,

where pn(k) is the probability of k successes in n trials, and

k!(n− k)!.

Example 2.38 Verify the above theorem with Example 2.37 by p3(0), p3(1),p3(2), p3(3).

Binomial Theorem

In general, the binomialtheorem states that

(a+ b)n =n∑

akbn−k.

If we let a = b = 1, then

Nn(k),

implies that there are 2n

distinct possible sequences ofsuccess and failures in n trials.

If we let a = p and b = 1− p,

pk(1− p)n−k

pn(k),

conforms to prob. Axiom II.

n! grows extremely quickly.

The following recursive formula

provides numerical efficiency:

pn(k+1) =(n− k)p

(k + 1)(1− p)pn(k).

Example 2.39

Let k be the number of active speakers in a group of 8 noninteracting

(independent) speakers. A speaker is active with probability 13 .

Find the probability that more than six speakers are active.

2.6.3 Multinomial Probability Law

Generalisation of binomial probability law.

Let B1, B2, . . . , BM be a partition of sample space S with P [Bj] = pj .

The events are mutually exclusive, so∑M

j=1 pj = 1.

Suppose that n independent, repeated experiments are performed.Let kj be the number of times that event Bj occurs, then the

probability of the vector (k1, k2, . . . , kM ) satisfies the multinomial

probability law:

P [(k1, k2, . . . , kM )] =n!

k1!k2! . . . kM !pk1

2 · · · pkM

where∑M

j=1 kj = n.

Examples

2.41 A dart is thrown nine times at a target consisting of three area. Each

throw has a probability of .2, .3, and .5 of landing in areas 1, 2, and 3,respectively. Find the probability that the dart lands exactly threetimes in each of the areas.

2.42 We pick 10 telephone numbers at random from a telephone bookand note the last digit in each of the numbers. What is the probability

that we obtain each of the integers from 0 to 9 only once?

P [(k1, . . . , kM )] =n!

k1! . . . kM !pk1

1 · · · pkM

M ,M∑

kj = n,M∑

pj = 1

2.6.4 Geometrical Probability Law

Repeat independent Bernoulli trials until the occurrence of the firstsuccess, and let m be that number of trials.

Ai: the event “success in trial i”.

m ∈ {1, 2, . . .}: number of trials until the stop, whose sample

space is a set of positive integers.

p(m): the probability that m trials are required.

Geometric probability law: the first m− 1 failures followed by

the success in mth trial,

p(m) = P[

c2 . . . A

cm−1Am

= (1− p)m−1p

Properties

The probabilities sum to 1: (q = 1− p)

∞∑

p(m) = p∞∑

qm−1 = p1

1− q= 1

The probability that more than K trials are required before a success:

P [{m > K}] = p

∞∑

qm−1 = pqK∞∑

qj = pqK1

1− (1− p)= qK .

Example 2.43 Error control by retransmission: Computer A sendsa message to computer B. If B detects an error, it requests A to

retransmit it. The probability of a single transmission error is q = 0.1.What is the probability that a message needs to be transmitted more

than two times?

⇒ P [m > 2] = q2 = 10−2

2.6.5 Dependent Experiments

A sequence or “chain” of subexperiments in which the outcome of agiven subexperiment determines which subexperiment is performednext.

Example 2.44 :

1. Urn 0: one ball with label “1”, two with label “0”.

2. Urn 1: one ball with label “0”, five with label “1”.

3. Experiment 1: flip a coin, and if head use urn 0, if tail use urn 1.

4. Experiment 2: if urn 0, pick a ball from urn 0, otherwise from urn 1

5. Experiment 3: if the outcome is “0” stay with urn 0, otherwise staywith urn 1.

See Figure 2.15

Markov Chains

To compute the probability of a particular sequence of outcomes,

s0, s1, s2. Let A = {s2} and B = {s0} ∩ {s1}. Using Bayes rule,

P [s0, s1, s2] = P [s2 | s0, s1]P [s0, s1]

= P [s2 | s0, s1]P [s1 | s1]P [s0]

Note that P [sn | s0, . . . , sn−1] depends only sn−1 since the most

recent outcome determines which subexperiment is performed,

P [sn | s0, . . . , sn−1] = P [sn | sn−1].

ThereforeP [s0, s1, s2] = P [s2 | s1]P [s1 | s0]P [s0]

Generally, Markov chains satisfy

P [s0, . . . , sn] = P [sn | sn−1]P [sn−1 | sn−2] · · ·P [s1 | s0]P [s0]

Lecture 2: Chapter 2. Basic Concepts of Probability...

Documents