+ All Categories
Home > Documents > BioE 439/539: Applied Statistics for Biotechnology and ...nablab.rice.edu/bioe439/Lecture1.pdf ·...

BioE 439/539: Applied Statistics for Biotechnology and ...nablab.rice.edu/bioe439/Lecture1.pdf ·...

Date post: 13-Oct-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
16
BioE 439/539: Applied Statistics for Biotechnology and Bioengineering Lecture 1: Matlab Refresher and Probability Instructor: Dave Zhang http://nablab.rice.edu/bioe439
Transcript
Page 1: BioE 439/539: Applied Statistics for Biotechnology and ...nablab.rice.edu/bioe439/Lecture1.pdf · Probability Practice Problems: One finds that in a population of 100,000 females,

BioE 439/539: Applied Statistics for Biotechnology and Bioengineering

Lecture 1: Matlab Refresher and Probability

Instructor: Dave Zhang

http://nablab.rice.edu/bioe439

Page 2: BioE 439/539: Applied Statistics for Biotechnology and ...nablab.rice.edu/bioe439/Lecture1.pdf · Probability Practice Problems: One finds that in a population of 100,000 females,

Course Grading Structure:

Lectures: 0%Attendance not mandatory, come only if you want to.

Problem Sets: 0% or 20%10 problem sets in total; they either all count or they all don’t.

Midterm: 40% or 32%Take-home exam. Will include Matlab programming questions.

Final: 60% or 48%Take-home exam. Will include Matlab programming questions.

Exams should take roughly 3 hours if you’ve mastered the materials, but you can take as long as you like. Open resources (e.g. Internet), but not open people.

BioE 539:Additional problems on both problem sets and exams.

(But bring Matlab-installed laptop if you do come.)

Page 3: BioE 439/539: Applied Statistics for Biotechnology and ...nablab.rice.edu/bioe439/Lecture1.pdf · Probability Practice Problems: One finds that in a population of 100,000 females,

Course Details:

TA: Guanyi Xie ([email protected])

If you added the class late, please email Dave and TA to make sure you’re on the class email list.

Dave’s email address: [email protected]

Email TA first about course material questions; email Dave only if you have questions that Guanyi doesn’t know

Page 4: BioE 439/539: Applied Statistics for Biotechnology and ...nablab.rice.edu/bioe439/Lecture1.pdf · Probability Practice Problems: One finds that in a population of 100,000 females,

Statistics... Why should you care?

“There are three kinds of lies: lies, damned lies, and statistics.” - Samuel Clemens (a.k.a. Mark Twain)

Use statistics to lie without legal consequences.Interpretation 1:

Understand statistics to see through mis-truths.Interpretation 2:

Statistic = a number summary of a pile of data

Page 5: BioE 439/539: Applied Statistics for Biotechnology and ...nablab.rice.edu/bioe439/Lecture1.pdf · Probability Practice Problems: One finds that in a population of 100,000 females,

Matlab Refresher

• a = [1 50 100]• b = 1:100• c = 1:5:100• d = 20 * ones(1,100)• e = linspace(1,100, 12)• f = [1:20, 40:60]• g = [e, f]

Vectors (a.k.a. lists) - Creating them

Page 6: BioE 439/539: Applied Statistics for Biotechnology and ...nablab.rice.edu/bioe439/Lecture1.pdf · Probability Practice Problems: One finds that in a population of 100,000 females,

Matlab RefresherVectors (a.k.a. lists) - Creating them

Problem 1: a = [4 7 10 13 16 19 21 22 23 24 25 26]

A1: a = [4:3:19, 21:26]

Problem 2: a = [20 18 16 14 12 100 80 60 40 20 0]

A2: a = [20:-2:12, 100:-20:0]

Page 7: BioE 439/539: Applied Statistics for Biotechnology and ...nablab.rice.edu/bioe439/Lecture1.pdf · Probability Practice Problems: One finds that in a population of 100,000 females,

Matlab RefresherVectors (a.k.a. lists) - Manipulating them

a2 = 10*(1:10)

a2(7)

a2(10) = 0; a2

a2 = a2 / 50

a2 = floor(a2)

Page 8: BioE 439/539: Applied Statistics for Biotechnology and ...nablab.rice.edu/bioe439/Lecture1.pdf · Probability Practice Problems: One finds that in a population of 100,000 females,

Matlab RefresherVectors (a.k.a. lists) - Manipulating them

a3 = [1 2 3] + [1 50 100]

b3 = [1 2 3].*[1 50 100]

c3 = (a3 > 50)

d3 = a3(a3 > 50)

e3 = a3(a3 ~= 50)

f3 = max(a3, b3)

g3 = fliplr(a3)

h3 = sort([1 20 10], ‘descend’)

Page 9: BioE 439/539: Applied Statistics for Biotechnology and ...nablab.rice.edu/bioe439/Lecture1.pdf · Probability Practice Problems: One finds that in a population of 100,000 females,

Matlab Refresher

Problem 3: a = [2 6 12 20 30 42 56 72 90 110]

A3: a = [1:10] .* [2:11]

Vectors (a.k.a. lists) - Manipulating them

Problem 4: ref = [3 1 4 1 5 9 2 6 5 3 5 8 9 7]; a = [3 4 5 9 6 5 3 5 8 9 7] (1’s and 2’s removed)

A4: a = ref(ref > 2)

Problem 5: ref = [3 1 4 1 5 9 2 6 5 3 5 8 9 7]; a = [3 0 4 0 5 9 0 6 5 3 5 8 9 7] (1’s and 2’s replaced by 0’s)

A5: a = (ref > 2).*ref

Page 10: BioE 439/539: Applied Statistics for Biotechnology and ...nablab.rice.edu/bioe439/Lecture1.pdf · Probability Practice Problems: One finds that in a population of 100,000 females,

ProbabilityProbability describes the uncertainty of an outcome

Rolling a die, there are 6 possible results.

Taking a card from a poker deck, 52 possible outcomes.

The probabilities of complex outcomes can be broken down into probabilities of simple outcomes

Pr(rolled die is odd) = Pr(X=1 OR X=3 OR X=5)

Assuming it’s a fair die, Pr(X=1) = 1/6

Pr(2 dice sum to 7) = Pr(X=1,Y=6) + Pr(X=2,Y=5) + Pr(X=3,Y=4)

= Pr(X=1) + Pr(X=3) + Pr(X=5)

+ Pr(X=4,Y=3) + Pr(X=5,Y=2) + Pr(X=6,Y=1)

Pr(become rich and famous) = Pr(become rich) * Pr(become famous)(assuming they’re independent)

Page 11: BioE 439/539: Applied Statistics for Biotechnology and ...nablab.rice.edu/bioe439/Lecture1.pdf · Probability Practice Problems: One finds that in a population of 100,000 females,

Flavor 1: Discrete

Flavor 2: Continuous

Rolling a (fair) 6-sided die once and getting a 1.

Probability that a random potato chip weights between 0.2 and 0.4 ounces.

Flavor 3: Pseudo-continuous

Grabbing a random person in the world, and having his/her height be between 5’7” and 5’9”.

Probability... comes in 3 flavors

Statistics generally deals with pseudo-continuous probability

Page 12: BioE 439/539: Applied Statistics for Biotechnology and ...nablab.rice.edu/bioe439/Lecture1.pdf · Probability Practice Problems: One finds that in a population of 100,000 females,

Intersections and Unions

Throwing Darts

Event 1

1 = win a teddy bear

Event 2

2 = win another chance to play

(1 ∩ 2)Intersection, get both!

(1 ∪ 2)Union, got something.

Event 1 Event 2

More complex dartboards

Event 1 Event 2

(1 ∩ 2)

Intersections and unions calculated the same

Page 13: BioE 439/539: Applied Statistics for Biotechnology and ...nablab.rice.edu/bioe439/Lecture1.pdf · Probability Practice Problems: One finds that in a population of 100,000 females,

Pr(A ∪ B) = Pr(A) + Pr(B) - Pr(A ∩ B)

Principle of Inclusion-Exclusion

Example 1: Taking a card from a poker deck and getting a King or a Spade

Pr(K ∪ S) = Pr(K) + Pr(S) - Pr(K ∩ S)= 4/52 + 13/52 - 1/52 = 16/52

Example 2: Calling on a random student and getting a junior or a girl

Pr(junior ∪ female) = Pr(junior) + Pr(female) - Pr(junior ∩ female)

Events can be mutually exclusive or independent

Event 1: Rolling die #1 and getting a 1

Event 2a: Rolling die #1 and getting a 2

Event 2b: Rolling die #2 and getting a 2

Page 14: BioE 439/539: Applied Statistics for Biotechnology and ...nablab.rice.edu/bioe439/Lecture1.pdf · Probability Practice Problems: One finds that in a population of 100,000 females,

The probability of two mutually exclusive events both occurring (∩) is zero.

= P( (D1=5) ∩ (D1=6) ) = 0P(rolling one die and getting a 5 AND a 6)

The probability of two independent events both occurring (∩) is their product

= P( (D1=6) ∩ (D2=6) ) = P(D1=6) * P(D2=6) = 1/36

P(rolling two fair dice and getting 6, 6)

mutually exclusive

Event 1 Event 2 Event 1 Event 2

independent

Page 15: BioE 439/539: Applied Statistics for Biotechnology and ...nablab.rice.edu/bioe439/Lecture1.pdf · Probability Practice Problems: One finds that in a population of 100,000 females,

A conditional probability P(A | B) is the probability that event A occurs, given that event B occurs.

P(picking a girl | picking a random student in BioE 439)

P(A | B) = P (A ∩ B) / P (B)

A man shows you two coins, one is normal and the other has two heads. He shuffles the coins out of sight, picks one and flips it, and a head shows. What’s the probability the other side of the coin is also a head?

Event 1 Event 2 You get a prize for hitting Event 1.

... but you’re cheating. You stuck a super-strong magnet behind Event 2, guaranteeing that you hit Event 2.

Page 16: BioE 439/539: Applied Statistics for Biotechnology and ...nablab.rice.edu/bioe439/Lecture1.pdf · Probability Practice Problems: One finds that in a population of 100,000 females,

Probability Practice Problems:One finds that in a population of 100,000 females, 75% can expect to live to age 60, 63% can expect to live to age 80, and 28% can expect to live to age 100.

3. A pair of female twins are currently 80 years old. What is the probability that exactly one of them survive to 100?

2. Given that a woman died before age 100, what is the probability that she died before age 60?

1. Given that a woman is currently 60, what is the probability that she lives to age 80?

A: Pr (age ≥ 80 | age ≥ 60) = 0.63 / 0.75 = 0.84

A: Pr (age < 60 | age < 100) = (1-0.75) / (1-0.28) = 0.347

A: Pr (age ≥ 100 | age ≥ 80) = 0.28 / 0.63 = 0.444

Pr(A ≥ 100 ∩ B < 100 | A ≥ 80 ∩ B ≥ 80) + Pr(A < 100 ∩ B ≥ 100 | A ≥ 80 ∩ B ≥ 80) = .444 * .555 + .555 * .444 = 0.494


Recommended