+ All Categories
Home > Documents > Bayes’ Theorem Bayes’ Theorem Proof - Computer...

Bayes’ Theorem Bayes’ Theorem Proof - Computer...

Date post: 05-Jun-2019
Category:
Upload: vuongcong
View: 223 times
Download: 0 times
Share this document with a friend
10
cs6501: PoKER Class 3: Probabilistic Reasoning Spring 2010 University of Virginia David Evans Plan One-line Proof of Bayes’ Theorem Inductive Learning Home Game this Thursday, 7pm! (Game start: 7:15pm) This is not an official course activity. Email me by Wednesday afternoon if you are coming. Bayes’ Theorem Bayes’ Theorem Proof
Transcript
Page 1: Bayes’ Theorem Bayes’ Theorem Proof - Computer Scienceevans/poker/wp-content/uploads/2011/02/class3.pdf · • One-line Proof of Bayes’ Theorem • Inductive Learning Home Game

cs6501: PoKER

Class 3:

Probabilistic

Reasoning

Spring 2010

University of Virginia

David Evans

Plan

• One-line Proof of Bayes’ Theorem

• Inductive Learning

Home Game this Thursday, 7pm! (Game start: 7:15pm)

This is not an official course activity.

Email me by Wednesday afternoon if you are coming.

Bayes’ Theorem Bayes’ Theorem Proof

Page 2: Bayes’ Theorem Bayes’ Theorem Proof - Computer Scienceevans/poker/wp-content/uploads/2011/02/class3.pdf · • One-line Proof of Bayes’ Theorem • Inductive Learning Home Game

Machine

Learning

Inductive Learning“Learning from Examples”

Learner

Input:

Output:

Limits of Induction

It was the summer of 1919 that I began to feel more and more

dissatisfied with these three theories—the Marxist theory of

history, psycho-analysis, and individual psychology; and I began

to feel dubious about their claims to scientific status. My

problem perhaps first took the simple form, “What is wrong with

Marxism, psycho-analysis, and individual psychology? Why are

they so different from physical theories, from Newton's theory,

and especially from the theory of relativity?”

To make this contrast clear I should explain that few of us at the

time would have said that we believed in the truth of Einstein's

theory of gravitation. This shows that it was not my doubting the

truth of those three other theories which bothered me, but

something else.

One can sum up all this by saying that the criterion of the scientific

status of a theory is its falsifiability, or refutability, or testability.

Karl Popper, Science as Falsification, 1963.

Deciding on h

• Many hypotheses fit the training data

• Depends on hypothesis space: what types of

functions

• Pick between simple hypothesis function (that

may not fit exactly) and complex one

• How many functions are there for

X: n bits, Y: 1 bit ?

Page 3: Bayes’ Theorem Bayes’ Theorem Proof - Computer Scienceevans/poker/wp-content/uploads/2011/02/class3.pdf · • One-line Proof of Bayes’ Theorem • Inductive Learning Home Game

Forms of Inductive Learning

Supervised Learning

Given:

Output: hypothesis function

Unsupervised Learning (no explicit outputs)

Given:

Output: clustering

Reinforcement Learning

Given:

Output:

Feedback:

No feedback for individual decisions (output), but overall feedback.

First Reinforcement Learner (?)Arthur Samuel. Some Studies in Machine Learning

Using the Game of Checkers. IBM Journal, 1959.

Earlier inductive learning paper:

R. J. Solomonoff. An Inductive Inference Machine, 1956.

(and neural networks studied ealier)

Page 4: Bayes’ Theorem Bayes’ Theorem Proof - Computer Scienceevans/poker/wp-content/uploads/2011/02/class3.pdf · • One-line Proof of Bayes’ Theorem • Inductive Learning Home Game

Spam Filtering Supervised Learning: Spam FilterMessage 1

Not Spam

Message 2

Spam

Message N

Not Spam

Learner

X-Sender-IP: 78.128.95.196

To: [email protected]

From: Nicole Cox <[email protected]>

Subject: Job Offer

Date: Thu, 22 Apr 2010 10:10:45 +0300 (EEST)

Dear David,

Do you want to participate in the greatest Mystery Shopping quests nationwide? Have you

ever wondered how Mystery Shoppers are recruited and how prosperous companies keep up

doing business in the highly competitive business world? The answer is that many companies

are recruiting young, creative, observant, and responsible individuals like you to give their

feedback on various products and customer services and thus improve their quality.

As a Mystery Shopper you have only one responsibility: Act as a real customer while

evaluating the place you are sent to mystery shop and enjoy all the benefits that go along with

your job. Remember that you have nothing to lose, because you are awarded generously for

your efforts:

-You get paid between $10 and $40 per hour for each mystery shopping assignment;

-You keep all things that you have purchased for free;

-You watch movies, eat in restaurants, and visit amusement parks for free;

-You are turning your most enjoyable hobby into a well-paying activity.

Be aware that as a Mystery Shopper you can earn on average $100 to $300 per week. The

Feature Extraction

Features

F1 = From: <someone in address book>

F2 = Subject: *FREE*

F3 = Body: *enlargement*

F4 = Date: <today>

F5 = Body: <contains URL>

F6 = To: [email protected]

Note: this assumes we

already know what the

features are! Need to

learn them.

Page 5: Bayes’ Theorem Bayes’ Theorem Proof - Computer Scienceevans/poker/wp-content/uploads/2011/02/class3.pdf · • One-line Proof of Bayes’ Theorem • Inductive Learning Home Game

Bayesian Filtering

Feature Number of Spam Number of Ham

F1: From: <someone in address book> 2 4052

F2: Subject: *FREE* 3058 2

F3: Body: *enlargement* 253 1

F4: Date: <today> 304 5423

F5: Body: <contains URL> 3630 263

… … …

Total messages: 4000 Spam / 6000 Ham

Bayesian Filtering

Feature Number of Spam Number of Ham

F1: From: <someone in address book> 2 4052

F2: Subject: *FREE* 3058 2

F3: Body: *enlargement* 253 1

F4: Date: <today> 304 5423

F5: Body: <contains URL> 3630 263

… … …

Total messages: 4000 Spam / 6000 Ham

Bayesian Filtering

Total messages: 4000 Spam / 6000 Ham

Feature Number of Spam Number of Ham

F1: From: <someone in address book> 2 4052

F2: Subject: *FREE* 3058 2

F3: Body: *enlargement* 253 1

F4: Date: <today> 304 5423

F5: Body: <contains URL> 3630 263

… … …

Combining Probabilities

Naïve Bayesian Model: assume all features are

independent

Page 6: Bayes’ Theorem Bayes’ Theorem Proof - Computer Scienceevans/poker/wp-content/uploads/2011/02/class3.pdf · • One-line Proof of Bayes’ Theorem • Inductive Learning Home Game

Combining Probabilities

Naïve Bayesian Model: assume all features are

independent

Learning the Features

Learner

Feature Spam Likelihood

F1: Subject: *Poker* 0.03

F2: Subject: *FREE* 0.999

… …

Make every

<context, token>

pair a feature.

Which ones should

we keep?

Bayesian Spam Filtering

Patrick Pantel and Dekang Lin. SpamCop: A

Spam Classification & Organization Program.

AAAI-98 Workshop on Text Classification,

1998.

Paul Graham. A Plan for Spam (2002), Better

Bayesian Filtering (2003)

SpamAssassin Bayesian Filter

Testing Learners

K-Fold Cross Validation

Training Data

Fold 1 Fold 2 Fold 3 Fold 4 Fold 5

Randomly Partition (usually 10 folds)

LearnerFold 1

Fold 2Fold 3

Fold 4 Use k-1 folds for training, test on unused fold.

(repeat for each fold unused)

Fold 5

Score

Page 7: Bayes’ Theorem Bayes’ Theorem Proof - Computer Scienceevans/poker/wp-content/uploads/2011/02/class3.pdf · • One-line Proof of Bayes’ Theorem • Inductive Learning Home Game

Concerns

• Limits of Naïve Bayesian Model

• How many features

• Expressiveness of learned features

• What if

Adversarial Spam

How Many Ways Can You Spell V1@gra?

600,426,974,379,824,381,952

Really Adversarial Spam

• Player 1: Spammer

– Goal: Create a spam that tricks filter, or make filter

reject ham

• Player 2: Filter

– Not be tricked (but not reject ham messages)

Does this game have a Nash equilibrium?

Page 8: Bayes’ Theorem Bayes’ Theorem Proof - Computer Scienceevans/poker/wp-content/uploads/2011/02/class3.pdf · • One-line Proof of Bayes’ Theorem • Inductive Learning Home Game

Blaine Nelson, Marco Barreno, Fuching Jack Chi, Anthony D. Joseph,

Benjamin I. P. Rubinstein, Udam Saini, Charles Sutton, J. D. Tygar, Kai Xia

Exploiting Machine Learning to Subvert Your Spam Filter. USENIX LEET 2008.

Focused Attack

(try to filter particular

non-spam message)

Hidden Markov Models

Hidden Markov Model

A

KStart

Q

Finite State Machine

+ probabilities on transitions

1/3

1/3

1/3

+ hide the state

+ add observations and output

probabilities

Bet

Check

1

1

1/32/3

Viterbi Path

Given a sequence of observations, what is most likely sequence of states

Page 9: Bayes’ Theorem Bayes’ Theorem Proof - Computer Scienceevans/poker/wp-content/uploads/2011/02/class3.pdf · • One-line Proof of Bayes’ Theorem • Inductive Learning Home Game

Viterbi Algorithm (1967)

Andrew Viterbi

x(t-2) x(t-1) x(t)

y(t-2) y(t-1) y(t)

Key assumption: the most likely sequence

for time t depends only on

(1) the most likely sequence at time t-1(2) y(t)

This is true for first-order HMMs: transition probabilities depend only on current state.

Viterbi AlgorithmKey assumption: the most likely sequence

for time t depends only on

(1) the most likely sequence at time t-1(2) y(t)

Viterbi Algorithm

Running time:

Applications of HMMs

• Noisy Transmission (Convolution Codes)

– Sequence of states: message to transmit

– Observations: received signal

• Speech Recognition

– Sequence of states: utterance

– Observations: recorded sound

• Bioinformatics

• Cryptanalysis

• etc.

Page 10: Bayes’ Theorem Bayes’ Theorem Proof - Computer Scienceevans/poker/wp-content/uploads/2011/02/class3.pdf · • One-line Proof of Bayes’ Theorem • Inductive Learning Home Game

Charge

• So far we have assumed the state transition

and output probabilities are all known!

• Thursday’s Class: learning the HMM

If you are coming to the home game Thursday 7pm,

remember to email me by 5pm Wednesday.

Include if you need a ride or can drive other people.


Recommended