+ All Categories
Home > Documents > 19-mlintro

19-mlintro

Date post: 03-Jun-2018
Category:
Upload: sathya-vignesh
View: 214 times
Download: 0 times
Share this document with a friend

of 29

Transcript
  • 8/12/2019 19-mlintro

    1/29

    Machine Learning

    Mausam

    (based on slides by Tom Mitchell, OrenEtzioni and Pedro Domingos)

  • 8/12/2019 19-mlintro

    2/29

    What Is Machine Learning?

    A computer program is said to learn fromexperience E with respect to some class oftasks T and a performance measure P if itimproves performance on T (according to P)with more E.

    2

  • 8/12/2019 19-mlintro

    3/29

    Traditional Programming

    Machine Learning

    3

    Computer Data

    ProgramOutput

    Computer DataOutput

    Program

  • 8/12/2019 19-mlintro

    4/29

    Why Bother with Machine Learning?

    Btw, Machine Learning ~ Data Mining

    Necessary for AI Learn concepts that people dont have time

    for (drowning in datastarved forknowledge)

    Mass customization (adapt software to each) Super-human learning/discovery

    4

  • 8/12/2019 19-mlintro

    5/29

    Quotes A break through in machine learning would be worth ten Microsofts

    (Bill Gates) Machine learning is the next Internet

    (Tony Tether, Former Director, DARPA) Machine learning is the hot new thing

    (John Hennessy, President, Stanford) Web rankings today are mostly a matter of machine learning

    (Prabhakar Raghavan, Dir. Research, Yahoo)

    Machine learning is going to result in a real revolution(Greg Papadopoulos, CTO, Sun)

    5

  • 8/12/2019 19-mlintro

    6/29

    Inductive Learning

    Given examples of a function (X, F(X)) Predict function F(X) for new examples X

    Discrete F(X): Classification Continuous F(X): Regression F(X)= Probability( X ): Probability estimation

    6

  • 8/12/2019 19-mlintro

    7/29

    Training Data Versus Test

    Terms: data, examples, and instances usedinterchangeably

    Training data: data where the labels are given Test data: data where the labels are known

    but not givenWhich do you use to measure performance?Cross validation

    7

  • 8/12/2019 19-mlintro

    8/29

    Basic Setup Input:

    Labeled training examples Hypothesis space H

    Output: hypothesis h in H that is consistentwith the training data & (hopefully) correctly

    classifies test data.

    8

  • 8/12/2019 19-mlintro

    9/29

    The new Machine Learning

    Old New

    Small data sets (100s of examples) Massive (10^6 to 10^10)

    Hand-labeled data Automatically labeled; semi supervised;labeled by crowds

    Hand-coded algorithms WEKA package downloaded over1,000,000 times

    9

  • 8/12/2019 19-mlintro

    10/29

    ML in a Nutshell

    10^5 machine learning algorithms Hundreds new every year

    Every algorithm has three components: Hypothesis space possible outputs Search strategy---strategy for exploring space Evaluation

    10

  • 8/12/2019 19-mlintro

    11/29

    Hypothesis Space (Representation)

    Decision trees Sets of rules / Logic programs Instances Graphical models (Bayes/Markov nets) Neural networks Support vector machines Model ensembles Etc.

    11

  • 8/12/2019 19-mlintro

    12/29

    Metrics for Evaluation Accuracy Precision and recall Squared error

    Likelihood Posterior probability Cost / Utility Margin

    Etc.

    Based on Data

    12

  • 8/12/2019 19-mlintro

    13/29

    Search Strategy

    Greedy (depth-first, best-first, hill climbing)

    Exhaustive

    Optimize an objective function

    More

    13

  • 8/12/2019 19-mlintro

    14/29

    Types of Learning

    Supervised (inductive) learning Training data includes desired outputs

    Unsupervised learning Training data does not include desired outputs

    Semi-supervised learning Training data includes a few desired outputs

    Reinforcement learning Rewards from sequence of actions

    14

  • 8/12/2019 19-mlintro

    15/29

    15

  • 8/12/2019 19-mlintro

    16/29

    16

  • 8/12/2019 19-mlintro

    17/29

    17

    Why Learning? Learning is essential for unknown environments

    e.g., when designer lacks omniscience

    Learning is necessary in dynamic environments Agent can adapt to changes in environment not foreseen at

    design time

    Learning is useful as a system construction method Expose the agent to reality rather than trying to approximate

    it through equations etc.

    Learning modifies the agent's decision mechanisms toimprove performance

  • 8/12/2019 19-mlintro

    18/29

    18

  • 8/12/2019 19-mlintro

    19/29

    19

  • 8/12/2019 19-mlintro

    20/29

    20

  • 8/12/2019 19-mlintro

    21/29

  • 8/12/2019 19-mlintro

    22/29

  • 8/12/2019 19-mlintro

    23/29

    Inductive Bias

    Need to make assumptions Experience alone doesnt allow us to make

    conclusions about unseen data instances

    Two types of bias: Restriction: Limit the hypothesis space

    (e.g., nave Bayes) Preference: Impose ordering on hypothesis space

    (e.g., decision tree)

    23

  • 8/12/2019 19-mlintro

    24/29

    24

    Inductive learning example Construct h to agree with f on training set

    h is consistent if it agrees with f on all trainingexamples

    E.g., curve fitting (regression):

    x = Input data point

    (training example)

  • 8/12/2019 19-mlintro

    25/29

    25

    Inductive learning example

    h = Straight line?

  • 8/12/2019 19-mlintro

    26/29

    26

    Inductive learning example What about a quadratic function?

    What aboutthis littlefella?

  • 8/12/2019 19-mlintro

    27/29

    27

    Inductive learning exampleFinally, a function that satisfies all!

  • 8/12/2019 19-mlintro

    28/29

    28

    But so does this one

    Inductive learning example

  • 8/12/2019 19-mlintro

    29/29

    29

    Ockhams Razor Principle

    Ockhams razor: prefer the simplest hypothesis consistent with data

    Related to KISS principle (keep it simple stupid)Smooth blue function preferable over wiggly yellow oneIf noise known to exist in this data, even linear might be

    better (the lowest x might be due to noise)


Recommended