+ All Categories
Home > Documents > Implicit learning of common sense for reasoning Brendan Juba Harvard University.

Implicit learning of common sense for reasoning Brendan Juba Harvard University.

Date post: 23-Dec-2015
Category:
Upload: mark-wilkerson
View: 214 times
Download: 0 times
Share this document with a friend
Popular Tags:
31
Implicit learning of common sense for reasoning Brendan Juba Harvard University
Transcript
Page 1: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

Implicit learning of common sense for reasoning

Brendan JubaHarvard University

Page 2: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

A convenient example

“Thomson visited Cooper’s grave in 1765. At that date, he had been traveling [resp.: dead] for five years.

“Who had been traveling [resp.: dead]?”(The Winograd Schema Challenge, [Levesque, Davis, and Morgenstern, 2012])

Our approach: learn sufficient knowledge to answer such queries from examples.

Page 3: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

The taskIn_grave(x) Alive(x) Traveling(x)

1 0 0

* 0 0

0 1 *

0 1 1

1 1 0

0 1 1

0 1 *

1 0 0

0 0 0

1 0 0

0 1 0

0 1 1

* 1 *

* 0 0

• The examples may be incomplete (a * in the table)

• Given In_grave(Cooper), we wish to infer ¬Traveling(Cooper)

• Follows from In_grave(x) ¬Alive(x), ⇒Traveling(x) Alive(x)⇒

• These two rules can be learned from this data

• Challenge: how can we tell which rules to learn?

Page 4: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

This work

Given: examples, KB, and a query…• Proposes a criterion for learnability of rules in

reasoning: “witnessed evaluation”• Presents a simple algorithm for efficiently

considering all such rules for reasoning in any “natural” (tractable) fragment– “Natural” defined previously by Beame, Kautz,

Sabharwal (JAIR 2004)– Tolerant to counterexamples as appropriate for

application to “common sense” reasoning

Page 5: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

This work

• Only concerns learned “common sense”– Cf. Spelke’s “core knowledge:” naïve theories, etc.– But: use of logical representations provide

potential “hook” into traditional KR• Focuses on confirming or refuting query

formulas on a domain (distribution)– As opposed to: predicting missing attributes in a

given example (cf. past work on PAC-Semantics)

Page 6: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

Why not use…

Bayes nets/Markov Logic/etc.?– Learning is the Achilles heel of these approaches:

Even if the distributions are described by a simple network, how do we find the dependencies?

Page 7: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

Outline

1. PAC-Semantics: model for learned knowledge– Suitable for capturing learned common sense

2. Witnessed evaluation: a learnability criterion under partial information

3. “Natural” fragments of proof systems4. The algorithm and its guarantee

Page 8: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

PAC Semantics (for propositional logic) Valiant, (AIJ 2000)

• Recall: propositional logic consists of formulas built from variables x1,…,xn, and connectives, e.g., ∧(AND), ∨(OR), ¬(NOT)

• Defined with respect to a background probability distribution D over {0,1}n

(Boolean assignments to x1,…,xn)

☞Definition. A formula φ(x1,…,xn) is (1-ε)-valid under D if PrD[φ(x1,…,xn)=1] ≥ 1-ε.

A RULE OF

THUMB…

Page 9: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

ExamplesIn_grave(x) Alive(x) Traveling(x)

1 0 0

* 0 0

0 1 *

0 1 1

1 1 0

0 1 1

0 1 *

1 0 0

0 0 0

1 0 0

0 1 0

0 1 1

* 1 *

* 0 0

In_grave(x) ¬Alive(x) ⇒

1

1

1

1

0

1

1

1

1

1

1

1

*

1

Buried Alive!!

Grave-digger

APPEARS TO BE ≈86%-VALID…

Page 10: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

ExamplesIn_grave(x) Alive(x) Traveling(x)

1 0 0

* 0 0

0 1 *

0 1 1

1 1 0

0 1 1

0 1 *

1 0 0

0 0 0

1 0 0

0 1 0

0 1 1

* 1 *

* 0 0

Traveling(x) Alive(x)⇒

1

1

1

1

1

1

1

1

1

1

1

1

1

1

Note: Agreeing with all observed examples does not imply 1-validity. Rare counterexamples may exist. We only get (1-ε)-valid with probability 1-δ

Page 11: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

The theorem, informally

Theorem. For every natural tractable proof system, there is an algorithm that efficiently simulates access during proof search to all rules that can be verified (1-ε)-valid on examples.

• Can’t afford to explicitly consider all rules!• Won’t even be able to identify rules simulated• Thus: rules are “learned implicitly”

Page 12: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

Outline

1. PAC-Semantics: model for learned knowledge2. Witnessed evaluation: a learnability

criterion under partial information3. “Natural” fragments of proof systems4. The algorithm and its guarantee

Page 13: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

Masking processesMichael, (AIJ 2010)

• A masking function m : {0,1} n → {0,1,*}n

takes an example (x1,…,xn) to a partial example by replacing some values with *

• A masking process M is a masking function valued random variable– NOTE: the choice of attributes to hide may depend

on the example!

Page 14: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

Restricting formulas

Given a formula φ and masked example ρ, the restriction of φ under ρ, φ|ρ, is obtained by “plugging in” the values of ρi for xi whenever ρi ≠ * and recursively simplifying (using game-tree evaluation). I.e., φ|ρ is a formula in the unknown values.

¬x∨

y ¬z

∧ρ: x=0, y=0

=1=0

¬z∨z

=1

Page 15: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

Witnessed formulas

We will learn rules that can be observed to hold under the given partial information:• Definition. ψ is (1-ε)-witnessed under a

distribution over partial examples M(D) ifPrρ M(D)∈ [ψ|ρ=1] ≥ 1-ε

• We will aim to succeed whenever there exists a (1-ε)-witnessed formula that completes a simple proof of the query formula…

Remark: equal to “ψ is a tautology given ρ” in standard cases where this is tractable, e.g., CNFs, intersections of halfspaces; remains tractable in cases where this is not, e.g., 3-DNFs

Page 16: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

Outline

1. PAC-Semantics: model for learned knowledge2. Witnessed evaluation: a learnability criterion

under partial information3. “Natural” fragments of proof systems4. The algorithm and its guarantee

Page 17: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

Example: Resolution (“RES”)

• A proof system for refuting CNFs (AND of ORs)– Equiv., for proving DNFs (ORs of ANDs)

• Operates on clauses—given a set of clauses {C1,…,Ck}, may derive– (“weakening”) Ci∨l from any Ci

(where l is any literal—a variable or its negation)– (“cut”) C’i C’∨ j from Ci=C’i∨x and Cj=C’j ¬∨ x

• Refute a CNF by deriving empty clause from it

Page 18: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

Tractable fragments of RES

• Bounded-width• Treelike, bounded clause space

xi ¬xi

¬xi∨xj ¬xi ¬∨ xj… SPACE-2 ≡ “UNIT PROPAGATION,”

SIMULATES CHAINING

Page 19: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

Tractable fragments of RES

• Bounded-width• Treelike, bounded clause space☞Applying a restriction to every step of proofs of

these forms yields proofs of the same form(from a refutation of φ, we obtain a refutation of φ|ρ of the same syntactic form)

• Def’n (BKS’04): such fragments are “natural”

Page 20: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

Other “natural” fragments…

• Bounded width k-DNF resolution• L1-bounded, sparse cutting planes• Degree-bounded polynomial calculus• (more?)

REQUIRES THAT RESTRICTIONS

PRESERVE THE SPECIAL SYNTACTIC FORM

Page 21: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

Outline

1. PAC-Semantics: model for learned knowledge2. Witnessed evaluation: a learnability criterion

under partial information3. “Natural” fragments of proof systems4. The algorithm and its guarantee

Page 22: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

The basic algorithm

• Given query DNF φ and masked ex’s {ρ1,…,ρk}

– For each ρi, search for a refutation of ¬φ|ρi

• If the fraction of successful refutations is greater than (1-ε), accept φ, and otherwise reject.

CAN INCORPORATE KB CNF

Φ: REFUTE [Φ∧¬φ]|ρi

Page 23: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

Example space-2 treelike RES refutation

Traveling

¬Traveling

¬Traveling∨Alive

¬Alive

¬In_grave∨¬Alive In_grave

Given

Refute

Supporting “common sense” premises

Page 24: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

Example [Traveling∧In_grave]|ρ1

Traveling

¬Traveling

¬Traveling∨Alive

¬Alive

¬In_grave∨¬Alive In_grave

Given

Refute

Example ρ1: In_grave = 0, Alive = 1

=T =T=∅

Trivial refutation

Page 25: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

Example [Traveling∧In_grave]|ρ2

Traveling

¬Traveling

¬Traveling∨Alive

¬Alive

¬In_grave∨¬Alive In_grave

Given

Refute

Example ρ2: Traveling = 0, Alive = 0

=T =T=∅

=TTrivial refutation

=T

Page 26: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

The algorithm uses 1/γ2log1/δ partial examples to distinguish the following cases w.p. 1-δ:• The query φ is not (1-ε-γ)-valid • There exists a (1-ε+γ)-witnessed formula ψ

for which there exists a proof of the query φ from ψ

LEARN ANY ψ THAT HELPS

VALIDATE THE

QUERY φ.

N.B.: ψ MAY NOT BE 1-VALID

The theorem, formally

Page 27: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

• Note that resolution is sound…– So, whenever a proof of φ|ρi

exists, φ was satisfied by

the example from D ⇒If φ is not (1-ε-γ)-valid, tail bounds imply that it is

unlikely that a (1-ε) fraction satisfied φ • On the other hand, consider the proof of φ from

the (1-ε+γ)-witnessed CNF ψ…– With probability (1-ε+γ),

all of the clauses of ψ simplify to 1⇒The restricted proof does not require clauses of ψ

Analysis

“Implicitly learned”

Page 28: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

Recap: this work…

• Proposed a criterion for learnability of common sense rules in reasoning: “witnessed evaluation”

• Presented a simple algorithm for efficiently considering all such rules as premises for reasoning in any “natural” (tractable) fragment– “Natural” defined by Beame, Kautz, Sabharwal (JAIR

2004) means: “closed under plugging in partial info.”– Tolerant to counterexamples as appropriate for

application to “common sense” reasoning

Page 29: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

Prior work: Learning to Reason

• Khardon & Roth (JACM 1997) showed that O(log n)-CNF queries could be efficiently answered using complete examples– No mention of theorem-proving whatsoever!– Could only handle low-width queries under

incomplete information (Mach. Learn. 1999)• Noise-tolerant learning captures (some kinds

of) common sense (Roth, IJCAI’95)

Page 30: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

Work in progress

• Further integration of learning and reasoning– Deciding general RES for limited learning problems

in quasipoly-time: arXiv:1304.4633– Limits of this approach: ECCC TR13-094

• Integration with “fancier” semantics (e.g., naf)– The point: want to consider proofs using such

“implicitly learned” facts & rules

Page 31: Implicit learning of common sense for reasoning Brendan Juba Harvard University.

Future work

• Empirical validation– Good domain?

• Explicit learning of premises– Not hard for our fragments under “bounded

concealment” (Michael AIJ 2010)– But: this won’t tolerate counterexamples!


Recommended