+ All Categories
Home > Technology > An Approach to Automated Learning of Conceptual Graphs from Text

An Approach to Automated Learning of Conceptual Graphs from Text

Date post: 11-May-2015
Category:
Upload: fulvio-rotella
View: 164 times
Download: 0 times
Share this document with a friend
Description:
Many document collections are private and accessible only by selected people. Especially in business realities, such collections need to be managed, and the use of an external taxonomic or ontological resource would be very useful. Unfortunately, very often domain-specific resources are not available, and the development of techniques that do not rely on external resources becomes essential. Automated learning of conceptual graphs from restricted collections needs to be robust with respect to missing or partial knowledge, that does not allow to extract a full conceptual graph and only provides sparse fragments thereof. This work proposes a way to deal with these problems applying relational clustering and generalization methods. While clustering collects similar concepts, generalization provides additional nodes that can bridge separate pieces of the graph while expressing it at a higher level of abstraction. In this process, considering relational information allows a broader perspective in the similarity assessment for clustering, and ensures more flexible and understandable descriptions of the generalized concepts. The final conceptual graph can be used for better analyzing and understanding the collection, and for performing some kind of reasoning on it.
Popular Tags:
27
Università degli studi di Bari “Aldo Moro” Diparmento di Informaca The 26th Internaonal Conference On Industrial, Engineering & Other Applicaons of Applied Intelligent Systems - IEA/AIE 2013 Amsterdam, The Netherlands, June 17-21, 2013 L.A.C.A.M. hp://lacam.di.uniba.it An Approach to Automated Learning of Conceptual Graphs from Text Fulvio Rotella , Stefano Ferilli, Fabio Leuzzi {fulvio.rotella, stefano.ferilli, fabio.leuzzi}@uniba.it
Transcript
Page 1: An Approach to Automated Learning of Conceptual Graphs from Text

Università degli studi di Bari “Aldo Moro”Dipartimento di Informatica

The 26th International Conference On Industrial, Engineering & Other Applications of Applied Intelligent Systems - IEA/AIE 2013Amsterdam, The Netherlands, June 17-21, 2013

L.A.C.A.M. http://lacam.di.uniba.it

An Approach to Automated Learning of Conceptual Graphs from Text

Fulvio Rotella, Stefano Ferilli, Fabio Leuzzi{fulvio.rotella, stefano.ferilli, fabio.leuzzi}@uniba.it

Page 2: An Approach to Automated Learning of Conceptual Graphs from Text

Overview

2/23

● Introduction

● Our Framework

● Goals

● Proposal

● Conceptual Graph Construction

● Knowledge Representation Formalism

● Approaching to missing/partial knowledge

● Probabilistic Reasoning by Association

● Qualitative Evaluations

● Conclusions & Future Works

Page 3: An Approach to Automated Learning of Conceptual Graphs from Text

IntroductionThe spread of electronic documents and document repositories has generated the need for automatic techniques to● understand ● handle the documents content in order to help users in satisfying their information needs.

Full Text Understanding is not trivial, due to:●intrinsic ambiguity of natural language●huge amount of common sense and conceptual background knowledge

For facing these problems lexical and/or conceptual taxonomies are useful

even if manually building is very costly and error prone. 3/23

Page 4: An Approach to Automated Learning of Conceptual Graphs from Text

Our framework*

4/23

1. Capable to build a conceptual network

Syntactic analysis by Stanford Parser [1] and Stanford

Dependencies [2]

Handles positive/negative and active/passive form of sentence

Relationships between subject and (direct/indirect) object

2. Performs generalizations to tackle data poorness and thus to enrich

the graph

3. Performs reasoning ‘by association' to look for relationships

between concepts

(*) F. Leuzzi, S. Ferilli, F. Rotella, “Improving Robustness and Flexibility of Concept Taxonomy Learning from Text”, New Frontiers in Mining Complex Patterns, pg. 170-184, 2013, ISBN 978-3-642-37381-7

Page 5: An Approach to Automated Learning of Conceptual Graphs from Text

Limits

● Anaphoras not handled

● Concepts Clustering using flat/vectorial-representations

● Concepts Generalization based on external resources

(eg. Wordnet [3])

● Focused mainly to the definitial portion of the network

5/23

Our framework

Page 6: An Approach to Automated Learning of Conceptual Graphs from Text

Goals and Proposal

1. Automated learning of conceptual graphs from restricted collections

2. Exploiting probabilistic reasoning ‘by association’ on extracted knowledge

● To Exploit an anaphora resolution strategy

● To face missing/partial knowledge applying a relational clustering

● To avoid the use of external resources to generalize similar concepts

6/23

Page 7: An Approach to Automated Learning of Conceptual Graphs from Text

Conceptual Graph Construction

The final output is a typed syntactic structure of each sentence.

Stanford Parser

Stanford Dependencies

JavaRAP[4]

STEP 1: Pre-processing

STEP 2: Sentences elaboration

input texts

w/o anaphoras

7/23

Page 8: An Approach to Automated Learning of Conceptual Graphs from Text

Knowledge representation formalism

8/23

only subject, verb and complement have been considered. subjects/complements will represent concepts, verbs will

express relations between them.

indirect complements are treated as direct ones by embedding the corresponding preposition into the verb.

the frequency of each arc in positive and negative sentences has been taken into account.

subject,complement

..subject,verb...,

complement

Page 9: An Approach to Automated Learning of Conceptual Graphs from Text

9/23

Approaching to missing/partial knowledge

The quality of the reasoning results applied on the network depends on

the processed texts + NOISE e.g. if two nodes belong to disjoint graph regions, reasoning cannot succeed

New Relational Generalization Approach

Concepts Description + Concepts Clustering + Generalization operator

Page 10: An Approach to Automated Learning of Conceptual Graphs from Text

Relational Concept Description1. Weak Components of the graph extracted by JUNG [5]

A maximal sub-graph in which at least a path exists between each pair of vertices

2. For each concept k-neighborhood around it has been extracted a sub-graph induced by the set of concepts that are k or fewer

hops away from it

3. Conceptual Graph translated into a set of Horn clauses:● <subj, verb_{pos,neg}, compl> → {pos, neg}_verb(subj, compl)

● eg. dog eats bone → pos_eat(dog, bone) ● concept(X):-rela(X,Y), relb(Z,X), relc(Y,T)

● eg. concept(dog):-pos_eat(dog,bone),pos_spit(cat,bone),neg_eat(dog,mouse)

11/23

Page 11: An Approach to Automated Learning of Conceptual Graphs from Text

Relational Pairwise clustering

Exploits the relational representation of concepts The similarity measure formulae similutudo [6] provides a relational similarity evaluation between them.

12/23

concept(X):- rela(X,Y), relb(Z,X), relc(Y,T).

concept(K):-relb(K,Y), reld(Z,K),relf(Y,T),rela(Z,T).

fs( C',C'' )

Page 12: An Approach to Automated Learning of Conceptual Graphs from Text

Generalization of cluster

generalization tacking advantage of an external resource

often not available for specific domains!

generalize each cluster using the maximum set of common descriptors of each concept

13/23

Problem

Previous approach

Solution

Page 13: An Approach to Automated Learning of Conceptual Graphs from Text

Generalization of cluster

14/23

1. Performing the logical generalization operator in [7]• a least general generalization (lgg) under ϴOI − subsumption

of two clauses is a generalization which is not more general than any other such generalization, that is, it is either more specifc than or not comparable to any other such generalization.

2. Exploitable for: retrieval of documents of interest Introducing new taxonomical relationships shifting of the representation when needed (abstraction)

Page 14: An Approach to Automated Learning of Conceptual Graphs from Text

Probabilistic reasoning ‘by association’

Reasoning ‘by association’ means: Finding a path of pairwise related concepts that establishes an

indirect interaction between two concepts c′ and c′′

Real Word Data is noisy and uncertain Logical reasoning is conclusive, need of a probabilistic approach Exploit sof relationships among concepts

Two strategies (B) and (D): (B) works in breadth aims at obtaining the minimal path between

concepts together with all involved relations (D) works in depth and exploits ProbLog [8] in order to allow

probabilistic queries on the conceptual graph 15/23

Page 15: An Approach to Automated Learning of Conceptual Graphs from Text

Given two nodes (concepts):

1. a Breadth-First Search starts from both nodes

2. the former searches the latter's frontier and vice versa

3. until the two frontiers meet by common nodes

Then the path is restored going backward to the roots in both

directions. 16/23

Probabilistic reasoning ‘by association’

Breadth-First Search (B)

Page 16: An Approach to Automated Learning of Conceptual Graphs from Text

Probabilistic reasoning ‘by association’

Breadth-First Search (B)We also provide:

● the number of positive/negative instances

● the corresponding ratios over the total

Different gradations of actions between two concepts:

● permitted

● prohibited

● typical

● rare

17/23

Page 17: An Approach to Automated Learning of Conceptual Graphs from Text

Has been defined a formalism based on ProbLog language: f :: p

● f is a ground atom:

link(subject,verb,object)

● p is the ratio between:

the sum of all ground atoms for which f holds

and

the sum of all possible links between subject and complement

18/23

Probabilistic reasoning ‘by association’

ProbLog Inference Engine (D)

Page 18: An Approach to Automated Learning of Conceptual Graphs from Text

Probabilistic reasoning ‘by association’ *

(B)

(D)

(*) F. Leuzzi, S. Ferilli, F. Rotella, “Improving Robustness and Flexibility of Concept Taxonomy Learning from Text”, New Frontiers in Mining Complex Patterns, pg. 170-184, 2013, ISBN 978-3-642-37381-7 19/23

Page 19: An Approach to Automated Learning of Conceptual Graphs from Text

Preliminary EvaluationExperimental setting

17/23

Goal: evaluate the qualitative examination of the obtained clusters and their generalizations.

The dataset regards 18 documents about social networks.The size of the dataset was deliberately kept small in order to have poor knowledge.

The similarity function returns value in ]0,4[ .The similarity function threshold has been in [2.0, 2.3] with hops equal to 0.5.

The graph built from text included:● 695 concepts● 727 relations

Page 20: An Approach to Automated Learning of Conceptual Graphs from Text

Preliminary EvaluationQualitative examination

21/23

Clusters obtained processing concepts described using one level of their neigbourhood with a similarity threshold equal to 2.0.

Page 21: An Approach to Automated Learning of Conceptual Graphs from Text

19/23

Apply (lgg) under ϴOI to cluster # 35.

concept(X) : − impact(Y, X), signal(Y, X), signal_as(Y, X), do_with(Y, X), consider(Y,X),offer(Y,X),offer_to(Y,X),average(Y,X), average_about(Y,X),experience(Y,X),flee_in(Y,X), be(Y,X).

θ = < {internet/Y, visible/X}, {internet/Y, textual/X} >

Preliminary EvaluationQualitative examination

Page 22: An Approach to Automated Learning of Conceptual Graphs from Text

Preliminary Evaluation

19/23

Apply (lgg) under ϴOI to cluster # 20.

concept(X) : − protect(Y, X), protect by(Y, X), become(Y, X), use(Y, X), have(Y, X), have to(Y, X), have in(Y, X), have on(Y, X),find(Y,X),go(Y,X),look(Y,X),begin(Y,X),begin with(Y,X), begin about(Y,X),suspect in(Y,X),suspect for(Y,X).

θ =< { parent/Y, kid/X}, {parent/Y, guru/X}, {parent/Y, limit/X} >

uncovered portion of kid:teach(kid, school), launch about(f oundation, kid), teach(kid, contrast), come from(kid,contrast), launch(foundation,kid), finish in(kid,school), invite_from(school, parent), possess to(school, parent), invite(school, parent), finish_in(kid,side), come_from(kid,school),f ind_in(school,parent), produce(school,parent), come from(kid,side), find_from(school,parent),finish_in(kid,contrast), invite about(school,parent), come_before(school,parent), release(foundation, kid), invite_to(school, parent), teach(kid, side), release_from(foundation,kid).uncovered portion of guru: become(teenager, guru). uncovered portion of limit: be(ability, limit), limit(ability, limit).

Page 23: An Approach to Automated Learning of Conceptual Graphs from Text

The improvement performed can be appreciated remarking the novelty in the method of description construction.

● Exploiting the Hamming distance we obtained a first level relation centric

(i.e the concept with its direct relations)● Exploiting our method we obtained a concept centric description

(i.e. direct and indirect relations between the first level concepts)

The results show that the procedure seems to be reliable in order to recognize similar concepts on the basis of their structural position in the graph

22/23

Preliminary Evaluation Remarks

Page 24: An Approach to Automated Learning of Conceptual Graphs from Text

ConclusionsThis work proposes an approach to automatically learn conceptual graphs from text, avoiding the support of external resources.

It works mixing different techniques.● It improves exploits an anaphora resolution technique● It applies a relational clustering to group similar concepts● It generalizes each cluster to obtain new concepts● Such concepts can be used to:

● build taxonomic relations● bridge disjoint portion of the graph

Preliminary experiments show that this approach can be viable although extensions and refinements are needed.

22/23

Page 25: An Approach to Automated Learning of Conceptual Graphs from Text

Future works

1. Enrich the Conceptual Graph with more information Collocations Extraction Identification of compound concepts (eg. House of Representatives) Identification of concepts attributes (eg. Adjectives) and properties

(eg. can(John,eat) )

2. Performing more extensive experiments adopting dataset available online in order to study the behaviour of the system and its limits

3. Automatic setting of suitable thresholds for searching generalizations

4. Exploiting more than one level of concept description, can be achieved interesting results

23/23

Page 26: An Approach to Automated Learning of Conceptual Graphs from Text

Thanks for attention

Questions?

Page 27: An Approach to Automated Learning of Conceptual Graphs from Text

References

[1] Klein and C. D. Manning. Fast exact inference with a factored model for natural language parsing. In Advances in Neural Information Processing Systems, volume 15. MIT Press, 2003.[2] M.C. de Marneffe, B. MacCartney, and C. D. Manning. Generating typed depen- dency parses from phrase structure trees. In LREC, 2006.[3] C. Fellbaum, editor. WordNet: An Electronic Lexical Database. MIT Press, Cam- bridge, MA, 1998.[4] L Qiu, M.Y. Kan, and T.S. Chua. A public reference implementation of the rap anaphora resolution algorithm. In Proceedings of LREC 2004, pages 291–294, 2004.[5] J. O’Madadhain, D. Fisher, S. White, and Y. Boey. The JUNG (Java Universal Network/Graph) Framework. Technical report, UCI-ICS, October 2003.[6] S. Ferilli, T. M. A. Basile, M. Biba, N. Di Mauro, and F. Esposito. A general similarity framework for horn clause logic. Fundam. Inf., 90(1-2):43–66, January 2009.[7] G.Semeraro,F.Esposito,D.Malerba,N.Fanizzi,andS.Ferilli.Alogicframework for the incremental inductive synthesis of datalog theories. In Norbert E. Fuchs, editor, LOPSTR, volume 1463 of LNCS, pages 300–321. Springer, 1997.[8] L. De Raedt, A. Kimmig, and H. Toivonen. Problog: a probabilistic prolog and its application in link discovery. In In Proc. of 20th IJCAI, pages 2468–2473. AAAI Press, 2007.


Recommended