Knowledge BaseKnowledge Base
DiagnosticsDiagnosticsRichard Fikes (Stanford KSL)Richard Fikes (Stanford KSL)
Adam Pease (Teknowledge)Adam Pease (Teknowledge)
Mala Mehrotra (Pragati Synergetic Research Inc.)Mala Mehrotra (Pragati Synergetic Research Inc.)
Yolanda Gil (USC ISI)Yolanda Gil (USC ISI)
Deborah McGuinness (Stanford KSL)Deborah McGuinness (Stanford KSL)
10/18/01
Knowledge Systems Laboratory, Stanford University
2
Knowledge Evolution ToolsKnowledge Evolution Tools KB development requires knowledge evolutionKB development requires knowledge evolution
Debugging, refining, structuring, modularizing, …Debugging, refining, structuring, modularizing, …
Power tools are needed to support KB evolutionPower tools are needed to support KB evolution KB diagnosisKB diagnosis
> Bugs, omissions, heuristic warnings, architectural adviceBugs, omissions, heuristic warnings, architectural advice
KB partitioningKB partitioning> To enable effective reasoningTo enable effective reasoning> To produce reusable KB building blocksTo produce reusable KB building blocks
KB mergingKB merging> To enable interoperation of KBs with overlapping contentTo enable interoperation of KBs with overlapping content
KSL is developing knowledge evolution toolsKSL is developing knowledge evolution tools
Knowledge Systems Laboratory, Stanford University
3
ChimaeraChimaera A Knowledge Evolution Tool EnvironmentA Knowledge Evolution Tool Environment
Tools for KB diagnosis and mergingTools for KB diagnosis and merging
Available as a Web service or an OKBC clientAvailable as a Web service or an OKBC client www.ksl.stanford.edu/software/chimaerawww.ksl.stanford.edu/software/chimaera Usable from a Web browserUsable from a Web browser Online user manual, tutorial, and demonstration movieOnline user manual, tutorial, and demonstration movie
Performs KB diagnostics in batch modePerforms KB diagnostics in batch mode Uploads and analyzes user’s KBUploads and analyzes user’s KB Accepts KBs in OKBC, KIF, MELD, RDF, DAML, …Accepts KBs in OKBC, KIF, MELD, RDF, DAML, … Provides results as HTML pages linked to frames and axioms Provides results as HTML pages linked to frames and axioms Provides user selectable set of diagnostic testsProvides user selectable set of diagnostic tests
Analyzes both the structure and content of a KBAnalyzes both the structure and content of a KB Uses reasoners to analyze contentUses reasoners to analyze content
Knowledge Systems Laboratory, Stanford University
4
Classification of Diagnostic ResultsClassification of Diagnostic Results Errors
Logical inconsistenciesE.g., contradictory type constraints
Content structure errorsE.g., terms used but not defined
Anomalies Missing information
E.g., type constraints Redundancies
E.g., redundant superclass and type links Extraneous structure or content
E.g., terms defined but not used
SummariesE.g., counts of term references
SuggestionsE.g., use consistent naming conventions
Knowledge Systems Laboratory, Stanford University
5
““Background” Reasoning AnalysisBackground” Reasoning Analysis Reasoning diagnostics that may take substantial timeReasoning diagnostics that may take substantial time
Performed in background Results incrementally posted on Web page Completion notification sent to user via e-mail
Example reasoning diagnosticsExample reasoning diagnostics Redundant axioms that are inferred by the KB (anomaly)
Inconsistent axioms whose negations are inferred by the KB (error)
Determine which relations in KB are primitive and non-primitive (summary)
> Show relations on which each non-primitive relation depend
Determine classes that are disjoint (suggest adding results to KB)
Derive subclass and instance links (suggest adding links to KB)
I.e., classification and recognition
Suggest reordering of an implication’s antecedents based on number of
inferable instances of each antecedent (suggestion)
Knowledge Systems Laboratory, Stanford University
6
Integration Into SHAKENIntegration Into SHAKEN
Chimaera is a KB diagnostics tool in the SHAKEN systemChimaera is a KB diagnostics tool in the SHAKEN system Used to diagnose both pump priming and SME KBsUsed to diagnose both pump priming and SME KBs
OKBC was used to do the integrationOKBC was used to do the integration Chimaera is an OKBC clientChimaera is an OKBC client
> Interacts with any OKBC server using the OKBC APIInteracts with any OKBC server using the OKBC API
> The Chimaera Web service uses Ontolingua as its OKBC serverThe Chimaera Web service uses Ontolingua as its OKBC server
SRI added an OKBC wrapper to the KM systemSRI added an OKBC wrapper to the KM system> Enabled KM to be an OKBC server usable by OKBC clientsEnabled KM to be an OKBC server usable by OKBC clients
> Enabled Chimaera’s diagnostics to run directly on KM KBsEnabled Chimaera’s diagnostics to run directly on KM KBs
Knowledge Systems Laboratory, Stanford University
7
Chimaera Useful To SRI TeamChimaera Useful To SRI Team
“Overall, we found that Chimaera was quite useful. It
found 2 concepts (Indole and Imidazole) that were corrupted, several occurrences of redundant superclasses, and several incorrect domain and range constraints (due to our poor representation of "Information").
…
We're currently fixing the bugs it revealed. It would be helpful if we could run Chimera on the component library frequently.”
– Bruce Porter
Knowledge Systems Laboratory, Stanford University
8
Next Steps: SME-Oriented SupportNext Steps: SME-Oriented Support Provide interactive repair oriented follow-up to diagnostics
Identify KB content on which diagnosis result is based Suggest repairs or repair strategies Guide user through repair procedure
Examples Class is a direct subclass of “THING”
> Provide direct subclasses of THING as candidate superclasses
> Step down through the class hierarchy
Class has redundant superclass links> Suggest removal of link(s) to most general classes
Type, cardinality, or bounds conflict> Suggest changing local conflicting constraint(s)
Missing information> Initiate acquisition dialogues for missing information
Knowledge Systems Laboratory, Stanford University
9
Next Steps: Architectural AnalysisNext Steps: Architectural Analysis Summarize architectural features of a KB
Percentage of > Relations that are functions > Axioms that are propositional, first order, higher order> Axioms that are not horn clauses
Distribution of > Axioms by type (using the HPKB, RKF types)> Axiom lengths by number of literals> Functions by number of arguments> Relations by number of arguments> Direct subclasses per class> Direct subproperties per property> Restrictions per object> Property values per object
Knowledge Systems Laboratory, Stanford University
10
Next Steps: Partitioning and BeyondNext Steps: Partitioning and Beyond Integration of KB partitioning tools into Chimaera
Provide automatic KB partitioning to enhance usability
Automatic running of test casesE.g., queries and expected answers
Support regression testing of evolving KB Provide result summaries from failed tests
Help with typographical errors Spelling correction for undefined names
E.g., classes, slots, relations, functions, constants
Spelling correction for anomalously occurring variables> Suggest is the same as another variable in the sentence
Knowledge Systems Laboratory, Stanford University
11
SummarySummary
KSL is developing Chimaera to support KB evolution
Chimaera was integrated into the SHAKEN Y1 system
Using OKBC(!)
Incrementally adding diagnostics
E.g., “background” diagnostics that use sophisticated reasoning
Next steps
KB partitioning tools
Repair dialogues for SMEs
KB architectural analysis
Regression testing
Knowledge Systems Laboratory, Stanford University
12
Role of Diagnostics in SystemsRole of Diagnostics in Systems
KE support SME support Increase productivity (“lightly trained”)
Step in managing KB development
Focus attention (e.g., redundant links) Evaluation support
Diagnose KBs produced during evaluation
Batch mode Foreground
Background
Changes in “patterns” in the KB between versions
Knowledge Systems Laboratory, Stanford University
13
Sharing Diagnostics InformationSharing Diagnostics Information Diagnostic specifications
Logical specifications English specifications Test cases
Diagnostic classifications Learnings Tricks of the trade Sharing facilitators:
Working group Mailing list
Findings data Author, group, or team specific
Repair strategies Alignments during collaborative development
Knowledge Systems Laboratory, Stanford University
14
Developer Needs and DesiresDeveloper Needs and Desires Reasoner-specific diagnostics Highly informative diagnostic results Reporting architectural bias in a KB
Binary versus higher order relations
First order versus higher order axioms> Weakly versus strongly higher order
Disjunctions or conjunctions
Existential versus universal quantifiers
Frames to axioms ratios
Horn clauses
Axiom lengths
Functions
Confusion of existential and universal quantifiers Type restrictions too general Misspelling of variables
Knowledge Systems Laboratory, Stanford University
15
Developer Needs and DesiresDeveloper Needs and Desires
Domain-specific tests
Semantic tests
Maintainability measures
Recognizing typographical errors
Spell check undefined or unused terms
Redefining (e.g., breaking up) a predicate Large scale modification techniques
Prioritizing diagnostics
Knowledge Systems Laboratory, Stanford University
16
Integration IssuesIntegration Issues
Architecture Use hosted services (like KSL)
Integrate special code
Take specifications from library
API Interaction Mode - Batch versus Interactive/Repair Translation issues
One major use of diagnostics is also in testing translators
Certain translations need to be done to do better analysis
Output integration
Knowledge Systems Laboratory, Stanford University
17
EvaluationEvaluation
Record types and numbers of errors Comparing KBs produced by SMEs versus KEs
Record use of repair strategies
Evaluate during testing
Feedback from SMEs about diagnostics
Knowledge Systems Laboratory, Stanford University
18
Classification of Diagnostic ResultsClassification of Diagnostic Results Errors
Logical inconsistencies
Content structure errors
(See Randy Davis thesis)
Anomalies Missing information
> Missing portions of descriptions
Redundancies
Extraneous structure or content
Summaries Architectural biases
Suggestions Stylistic suggestions
Static versus operational tests Use of expertise about KR paradigms
Knowledge Systems Laboratory, Stanford University
19
Diagnostic Issues/GoalsDiagnostic Issues/Goals Role of Diagnostics in Systems
KE support, SME support Evaluators of KBs
How to Share Diagnostics Working Group? Logical specification, English descriptions, tests, …
Know the Main Contributors Possible Diagnostics
What do users want? What can tool builders provide?
Integration Issues Developer Needs/Desires Evaluation
Knowledge Systems Laboratory, Stanford University
20
The Role of KB DiagnosticsThe Role of KB Diagnostics KE support SME support Increase productivity (“lightly trained”) Mgmt of kb Inference dependent quality improvement Focus attention (ex. Redundant links) Evaluation support Abstract patterns – average fanout of specialization,
statistics of number of uses of a predicate – big picture view
Version comparison Regression testing
Knowledge Systems Laboratory, Stanford University
21
Diagnostic SharingDiagnostic Sharing Diagnostic specifications
Logical specifications English specifications Test cases
Diagnostic classifications Taxonomy of errors – bottlenecks,
Quantification Alignments across systems – inconsistencies among smes Repair strategies How informative a system is (core dump vs. useful explanation) Learnings Tricks of the trade
Sharing Facilitators: Working Group Mailing list
Knowledge Systems Laboratory, Stanford University
22
Sharing facilitiesSharing facilities
Working groupMailing listPosting of papersUtilize Teknowledge
Knowledge Systems Laboratory, Stanford University
23
biasesbiases Binary vs. higher arity First order vs higher order
Weakly vs strongly higher order Universal over existential Disjunction vs. conjunction Frame-ism Horn clauses Lisp style Relations -> functions Depth vs. breadth in hierarchy …. Maybe report in summarizations.. At least document biases
Knowledge Systems Laboratory, Stanford University
24
Organizations/PeopleOrganizations/People
Cycorp – many special purpose - Kahlert ISI – Why Not? – Chalupsky
– KANAL – Gil
- expect - Gil Pragati – Clustering - Mehrotra Stanford FRG/KSL – Partitioning – McCarthy,
Amir, McIlraith Stanford KSL – Chimaera - Fikes, McGuinness
Knowledge Systems Laboratory, Stanford University
25
DiagnosticsDiagnostics Errors – provable logical inconsistencies Anomalies – redundancies, cycles,… Summaries – word counts, … Suggestions – naming conventions Incompletenesses – explicit salient assertions or statistics Stylistics - length of rule, … bad factoring, Randy davis – errors – incompleteness, inconsistent Get this - Top ten list of things people do wrong in cyc -
goolsbeyPerspectives/units:
Frame-like content vs. axioms vs. problem solving technology vs. learning to correct components
Knowledge Systems Laboratory, Stanford University
26
stylestyle
Static ReasonerSimulation / executionUsing examplesSummarization/improvements/critiquer
Knowledge Systems Laboratory, Stanford University
27
Integration IssuesIntegration Issues Architecuture
Use hosted services (like KSL) Integrate special code Take specifications from library
API Interaction Mode – Batch vs. Interactive/Repair Translation issues
one major use of diagnostics is also in testing translators Certain translations need to be done to do better analysis Background ontologies – meld starter ontology
Output integration
Knowledge Systems Laboratory, Stanford University
28
Developer Needs/DesiresDeveloper Needs/Desires
Missing existentialsToo high a type specificationVariable name mismatch
Semantic requests:Wrong semantic paradigm?TyposSpell check
Large scale modification tools and their integrationexample removal/ fixing top level
priotizing
Diagnostics to minimize cost, ease maintenance
Knowledge Systems Laboratory, Stanford University
29
EvaluationEvaluation Record types of errors
Fine granularity Kb differences across sme vs. ke developed
ontologies across team Record use of repair strategies… Evaluate during testing… Feedback from smes on features, usefulness, etc. Attempt to keep extremely complete audit trails for
future analysis Important to be careful with diagnostic reporting
Knowledge Systems Laboratory, Stanford University
30
Action ItemsAction Items
Working GroupDiagnostics repositoryWeb siteFollow up briefingMailing list
Knowledge Systems Laboratory, Stanford University
31
ChimaeraChimaera A Knowledge Evolution Environment
Tools for KB diagnosis and merging
Available as a Web service www-ksl-svc.stanford.edu www.ksl.stanford.edu/software/chimaera Usable from a Web browser Online user manual, tutorial, and demonstration movie Provides user selectable set of diagnostic tests
Performs kb diagnostics in batch mode Uploads and analyzes user’s KB Accepts KBs in MELD, KIF, OKBC, DAML, RDF, XML, … Provides results as HTML pages linked to frames and axioms
Analyzes both the structure and content of a KB Uses hybrid reasoners to analyze content Currently runs 28 diagnostic tests
Knowledge Systems Laboratory, Stanford University
32
Collection/SpecificationCollection/Specification
Logical Specification of diagnosticEnglish SpecificationExample kb that triggers diagnostic output
Knowledge Systems Laboratory, Stanford University
33
Classification of Diagnostic Results IIClassification of Diagnostic Results II
Axiom Analysis Axiom Syntax Problems
E.g., no consequent to a implications Axiom Redundancy
E.g., 1. A =>B 2. A=>C 3. C =>B means 1 is redundant Axiom Variable Usage
E.g., Variable used in antecedent but not in consequent Axiom Consistency
E.g., A => not A Axiom Tautology
E.g., consequent repeats (portion of) antecedent