+ All Categories
Transcript

Text Analytics And Text MiningBest of Text and Data

Tom ReamyChief Knowledge Architect

KAPS Group

Knowledge Architecture Professional Services

http://www.kapsgroup.com

2

Agenda

Text Analytics Capabilities Text Analytics Applications Text Mining and Text Analytics

– Data and Unstructured Content

Case Study – Text Mining for Taxonomy Development Conclusion

3

KAPS Group: General

Knowledge Architecture Professional Services Virtual Company: Network of consultants – 8-10 Partners – SAS, Smart Logic, Microsoft-FAST, Concept Searching, etc. Consulting, Strategy, Knowledge architecture audit Services:

– Text Analytics evaluation, development, consulting, customization– Knowledge Representation – taxonomy, ontology, Prototype– Metadata standards and implementation– Knowledge Management: Collaboration, Expertise, e-learning– Applied Theory – Faceted taxonomies, complexity theory, natural

categories

4

Introduction to Text AnalyticsText Analytics Features Noun Phrase Extraction

– Catalogs with variants, rule based dynamic– Multiple types, custom classes – entities, concepts, events– Feeds facets

Summarization– Customizable rules, map to different content

Fact Extraction– Relationships of entities – people-organizations-activities– Ontologies – triples, RDF, etc.

Sentiment Analysis– Statistical, rules – full categorization set of operators

5

Introduction to Text AnalyticsText Analytics Features Auto-categorization

– Training sets – Bayesian, Vector space– Terms – literal strings, stemming, dictionary of related terms– Rules – simple – position in text (Title, body, url)– Semantic Network – Predefined relationships, sets of rules– Boolean– Full search syntax – AND, OR, NOT– Advanced – NEAR (#), PARAGRAPH, SENTENCE

This is the most difficult to develop Build on a Taxonomy Combine with Extraction, Sentiment Foundation for best text analytics & combination

6

7

8

9

10

11

12

Varieties of Taxonomy/ Text Analytics Software

Taxonomy Management– Synaptica, SchemaLogic

Full Platform– SAS-Teragram, SAP-Inxight, Smart Logic, Data Harmony, Concept

Searching, Expert System, IBM, GATE

Content Management – embedded Embedded – Search

– FAST, Autonomy, Endeca, Exalead, etc.

Specialty– Sentiment Analysis , VOC – Lexalytics, Attensity / Reports– Ontology – extraction, plus ontology

13

Text Analytics ApplicationsPlatform for Multiple Applications Content Aggregation, Duplicate Documents – save millions! Business intelligence, Customer Intelligence Social Media - sentiment analysis, Voice of the Customer Social – Hybrid folksonomy / taxonomy / auto-metadata Social – expertise, categorize tweets and blogs, reputation Ontology – travel assistant, semantic web, etc. eDiscovery, Reputation management, Customer Experience Expertise Location, Crowd sourcing Technical support

14

Text Analytics Applications:Enterprise Search - Elements Text Analytics can “solve” enterprise search Multiple Knowledge Structures

– Facet – orthogonal dimension of metadata– Taxonomy - Subject matter / aboutness

Software - Search, ECM, auto-categorization, entity extraction, Text Analytics and Text Mining

People – tagging, evaluating tags, fine tune rules and taxonomy

Rich Search Results – context and conversation Platform for search based applications

15

16

Text Analytics and Text MiningData and Unstructured Content

80% of content is unstructured – adding to semantic web is major Text Analytics – content into data

– Big Data meets Big Content Real integration of text and ontology

– Beyond “hasDescription”– Improve accuracy of extracted entities, facts – disambiguation

• Pipeline – oil & gas OR research / Ford– Add Concepts, not just “Things” – 68% want this

Semantic Web + Text Analytics = real world value Linked Data + Text Analytics – best of both worlds

Build superior foundation elements – taxonomies, categorization

17

Text Analytics and Text Mining and Data MiningVaccine Adverse Reaction Combine with Data Mining New sources of information

News stories, medical records Blogs, social

Find new connections, sources of knowledge Vaccine Adverse Effects – disease, symptoms, variables

Unstructured text into a data source Some preliminary analysis, content structure Find unknown adverse effects and prevalence Drug Discovery + search / research – 5 year story

18

19

Text Analytics ApplicationsExample – Vaccine Adverse Effects

20

Text Analytics ApplicationsExample – Vaccine Adverse Effects

21

Text Analytics ApplicationsExample – Vaccine Adverse Effects

Text Analytics and Text MiningCase Study – Taxonomy Development

Problem – 200,000 new uncategorized documents Old taxonomy –need one that reflects change in corpus Text mining, entity extraction, categorization Bottom Up- terms in documents – frequency, date, Clustering – suggested categories Clustering – chunking for editors Time savings – only feasible way to scan documents Quality – important terms, co-occurring terms

22

Text Analytics and Text MiningCase Study – Taxonomy Development

Text into Data: Article, Abstract, Title, Subtitle – fields & source of terms Add Data: PubDate, journalTitle, Taxonomy Node Terms – Map to frequency, date, date ranges, Taxonomy Node

– New Terms, Trends Relevance – frequency, Abstract, Title, human judgment Entity Extraction – Authors, Organizations, Products, Categorization – build on clusters & taxonomy Combination – reports, visualizations, interactive explorations

23

Case Study – Taxonomy Development

24

25

26

Case Study – Taxonomy Development

27

Case Study – Taxonomy Development

28

Conclusion

Text Analytics impact is huge – solve information overload Enterprise Search and Search Based Applications: Save millions

and enhance productivity Combination of Text Analytics & Text Mining – unlimited range of

applications Mutual Enrichment – more data, add structure to unstructured Add Ontology = Richer Text Analytics – smarter, more useful Text Analytics + Text Mining + Semantic Web

– Move from theory to new practical applications

The best is yet to come!

29

Questions?

Tom [email protected]

KAPS Group

Knowledge Architecture Professional Services

http://www.kapsgroup.com


Top Related