+ All Categories
Home > Documents > Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group...

Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group...

Date post: 24-Dec-2015
Category:
Upload: ada-harris
View: 225 times
Download: 0 times
Share this document with a friend
30
Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services http://www.kapsgroup.com
Transcript
Page 1: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

Text Analytics And Text MiningBest of Text and Data

Tom ReamyChief Knowledge Architect

KAPS Group

Knowledge Architecture Professional Services

http://www.kapsgroup.com

Page 2: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

2

Agenda

Text Analytics Capabilities Text Analytics Applications Text Mining and Text Analytics

– Data and Unstructured Content

Case Study – Text Mining for Taxonomy Development Conclusion

Page 3: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

3

KAPS Group: General

Knowledge Architecture Professional Services Virtual Company: Network of consultants – 8-10 Partners – SAS, Smart Logic, Microsoft-FAST, Concept Searching, etc. Consulting, Strategy, Knowledge architecture audit Services:

– Text Analytics evaluation, development, consulting, customization– Knowledge Representation – taxonomy, ontology, Prototype– Metadata standards and implementation– Knowledge Management: Collaboration, Expertise, e-learning– Applied Theory – Faceted taxonomies, complexity theory, natural

categories

Page 4: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

4

Introduction to Text AnalyticsText Analytics Features Noun Phrase Extraction

– Catalogs with variants, rule based dynamic– Multiple types, custom classes – entities, concepts, events– Feeds facets

Summarization– Customizable rules, map to different content

Fact Extraction– Relationships of entities – people-organizations-activities– Ontologies – triples, RDF, etc.

Sentiment Analysis– Statistical, rules – full categorization set of operators

Page 5: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

5

Introduction to Text AnalyticsText Analytics Features Auto-categorization

– Training sets – Bayesian, Vector space– Terms – literal strings, stemming, dictionary of related terms– Rules – simple – position in text (Title, body, url)– Semantic Network – Predefined relationships, sets of rules– Boolean– Full search syntax – AND, OR, NOT– Advanced – NEAR (#), PARAGRAPH, SENTENCE

This is the most difficult to develop Build on a Taxonomy Combine with Extraction, Sentiment Foundation for best text analytics & combination

Page 6: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

6

Page 7: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

7

Page 8: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

8

Page 9: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

9

Page 10: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

10

Page 11: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

11

Page 12: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

12

Varieties of Taxonomy/ Text Analytics Software

Taxonomy Management– Synaptica, SchemaLogic

Full Platform– SAS-Teragram, SAP-Inxight, Smart Logic, Data Harmony, Concept

Searching, Expert System, IBM, GATE

Content Management – embedded Embedded – Search

– FAST, Autonomy, Endeca, Exalead, etc.

Specialty– Sentiment Analysis , VOC – Lexalytics, Attensity / Reports– Ontology – extraction, plus ontology

Page 13: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

13

Text Analytics ApplicationsPlatform for Multiple Applications Content Aggregation, Duplicate Documents – save millions! Business intelligence, Customer Intelligence Social Media - sentiment analysis, Voice of the Customer Social – Hybrid folksonomy / taxonomy / auto-metadata Social – expertise, categorize tweets and blogs, reputation Ontology – travel assistant, semantic web, etc. eDiscovery, Reputation management, Customer Experience Expertise Location, Crowd sourcing Technical support

Page 14: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

14

Text Analytics Applications:Enterprise Search - Elements Text Analytics can “solve” enterprise search Multiple Knowledge Structures

– Facet – orthogonal dimension of metadata– Taxonomy - Subject matter / aboutness

Software - Search, ECM, auto-categorization, entity extraction, Text Analytics and Text Mining

People – tagging, evaluating tags, fine tune rules and taxonomy

Rich Search Results – context and conversation Platform for search based applications

Page 15: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

15

Page 16: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

16

Page 17: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

Text Analytics and Text MiningData and Unstructured Content

80% of content is unstructured – adding to semantic web is major Text Analytics – content into data

– Big Data meets Big Content Real integration of text and ontology

– Beyond “hasDescription”– Improve accuracy of extracted entities, facts – disambiguation

• Pipeline – oil & gas OR research / Ford– Add Concepts, not just “Things” – 68% want this

Semantic Web + Text Analytics = real world value Linked Data + Text Analytics – best of both worlds

Build superior foundation elements – taxonomies, categorization

17

Page 18: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

Text Analytics and Text Mining and Data MiningVaccine Adverse Reaction Combine with Data Mining New sources of information

News stories, medical records Blogs, social

Find new connections, sources of knowledge Vaccine Adverse Effects – disease, symptoms, variables

Unstructured text into a data source Some preliminary analysis, content structure Find unknown adverse effects and prevalence Drug Discovery + search / research – 5 year story

18

Page 19: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

19

Text Analytics ApplicationsExample – Vaccine Adverse Effects

Page 20: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

20

Text Analytics ApplicationsExample – Vaccine Adverse Effects

Page 21: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

21

Text Analytics ApplicationsExample – Vaccine Adverse Effects

Page 22: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

Text Analytics and Text MiningCase Study – Taxonomy Development

Problem – 200,000 new uncategorized documents Old taxonomy –need one that reflects change in corpus Text mining, entity extraction, categorization Bottom Up- terms in documents – frequency, date, Clustering – suggested categories Clustering – chunking for editors Time savings – only feasible way to scan documents Quality – important terms, co-occurring terms

22

Page 23: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

Text Analytics and Text MiningCase Study – Taxonomy Development

Text into Data: Article, Abstract, Title, Subtitle – fields & source of terms Add Data: PubDate, journalTitle, Taxonomy Node Terms – Map to frequency, date, date ranges, Taxonomy Node

– New Terms, Trends Relevance – frequency, Abstract, Title, human judgment Entity Extraction – Authors, Organizations, Products, Categorization – build on clusters & taxonomy Combination – reports, visualizations, interactive explorations

23

Page 24: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

Case Study – Taxonomy Development

24

Page 25: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

25

Page 26: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

26

Page 27: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

Case Study – Taxonomy Development

27

Page 28: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

Case Study – Taxonomy Development

28

Page 29: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

Conclusion

Text Analytics impact is huge – solve information overload Enterprise Search and Search Based Applications: Save millions

and enhance productivity Combination of Text Analytics & Text Mining – unlimited range of

applications Mutual Enrichment – more data, add structure to unstructured Add Ontology = Richer Text Analytics – smarter, more useful Text Analytics + Text Mining + Semantic Web

– Move from theory to new practical applications

The best is yet to come!

29

Page 30: Text Analytics And Text Mining Best of Text and Data Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services .

Questions?

Tom [email protected]

KAPS Group

Knowledge Architecture Professional Services

http://www.kapsgroup.com


Recommended