+ All Categories
Home > Documents > Introduction to Text and Web Mining. I. Text Mining is part of our lives.

Introduction to Text and Web Mining. I. Text Mining is part of our lives.

Date post: 12-Jan-2016
Category:
Upload: todd-morgan
View: 231 times
Download: 0 times
Share this document with a friend
27
Introduction to Text and Web Mining
Transcript
Page 1: Introduction to Text and Web Mining. I. Text Mining is part of our lives.

Introduction to Text and Web Mining

Page 2: Introduction to Text and Web Mining. I. Text Mining is part of our lives.

I. Text Mining is part of our lives

Page 3: Introduction to Text and Web Mining. I. Text Mining is part of our lives.

Google trends

Page 4: Introduction to Text and Web Mining. I. Text Mining is part of our lives.
Page 5: Introduction to Text and Web Mining. I. Text Mining is part of our lives.

Google correlate

Page 6: Introduction to Text and Web Mining. I. Text Mining is part of our lives.

Social Metrics Insight

Page 7: Introduction to Text and Web Mining. I. Text Mining is part of our lives.

Related words on “bigdata”

Page 8: Introduction to Text and Web Mining. I. Text Mining is part of our lives.

Sentiment analysis on “bigdata”

Page 9: Introduction to Text and Web Mining. I. Text Mining is part of our lives.

summly

Page 10: Introduction to Text and Web Mining. I. Text Mining is part of our lives.

summly

In March 2011, D’Aloisio created Trimit, an app that summerizes e-mails, blog posts and more into 1,000, 500, or 140-character summaries and be able to share it via SMS, email, Facebook, Twitter in .txt form in just a few clicks or shakes of your iPhone.  In July of the same year, Apple named Trimit as a noteworthy app on the. App Store

Page 11: Introduction to Text and Web Mining. I. Text Mining is part of our lives.

II. What is text mining?

Page 12: Introduction to Text and Web Mining. I. Text Mining is part of our lives.

Text Mining

• Text mining

Application of data mining to non-structured or less structured text files. It entails the generation of meaningful numerical indices from the unstructured text and then processing these indices using various data mining algorithms

Page 13: Introduction to Text and Web Mining. I. Text Mining is part of our lives.

Text Mining

• Text mining helps organizations:– Find the “hidden” content of documents,

including additional useful relationships– Relate documents across previous unnoticed

divisions – Group documents by common themes

Page 14: Introduction to Text and Web Mining. I. Text Mining is part of our lives.

Text Mining

• Applications of text mining – Automatic detection of e-mail spam or

phishing through analysis of the document content

– Automatic processing of messages or e-mails to route a message to the most appropriate party to process that message

– Analysis of warranty claims, help desk calls/reports, and so on to identify the most common problems and relevant responses

Page 15: Introduction to Text and Web Mining. I. Text Mining is part of our lives.

Text Mining

• Applications of text mining – Analysis of related scientific publications in

journals to create an automated summary view of a particular discipline

– Creation of a “relationship view” of a document collection

– Qualitative analysis of documents to detect deception

Page 16: Introduction to Text and Web Mining. I. Text Mining is part of our lives.

Text Mining

• How to mine text 1. Eliminate commonly used words (stop-words)

2. Replace words with their stems or roots (stemming algorithms)

3. Consider synonyms and phrases

4. Calculate the weights of the remaining terms

Page 17: Introduction to Text and Web Mining. I. Text Mining is part of our lives.

Web Mining

• Web mining

The discovery and analysis of interesting and useful information from the Web, about the Web, and usually through Web-based tools

Page 18: Introduction to Text and Web Mining. I. Text Mining is part of our lives.

Data Mining Project Processes

Page 19: Introduction to Text and Web Mining. I. Text Mining is part of our lives.

Web Mining

• Web content mining

The extraction of useful information from Web pages

• Web structure mining

The development of useful information from the links included in the Web documents

• Web usage mining

The extraction of useful information from the data being generated through webpage visits, transaction, etc.

Page 20: Introduction to Text and Web Mining. I. Text Mining is part of our lives.

Web Mining

• Uses for Web mining:– Determine the lifetime value of clients– Design cross-marketing strategies across

products– Evaluate promotional campaigns– Target electronic ads and coupons at user

groups– Predict user behavior– Present dynamic information to users

Page 21: Introduction to Text and Web Mining. I. Text Mining is part of our lives.

Sentiment analysis(Opinion Mining)• sentiment analysis aims to determine the

attitude of a speaker or a writer with respect to some topic or the overall contextual polarity of a document. The attitude may be his or her judgment or evaluation on affective state (that is to say, the emotional state of the author when writing), or the intended emotional communication.

Page 22: Introduction to Text and Web Mining. I. Text Mining is part of our lives.

A basic task in sentiment analysis is classifying the polarity (+, -)of a given text at the document, sentence, or feature/aspect level — whether the expressed opinion in a document, a sentence or an entity feature/aspect is positive, negative, or neutral. Advanced, "beyond polarity" sentiment classification looks, for instance, at emotional states such as "angry," "sad," and "happy."

Page 23: Introduction to Text and Web Mining. I. Text Mining is part of our lives.

ApplicationsDetecting the polarity of product reviews and movie reviews respectively.

Classifying a movie review as either positive or negative to predicting star ratings on either a 3 or a 4 star scale

Analysis of restaurant reviews, predicting ratings for various aspects of the given restaurant, such as the food and atmosphere (on a five-star scale).

Page 24: Introduction to Text and Web Mining. I. Text Mining is part of our lives.

Social network analysis

Social network analysis (SNA) is the use of network theory to analyse social networks. Social network analysis views social relationships in terms of network theory, consisting of nodes, representing individual actors within the network, and ties which represent relationships between the individuals, such as friendship, kinship, organizations and sexual relationships. These networks are often depicted in a social network diagram, where nodes are represented as points and ties are represented as lines. (NodeXL)

Page 25: Introduction to Text and Web Mining. I. Text Mining is part of our lives.

Human SNS Graph

Page 26: Introduction to Text and Web Mining. I. Text Mining is part of our lives.

III. Text Mining Cases

• Cases on text mining

Page 27: Introduction to Text and Web Mining. I. Text Mining is part of our lives.

IV. Text Mining Techniques

• R

• Python

• Open API


Recommended