+ All Categories
Home > Technology > Knowledge discovery in social media mining for market analysis

Knowledge discovery in social media mining for market analysis

Date post: 18-Feb-2017
Category:
Upload: senuri-wijenayake
View: 90 times
Download: 0 times
Share this document with a friend
35
Knowledge Discovery in Social Media Mining for Market Analysis By: Senuri Wijenayake
Transcript
Page 1: Knowledge discovery in social media mining for market analysis

Knowledge Discovery in Social Media Mining for

Market Analysis

By: Senuri Wijenayake

Page 2: Knowledge discovery in social media mining for market analysis

Introduction

Problem Addressed: Three research areas in Social Media Mining Predictive Power Community Detection Influence Propagation

Focus: Analyzed the existing literature and find applications in Social Media for Knowledge Discovery for Market Analysis

Page 3: Knowledge discovery in social media mining for market analysis

Background

Fact 1: Facebook has over 1.55 billion active users by November 2015(extracted from Statistics Portal – November 2015)

Fact 2: All adults spend at least 2 hours a day on some form of social media network

Page 4: Knowledge discovery in social media mining for market analysis

Focus of Research

A rich source of data with

human sentiment

and behavior

Developed online

relationships and groups

Online interactions

where people voice their

ideas

Understand customer

satisfaction and changing

customer requirements

Focused marketing campaigns for better

results

Influencing consumer behavior

effectively via influential

users

Page 5: Knowledge discovery in social media mining for market analysis

Using Social Media to make Predictions

Progress So Far: Human Intuition – Can’t be duplicated Data Based Models – Inadequate data to

represent human cognitive process

SOLUTION: Use data available on social media for predictive analysis.

Page 6: Knowledge discovery in social media mining for market analysis

Using Social Media to make Predictions

Progress So Far: Yahoo Finance Message Board – Stock market

variability (Antweiler & Frank 2004)

Google Search Queries – Track disease outbreaks (Ginsberg et al. 2009)

Amazon Reviews – Predicting product sales (Ghose & Ipeirotis 2011)

Page 7: Knowledge discovery in social media mining for market analysis

General Framework for SMM for Predictions

Stage 1: Preprocessing Social Media data

are unstructured Convert them into

high quality structured data, suitable for data mining

Quality: Strong et al. (1997) Objectivity Completeness Sufficiency

Stage 2: Predictive Analysis Develop a model to

make accurate predictions on a new set of data (Harold 2013)

Methodologies: Market Models Survey Models Statistical

Models

Page 8: Knowledge discovery in social media mining for market analysis

Data Preprocessing

Problem SolutionData Cleaning Missing values

NoiseOutliers

SubstitutionRegression

Data Integration

Entity IdentificationRedundancy

Schema based Entity IdentificationDuplicate Detection

Data Transformatio

n

Data can’t be used straight away for mining

GeneralizeAttribute Construction

Data Reduction

Large amounts of data requires a significant processing power

Data Cube AggregationAttribute Selection

Page 9: Knowledge discovery in social media mining for market analysis

Application of Predictions in Market AnalysisObjective: How the knowledge available could be used to make predictions with regard to market analysis and how successful is it ? Microblogging (Twitter) is most popular

Focus: Twitter data for predicting box office

performance of movies

Page 10: Knowledge discovery in social media mining for market analysis

Application of Predictions in Market AnalysisLiterature: Asur & Huberman (2010) used correlation and

regression based models on Twitter data

Leskovec (2011) rectified imperfections which could rise due to incomplete data

Vasu Jain (2013) used sentiment analysis for predictions

Gaikar & Marakarkandy (2015) introduced a framework for using Twitter data for sentiment analysis and making predictions

Page 11: Knowledge discovery in social media mining for market analysis

Application of Predictions in Market Analysis

Gaikar & Marakarkandy (2015)

Predict box office performance of a Bollywood movie as a hit, flop or

an average

Predict the opening weekend revenue collection

Page 12: Knowledge discovery in social media mining for market analysis

Twitter for Predictions: Methodology

Module 1: Data Extraction The most trending hashtag on Twitter

and related hashtags are extracted (HashTags.org)

Twitter4j API used to connect and extract tweets from Twitter servers

Stored in mySQL database Movie star ratings taken from Timex

CelebexA complete set of most relevant data

has been extracted

Page 13: Knowledge discovery in social media mining for market analysis

Twitter for Predictions: Methodology

Module 2: Sentiment Analysis

Page 14: Knowledge discovery in social media mining for market analysis

Twitter for Predictions: Methodology

Module 3: Predictive Analysis Predicting movie performance

Input: Sentiment score + Movie Star Rating

Process: Fuzzy Inference based model is created

Output: Box office movie performance as Hit, Flop or Average

Page 15: Knowledge discovery in social media mining for market analysis

Twitter for Predictions: Methodology

Module 3: Predictive Analysis Predicting weekend collection

Input: Hype factor, Shows per day on all screens, average full house collection

Process: Output: Estimated opening weekend

collection

Page 16: Knowledge discovery in social media mining for market analysis

Twitter for Predictions: Findings & Evaluation 10269 tweets for 14 movies released in

a period of six months (relevant, complete, sufficient) was considered

Actor ratings in the month of release was considered

Predictions compared against the real ratings extracted from IMDB (near perfect predictions)

Mean Square Error used to evaluate the effectiveness of the predictive model (<7% error rate)

Page 17: Knowledge discovery in social media mining for market analysis

Twitter for Predictions: Findings & Evaluation

Page 18: Knowledge discovery in social media mining for market analysis

Twitter for Predictions: Applications

If the predicted revenue < budgeted revenue, increase marketing and publicity efforts

Can determine the maximum allowable promotional budget

Limitations:Only two predictor variables used to

predict box office performance (sentiment score + actor rating)

Use more variables

Page 19: Knowledge discovery in social media mining for market analysis

Using Social Media for Community Detection

60% of American population chose social media as their first choice for information seeking (Scot et al. 2014)

Social relationships transferred to the internet

Online communities based on similar interests and opinions have been created

Opinion based community detection can be used to identify such online communities

Page 20: Knowledge discovery in social media mining for market analysis

Literature: Park & Cho (2012) identified online communities

as an information source for apparel shopping Dev (2014) proposed an algorithm for

community detection in social media based on different interaction methods (no opinion mining)

Kavoura (2014) identified the impact of online communities for communication

Dinsoreanu & Potolea introduced a framework for opinion based community detection in social media

Using Social Media for Community Detection

Page 21: Knowledge discovery in social media mining for market analysis

Data Preparation: Extracted user comments from blog posts and

forums A classification model for opinion mining created a

set of labelled documents and 5 grammar rules introduced by Turney 2002.

Extracted tokens (after filtering) are classified into positive and negative opinions using SVM and NB. A sentiment score assigned to each token.

Tokens stored in a structure format (includes the id, holder, opinion keyword, polarity score etc.)

Community Detection: Methodology

Page 22: Knowledge discovery in social media mining for market analysis

Opinion based Community Detection:

Identifying communities based on similar interests in multiple targets

Aggregate functions to represent the similarity of opinions in multiple targets

Similarity graphs based on Euclidean distance were drawn

Community Detection: Methodology

Page 23: Knowledge discovery in social media mining for market analysis

Opinion based Community Detection: Similarity Functions:

Community Detection: Methodology

Page 24: Knowledge discovery in social media mining for market analysis

1000 labelled documents used as the training set for NB and SVM

Near perfect classification of opinions can be obtained

A user generated data set was used to apply community detection algorithms

Findings: Linear functions perform poorly when

number of targets increase Exponential functions with cutoff perform

best with increasing opinions

Community Detection: Findings & Evaluation

Page 25: Knowledge discovery in social media mining for market analysis

A practice application of community detection was not conducted

Suggestion: The proposed framework can be applied in the pharmaceutical industry for online community detection

Background Literature: “CyberRx” by Radar & Subhan (2013)

Community Detection: Limitations

Page 26: Knowledge discovery in social media mining for market analysis

Community Detection: Potential Application

CyberRx New ApproachData

CollectionForums and Blogs using Google Alerts

Additional sources such as bulletin boards

Keywords Used

Formal names and language

More popular brand names and consumer driven language

Opinion Mining

Manual Automated (SVM classifier)

Community Detection

Manual Aggregated functions and Similarity graphs

Findings Two main communities,- Side effects,

medications- Changing medication

More specific communities can be identified

Page 27: Knowledge discovery in social media mining for market analysis

Community Detection: Potential Application

Knowledge such as,Most prevalent diseases classified based

on geography and demography

Most popularly used brands of drugs

Competing alternatives for a given drug

Information of specifications, variations, duration ad personal experience of side effects (both normal and abnormal)

Page 28: Knowledge discovery in social media mining for market analysis

Using Social Media for Influence Propagation

People influence each other via online interactions and communications

Purchase decisions are heavily influenced by eWoM in social media networks

34% of Twitter users post product related opinions at least once a week (ROI Research Institute)

Objective: Target most influential user on social media to activate a chain of influence driven by eWoM

Page 29: Knowledge discovery in social media mining for market analysis

Literature: Khobzi (2014) conducted a basic content based

analysis on Facebook posts, to identify the connection between the sentiment and the popularity of the post

Kaiser et al. (2012) analyzed opinion formation and influential users based on data collected on iPhone reviews

Okazaki et al. (2014) explored the different types of customer engagement in social media networks and their impact on influence propagation

Using Social Media for Influence Propagation

Page 30: Knowledge discovery in social media mining for market analysis

Influence Propagation: Methodology

Focus group: IKEA customers Training set included 300 preprocessed Tweets Classified manually based on customer

emotional status and content Emotional Status: Satisfied, Dissatisfied, Neutral Content: Information, Sharing, Opinion, Question,

Reply

Trained NB, KNN, SVM classfiers NB performed best

Page 31: Knowledge discovery in social media mining for market analysis

Influence Propagation: Application

New data set: 4000 tweets Users were seen as nodes and tweets as their

relationships Google’s PageRank algorithm to determine the

relative importance of each user

Findings: One satisfied user sharing information (positive

eWoM) Three dissatisfied users spreading negative

opinions

Page 32: Knowledge discovery in social media mining for market analysis

Influence Propagation: Suggestions

Conclusion: Influential Users can be identified Different customer satisfaction levels are crucial

Suggestions: Using celebrities and converting their followers into

influence makers. Additional incentives could be provided to encourage

engagement in discussions Closely monitor for dissatisfied customers online and

occasionally mediate in retweets suggesting feasible solutions and demonstrate their commitment

Page 33: Knowledge discovery in social media mining for market analysis

Knowledge Discovery in SMM: Conclusions

Consolidates the potential knowledge areas that could be exploited for market analysis via community detection in, predictive power of and influence propagation in social media.

Properly preprocessed social media data, with acceptable quality when applied to robust statistical models could predict future market trends with considerable accuracy.

Social media taken social relationships to the digital platform and have created opinion based communities online. These can be used to identify genuine consumer requirements.

Page 34: Knowledge discovery in social media mining for market analysis

Knowledge Discovery in SMM: Conclusions

People express their genuine consumer experiences on social media networks which clearly influence purchasing decisions of other potential consumers.

An efficient framework can identify influential users online and trigger a chain of positive eWoM promoting viral marketing.

Page 35: Knowledge discovery in social media mining for market analysis

Questions & Answers


Recommended