+ All Categories
Home > Documents > OPINION MINING ANALYSIS IN BANKING SYSTEM USING ROUGH ... · suggesting research area, text...

OPINION MINING ANALYSIS IN BANKING SYSTEM USING ROUGH ... · suggesting research area, text...

Date post: 20-May-2020
Category:
Upload: others
View: 4 times
Download: 0 times
Share this document with a friend
16
http://www.iaeme.com/IJMET/index.asp 274 [email protected] International Journal of Mechanical Engineering and Technology (IJMET) Volume 8, Issue 12, December 2017, pp. 274289, Article ID: IJMET_08_12_027 Available online at http://www.iaeme.com/IJMET/issues.asp?JType=IJMET&VType=8&IType=12 ISSN Print: 0976-6340 and ISSN Online: 0976-6359 © IAEME Publication Scopus Indexed OPINION MINING ANALYSIS IN BANKING SYSTEM USING ROUGH FEATURE SELECTION TECHNIQUE FROM SOCIAL MEDIA TEXT N.Sumathi Research Scholar,(Part-time Ph.D Category-B), Department of Computer Science, Research & Development Centre, Bharathiar University, Coimbatore, India Dr.T.Sheela Research Supervisor, Dean Networking, HOD/IT, Sri Sai Ram Engineering College, Chennai, India ABSTRACT Information technologies and social media have strongly touched into our daily life and it is not able to think of our life without Internet and social media. Social media is a strongest option for sharing peoples feeling, thoughts and opinions about various sectors. Some part of such sharing data is collectively for an opinionated, which can be used to analysis and support for making decision in organization. With more or less three billion individuals pick up the social media tab everywhere in the world, banks ought to consider seriously how to improving our customer service, risk factor of banking by arranging an information which is gathered by on social media. Banks have practice social media awareness to generate pricing paradigms for loans and another banking products, and they can bring together traditional scoring milieus with available information in the public social domain, such as Facebook, Twitter, and other social networking sites. In this paper, from the social media, using the sentiment analysis techniques is the method of extracting particular subjective information from various types of data. Opinion mining is one of the techniques it assists you to recognize several subjective opinions of bank. It helps to solves where manual analysis break down. This research made of mounting source of texts from daily observations on social media discussions with growing complexity of text sources and topics, so it need to re-examine the typical sentiment feature extraction approaches. In this paper proposed a sentiment index to extract the feature by using rough set method, it helpful for making information system for classifying the bank loan price analyzing, risk analyzing through social media and creating a strategies to improve wherever fails. Keywords: Sentiment analysis, Rough set, Feature selection, classification, Bank dataset, data preprocessing
Transcript
Page 1: OPINION MINING ANALYSIS IN BANKING SYSTEM USING ROUGH ... · suggesting research area, text sentiment analysis has been greatly studied [1,3,26–28,33,35], where sentiment analysis

http://www.iaeme.com/IJMET/index.asp 274 [email protected]

International Journal of Mechanical Engineering and Technology (IJMET)

Volume 8, Issue 12, December 2017, pp. 274–289, Article ID: IJMET_08_12_027

Available online at http://www.iaeme.com/IJMET/issues.asp?JType=IJMET&VType=8&IType=12

ISSN Print: 0976-6340 and ISSN Online: 0976-6359

© IAEME Publication Scopus Indexed

OPINION MINING ANALYSIS IN BANKING

SYSTEM USING ROUGH FEATURE SELECTION

TECHNIQUE FROM SOCIAL MEDIA TEXT

N.Sumathi

Research Scholar,(Part-time Ph.D Category-B), Department of Computer Science,

Research & Development Centre, Bharathiar University, Coimbatore, India

Dr.T.Sheela

Research Supervisor, Dean Networking, HOD/IT,

Sri Sai Ram Engineering College, Chennai, India

ABSTRACT

Information technologies and social media have strongly touched into our daily

life and it is not able to think of our life without Internet and social media. Social

media is a strongest option for sharing peoples feeling, thoughts and opinions about

various sectors. Some part of such sharing data is collectively for an opinionated,

which can be used to analysis and support for making decision in organization. With

more or less three billion individuals pick up the social media tab everywhere in the

world, banks ought to consider seriously how to improving our customer service, risk

factor of banking by arranging an information which is gathered by on social media.

Banks have practice social media awareness to generate pricing paradigms for loans

and another banking products, and they can bring together traditional scoring milieus

with available information in the public social domain, such as Facebook, Twitter,

and other social networking sites. In this paper, from the social media, using the

sentiment analysis techniques is the method of extracting particular subjective

information from various types of data. Opinion mining is one of the techniques it

assists you to recognize several subjective opinions of bank. It helps to solves where

manual analysis break down. This research made of mounting source of texts from

daily observations on social media discussions with growing complexity of text

sources and topics, so it need to re-examine the typical sentiment feature extraction

approaches. In this paper proposed a sentiment index to extract the feature by using

rough set method, it helpful for making information system for classifying the bank

loan price analyzing, risk analyzing through social media and creating a strategies to

improve wherever fails.

Keywords: Sentiment analysis, Rough set, Feature selection, classification, Bank

dataset, data preprocessing

Page 2: OPINION MINING ANALYSIS IN BANKING SYSTEM USING ROUGH ... · suggesting research area, text sentiment analysis has been greatly studied [1,3,26–28,33,35], where sentiment analysis

N.Sumathi and Dr.T.Sheela

http://www.iaeme.com/IJMET/index.asp 275 [email protected]

Cite this Article: N.Sumathi and Dr.T.Sheela, Opinion Mining Analysis in Banking

System Using Rough Feature Selection Technique from Social Media Text,

International Journal of Mechanical Engineering and Technology 8(12), 2017, pp.

274–289.

http://www.iaeme.com/IJMET/issues.asp?JType=IJMET&VType=8&IType=12

1. INTRODUCTION

The main growing technique of computer is semantics linguistics that explores human

opinions is known as opinion mining or called as sentiment analysis. Opinion mining is one of

the methods of analyzing human thought about products and services; this method belongs to

the natural language processing. At the moment retrieval of opinions from various media

became very easier because personalities share their understandings about numerous topics

across social networks such as Facebook, Twitter and etc., Study of opinions mining is

whatever the people put their opinions and reviews as regards products on social. For

example, in banking, if the bankers want know about the customer's satisfaction of loan

prices, credit card prices and interest on particular scheme.

In the modern years a growing amount of individuals are fixed through an internet so the

flowing information is always increasing. Publics in all over the world post huge volume of

opinionated information in various formats to various media services on the WWW. Some of

the organization teams have immense interest in what customer‘s expression and opinion

regarding their products. Computerizing of this process is helpful to know the customers

thought and customer satisfaction. This is motivated to do the research in sentiment analysis.

Users will rapidly increasing on banks to use social media to deliver faster and more

effective customer service and financial advice. By analyzing the great numbers of data

available on social media such as twitter, banks need to extract some key features that enable

them to improve services of products, marketing, customer service, business performance and

risk management. Since social media have to all concerning the customer experience and

feelings, banks need to assemble their social media strategies about the customer to

determination of loyalty, revenue and profitability.

The social media change the leads to increasing competition in the market; banks need to

be aware of their customer opinion on social media to know brand‘s position and reputation.

Opinions of people are also essential as input to marketing their products. The negative

sentiment, a bank should be able to take action quickly, but for that to happen opinion

detection techniques are needed. A way to do this is thought for making sentiment analysis

system for about the bank.

Social media could be surrounded into a bank‘s thorough network because it influences

several areas, such as CRM (customer relationship management), product and service design,

risk management, etc. Banks can go over their services and product processes once they

collect feedback from their customers and need to get changes based on suggestions

represented from social data. Furthermore, banks will need to inaugurate key performance

indicators to determine their success as they build improvement in the social media journey.

Figure 1 shows the social media key indicators for bank.

In this paper, we present a new method of opinion mining for analyzing an opinion about

risk in textual discoveries by banks. In this work we finds appropriate data set from web,

extract the appropriate keywords for risk and loan price opinion analysis, and analyze risk and

loan prices sentiment within the order of year. The obtained sentiment index scores will be

put a figure on uncertainty, positivity and negativity. Uncertainty narrates to ―uncertainties

resulting in adverse variations of profitability or in losses‖ (Bessis, 2002, p. 11). Negative

opinion analyzes the current or future problems, and positive opinion might characterize

Page 3: OPINION MINING ANALYSIS IN BANKING SYSTEM USING ROUGH ... · suggesting research area, text sentiment analysis has been greatly studied [1,3,26–28,33,35], where sentiment analysis

Opinion Mining Analysis in Banking System Using Rough Feature Selection Technique from Social

Media Text

http://www.iaeme.com/IJMET/index.asp 276 [email protected]

overconfidence. We can discover the sentiment index scores reflect financial events rather

than other major economic crisis within bank on last decade. Additionally test for correlations

between the quantitative risk indicator and sentiment index scores.

Figure 1 Key Performance Indicators

The rest of the paper has four more chapters and ordered as follows. Chapter 2: Describes

different techniques applied in opinion mining. The literature review includes discussion on

lexicon-based approaches. Chapter 3: Gives information about models and proposed

approach, this chapter consist rough set feature selection on sentiment analysis. Additionally

feature selection methods are presented. Next chapter discussed a data set used for training

and testing, presents experiments and results achieved in this research work. Conclusion and

future research are discussed in last chapter.

2. LITERATURE REVIEW

Wide-ranging research effort has been achieved upon different types of networking social

media channels such as dynamic networks upon individuals (twitter, etc.), and online virtual

communities [2, 4, 16, 34, 41–43]. One discoverer work from the Asavathiratham at MIT in

1996 [2] developed a model for tractable representation of networked Markov chains.

Concerning network dynamics of online virtual societies and communities, [16] proposed a

method used for various interesting computations on a social network. Text x or keyword

classification based on its sensitive polarization has become a lately-emerged boundary

attractive to the web mining. To demonstrate how it works, you should use a search engine

online for finding some new location or text such as Google, and shoot the query keyword. It

would be accessible to know what fraction of the matches Google returns recommends upon

the text keyword as a travel destination [18]. Including sentiment analysis into search engine

and text retrieval mechanisms empowers a more effective and functional service for network

users [45]. Sentiment analysis has been consumed in various applications such as news

Page 4: OPINION MINING ANALYSIS IN BANKING SYSTEM USING ROUGH ... · suggesting research area, text sentiment analysis has been greatly studied [1,3,26–28,33,35], where sentiment analysis

N.Sumathi and Dr.T.Sheela

http://www.iaeme.com/IJMET/index.asp 277 [email protected]

tracking, online forums, chatting rooms, blogging etc. YouTube initiated sentiment

classification techniques to sort out all its comments into ―Poor‖ or ―Good‖ [44]. As a

suggesting research area, text sentiment analysis has been greatly studied [1,3,26–28,33,35],

where sentiment analysis is used for text classification tasks [8,13,14,40]. Prevailing

sentiment analysis approaches divide into two types as machine learning [3,33] and semantic

oriented approaches [1,26–28,33,35]. Languages that have been studied include English

[3,13,26–28], Chinese [33,35] and Arabic [1].

In 2015, (Øye, et al) [45] he proposed sentiment analysis on Norwegian Twitter messages.

The aim of his work was to carry out typical sentiment analysis on three distinct datasets

gotten from Twitter social media. The datasets were Norwegian general tweets, one about the

prime minister of Norwegian, Erna Solberg, and another set of tweets about the Rosenborg, it

is football team of Norwegian. Øye presented the analysis based on the two-step approach as

described by (Pang and Lee, 2004) and found a precision up to 80% on the polarity

classification, and up to 76% when merging subjectivity detection with polarity classification.

The author Wei and Gulla, 2010 [46], presented a paper illustrating an approach to

sentiment learning aided by sentiment ontology tree (SOT) structures. The SOT consists

various features, and sub-features, with a root entity, systematized in a hierarchical style. A

hierarchical learning algorithm was applied to threshold- and weight-vectors for analyzing the

problem. According this paper we know the how to extract the feature from social media but

this feature extraction organized on tree structure and it some semantic meaning identification

lacked so we need to use efficient techniques to retrieve features.

Vidya et al., 2015 [47] used Twitter data to develop a sentiment analysis system for

various mobile phone providers in Indonesia. The motivation of this work was to analyses

brand reputation by extracting sentiments with regard to five products: internet services, voice

call, Short Service Message (SMS), 3G and 4G. A metric, net brand reputation (NBR), used

to compute customer satisfaction per service. The NBR metric was matched to the Net

promoter score (NPS) which is commonly used in computing customer satisfaction

(Satmetrix, 2017). This paper motivated to analyzing and finding the metric to determine the

brand reputation. According this paper we get several metrics on bank organization and

finding the reputation of their products.

David Hazarika [11], the social media domain is quickly developing. Banks and financial

institutions that are rapidly to synergize their business process with their social media

stratagems will be most quick to respond the customer needs and offer customers for best

experience. As such, social media can be show as a double-edged sword likely one side it can

raise the possibility of security and privacy threats of banks and their customers, while on the

other side, it most confidently will generate massive value. We conclude that social media

stratagems have become a command for the banking industry for how bank will need to get

on their own social journey to protect their place in the future.

Page 5: OPINION MINING ANALYSIS IN BANKING SYSTEM USING ROUGH ... · suggesting research area, text sentiment analysis has been greatly studied [1,3,26–28,33,35], where sentiment analysis

Opinion Mining Analysis in Banking System Using Rough Feature Selection Technique from Social

Media Text

http://www.iaeme.com/IJMET/index.asp 278 [email protected]

3. MODELS AND METHODOLOGY

Figure 2 Black diagram for proposed method

Our proposed system is primarily combined of the following steps: data collection and

cleaning, giving sentiment index value, generating the information system using roughest and

analyzing the sentiment. Figure. 2 illustrates the diagram of our proposed method. There are

three modules are defined to integrating opinion mining, such as data collection, rough set

feature extraction and analyzing features attribute based result analyzed. First Module

describing the getting content and data cleaning based on text sentiment analysis. The second

module is lexicon-based approach, it helps to originates and analyzes positivity, negativity

and uncertainty text posted in twitter social media about bank products. In Third module is

extracting the feature from posting and compute the sentiment index value for each. Our

proposed system will grant an integer value for each feature, with the sign showing its

emotional polarity and the absolute value its emotional intensity.

3.1 Data Preprocessing

Unstructured text data in twitter have been noisy. Therefore, data cleaning is must to attain a

good output. The pre-processing of data has the subsequent steps, it consisting the following

techniques:

3.1.1. Removal of text without relevant to bank products

Consuming empty posting or posting with irrelevant information of particular products only

put in noise to the classification and extraction problem. So, their removal is a must task for

sentiment analysis.

Page 6: OPINION MINING ANALYSIS IN BANKING SYSTEM USING ROUGH ... · suggesting research area, text sentiment analysis has been greatly studied [1,3,26–28,33,35], where sentiment analysis

N.Sumathi and Dr.T.Sheela

http://www.iaeme.com/IJMET/index.asp 279 [email protected]

3.1.2. Changeover to lower case

This step involves on eliminating conflict on the use of lower and upper cases. So, all the

posted text was transformed into lower case. This can make the easier use of lexicons for the

classification.

3.1.3. Removal of stopwords

Stopwords are functional words in language for connecting sentence, continue next posting

such as prepositions, conjunctions words in English. Examples include a, an, the, into, if, and,

or. This procedure helps to reduce the size of dataset and consist of necessary text for the

succeeding steps.

3.1.4 Special character removals

This step removes excessive whitespaces, punctuation characters, special symbol and

numbers.

3.1.5. Stemming

Describes stemming is a process of deleting suffixes and prefixes of posted data, after this

step dataset have root word or stem word. The premise is that words describe similar meaning

in text. For example: develop, develops, developed, developing, developers have a common

stem or root word develops.

3.2 Lexicon-based Approach

3.2.1 Sentiment Classifier

This work processing a sentiment analysis in a document level in others words, each text can

be categorized into positive, negative and uncertainty. Uncertainty group means the posted

data is nor good or bad about the bank product in twitter posted by customers. We consume

that each posted text refers to a single bank product such as loan or credit card. The following

algorithm have classified the user text as follows

1. Every text in dataset is classified into positive, negative and uncertainty

2. If the total number of positive lexicon words of products is greater than the total

number of negative words, we assume the products are categorized as positive.

3. If the total number of positive lexicon words of a products is lesser than the total

number of negative words, we assume the products is categorized as negative

4. If the words not categorized as positive or negative, the words categorized as

uncertainty.

Page 7: OPINION MINING ANALYSIS IN BANKING SYSTEM USING ROUGH ... · suggesting research area, text sentiment analysis has been greatly studied [1,3,26–28,33,35], where sentiment analysis

Opinion Mining Analysis in Banking System Using Rough Feature Selection Technique from Social

Media Text

http://www.iaeme.com/IJMET/index.asp 280 [email protected]

Table 1 Examples for opinion words

Positive Good, Excellent, easy, Secure

Transaction

Negative Connection problem, less

Response, Poor

Uncertainty Average, Somewhat ok, not

good not bad

Table 1 lists out of some sentiment words in the bank money transfer risk. The customer

level sentiment index weight scores are computed for three sentiment lexicon classes, namely

uncertainty lexicon, positivity lexicon, and negativity lexicon.

Li,j the local weight of text i in file j.

* ( )

global weight is the inverse document frequency

Where N denoted number of documents and di is the number of customers used text i at

least once

√∑ ( )

Algorithm 1 Text Sentiment Classification

procedure ClassifySentiment(Data)

Positives = 0

Negatives = 0

for every word T in data do

if Lexicon(T) is positive then

Positives = Positives + 1

else if Lexicon(T) is negative then

Negatives =Negatives + 1

end if

end for

if Positives > Negatives then

return ―positive about products‖

else if Positives < Negatives then

return ―negative about products‖

else

return ―Uncertainty (Average Products)

end if

end procedure

Page 8: OPINION MINING ANALYSIS IN BANKING SYSTEM USING ROUGH ... · suggesting research area, text sentiment analysis has been greatly studied [1,3,26–28,33,35], where sentiment analysis

N.Sumathi and Dr.T.Sheela

http://www.iaeme.com/IJMET/index.asp 281 [email protected]

After computing the sentiment index weight, the customer posted data are filtered and

categorized in order to make ready them for the assessments. In particular, the data are

separated regarding to specific products and clustered by year separately by bank.

3.3 Rough set information system and feature extraction

3.3.1 Information System

Rough set theory supports rough calculation in decision-making, selecting an object or

attribute associated with knowledge signifying relative membership. A table represents

knowledge, so it has called an information system, where rows denote objects and columns

denote attributes. An information system S is pair of non-empty finite set of objects (U), as

the universe, and non-empty finite set of attributes (A)such as S=(U,A). Let X U and BA.

Let the set X consists only the data in B by constructing the lower and upper approximations

of X it denoted by X and X. Lower approximation defines the set of members that

certainly belong to a given class. Let X U, the B-lower approximation X of a set of

members X can be defined as{xU:[x]B X}. Upper approximation defines the set of

members that can probably belong to a given class. Let X U, the B-upper approximation

X of a set of members X can be defined as{xU:[x]BX ≠ Φ } [37].

Figure 3 Rough set Theory

The Figure.3 illustrates the sets of dark-gray spots which is denoted as lower

approximation X, while those of both dark-gray and light-gray spot together represent upper

approximation X.

The set X with respect to B can be described numerically as Rα = 1- . This means X is

crisp with respect to B if the set X is 0, and X is rough if Rα>0 then [38]. Let the universe U

is a set of pixels of an image. The universe U can be partitioned into a set of non- overlapping

windows (size m × n, say), each partitioned window considered as a granule G. A granule is a

bunch of pixels in the universe U called as indistinguishability. Thus, granulation involves

decomposition of completely set into parts [37, 41, 38].

In Rough Set Theory, a data set is represented as a table, where each row represents a text

posted by customer. Each column represents the sentiment lexicons such as positive, negative

and uncertainty [39, 40].This table is called an information system. The set of all elements is

known as the universe.

Page 9: OPINION MINING ANALYSIS IN BANKING SYSTEM USING ROUGH ... · suggesting research area, text sentiment analysis has been greatly studied [1,3,26–28,33,35], where sentiment analysis

Opinion Mining Analysis in Banking System Using Rough Feature Selection Technique from Social

Media Text

http://www.iaeme.com/IJMET/index.asp 282 [email protected]

Consider a Universe U of elements. Formally an information system I is a quadruple

I = (U, A, V, ρ), Where

A is a non-empty finite set of attributes

V = UaєA Va is the set of attribute values of all attribute,

Where Va is the set of possible values of a

ρ: U X A → V is an information function, such that for every element x є U,

ρ (x, a) є Va is the value of attribute a for element x.

The information system can also be viewed as an Information table, where each element x

є U corresponds to a row, each attribute a є A corresponds to a column

3.3.2 Decision system

A decision system is used for minimizing the attribute which is used to retrieve the text, those

is called condition attributes. That is the set of feature attribute used to get exact results that

particular attribute called as decision attribute.

A very simple information system is shown in Table. 2. There are six cases or objects, and

two condition attributes

Table 2 An Example Information System.

Term Bank mission

Sentiment

Feature

lexicon

policy 37 +1

Financial 25 +1

Bank 36 0

That 8 0

Loan 85 +1

Wont 23 -1

Wanted 8 -1

Any one posted data file will contain only a subset of all individual terms, and the rows

corresponding to used terms and giving number of time it used by customer‘s. The third

column feature lexicons for sentiment analysis such as positive represent as +1, negative

represented as -1 and uncertainty represents as 0.

This posteriori knowledge is stated by one separated variable called decision variable; the

procedure is called as supervised learning. Information systems of this category are called

decision systems.

The customers posted a text about bank products on social media, they discussed single,

two or many features about a product. The classification of separated features from a massive

database of reviews is a difficult job. The information system is established for this system is

by using the benchmark dataset. The unstructured dataset preprocessed in module one for

removing unwanted text. After this step, the dataset consist only product features, sub features

and relevant opinion words. The information system is developed consists of objects which

are social media customers who are text about bank, the product features are known as

condition attributes and the decision attribute called as class label. The Table 3 tabulated the

information system for bank loan interest product and financial risk.

Page 10: OPINION MINING ANALYSIS IN BANKING SYSTEM USING ROUGH ... · suggesting research area, text sentiment analysis has been greatly studied [1,3,26–28,33,35], where sentiment analysis

N.Sumathi and Dr.T.Sheela

http://www.iaeme.com/IJMET/index.asp 283 [email protected]

Table 3 Information table of Bank Text

Object Interest Pre pay Installment Services Response Service

charge Document Sentiment

O1 1 1 0 0 0 1 1 P

O2 1 1 0 0 0 1 1 P

O3 0 0 0 1 0 0 0 N

O4 0 1 1 1 0 0 0 P

O5 1 1 0 0 0 1 1 P

O6 0 0 0 0 1 1 0 N

O7 0 1 1 1 0 0 0 P

O8 1 0 1 0 1 0 0 N

O9 1 1 0 0 0 1 1 P

O10 0 0 1 1 0 1 1 P

O11 0 1 1 1 0 0 0 P

O12 0 1 1 1 0 0 0 P

O13 0 0 0 1 0 0 0 N

O14 1 1 0 0 0 1 1 P

O15 0 0 1 1 0 0 0 N

The Bank deal with market risks by hedging opposed to foreign exchange and interest rate

risk with the intention to shield its earnings and protect the economic value of its liabilities

and assets. Foreign exchange risk is virtually fully hedged. Interest rate risk obtaining from

differences between lending and funding is retained at a modest level. The Table 4 is an

information table on market risk.

Table 4 Information Table of Market risk

Foreign exchange Cross-currency Interest rate Credit spread Sentiment

X1 1 1 0 1 P

X2 1 1 0 0 N

X3 0 1 1 1 P

X4 1 1 0 0 N

X5 1 1 1 1 P

X6 1 0 0 1 N

X7 0 1 1 1 P

X8 1 1 0 1 P

X9 1 1 0 1 P

The information table value ‗1‘ represents presence and ‗0‘ represents the absence of the

feature in the posted text. The character ‗P‘ and ‗N‘ represents the Opinion Orientation

decision attribute value. An important theory of a rough set is to detect redundancies and

dependencies among the information features. Lower and upper approximations used to find

decision boundary based on the equivalence classes.

Page 11: OPINION MINING ANALYSIS IN BANKING SYSTEM USING ROUGH ... · suggesting research area, text sentiment analysis has been greatly studied [1,3,26–28,33,35], where sentiment analysis

Opinion Mining Analysis in Banking System Using Rough Feature Selection Technique from Social

Media Text

http://www.iaeme.com/IJMET/index.asp 284 [email protected]

Table 5 Equivalence Class

Equivalence class(Loan) Equivalence

class(Risk)

{O1,O2,O5,O14} {X1,X8,X9}

{O3,O13} {X2,X4}

{O4,O7,O11,O12} {X3,X7}

{O5,O9} {X5}

{O6} {X6}

{07}

{09}

Regarding the information system, loan analysis has eight condition attributes and the risk

management have five condition attributes, based on equivalence class the attribute reduced

as four attributes for deciding sentiment positive or negative.

4. RESULTS AND EVALUATION

4.1. Dataset

In this system, to the executed sentiment analysis technique have need of a great amount of

text data. The purpose of this execution is to make a paradigm that exactly classifies any type

of bank text as negative (-1), uncertainty (0) and positive (1). Appropriate to generate an

effective model for this problem, all information is reviews of the bank product domain. The

all reviews written by bank customers and all posted text scripted in English language.

The data set fetched from Facebook and Twitter. Facebook pages occasionally consists a

review column as empty where individuals can drop reviews as blank. These posting are

treated as uncertainty sentiment. All Twitter posted text can be manually edited because of

there is no rating of posted text.

Figure 3 Dataset classification

From the dataset from the total of 653 posted texts 49% were positive, 42% were negative

and 9% were uncertainty. In figure 3, the average occurrence of text polarity class is shown.

It is easy to realize a negative text have more long unwanted text, comparing the other

sentiment analysis negative part have more data cleaning process.

0

2

4

6

8

10

12

14

Negative Uncertainty Positive

Average of Dataset Sentiment Class

Unknown

Preposition

Conjunction

Special Symbol

White spaces

Repeated word

Stemming

Page 12: OPINION MINING ANALYSIS IN BANKING SYSTEM USING ROUGH ... · suggesting research area, text sentiment analysis has been greatly studied [1,3,26–28,33,35], where sentiment analysis

N.Sumathi and Dr.T.Sheela

http://www.iaeme.com/IJMET/index.asp 285 [email protected]

Feature selection is the procedure for choosing which minimal features attribute to

retrieve the sentiment text for analyzing the bank product. The aim of feature selection

method is to detect the subset of the features attribute in dataset, and it produce the exact

results for analyzing. The features attribute were grouped into the different classes showed in

table 5. Regarding this class, we analyzing the bank products.

4.2 Results and Evaluation

Precision is a technique to fetch the total amount relevant text bank among the total texts in

dataset. Recall is measures the number of text retrieved about products in total bank texts.

Figure 4 shows the how the classification accuracy of bank products such as loan prices

and marketing risk. Algorithms performed based on equivalence class table 5 with the 15 and

9 feature attribute sets and calculating the highest accuracy values and showed in table 6.

Figure 4, shows the Accuracy of feature selection in the bank loan prices and Market risk.

Table 6 Accuracy Measurement

Equivalence

class(Loan) Accuracy

Equivalence

class(Risk) Accuracy

{O1,O2,O5,O14} 0.75 {X1,X8,X9} 0.79

{O3,O13} 0.6 {X2,X4} 0.72

{O4,O7,O11,O12} 0.74 {X3,X7} 0.65

{O5,O9} 0.4 {X5} 0.4

{O6} 0.2 {X6} 0.3

{07} 0.2

{09} 0.1

Figure 4 Accuracy Measurements

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

F1 F2 F3 F4 F5 F6 F7

Loan prices

Market risk

Page 13: OPINION MINING ANALYSIS IN BANKING SYSTEM USING ROUGH ... · suggesting research area, text sentiment analysis has been greatly studied [1,3,26–28,33,35], where sentiment analysis

Opinion Mining Analysis in Banking System Using Rough Feature Selection Technique from Social

Media Text

http://www.iaeme.com/IJMET/index.asp 286 [email protected]

Figure 5 Sentiment analysis of Loan product

Figure 6 Sentiment analysis of Market Risk

As the Figure 5 and Figure 6 has the measurement of Loan and market risk products based

on rough set feature selection. According the figure 4, 5, and 6, feature attribute Interest, Pre

pay, Service charge, Document gives the more accuracy in sentiment classification. The

feature Pre pay Services, Installment gives next accuracy in loan prices analysis. The features

Foreign exchange, Cross-currency, Credit spread gives the more accuracy in market risk

analysis.

5. CONCLUSION

In this paper, we have Rough set based feature selection approach is utilized for sentiment

analysis in bank. It is capable reducing decision attributes for analyzing the positive,

uncertainty and negative text, the text posted by bank customers to social media. We found

the exact result about loan pricing and market risk through the rough set theory method with

help of selecting minimal feature attribute.

In future, we use various classification methods for sentiment lexicons in bank with rough

set feature selection technique, and comparing another technique which is give the more

accuracy result.

0

0.2

0.4

0.6

0.8

1

Negative Uncertainty Positive

Evaluation of Loan

Precision

Recall

0

0.2

0.4

0.6

0.8

1

Negative Uncertainty Positive

Measurment of Market Risk

Precision

Recall

Page 14: OPINION MINING ANALYSIS IN BANKING SYSTEM USING ROUGH ... · suggesting research area, text sentiment analysis has been greatly studied [1,3,26–28,33,35], where sentiment analysis

N.Sumathi and Dr.T.Sheela

http://www.iaeme.com/IJMET/index.asp 287 [email protected]

REFERENCES

[1] K. Ahmad, Y. Almas, Visualising sentiments in financial texts? Proceedings of the Ninth

International Conference on Information Visualisation (2005) 363–368.

[2] C. Asavathiratham, The Influence Model: A Tractable Representation for the Dynamics of

Networked Markov Chains, Dept. of EECS. 2000, MIT, Cambridge, 2000, p. 188.

[3] P. Chaovalit, L. Zhou, Movie review mining: a comparison between supervised and

unsupervised classification approaches, Proceedings of the 38th Hawaii International

Conference on System Sciences, 2005.

[4] K.W. Cheung, J.T. Kwok, M.H. Law, K.C. Tsui, Mining customer product ratings for

personalized marketing, Decision Support Systems 35 (2) (2003) 231–243.

[5] J. Coble, D. Cook, R. Rathi, L. Holder, Iterative structure discovery in graph-based data,

International Journal of Artificial Intelligence Techniques 1–2 (14) (2005) 101–124.

[6] M. Dash, H. Liu, Feature selection for classification, Intelligent Data Analysis 1 (3)

(1997) 131–156.

[7] C.C. Freifeld, K.D. Mandl, B.Y. Reis, J.S. Brownstein, HealthMap: global infectiou

disease monitoring through automated classification and visualization of internet media

reports, Journal of the American Medical Informatics Association 15 (2008) 150–157.

[8] J. Gaurav, A. Ginwala, Y.A. Aslandogan, An approach to text classification using

dimensionality reduction and combination of classifiers, Proceedings of the 2004 IEEE

International Conference on Information Reuse and Integration (2004) 564–569.

[9] Goswami, R.M. Jin, G. Agrawal, Fast and exact out-of-core k-means clustering, Fourth

IEEE International Conference on Data Mining (2004) 83–90. [10] V. Guralnik, G.

Karypis, A scalable algorithm for clustering protein sequences, Proc. Workshop Data

Mining in Bioinformatics (BIOKDD), 2001, pp. 73–80.

[10] K.F. Han, D. Baker, Recurring local sequence motifs in proteins, Journal of Molecular

Biology 251 (1) (1995) 176–187.

[11] David Hazarika, How Banks Can Use Social Media Analytics To Drive Business

Advantage, article,2010

[12] V. Hatzivassiloglou, K.R. McKeown, Predicting the semantic orientation of adjectives,

Proceedings of the 35th Annual Meeting of the ACL and the 8th Conference of the

European Chapter of the ACL, New Brunswick, NJ, 1997, pp. 174–181.

[13] R.Q. Huang, J.H.L. Hansen, Dialect classification on printed text using perplexity

measure and conditional random fields, IEEE International Conference on Acoustics,

Speech and Signal Processing (2007) 993–996.

[14] T. Joachims, Text categorization with SVM: learning with many relevant features,

Proceedings of ECM, 10th European Conference on Machine Learning, 1998.

[15] J.I. Khan, S. Shaikh, Relationship algebra for computing in social networks and social

network based applications, 2006 IEEE/WIC/ACM International Conference on Web

Intelligence, 2006, pp. 113–116.

[16] N. Li, X. Liang, X. Li, C. Wang, D. Wu, Network environment and financial risk using

machine learning and sentiment analysis, Human and Ecological Risk Assessment (2009)

227–252.

[17] B. Pang, L. Lee, S. Vaithyanathan, Thumbs up? Sentiment classification using machine

learning techniques, Proceedings of the Conference on Empirical Methods in Natural

Language Processing (EMNLP), 2002, pp. 79–86.

[18] T. Saegusa, T. Maruyama, Real-time segmentation of color images based on the K-means

Clustering on FPGA, International Conference on Field-Programmable Technology, 2007,

pp. 329–332.

Page 15: OPINION MINING ANALYSIS IN BANKING SYSTEM USING ROUGH ... · suggesting research area, text sentiment analysis has been greatly studied [1,3,26–28,33,35], where sentiment analysis

Opinion Mining Analysis in Banking System Using Rough Feature Selection Technique from Social

Media Text

http://www.iaeme.com/IJMET/index.asp 288 [email protected]

[19] S. Schauland, A. Kummert, P. Su-Birm, I. Uri, Y. Zhang, Vision-based pedestrian

detection—improvement and verification of feature extraction methods and SVMbased

classification, IEEE Intelligent Transportation Systems Conference (2006) 97–102.

[20] Z.H. Sun, Y.X. Sun, Fuzzy support vector machine for regression estimation, IEEE

International Conference on Systems, Man and Cybernetics, vol. 4, 2003, pp. 3336–3341.

[21] S. Tan, J. Zhang, An empirical study of sentiment analysis for chinese documents, Expert

Systems with Applications 34 (4) (2008) 2622–2629. [23] D. Thanh-Nghi, J.D. Fekete,

Large scale classification with support vector machine algorithms, ICMLA 2007, Sixth

International Conference on Machine Learning and Applications, 2007, pp. 7–12.

[22] Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis lectures on human

language technologies, 5(1), 1-167.

[23] Medhat, W., Hassan, A., & Korashy, H. (2014). Sentiment analysis algorithms and

applications: A survey. Ain Shams Engineering Journal, 5(4), 1093-1113.

[24] Pak, A., & Paroubek, P. (2010, May). Twitter as a Corpus for Sentiment Analysis and

Opinion Mining. In LREc (Vol. 10, No. 2010).

[25] Gokulakrishnan, B., Priyanthan, P., Ragavan, T., Prasath, N., & Perera, A. (2012,

December). Opinion mining and sentiment analysis on a twitter data stream. In Advances

in ICT for emerging regions (ICTer), 2012 International Conference on (pp. 182-188).

IEEE.

[26] Hallsmar, F., & Palm, J. (2016). Multi-class sentiment classification on twitter using an

emoji training heuristic.

[27] Salas-Zárate, M. D. P., Medina-Moreira, J., Lagos-Ortiz, K., Luna-Aveiga, H., Rodríguez-

García, M. Á., & Valencia-García, R. (2017). Sentiment Analysis on Tweets about

Diabetes: An Aspect-Level Approach. Computational and mathematical methods in

medicine, 2017.

[28] Chiavetta, F., Bosco, G. L., & Pilato, G. (2016). A Lexicon-based Approach for Sentiment

Classification of Amazon Books Reviews in Italian Language.

[29] Hailong, Z., Wenyan, G., & Bo, J. (2014, September). Machine learning and lexicon

based methods for sentiment classification: A survey. In Web Information System and

Application Conference (WISA), 2014 11th (pp. 262-265). IEEE.

[30] Dong, Z., Dong, Q., & Hao, C. (2010, August). Hownet and its computation of meaning.

In Proceedings of the 23rd International Conference on Computational Linguistics:

Demonstrations (pp. 53-56). Association for Computational Linguistics.

[31] Musto, C., Semeraro, G., & Polignano, M. (2014). A comparison of lexicon-based

approaches for sentiment analysis of microblog posts. Information Filtering and Retrieval,

59.

[32] Park, S., & Kim, Y. (2016, June). Building thesaurus lexicon using dictionary-based

approach for sentiment classification. In Software Engineering Research, Management

and Applications (SERA), 2016 IEEE 14th International Conference on (pp. 39-44).

IEEE.

[33] Ding, X., Liu, B., & Yu, P. S. (2008, February). A holistic lexicon-based approach to

opinion mining. In Proceedings of the 2008 international conference on web search and

data mining (pp. 231-240). ACM.

[34] Thakkar, H., & Patel, D. (2015). Approaches for sentiment analysis on twitter: A state-of-

art study. arXiv preprint arXiv:1512.01043.

[35] Zagibalov, T., & Carroll, J. (2008, August). Automatic seed word selection for

unsupervised sentiment classification of Chinese text. In Proceedings of the 22nd

International Conference on Computational Linguistics-Volume 1 (pp. 1073-1080).

Association for Computational Linguistics.

Page 16: OPINION MINING ANALYSIS IN BANKING SYSTEM USING ROUGH ... · suggesting research area, text sentiment analysis has been greatly studied [1,3,26–28,33,35], where sentiment analysis

N.Sumathi and Dr.T.Sheela

http://www.iaeme.com/IJMET/index.asp 289 [email protected]

[36] Tang, B., Kay, S., & He, H. (2016). Toward optimal feature selection in naive Bayes for

text categorization. IEEE Transactions on Knowledge and Data Engineering, 28(9), 2508-

2521.

[37] Yiyuan Cheng, Ruiling Zhang, Xiufeng Wang, Qiushuang Chen. Text Feature Extraction

Based on Rough Set in Fifth International Conference on Fuzzy Systems and Knowledge

Discovery, 2008 IEEE, pp.310-314.

[38] Hsun-Hui Huang, Yau-Hwang Kuo and Horng-Chang Yang. Fuzzy-Rough Set Aided

Sentence Extraction Summarization in Proceedings of the First International Conference

on Innovative Computing, Information and Control (ICICIC'06).

[39] Qiang Li, Jjan-Hua Li, Gong- Shen liu, Sheng-Hong Li. A Rough Set based Hybrid

Feature Selection Method For Topic Specific Text Filtering in Proceedings of the Third

International Conference on Machine Learning and Cybernetics, Shanghai, 26-29 August

2004, pp.1464-1468.

[40] Richard Jensen and Qiang Shen . Semantics-Preserving Dimensionality eduction: Rough

and Fuzzy-Rough-Based Approaches in IEEE Transactions on Knowledge and Data

Engineering, Vol. 16, No. 12, December 2004,pp.1457-1471.

[41] Zdzisław Pawlak. Rough set theory and its applications in journal of Telecommunication

and Information Technology 3/2002, pp.7-10. .

[42] Chikersal, P., Poria, S., & Cambria, E. (2015, June). SeNTU: sentiment analysis of tweets

by combining a rule-based classifier with supervised learning. In Proceedings of the

International Workshop on Semantic Evaluation, SemEval (pp. 647-651).

[43] Manning, C. D., Raghavan, P., & Schütze, H. (2008). Introduction to information retrieval

(Vol. 1, No. 1, p. 496). Cambridge: Cambridge university press.

[44] Severyn, A., & Moschitti, A. (2015, August). Twitter sentiment analysis with deep

convolutional neural networks. In Proceedings of the 38th International ACM SIGIR

Conference on Research and Development in Information Retrieval (pp. 959-962). ACM.

[45] Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed

representations of words and phrases and their compositionality. In Advances in neural

information processing systems (pp. 3111-3119).

[46] Kim, Y. (2014). Convolutional neural networks for sentence classification. arXiv preprint

arXiv:1408.5882.

[47] Øye, J. A. (2015). Sentiment Analysis of Norwegian Twitter Messages. Master‘s thesis.

[48] Wei,W. and Gulla, J. A. (2010). Sentiment Learning on Product Reviews via Sentiment

Ontology Tree. In Proceedings of the 48th Annual Meeting of the Association for

Computational Linguistics, pages 404–413. Association for Computational Linguistics.

[49] Vidya, N. A., Fanany, M. I., and Budi, I. (2015). Twitter Sentiment to Analyze Net Brand

Reputation of Mobile Phone Providers. 72:519–526.

[50] Myneni Madhu Bala, M. Srinivasa Rao and M Ramesh Babu, Sentiment Trends on

Natural Disasters using Location Based Twitter Opinion Mining, International Journal of

Civil Engineering and Technology (IJCIET) Volume 8, Issue 8, August 2017, pp. 9-19

[51] Rashid Ali, Pro-Mining: Product Recommendation Using Web-Based Opinion Mining,

International Journal of Computer Engineering & Technology (IJCET), Volume 4, Issue

6, November - December (2013), pp. 299-313

[52] Sandip S. Patil and Asha P. Chaudhari, Classification of Emotions from Text Using Svm

Based Opinion Mining, International Journal of Computer Engineering & Technology

(IJCET), Volume 3, Issue 1, January- June (2012), pp. 330-338

[53] Dr. Jamshed Siddiqui, An Overview of Opinion Mining Techniques, International Journal

of Advanced Research in Engineering and Technology (IJARET), Volume 4, Issue 7,

November - December 2013, pp. 176-182


Recommended