An Enhanced Stock Prediction Model using Sentiment Analysis Based on
Multiple Regression
Saravanan.Ramalingam1* and Sujatha.Putholla2
1Research Scholar, Department of Computer Science and Engineering, Pondicherry University, Pondicherry, India 2Assistant Professor, Department of Computer Science and Engineering, Pondicherry University, Pondicherry, India
ABSTRACT
Big data analytics is the process of extracting data and gaining knowledge from the large sets of
data that helps the data scientists to gain valuable insights. Predictive analytics is one of the key areas of
big data analytics that takes into consideration the historical and current data sets to predict the values of
future data sets. Predictive analytics makes use of statistical methods to generate data predictions as well
as methods for assessing the predictive power of algorithms. Sentiment analysis is also a type of big data
analytics which deals with the process of computing, identifying and categorizing the public opinion,
which is in a text form to find the sentimental interests of public on a particular topic. Stock market is an
area where people exchange their shares and stocks according to their wish with a basic motto of financial
gain. Predicting the stock market is a challenging job as the stock price movement is influenced by a large
number of factors such as global events, political events, general economic conditions, and traders’
expectations across the globe. Hence to predict the stock price movement we make use of predictive
analytics and arrive at predict index value. To predict the influence of public factors we make use of
sentiment analysis on the data collected from social media and form a sentiment index. Then to improve
the accuracy of our prediction we make use of the two indices found and match them accordingly to gain
an overall stock index which predicts the stock price movements.
KEYWORDS: Big Data, Big Data Analytics, Sentiment Analysis, Stock Prediction, Predictive
Analytics.
*Corresponding author
Saravanan.Ramalingam
Research Scholar,
Department of Computer Science and Engineering,
Pondicherry University, Pondicherry, India
Email: [email protected], Mob.No - 9894447191
JASC: Journal of Applied Science and Computations
Volume VI, Issue IV, April/2019
ISSN NO: 1076-5131
Page No:1458
INTRODUCTION
Big Data is a phrase used to mean a massive volume of both structured and unstructured data that
is so large which is difficult to process using traditional database and software techniques. In most
enterprise scenarios the volume of data is too big or it moves too fast or it exceeds current processing
capacity. Big Data has the potential to help companies improve operations and make faster, more
intelligent decisions. This data, when captured, formatted, manipulated, stored, and analyzed can help a
company to gain useful insight to increase revenues, get or retain customers, and improve operations.
Big data has increased the demand of information management specialists in Software AG, Oracle
Corporation, IBM, Microsoft, SAP, EMC, HP and Dell have spent more than $15 billion on software firms
specializing in data management and analytics. In 2010, this industry was worth more than $100 billion
and was growing at almost 10 percent a year: about twice as fast as the software business as a whole.
There are various challenges that are faced in Big Data in day to day life. Some of the challenges
that Big Data faces are discussed in this section.
Understanding and Utilizing Big Data
New, Complex, and Continuously Emerging Technologies
Cloud Based Solutions
Privacy, Security, and Regulatory Considerations
Archiving and Disposal of Big Data
The Need for IT, Data Analyst, and Management Resources
BIG DATA ANALYTICS
Big data analytics is the process of examining large data sets to uncover hidden patterns, unknown
correlations, market trends, customer preferences and other useful business information. The primary goal
of big data analytics is to help companies make more informed business decisions by enabling data
scientists, predictive modelers and other analytics professionals to analyze large volumes of transaction
data, as well as other forms of data that may be untapped by conventional business intelligence programs.
Many analytic techniques, such as regression analysis, simulation, and machine learning, have
been available for many years. Big data analytics allows data scientists and various other users to evaluate
large volumes of transaction data and other data sources that traditional business systems would be unable
to tackle. Traditional systems may fall short because they're unable to analyze as many data sources.
JASC: Journal of Applied Science and Computations
Volume VI, Issue IV, April/2019
ISSN NO: 1076-5131
Page No:1459
Sophisticated software programs are used for big data analytics, but the unstructured data used in big data
analytics may not be well suited to conventional data warehouses. Big data's high processing requirements
may also make traditional data warehousing a poor fit.
Descriptive Analytics
Descriptive analytics, such as reporting/OLAP, dashboards/scorecards, and data visualization,
have been widely used for some time, and are the core applications of traditional BI.
Predictive Analytics
Predictive analytics suggest what will occur in the future and the methods and algorithms for
predictive analytics such as regression analysis, machine learning, and neural networks.
Prescriptive Analytics
Prescriptive Analytics refers to the process of analyzing the abstraction of an exact data related to
a particular field to enhance the classification result.
Big data analytics helps organizations harness their data and use it to identify new opportunities.
That, in turn, leads to smarter business moves, more efficient operations, higher profits and happier
customers. Some of the major importance of Big Data Analytics is,
Reducing the Cost
Faster and Better Decision Making
Need for New Products and Services
Sentiment analysis is another form of big data analytics which is a computational study of opinions,
sentiments, evaluations, attitudes, appraisal, affects, views, emotions, subjectivity, etc., expressed in text.
The text may include reviews, feedbacks, comments, discussions, news, status, tweets. Sentiment analysis
can be done at different levels - document level, sentence level or aspect/feature level.
Document Level Classification
In this process, sentiment is extracted from the entire review, and a whole opinion is classified
based on the overall sentiment of the opinion holder. The goal is to classify a review as positive, negative,
or neutral. Document level classification works best when the document is written by a single person and
expresses an opinion/sentiment on a single entity.
JASC: Journal of Applied Science and Computations
Volume VI, Issue IV, April/2019
ISSN NO: 1076-5131
Page No:1460
Sentence Level Classification
This process usually involves two steps:
1. Subjectivity classification of a sentence into one of two classes: objective and subjective.
2. Sentiment classification of subjective sentences into two classes: positive and negative
OBJECTIVE
In the existing system, only two factors such as positive and negative are considered. To overcome
this drawback, a new approach called NRC emotion lexicon has been implemented successfully in the
proposed model. NRC emotion lexicon classifies the given input file into 10 different factors. Another
objective is to perform an enhanced stock prediction. In the existing system only the price movements
were considered that is whether the stock prices move up or go down. Using the multiple regression
technique, the stock prices are predicted.
RELATED WORKS
Stock market is a place where people buy and sell their shares and stocks according to their wish
with a basic motto of financial gain. Investing in stock market seems to be an easy task but that is not the
major case. It also includes a high-risk factor on investing in a particular stock. So, to identify the increase
or decrease in price of a particular stock a technique is utilized called Stock Prediction. Stock prediction
is an area in which interest in predicting the stock prices by analyst increases exponentially as it avoids
the risk or to improve the financial status considerably. The following section lists the various papers
related to stock market prediction.
Thien Hai Nguyen et al. [1], developed a model to predict the stock price movement using the
sentiments of specific topic. A new feature called “topic-sentiment” was incorporated for better stock
market prediction. Historical data were extracted from Yahoo Finance for 18 companies for about a year
period. The sentences were split and then Stanford Core NLP was used for POS tagging and lemmatization
of each word in each sentence was done. For each transaction date, the sentiment value of each topic was
calculated and also the importance of each topic was considered for prediction.
Bollen et al, [2] checked whether public sentiment expressed in daily tweets can predict the stock
market or not. Two tools Opinion Finder and GPOMS were used to measure variation in the public mood
JASC: Journal of Applied Science and Computations
Volume VI, Issue IV, April/2019
ISSN NO: 1076-5131
Page No:1461
from tweets and the results were correlated with Dow Jones Industrial Average (DJIA). Fagner
Andrade et al, [3] predicted the close price of the stock (PETR4) by utilizing artificial neural network
model. Three stages were included for generating prediction. The datasets collection, cleaning and data
normalization and prediction using MLP feed-forward network model. Both these techniques are
correlated to find the accuracy in stock prediction.
Girija V Attigeri et al, [4] predicted the stock performance by applying the concepts machine
learning and fundamental analysis. Data were gathered and prepared for sentiment analysis. After
analysing, the sentiments are aggregated and visualized in the form of graph and Machine leaning model
to predict new data is developed using Linear Regression. Rishabh Soni et al, [5] proposed a hybrid
approach which combines unsupervised learning to cluster the tweets and perform supervised learning
methods. Feature extraction was implemented after obtaining the data set. Cluster of tweets were formed.
Various decision tree algorithms such as Random Forest are implemented and the performance was
evaluated.
Bhakti G. Deshmukh [6], tried to find out whether twitter sentiments and commodity prices help
in predicting actual stock prices for top 50 companies listed on NIFTY at NSE, India. They used NLP,
Sentiment Analysis and ML techniques for prediction. Tweets are collected and then processed to perform
sentiment analysis and are correlated to predict the stock prices. Kibum Kim, [7] predicted the stock price
on bio industry using opinion mining and mechanical learning. The stock price prediction system
consisting of a data collector, vocabulary analyzer, sentiment dictionary, sentiment analyzer, and stock
price predictor is used. Based on previously stored data, stock price predictor manages predicting stock
prices based on new information using mechanical learning engine.
Jigar Patel, [8] predicted the stock movement using the Trend Deterministic Data Preparation
technique that exploits inherent opinion of each of the technical indicators about stock price movement.
The technical indicators were previously used directly for prediction while this study first extracts trend
related information from each of the technical indicators and then utilizes the same for prediction, resulting
in significant improvement in accuracy. Anthony R. [9], investigated the effect of public sentiments or
mood from a large collection of twitter data for the movement of Stock Market Index. The data was
collected from twitter based on the geo location and the stock price of the local market was predicted to
find if the public sentiments affected the movement of stock index and its degree of influence in that
market using Granger causality test.
JASC: Journal of Applied Science and Computations
Volume VI, Issue IV, April/2019
ISSN NO: 1076-5131
Page No:1462
Alexander Porshnev [10], used a dictionary-based approach for sentiment analysis which
distinguishes eight basic emotions in the tweets of users. The model made use of SVM between the
collected datasets to find the dependency of stock prices on the sentiments.
From the study of various sentiment analyses, and prediction algorithms related to Stock
Movement Prediction models, it can be found that the limiting number of factors considered for sentiments
extracted from textual data decreases the accuracy in stock prediction. It can be found that using more
number of sentiments to predict the stock results in a more accurate and precise prediction values. Further
combining both historical price prediction techniques and sentimental price prediction techniques, better
prediction accuracy can be achieved.
PROPOSED SYSTEM
To design and implement a novel prediction technique for stock prediction using sentiment
analysis with the help of social media information in real time and historical data of various stocks and
also to predict price rather than stock movement prediction using machine learning techniques. In the
existing system, only two scores were obtained via sentiment analysis namely positive and negative
scores. Also, prediction results were less accurate than the other models. No real time data retrieval for
the existing system was done. These drawbacks were taken into account and an enhanced model was
designed and implemented.
STOCK PREDICTION USING MULTIPLE REGRESSION
In this model, the stock prediction is done with the help of multiple linear regression technique
along with the sentiment analysis done by the help of NRC emotion lexicon. Primarily the tweets are
collected using the Twitter API for particular dates that are to be considered for sentiment analysis. The
collected data are cleansed using techniques such as tokenizing, POS tagging, Stemming, etc. The factors
to be identified here are the two sentiments such as positive and negative as well as the 8 different
emotions: angry, anticipation, disgust, fear, joy, sadness, surprise and trust are extracted with the help of
NRC emotion lexicon. After performing sentiment analysis, the close price of various dates and values of
emotions obtained from sentiment analysis are tabulated. Finally, the performance factors are also
calculated. The architecture of the system is given in the figure.1.
JASC: Journal of Applied Science and Computations
Volume VI, Issue IV, April/2019
ISSN NO: 1076-5131
Page No:1463
Figure.1 Stock Prediction Flow Diagram
The workflow of the system is listed below
1. The tweets are gathered and saved in a text file.
2. The file is then subjected to text analysis where stop words and punctuations are removed, and
also the file format was converted from UTF-8 format to ASCII format for NRC sentiment
analysis.
3. The Sentiment scores and close prices are fed as input for the two regression operations.
a. First, by including two sentiment factors - positive and negative sentiment score alone
b. Next by including two sentiment factors and eight emotion factors
4. Finally, the performance factors are evaluated after obtaining prediction values.
EXPERIMENTATION
In the proposed model, the real-time datasets from twitter were collected using Twitter API using
RStudio and the historical dataset were collected from Yahoo Finance for IBM stocks. The tweets relating
IBM was fetched for processing. It was fetched for about 25 days and each day 5000 tweets were obtained
in English language. The historical data of IBM which comprises of Open, High, Low, Close and Adjusted
Close prices of each transaction dates were also fetched for the same specified interval and the values are
plotted in the time series method to view the close prices accurately. After performing the sentiment
analysis, multiple regression technique was carried out by making close price as dependent variable and
the emotion values as independent variables
JASC: Journal of Applied Science and Computations
Volume VI, Issue IV, April/2019
ISSN NO: 1076-5131
Page No:1464
SENTIMENT ANALYSIS EXPERIMENTATION
In this part, first an application was created in Twitter. This application will contain four major
keys such as API key, API secret key, consumer token and consumer token secret which are referred to as
the credentials is shown in figure 2
Figure.2 Credential Details
The tweets related to IBM stocks are retrieved by using the keyword IBM stored in the form of
text file of size about 3.1 MB. This text file of UTF-8 format was converted to ASCII format and then it
is subjected to NRC sentiment analysis
FINANCIAL DATA EXPERIMENTATION
In financial data experimentation, the historical prices are fetched from the Yahoo Finance website.
In RStudio, the URL for fetching the close price values has to be included and then run. The URL contains
the date interval that is from starting date to the ending date for which the financial data has to be fetched.
The value after fetching the IBM close price from Yahoo finance data was shown in figure.3
Figure.3 Sample IBM Close Price Values
JASC: Journal of Applied Science and Computations
Volume VI, Issue IV, April/2019
ISSN NO: 1076-5131
Page No:1465
PREDICTION EXPERIMENTATION
In prediction experimentation, the input is the close price value and the scores obtained by NRC
emotion lexicon for various emotions and sentiment factors. The input was read from the CSV file in
which both the close price value and all emotion factors and sentiment factors are stored. Once the input
was read, then the prediction has to be done by correlating close price with factors obtained from NRC
emotion lexicon. In this part two types of prediction were carried out.
First, the close price was correlated with the sentiment factors such as positive score and negative score
alone, and then the prediction results are obtained. Next, the close price was correlated with all the ten
factors that are angry, anticipation, disgust, fear, joy, negative, positive, sadness, surprise and trust. Then
the prediction result was obtained for both the confidence and prediction intervals.
Performance Factors
Performance factors are the important metrics to be evaluated that drives the result and conclusion
of any work. The factors used in this work are given below,
Opinion Values
The opinion value is an aggregate of all the opinion words that are discriminated by two score
values such as positive score (Ps) and negative score (Ns).
Opinion Values (Oj) = (Ps-Ns)/(Ps+Ns) …….... (1)
where,
Ps is Positive Scores and
Ns is Negative Scores
MAPE
Mean Absolute Percentage Error (MAPE) is the average absolute percent error, measures the size
of the error in percentage terms. Mean absolute percentage error is given in (2)
𝑀𝐴𝑃𝐸 =1
𝑁∑
𝐹𝑘−𝐴𝑘
𝐴𝑘
𝑛
𝑘=1 …… (2)
JASC: Journal of Applied Science and Computations
Volume VI, Issue IV, April/2019
ISSN NO: 1076-5131
Page No:1466
where,
A is the actual value and
F is the forecast value
RESULT ANALYSIS
The result obtained from the sentiment analysis, prediction with two factors and ten factors are
discussed in this section.
SENTIMENT ANALYSIS RESULTS
The tweets are given as input in the form of text file. The NRC sentiment lexicon converts UTF-8
format to ASCII format and then the sentiment analysis was done for 5000 tweets on 25 different days.
Sample values are tabulated in Table 1 from the R extracted and processed data. The sample output plot
is shown in figure 4.
Table 1. Tabulation of emotions analyzed from Twitter for one day
Emotions Score
Anger 19
Anticipation 50
Disgust 29
Fear 9
Joy 11
Negative 51
Positive 15069
Sadness 10
Surprise 8
Trust 10056
JASC: Journal of Applied Science and Computations
Volume VI, Issue IV, April/2019
ISSN NO: 1076-5131
Page No:1467
Figure.4 Sentiment Analysis
PREDICTION RESULTS
The prediction is done for both the sentiment and emotion factors as well. First, the prediction is
done by correlating the close price value with the positive and negative scores obtained from NRC
sentiment analysis and various graphs are plotted and saved. The prediction value for considering positive
and negative alone for both the confidence and prediction interval is shown in figure 5.
Figure.5 Linear Regression Output
Then, the multiple regression technique is done for the sentiment factors positive and negative as
well as the emotion factors such as angry, anticipation, disgust, fear, joy, negative, positive, sadness,
surprise and trust. When the number of independent factors increases, the prediction result decreases. The
multiple regression technique prediction result is shown in figure 6.
JASC: Journal of Applied Science and Computations
Volume VI, Issue IV, April/2019
ISSN NO: 1076-5131
Page No:1468
Figure.6 Multiple Regression Output
PLOTS OF REGRESSION TECHNIQUES
There are four different plots obtained for the regression techniques. And also a separate plot for
residuals present for the given input. First, Figure 7(a) denotes the residuals for two factors and 7(b)
denotes the residuals for all factors. There are 25 residuals in both the considerations. The residuals are
the part of the dependent variable that the model couldn't explain, and they are our best available estimate
of the error term from the regression model.
Figure 7(a) Residuals for Two Factors Figure 7(b) Residuals for Ten Factors
Then, there is a plot for residuals against the fitted value. The plot of residuals against fitted values
for two factors is shown in the figure 8(a) and the plot of residuals against fitted values for all factors is
shown in the figure 8(b). The regression line denotes the best prediction of the dependent variable and the
independent variable.
JASC: Journal of Applied Science and Computations
Volume VI, Issue IV, April/2019
ISSN NO: 1076-5131
Page No:1469
Figure 8(a) Residual vs Fitted for Two Factors Figure 8(b) Residuals vs Fitted for Ten Factors
The next plot is the normal Q-Q plot that help us to assess if a set of data plausibly came from some
theoretical distribution such as a Normal or Exponential. The normal Q-Q plot for two factors is shown in
the figure 9(a) and the normal Q-Q plot for all factors are shown in the figure 9(b).
Figure 9(a) Normal Q-Q Plot for Two Factors Figure 9(b) Normal Q-Q Plot for Ten Factors
The Scale–location plot, the fitted values are plotted with respect to the square root of the standardized
residuals. The prediction curve depicts the best matching value of fitted values. The scale-location plot
for 2 factors is shown in figure 10(a) and the scale-location plot for all factors is shown in the figure 10(b).
JASC: Journal of Applied Science and Computations
Volume VI, Issue IV, April/2019
ISSN NO: 1076-5131
Page No:1470
Figure 10(a) Scale-Location Plot for Two
Factors
Figure 10(b) Scale-Location Plot for Ten
Factors
The next plot is the plot for leverage against residuals. Normally, there are four possibilities that
can be derived from the plot. They are fine, high residuals to low leverage, high residuals to high leverage
and low residuals to high leverage. The residuals vs leverage plot for two factors is shown in figure 11(a)
and the plot for residuals vs leverage for all factors are shown in the figure 11(b).
Figure 11(a) Residuals vs Leverage for Two
Factors
Figure 11(b) Residuals vs Leverage for Ten
Factors
The performance factors such as opinion values and MAPE are evaluated for the proposed model.
The opinion value obtained from the sentiment scores are evaluated and the value is 0.9903486.
The MAPE value obtained from the prediction results are evaluated and the value is 0.1082715
The sentiment analysis was done by considering a greater number of factors than the existing
system. The NRC emotion lexicon plays a major role in classifying sentiments and emotions.
JASC: Journal of Applied Science and Computations
Volume VI, Issue IV, April/2019
ISSN NO: 1076-5131
Page No:1471
Financial prices are obtained from the Yahoo Finance and are processed to produce the time series
representation for every close prices obtained for the given interval. Obtained data and the sentiment
factors are stored in the separate CSV file which is given as input for multiple regression technique.
Prediction was done for two methods. Multiple regression technique is done by correlating the close price
with positive and negative scores alone as well by correlating close price with all the emotion and
sentiment scores. Performance factors such as opinion value, MAPE are found. Opinion value depends on
the sentiment scores and MAPE is an accuracy factor. Finally, the fit, lower and upper values are obtained
as the prediction results for both the cases.
CONCLUSION
Various Big Data Analytics techniques have been studied for choosing the suitable predictive
analytical technique for this proposed model. For the proposed model, multiple regression analysis
technique seems to be suitable to compare and perform better by including various factors that do affect
the stock prices. In this work, the sentiment analysis technique has been done after obtaining the tweets
from the twitter in real time using Twitter Stream API with the help of the NRC emotion lexicon. Financial
data for IBM was also obtained from Yahoo Finance indicating close prices and it is plotted with the help
of time series package. Obtained sentimental scores and emotion factor scores are correlated with the stock
price historical data for the prediction technique to provide the precise stock price prediction. The
technique used for prediction used in this work was multiple linear regression technique. The accuracy
performance factors are also evaluated for prediction results and opinion value was found with the help of
the sentiment scores obtained from the NRC sentiment analysis. In this work tweets and financial data for
only one company (IBM) was considered. In the future work, the tweets and historical data for more than
six months can be fetched and it can be fed as an input for multiple regression technique. Also tweets and
historical prices for more than one company can be considered for more accurate prediction.
REFERENCES
1. Thien Hai Nguyen, Kiyoaki Shirai, Julien Velcin, “Sentiment analysis on social media for stock
movement prediction”, Expert Systems with Applications, 2015.
JASC: Journal of Applied Science and Computations
Volume VI, Issue IV, April/2019
ISSN NO: 1076-5131
Page No:1472
2. Bollen, J., Mao, H., & Zeng, X, “Twitter mood predicts the stock market”, Journal of Computer
Science, Springer International Publishing, Vol 2, 2011.
3. Fagner Andrade, Luis Enrique, Christiane Naire, Azevedo Reis, “The Use of Artificial Neural
Networks in the Analysis and Prediction of Stock Prices”, Systems, Man, and Cybernetics (SMC),
IEEE International Conference, 2013.
4. Paul C. Zikopoulos, Chris Eaton, Dirk deRoos, “Understanding Big Data Analytics for Enterprise
Class Hadoop and Streaming Data”, McGraw-Hill companies, 2012.
5. Girija V Attigeri, Manohara Pai M M, Radhika M Pai, Aparna Nayak, “Stock Market Prediction:
A Big Data Approach”, TENCON – IEEE conference, 2015.
6. Chun‑Wei Tsai1, Chin‑Feng Lai, Han‑Chieh Chao and Athanasios V. Vasilakos, “Big data
analytics: A survey”, Journal of Big Data, Springer international publishing, 2015.
7. Rishabh Soni, K. James Mathai, “Improved Twitter Sentiment Prediction through ‘Cluster-then-
Predict Model’ ”, International Journal of Computer Science and Network, Vol 4, Issue 4, 2015.
8. Jigar Patel, Sahil Shah, Priyank Thakkar , K Kotecha, “Predicting stock and stock price index
movement using Trend Deterministic Data Preparation and machine learning techniques”, Expert
Systems with Applications, 2015.
9. Bhakti G. Deshmukh, Premkumar S. Jain, Dr. M. S. Patwardhan, Viraj Kulkarni, “Spin-offs in
Indian Stock Market owing to Twitter Sentiments, Commodity Prices and Analyst
Recommendations”, All India Council for Technical Education, 2016.
10. K.K. Suresh Kumar, N.M. Elango, “Performance Analysis of Stock Price prediction using
Artificial Neural Network”, Global Journal of Computer Science and Technology, Vol 12, Issue
1, 2012.
11. Kibum Kim, Seungmin Yang, Dongyoung Kim, Jeawon Park, Jaehyun Choi, “A Stock Prediction
System Based on News and Twitter”, International Journal of Software Engineering and Its
Applications, Vol 10, Issue 6, 2016.
12. Anthony R Calingo, Ariel m Sison, Batolome T Tangulig, “Prediction Model of the Stock Market
Index Using Twitter Sentiment Analysis”, International journal of Information Technology and
Computer Science, Volume 8, Issue 2, 2016.
JASC: Journal of Applied Science and Computations
Volume VI, Issue IV, April/2019
ISSN NO: 1076-5131
Page No:1473
13. Alexander Porshev, Ilya Redkin, Alexy Schevchenko, “Improving Prediction of Stock Market
Indices by Analyzing the Psychological States of Twitter Users”, Financial Economics, Volume
22, 2013.
14. Borko Furht, Flavio Villanustre, “Big Data Technologies and Applications”, Comparison between
the Frameworks/Platforms of the Big Data, page number - 31, 32.
15. Walaa Medhata, Ahmed Hassanb, Hoda Korashy, “Sentiment analysis algorithms and applications
– a survey”, Ain Shams Engineering Journal, Volume 5, Issue 4, 2014.
16. Saif M. Mohammad, Peter D. Turney, “Crowdsourcing a Word–Emotion Association Lexicon”,
National Research Council Canada, Computational Intelligence, Volume 59, Issue 1, 2013.
17. Ashish Katrekar, “An Introduction to Sentiment Analysis”, AVP, Big Data Analytics, Page
numbers – 2 and 3.
18. Orlando, Fla, “Gartner Says Big Data Creates Big Jobs: 4.4 Million IT Jobs Globally to Support
Big Data By 2015”, 2012.
19. Leona S. Aiken, Stephen G. West, Steven C. Pitts, “Multiple Linear Regerssion” Part four – Data
Analysis method, 2003.
20. Reddy D. Maheswara, “Trends and Opportunities in Big Data Analytics-An Overview”, Indian
Journal of Science, Volume 23, Issue 80, 2016.
JASC: Journal of Applied Science and Computations
Volume VI, Issue IV, April/2019
ISSN NO: 1076-5131
Page No:1474