Web and Social Computing - Presentation Week8

Post on 11-Jan-2017

114 views 4 download

transcript

Articles Overview

Maneshka Paiva and Matt Courtney

Article 1: Mining Dynamic Social Networks From Public News Articles For Company Value Prediction

Overview of Article

● Developed algorithms and a system to infer large-scale evolutionary company networks from public news during 1981-2009

● Prediction of company profits and revenue growth using the network changes over time

● Proposal of a feature extraction and selection algorithm for longitudinal networks

● Measures how networks affect company performance and what network features are important

Article’s Research Questions1. Is it possible to predict a company’s value(such as revenue and profit) based

on dynamic company networks? 2. How can we infer evolutionary company networks?

3. What features of a longitudinal network are useful for a company and how can they be generated?

Three company value prediction categories1. Use company’s financial statements

2. Use historical trends to identify price patterns and likely company activity

3. Social Network Analysis(SNA) to examine relational and structural embeddedness of companies on intercompany networks

Three company value prediction categories1. Use company’s financial statements

2. Use historical trends to identify price patterns and likely company activity

3. Social Network Analysis(SNA) to examine relational and structural embeddedness of companies on intercompany networks

→ by combining both historical and financial information!

DataExtracted Dataset:

● New York Times (1981-2009)○ Fortune 500 companies (minimum of 3 years)

Metrics:

● Co-occurrence of company names at document-level & sentence-level.○ Generate ‘impact score’ from two aggregated values. ○ Heuristically weight each factor to emulate natural

relationship.

“Longitudinal intercompany impact networks from public news (i.e. New York Times)”Network

Network

Microsoft 2003 Microsoft 2009

Data ProcessingDataset Generation

Given a set of companies (V), a time period (T), and a data source (D)

Extract inter-company networks at each given period GT = {Gt1, Gt2, …, Gtk}

Feature Vectors

Pick a company from the set V ( x ∈ V)

Generate a feature vector from its embeddedness in the inter-company networks GT (FTx )

Mining Network - Predicting Value72-dimensional temporal network effects generated

● Some of those features don’t have positive effect on company valuations.○ Feature selection can be beneficial to both accuracy and efficiency.

● Three methods of feature selection considered:○ Individual feature selection○ Network feature variance○ Feature set selection

Mining Network - Individual Feature SelectionUse spearman’s correlation to find important features (high correlation).

Mining Network - Individual Feature Selection

“If there is an Increase in the ratio of the number of connections that a company has with the number of connections that its neighbors have, then the value of its

profits will increase.”

Mining Network - Network Feature VarianceTune a network with a threshold variance:

● Some features will depend very sensitively on the existence of an edge.○ Measure feature variance in connected structure with different thresholds.

● If a feature has high associated variance it is due to it varying greatly if an edge with a highly connected neighbor is removed.○ This indicates how important a feature is for the network.

Mining Network - Network Feature Variance

● Maximise the sum of important scores and individual features.● Minimise the sum of similarity score between features.

Mining Network - Feature Set Selection

A more optimal outcome can be found if more than one feature is used.

Feature selection models from Geng et al used.

After generating a selecting a feature sets, they are used to predict company value.

Evaluation of prediction results

STEPS:1. Predict company values(profits and revenues) for 20 of the ‘Fortune 500’

companies

2. Effectiveness of network features and parameters on profits and revenues

Performance Evaluation measures

Squared Correlation Coefficient(r2) - To quantify correlation

Performance Evaluation measures

Mean Squared Error(MSE) - Error between the predicted valuations and the true valuation

Step 1: Company value prediction - (1)

● Select 20 large companies from different industries from the ‘Fortune 500’ list

● Companies Selected: IBM, Intel, Microsoft, GM, HP, Honda, Nissan, AT&T, Walmart, Yahoo, Nike, Dell, Starbucks, JP Morgan, Pepsi, Cisco Systems, FedEx, The Gap, American Electric Power, Sun Microsystems.

● These made it to the list continuously for many years and have information regarding company valuation and network.

Step 1: Company value prediction - (2)

● Learn profit model for each 5 years’ networks and predict next year profits● Then compare the actual profit earned in that year with predicted profit value

Step 1: Company value prediction - (3)

● Year 1995 - ONLY year prediction was much lower than real value

Step 1: Company value prediction - (4) ● Learn 10 years to predict following year’s profit

Step 2: Effectiveness of network features - (1)

● Evaluate effectiveness of network features

● Use different feature sets for predicting companies’ mean profits over the years and take average to compare prediction performance by each feature set.

Step 2: Effectiveness of network features - (2)

● Feature set notations:➔ s - Structural Features(network features)➔ t - Temporal Features➔ p - Financial Features➔ d - Delta change in temporal features

● Combine notation:➔ sp - network features + financial features➔ stdp - combination all features

Step 2: Effectiveness of network features - (3)

● Using ‘p’(financial profile) only has better performance than s, t, d individually● Combined feature sets improves the profit prediction performance● Using ‘stdp’(combination of all) feature set outperforms network(‘s’) only and

financial(‘p’) only feature sets by 150% and 34% respectively

Step 2: Effectiveness of network features - (4)

● Network features (‘s’, ‘t’, ‘d’) do not seem to contribute to revenue prediction● When looking at the graphs from above any combination that includes the

financial feature set ‘p’ shows a significant impact on the revenue prediction

Step 2: Effectiveness of network features - (5)

Learning outcomes of using network features:

● For profit prediction it can be seen that the financial feature set does contribute to profit prediction over the network feature sets.

● A combination of all features further improves the profit prediction

● Network features do not seem to contribute to revenue prediction. It is only the financial feature set

Step 2: Effectiveness of parameters - (1)

● Tune parameters window size and delta size

● Compare companies’ networks that existed 1 and 3 years prior

● Take the average of r2 of different years r2

Step 2: Effectiveness of parameters - (2)

● Both window size and delta size showed similar results● Better results when using networks from previous year(window,delta =1) rather

than 3 years prior

Step 2: Effectiveness of parameters - (3)

Learning outcomes of using parameters:

● One of either window or delta is adequate for profit prediction

● A window or delta of 1 previous year gives a more accurate prediction than that of 3 years prior networks

● Networks display a 1-year lagged impact on changes in a company’s

value

Article’s Research Questions - Learning Outcomes1. Is it possible to predict a company’s value(such as revenue and profit) based

on dynamic company networks?Yes, we have seen this by using dynamic social networks of companies

2. How can we infer evolutionary company networks?Developed an algorithm to infer longitudinal company networks

3. What features of a longitudinal network are useful for a company and how can they be generated?

Use of network features, financial features and combinations of these features for company value prediction. Deduced that network features contribute towards profit prediction but does not seem to help predict revenues.

Article 2: Network Science, Web Science and Internet Science

Definitions

Network Science - Understanding the emergence of networks, developing models to foresee how networks evolve and optimising networks.

Web Science - Study of large scale socio-technical system such as the WWW. This considers the relationship between people and technology, and the ways in which they complement each other and the impact it has on society.

Internet Science - A discipline that looks into the evolution of internet networks and society. The internet provides an infrastructure on which human activity is soon becoming heavily reliant.

Thanks for listening.Questions?