Download - Juan Mateos-Garcia, Nesta P&R NEMODE PDW BAM Conference 9-11 September, 2014

The profile of the management (data) scientist: Potential scenarios and skills for B/SMD-based

Management research

Juan Mateos-Garcia, Nesta P&RNEMODE PDW

BAM Conference 9-11 September, 2014

2

Organisational + personal context• Nesta: The UK’s innovation foundation.,

with a mission to help people and organisations bring great ideas to life.

• Doing research on data skills for BIS data capability strategy in partnership with RSS and Creative Skillset

• Doing some ‘big’ data work myself• I used to do management research

(CENTRIM).Draw on all this to reflect on the implications of big data for management research, focusing on skills.

3

Data-driven (automated, personalised) products,

processes and services. New formats for data communication

1. Definitions

More varieties of data

More online activity, digital processes, better hardware.

Generated at faster velocities

Larger volumes of data

New applications

4

More complexity

5

New opportunities for researchers• Coverage: Large samples• Revelation: Make the invisible

visible, reveal preferences, run experiments.

• Granularity: High level of resolution (temporal + dimensional).

• Cheap! £££

6

3. MOR examplesI looked at abstracts of 103 papers in last three issues of [1] AOMJ, [2] BJM, [3] Management Science. No ‘big data’ papers in [1] and [2]. 11 in MS (8 in a ‘Business Analytics’ special issue)

Data source TopicAral +

WalkerFacebook(Proprietary)

Use RCTs to study social influence. Large samples and high levels of granularity allows them to consider how social influence interacts with tie embeddedness and tie strength.

Bao + Datta

SEC (Open) Use unsupervised learning to identify and quantify risk types in ~14,000 annual reports, benchmark them against other methods for classification, and develop an interactive platform to explore the findings.

Goshe + Han

App Store + Google Play (open)

Scrape App Store and Google Play data to create a sales panel they use to estimate consumer demand and how it is affected by App features, including pricing model.

Tambe LinkedIn (Proprietary)

Quantify business big data capabilities and measure inter-company recruitment networks to estimate inter-company skill investment spillover

7

Display findings visually + interactively: Data visualisation

Initial visualisation: Exploratory data analysisDimension reduction: Cluster analysis, PCA.Model selection, estimation, evaluation: Econometrics/statistics/machine learning

Get data: Web scraping/API programming skillsRun experiments: Experimental designsManage and process the data: Database management Clean the data: ‘wrangling’ (and patience).

Technical skills required, or the profile of the management data scientist

Access data

Model data

Present findings

Dat

a Pi

pelin

e

8

Dealing with false positives bound to happen with large samples and multiple tests.Encouraging consilience through reproducibility and relating finding to wider bodies of knowledge

Ask the right questions: “The best dimension reduction tool that there is.”Be careful with biases: N = All? Rarely. It is important to understand the (administrative and organisational) processes that generated the data.

Obtain proprietary dataManage anonymity and ethical issues (including experimental research cf. Facebook infamous RCT).

Challenges (not all technical)

Access data

Model data

Present findings

Dat

a Pi

pelin

e

Requ

ires

theo

ry a

nd d

omai

n kn

owle

dge

9

Institutional solutions• People with technical skills and domain

knowledge are rare -> Unicorns. • Supply push + Demand pull to increase

MOR big data capabilities.• Internal dialogue within the discipline

and with other disciplines (Computer Science, Information Systems)

• Acknowledge big data limitations for looking at important issues (power, perceptions, structural change.)

10

THANK [email protected]

@JMateosGarcia

mailto:[email protected]