+ All Categories
Home > Technology > Statistical Programming with JavaScript

Statistical Programming with JavaScript

Date post: 13-Apr-2017
Category:
Upload: david-simons
View: 392 times
Download: 0 times
Share this document with a friend
68
STATISTICAL PROGRAMMING IN JAVASCRIPT David Simons @SwamWithTurtles
Transcript
Page 1: Statistical Programming with JavaScript

STAT I ST I CA L P R O G R A M M I N GI N J AVA S C R I P T

D a v i d S i m o n s@ S wa m Wi t h Tu r t l e s

Page 2: Statistical Programming with JavaScript

slides:

www.tinyurl.com/stats-js

Page 3: Statistical Programming with JavaScript

demos: swamwithturtles.github.io/js-statistics

code: github.com/SwamWithTurtles/js-statistics

Page 4: Statistical Programming with JavaScript

WHO AM I?

Freelance Software Developer

@SwamWithTurtles

Java and JavaScript

Afraid of goats?

Page 5: Statistical Programming with JavaScript

WHO AM I?

DATA NERD

Page 6: Statistical Programming with JavaScript

C O N T E N T S

THEORY CASE STUDIES JAVASCRIPT APPLICATION

WHAT IS DATA?

GAINING INSIGHTS RANDOMNESS SIMULATION

L E A R N I N G T H R O U G H

Reward: What shape is the internet?

Page 7: Statistical Programming with JavaScript

Data

Page 8: Statistical Programming with JavaScript
Page 9: Statistical Programming with JavaScript

B E H I N D T H E H O O D

API

DB

ADMIN INTERFACE

SCHEDULED TASKS

3RD PARTY APIS

Page 10: Statistical Programming with JavaScript

W H AT D ATA W A S T H E R E ?

S O …

Page 11: Statistical Programming with JavaScript

W H AT D ATA W A S T H E R E ?

• Counts of lists (e.g. brands, products etc.)

• Stock levels and prices of products

• Days an item has been out of stock

Page 12: Statistical Programming with JavaScript

W H AT D ATA W A S T H E R E ?

• Non-functional data

• Numbers of users

• Performance for users

• Performance of third party APIs

• Robustness of system (Uptime, status codes, frequency of errors)

Page 13: Statistical Programming with JavaScript

T H E R E I S D ATA E V E R Y W H E R E

T H E L E S S O N ?

Page 14: Statistical Programming with JavaScript

What is data?

Page 15: Statistical Programming with JavaScript

What is good data?

Page 16: Statistical Programming with JavaScript

W H AT D ATA S H O U L D I C A R E A B O U T ?

• Data you get repeatedly

• Data you can extract ‘information’ from

• Normally this means numerical data, though NLP is getting big!

• Data that answers valuable questions

Page 17: Statistical Programming with JavaScript

Gaining Insights

Page 18: Statistical Programming with JavaScript

A data set:

Identification WIND CEILING TEMP DEWPT RHX USAF NCDC Date HrMn I Type QCP Dir Q I Spd Q Hgt Q I I Temp Q Dewpt Q RHx 865300,99999,19860401,0000,4,FM-12, ,110,1,N, 7.2,1,22000,1,C,N, 21.6,1, 19.2,1, 86, 865300,99999,19860401,0300,4,FM-12, ,110,1,N, 5.1,1,22000,1,C,N, 19.4,1, 18.5,1, 95, 865300,99999,19860401,0600,4,FM-12, ,070,1,N, 7.2,1,03600,1,C,N, 19.2,1, 999.9,9,999, 865300,99999,19860401,0900,4,FM-12, ,070,1,N, 6.2,1,00120,1,C,N, 19.2,1, 18.9,1, 98, 865300,99999,19860401,1200,4,FM-12, ,070,1,N, 7.7,1,03600,1,C,N, 21.6,1, 18.3,1, 82, 865300,99999,19860401,1500,4,FM-12, ,040,1,N, 9.8,1,03600,1,C,N, 23.0,1, 18.8,1, 77, 865300,99999,19860401,1800,4,FM-12, ,030,1,N, 6.2,1,03600,1,C,N, 19.6,1, 19.0,1, 96, 865300,99999,19860401,2100,4,FM-12, ,050,1,N, 6.7,1,03600,1,C,N, 19.0,1, 18.7,1, 98, 865300,99999,19860402,0000,4,FM-12, ,340,1,N, 7.2,1,03600,1,C,N, 20.0,1, 19.4,1, 96, 865300,99999,19860402,0300,4,FM-12, ,360,1,N, 4.1,1,03600,1,C,N, 19.4,1, 19.1,1, 98, 865300,99999,19860402,0600,4,FM-12, ,999,1,C, 0.0,1,03600,1,C,N, 19.2,1, 18.9,1, 98, 865300,99999,19860402,0900,4,FM-12, ,999,1,C, 0.0,1,00210,1,C,N, 19.0,1, 18.7,1, 98, 865300,99999,19860402,1200,4,FM-12, ,200,1,N, 2.6,1,00210,1,C,N, 20.4,1, 20.1,1, 98, 865300,99999,19860402,1500,4,FM-12, ,210,1,N, 5.1,1,00750,1,C,N, 23.2,1, 19.3,1, 79, 865300,99999,19860402,1800,4,FM-12, ,200,1,N, 3.1,1,00750,1,C,N, 26.4,1, 18.4,1, 62, 865300,99999,19860402,2100,4,FM-12, ,999,1,C, 0.0,1,22000,1,C,N, 26.2,1, 17.1,1, 57, 865300,99999,19860403,0000,4,FM-12, ,140,1,N, 4.1,1,22000,1,C,N, 19.2,1, 17.0,1, 87, 865300,99999,19860403,0300,4,FM-12, ,999,1,C, 0.0,1,22000,1,C,N, 15.8,1, 15.2,1, 96, 865300,99999,19860403,0600,4,FM-12, ,999,1,C, 0.0,1,22000,1,C,N, 15.4,1, 14.0,1, 91, 865300,99999,19860403,1200,4,FM-12, ,060,1,N, 5.1,1,22000,1,C,N, 21.0,1, 19.8,1, 93, 865300,99999,19860403,1500,4,FM-12, ,060,1,N, 4.1,1,00900,1,C,N, 24.8,1, 21.3,1, 81, 865300,99999,19860403,1800,4,FM-12, ,050,1,N, 7.7,1,09000,1,C,N, 28.0,1, 21.4,1, 67, 865300,99999,19860403,2100,4,FM-12, ,040,1,N, 5.1,1,09000,1,C,N, 25.4,1, 21.4,1, 79, 865300,99999,19860404,0000,4,FM-12, ,060,1,N, 6.2,1,03600,1,C,N, 22.2,1, 21.3,1, 95, 865300,99999,19860404,0300,4,FM-12, ,050,1,N, 5.1,1,09000,1,C,N, 21.0,1, 20.7,1, 98, 865300,99999,19860404,0600,4,FM-12, ,060,1,N, 6.2,1,22000,1,C,N, 20.2,1, 19.9,1, 98, 865300,99999,19860404,1200,4,FM-12, ,040,1,N, 5.1,1,00120,1,C,N, 20.4,1, 19.5,1, 95, 865300,99999,19860404,1500,4,FM-12, ,020,1,N, 7.7,1,00420,1,C,N, 24.2,1, 20.4,1, 79, 865300,99999,19860404,1800,4,FM-12, ,250,1,N, 4.1,1,00750,1,C,N, 25.6,1, 20.7,1, 74, 865300,99999,19860404,2100,4,FM-12, ,250,1,N, 5.1,1,00750,1,C,N, 23.6,1, 20.4,1, 82, 865300,99999,19860405,0000,4,FM-12, ,180,1,N, 6.2,1,00420,1,C,N, 20.2,1, 19.6,1, 96, 865300,99999,19860405,0300,4,FM-12, ,160,1,N, 5.1,1,00120,1,C,N, 18.6,1, 18.0,1, 96,

Page 19: Statistical Programming with JavaScript

s u m m a r y s t a t i s t i c s

Page 20: Statistical Programming with JavaScript

S U M M A R Y S TAT I S T I C S

• A statistic is a function of the data we have inputed

• It aims to capture information about values to make it more understandable

Page 21: Statistical Programming with JavaScript

T H E FA M O U S O N E :

• Mean (‘average’)

• Sum all of the data and divide by the number of items

• Gives a sense of ‘size’

Page 22: Statistical Programming with JavaScript

Group 1:

Group 2:

Page 23: Statistical Programming with JavaScript

O T H E R S TAT I S T I C S

• “Location”

• Mean, Mode, Median

• “Spread”

• Standard Deviation

• “Shape”

• Skew, Kurtosis

Page 24: Statistical Programming with JavaScript

D E M O

Page 25: Statistical Programming with JavaScript

Distributions

Page 26: Statistical Programming with JavaScript

What is a random variable?

Page 27: Statistical Programming with JavaScript

Discrete VariablesCan be any of a list of values, each with its own probability

H E A D S 0 . 5

TA I L S 0 . 5

2 1 / 3 6

3 2 / 3 6

4 3 / 3 6

5 4 / 3 6

6 5 / 3 6

7 6 / 3 6

8 5 / 3 6

9 4 / 3 6

1 0 3 / 3 6

1 1 2 / 3 6

1 2 1 / 3 6

Page 28: Statistical Programming with JavaScript

This makes sense:X = Result of a coin flip

H E A D S 0 . 5

TA I L S 0 . 5 But:X won’t always have the

same value

Page 29: Statistical Programming with JavaScript

R A N D O M VA R I A B L E S

X = Result of a coin flip

H E A D S 0 . 5

TA I L S 0 . 5

X is a Random Variable

This is its distribution

Page 30: Statistical Programming with JavaScript

D E M O …

Page 31: Statistical Programming with JavaScript

ContinuousA numerical variable,

that can be any number (sometimes within a range)

height

weightMath.random()

Page 32: Statistical Programming with JavaScript

H O W D O W E D E F I N E T H E D I S T R I B U T I O N ?

Math.random() height

Page 33: Statistical Programming with JavaScript

D E M O

Page 34: Statistical Programming with JavaScript

S O W H AT ?E R R R …

Page 35: Statistical Programming with JavaScript

• When we do data analysis, we’re really looking at the range of values a random variable can be…

• … and asking questions about its distribution.

Page 36: Statistical Programming with JavaScript

Y O U ’ R E A N A U D I T O R

I M A G I N E …

Page 37: Statistical Programming with JavaScript

A U D I T I N G A L E D G E R

• Make a list of all ingoing and outgoing transactions

• These are random variables.

• What is their distribution? Does it deviate from what we expect?

Page 38: Statistical Programming with JavaScript

B E N F O R D ’ S L A W

http://www.journalofaccountancy.com/Issues/1999/May/nigrini

Page 39: Statistical Programming with JavaScript

I N T U I T I V E U S E R I N P U T S

D E S I G N I N G

Page 40: Statistical Programming with JavaScript

O U R TA S K …

• Designing a system that tries to understand what happens under financial system “shocks”

• So: a user would input a shock, its impacts would propagate and we would see our bottom line.

Page 41: Statistical Programming with JavaScript

O U R F I R S T AT T E M P T

• Shock ‘sliders’ that scaled linearly

0 %

2 5 % B O O M

9 0 % B U S T

Page 42: Statistical Programming with JavaScript

D I S T R I B U T I O N O F F I N A N C I A L C H A N G E S

Page 43: Statistical Programming with JavaScript

S O …

• Shock ‘sliders’ that scaled linearly

0 %

8 % B O O M

1 0 5 % B U S T

Change that happens with 75% chance

Change that happens with 10% chance

Page 44: Statistical Programming with JavaScript

Randomness

Page 45: Statistical Programming with JavaScript

M A K I N G R A N D O M VA R I A B L E S

Page 46: Statistical Programming with JavaScript

S O M E W A R N I N G S

• Exactly what randomness means is a fuzzy question.

• These numbers are not ‘cryptographically’ random.

Page 47: Statistical Programming with JavaScript

J AVA S C R I P T ’ S E N T R Y T O R A N D O M N E S S

• Different runtimes can implement it differently.

• V8 implements Multiply-With-Carry:

• Take a sequence of ‘seed’ values

• Iteratively perform modular arithmetic-based operations

• Extend the initial seed values to a longer sequence.

Math.random()

Page 48: Statistical Programming with JavaScript

W H AT A B O U T O T H E R D I S T R I B U T I O N S ?

B U T …

Page 49: Statistical Programming with JavaScript

T H E S H O R T A N S W E R

Math.random()= f( )

Page 50: Statistical Programming with JavaScript

T H E S H O R T A N S W E R

=H E A D S 0 . 5

TA I L S 0 . 5

=

Page 51: Statistical Programming with JavaScript

WHAT ’ S THE FUNCTION?

jStatbeta

centralF cauchy

chi-squared exponential

gamma inverse gamma kumaraswamy

lognormal normal pareto

student t uniform weibull

binomial negative binomial hypergeometric

poisson triangular

OR

Page 52: Statistical Programming with JavaScript

U S I N G R A N D O M N E S S

Page 53: Statistical Programming with JavaScript

w h y w o u l d i w a n t t o u s e

R A N D O M NE S S ?

Page 54: Statistical Programming with JavaScript

S T U B B E D T E S T D ATA

• Avoid coupling yourself to specific test implementations

• Spin-up life-like environments for load testing

Page 55: Statistical Programming with JavaScript

N O N -D E T E R M I N I S T I C A L G O R I T H M S

• Modelling underlying or random data

• Solving a problem that is expensive or impossible to solve perfectly

Page 56: Statistical Programming with JavaScript

P I T FA L L S

Page 57: Statistical Programming with JavaScript

C H O O S I N G T H E D I S T R I B U T I O N

• What if a ‘uniform’ distribution isn’t enough?

• What if we want random data that isn’t just numbers?

Page 58: Statistical Programming with JavaScript

E X A M P L E : S O C I A L N E T W O R K

Page 59: Statistical Programming with JavaScript

E X A M P L E : S O C I A L N E T W O R K

11 Traversals

Page 60: Statistical Programming with JavaScript

D E M O

Page 61: Statistical Programming with JavaScript

B a r a b a s i - A l b e r t R a n d o m M o d e l

Page 62: Statistical Programming with JavaScript

B A R A B A S I - A L B E R T R A N D O M M O D E L

• Start with two linked objects

• Add one new object at a time

• Link that object to one existing object, with already ‘popular’ objects more likely to be chosen.

Page 63: Statistical Programming with JavaScript

T H I S M O D E L S …

• Academic Citations

• Actor filmographies

• Spread of Infectious diseases

• Social Networks

Page 64: Statistical Programming with JavaScript

C O N T E N T S

THEORY CASE STUDIES JAVASCRIPT APPLICATION

WHAT IS DATA?

GAINING INSIGHTS RANDOMNESS SIMULATION

L E A R N I N G T H R O U G H

Reward: What shape is the internet?

Page 65: Statistical Programming with JavaScript

We’reOUTof

TIME

Page 66: Statistical Programming with JavaScript

• Data is any information we collect. Not all data is valuable.

• Seeing trends in lots of numbers is hard. Summary statistics and charts help us unpick its meaning.

• Data can be treated as random ‘realisations’ from a backing distribution.

• Making random variables is easy, and can be done in different shapes for different purposes.

WHAT IS DATA?

GAINING INSIGHTS RANDOMNESS SIMULATION

Page 67: Statistical Programming with JavaScript

L I B R A R I E S W E U S E D

G E N E R A L L I B R A R I E SK N O C K O U T. J S

R E Q U I R E . J S B O O T S T R A P

D ATA M A N I P U L AT I O N L O D A S H J S TAT

D ATA I M P O RT PA PA PA R S E

C H A RT I N G D 3 C H A R T. J S

Page 68: Statistical Programming with JavaScript

T H A N K Y O U

D a v i d S i m o n s@ S wa m Wi t h Tu r t l e s


Recommended