+ All Categories
Home > Documents > Chapter II: Basics from Linear Algebra, Probability Theory...

Chapter II: Basics from Linear Algebra, Probability Theory...

Date post: 23-Jun-2018
Category:
Upload: hakiet
View: 220 times
Download: 0 times
Share this document with a friend
37
Chapter II: Basics from Linear Algebra, Probability Theory, and Statistics Information Retrieval & Data Mining Universität des Saarlandes, Saarbrücken Wintersemester 2013/14
Transcript
Page 1: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

Chapter II: Basics from Linear Algebra, Probability Theory, and StatisticsInformation Retrieval & Data Mining Universität des Saarlandes, Saarbrücken Wintersemester 2013/14

Page 2: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

Chapter II

II.1 Linear Algebra Vectors, Matrices, Eigenvalues, Eigenvectors, Singular Value Decomposition

II.2 Probability Theory Events, Probabilities, Random Variables, Distributions, Bounds, Limit Theorems

II.3 Statistical Inference Parameter Estimation, Confidence Intervals, Hypothesis Testing

!

!2

Page 3: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

II.3 Statistical Inference

1. Parameter Estimation

2. Confidence Intervals

3. Hypothesis TestingBased on LW Chapters 6, 7, 9, 10

!3

Page 4: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

Statistical Model

• A statistical model M is a set of distributions (or regression functions), e.g., all unimodal smooth distributions

• M is called a parametric model if it can be completely described by a finite number of parameters, e.g., the family of Normal distributions for a finite number of parameters µ and σ

!4

M =

⇢fX(x;µ,�) =

1p2⇡ �

e

� (x�µ)2

2 �

2 | µ 2 R, � > 0

Page 5: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

Statistical Inference

• Given a parametric model M and a sample X1,…, Xm,how do we infer (learn) the parameters of M?

• For multivariate models with observed variable X and response variable Y, this is called prediction or regression,for a discrete outcome variable this is also called classification

!5

Page 6: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

Idea of Sampling

• Example: Suppose we want to estimate the average salary of employees in German companies

• Sample 1: Suppose we look at n = 200 top-paid CEOs of major banks

• Sample 2: Suppose we look at n = 1,000 employees across all sectors

!6

Distribution X(population of interest)

Samples X1, …, Xm

(e.g., people)

Statistical Inference What can we say about X

based on X1, …, Xm?

Page 7: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

Basic Types of Statistical Inference

• Given independent and identically distributed (iid.) samples X1, …, Xn ~ X of an unknown distribution X

• e.g.: n single-coin-toss experiments X1, …, Xn ~ Bernoulli(p)

• Parameter estimation

• e.g.: what is the parameter p of Bernoulli(p)? what is E[X], the cdf FX of X, the pdf fX of X, etc.?

• Confidence intervals

• e.g.: give me all values C = [a, b] such that P[p ∈ C] ≥ 0.95 with interval boundaries a and b derived from samples X1, …, Xn

• Hypothesis testing

• e.g.: H0 : p = 1/2 (i.e., coin is fair) vs. H1 : p ⧧ 1/2

!7

Page 8: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

1. Parameter Estimation

• A point estimator for a parameter θ of a probability distribution X is a random variable θ derived from an iid. sample X1, …, Xn

• Examples:

• Sample mean

• Sample variance

!

• An estimator for parameter θ is unbiased if otherwise the estimator has bias

• An estimator on sample size n is consistent if

!8

X :=1

n

nX

i=1

Xi

✓n

✓n E[✓n] = ✓

E[✓n]� ✓

lim

n!1P [|ˆ✓n � ✓| < ✏] = 1 for any ✏ > 0

S2X :=

1

n� 1

nX

i=1

(Xi � X)2

Page 9: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

Estimation Error

• Let be an estimator for parameter θ over iid. samples X1, …, Xn

• The distribution of is called sampling distribution

• The standard error for is:

• The mean squared error (MSE) for is:

!

• The estimator is asymptotically Normal if converges in distribution to N(0,1)

!9

se(✓) =q

V ar(✓n)

✓n

✓n

✓n

✓n

MSE(✓n) = E[(✓n � ✓)2] = bias2(✓n) + V ar(✓n)

(✓n � ✓)/se

✓n

Page 10: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

Types of Estimation

• Non-Parametric Estimation

• no assumptions about the model M nor the parameters θ of the underlying distribution X

• e.g.: “plug-in estimators” (e.g., histograms) to approximate X

• Parametric Estimation

• requires assumptions about the model M and the parameters θ of the underlying distribution X

• analytical or numerical methods for estimating θ

• Method of Moments

• Maximum Likelihood

• Expectation Maximization (EM)

!10

Page 11: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

Empirical Distribution Function

• The empirical distribution function is the cdf that puts probability mass 1/n at each data point Xi with indicator function

!

!

• A statistical function (“statistics”) T(F) is any function over F, e.g., mean, variance, skewness, median, quantiles, correlation

• The plug-in estimator of θ = T(F) is

!11

Fn

Fn(x) =1

n

nX

i=1

I(Xi x)

I(Xi x) =

⇢1 : Xi x

0 : Xi > x

✓n = T (Fn)

Page 12: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

Histograms as Density Estimators

• Instead of the full empirical distribution, often compact synopses can be used, such as histograms where X1, …, Xn are grouped into m cells (buckets) c1, …, cm with bucket boundaries lb(ci) and ub(ci)

• Example:X1 = X2 = 1X3 = X4 = X5 = 2X6 = … X10 = 3X11 = … X14 = 4X15 = … X17 = 5X18 = X19 = 6X20 = 7

!12

lb(c1) = �1, ub(cm) = 1, ub(ci�1) = lb(ci) for (1 i m), and

freqf (ci) =ˆ

fn(x) =1n

Pnj=1 I(lb(ci) < Xj ub(ci))

freqF (ci) =ˆ

Fn(x) =1n

Pnj=1 I(Xj ub(ci))

x

fX(x)

1 2 3 4 5 6 7

2/20 3/20

5/20

4/20 3/20

2/20 1/20

µn = 1⇥ 220 + 2⇥ 3

20 + . . .+ 7⇥ 120

= 3.65

Page 13: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

Method of Moments

• Suppose parameter θ = (θ1, …, θk) has k components

• Compute j-th moment for 1 ≤ j ≤ k:

• Compute j-th sample moment for 1 ≤ j ≤ k:

• Method-of-moments estimate of θ is obtained by solving a system of k equations in k unknowns

!13

↵j =1

n

nX

i=1

Xji

↵j = ↵j(✓) = E✓[Xj ] =

Z +1

�1x

jfX(x) dx

↵1(✓n) = ↵1...

↵k(✓n) = ↵k

Page 14: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

Method of Moments (Example)

• Let X1, …, Xn ~ Normal(µ, σ2).

!

!

• By solving the system of 2 equations in 2 unknowns we obtain as solutions

!14

↵1 = E✓[X] = µ

µ = Xn �2 =1

n

nX

i=1

(Xi � Xn)2

µ =1

n

nX

i=1

Xi

�2 + µ2 =1

n

nX

i=1

X2i

↵2 = E✓[X2] = V ar(X) + (E[X])2 = �2 + µ2

Page 15: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

Maximum Likelihood

• Let X1, …, Xn be iid. with pdf f(x;θ)

• Estimate parameter θ of a postulated distribution f(x;θ) such thatthe likelihood that the sample values x1, …, xn are generated by the distribution are maximized

• Maximize L(x1, …, xn, θ) ≈ P[x1, …, xn originate from f(x;θ)]

• Usually formulated as:

!

!

• The value that maximizes Ln[θ] is called the maximum-likelihood estimate (MLE) of θ

• If analytically intractable, MLE can be determined using numerical iteration methods

!15

argmax

✓Ln[✓] =

nY

i=1

f(Xi, ✓)

Page 16: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

Maximum Likelihood (Example)

• Let X1, …, Xn ~ Bernoulli(p) (corresponding to n coin tosses)

• Assume that we observed h times head and (n-h) times tail

• Maximum-likelihood estimation of parameter p

!

!

• Maximize log-likelihood function

!16

log L[h, n, p] = h⇥ log(p) + (n� h)⇥ log(1� p)

@L

@p=

h

p� n� h

1� p= 0 ) p =

h

n

L[h, n, p] =nY

i=1

f(Xi; p) =nY

i=1

pXi(1� p)1�Xi = ph (1� p)(n�h)

Page 17: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

Maximum Likelihood for Normal Distributions

!17

L(x1, . . . , xn, µ,�2) =

✓1p2⇡�

◆n nY

i=1

e

� (xi

�µ)2

2 �

2

@L

=�1

2�2

nX

i=1

2 (xi � �) = 0

@L

@�

2= � n

2�2+

1

2�4

nX

i=1

(xi � µ)2 = 0

) µ =1

n

nX

i=1

xi �

2 =1

n

nX

i=1

(xi � µ)2

Page 18: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

2. Confidence Intervals

• Determine interval estimator T for parameter θ such thatT±a is the confidence interval and 1-α the confidence level

• For the distribution of a random variable X, a value xγ (0 < γ < 1)is with P[X ≤ xγ] ≥ γ and P[X ≥ xγ] ≥ 1-γ is called γ-quantile

• the 0.5-quantile is known as median

• for the standard Normal distribution N(0,1) the γ-quantile is denoted Φγ

• For a given a or α, find a value z of N(0,1)that denotes the [T-a,T+a] confidence interval or a corresponding γ-quantilefor 1-α

!18

P [T � a ✓ T + a] = 1� ↵

Page 19: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

Confidence Intervals for Expectations (I)

• Let X1, …, Xn be a sample from a distribution X with unknownexpectation µ and known variance σ2

• For sufficiently large n, the sample mean is N(µ, σ2/n) distributed and

!19

X

P [�z (X�µ)pn

� z] = �(z)� �(�z)= �(z)� (1� �(z))= 2�(z)� 1= P [X � z �p

n µ X + z �p

n]

) P [X � �1�↵/2 �pn

µ X +�1�↵/2 �p

n] = 1� ↵

Page 20: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

Confidence Intervals for Expectations (I) (cont’d)

!

!

• For confidence interval compute and lookup Φ(z) to determine 1-α

!

• For confidence level 1-α set (i.e., as (1-α/2)-quantile of N(0,1)) then to determine confidence interval

!20

[X � a, X + a]

z =apn

z = �1�↵2

a =z �pn

P [X ��1�↵/2 �p

n µ X +

�1�↵/2 �pn

] = 1� ↵

Page 21: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

Confidence Intervals for Expectations (I) (Example)

• Based on a random sample of n = 100 queries, we observe an average response time of . We further know that the standard deviation is

• Q: What is the confidence of the interval 64±0.5? A: 78.87%

• Q: What’s the 99% confidence interval? A: 64±1.032

!21

� = 4X = 64

a = 0.5

z = 0.5p100

4 = 1.25�(1.25) = 0.894351� ↵

2 = 0.894351� ↵ = 0.7887

1� ↵ = 0.99↵ = 0.01a = �0.005⇥4p

100= 1.032

Page 22: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

Confidence Intervals for Expectations (II)

• Let X1, …, Xn be an iid. sample from a distribution X with unknown expectation µ, unknown variance σ2, but known sample variance S2

• For sufficiently large n, the random variable has a Student’s t distribution with (n-1) degrees of freedom with the Gamma function

!22

T =(X � µ)

pn

S

fT,n(t) =�(n+1

2 )

�(n2 )pn⇡ (1 + t2

n )n+12

�(x) =

Z 1

0e

�t

t

x�1dt for x > 0

Page 23: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

Confidence Intervals for Expectations (II) (cont’d)

!

!

• For confidence interval compute and lookup fT(n-1)(t) to determine 1-α

!

• For confidence level 1-α set (i.e., as (1-α/2)-quantile of fT(n-1)) then to determine confidence interval

!23

P [X �tn�1,1�↵/2 Sp

n µ X +

tn�1,1�↵/2 Spn

] = 1� ↵

[X � a, X + a]

t =apn

S

t = tn�1,1�↵/2

a =t Spn

Page 24: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

3. Hypothesis Testing

• Suppose we throw a coin n times and want to know whether the coin is fair, i.e., P(H) = P(T)

• Let X1, …, Xn ~ Bernoulli(p) be the iid. coin flips, so that the coin is fair if p = 0.5

• Let the null hypothesis H0 be “the coin is fair”

• The alternative hypothesis H1 is then “the coin is not fair”

• Intuitively, if is large, we should reject H0

!24

|X � 0.5|

IRDM WS 2007 2-51

Normal Distribution Table

Page 25: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

Hypothesis Testing Terminology

• θ = θ0 is called a simple hypothesis

• θ > θ0 or θ < θ0 is called a compound hypothesis

• H0 : θ = θ0 vs. H1 : θ ⧧ θ0 is called a two-sided test

• H0 : θ ≤ θ0 vs. H1 : θ > θ0 and H0 : θ ≥ θ0 vs. H1 : θ < θ0 are called a one-sided test

• Rejection region R : if X ∈ R, reject H0 otherwise retain H0

• The rejection region is typically defined using a test statistic T and a critical value c

!25

R = {X : T (X) > c }

Page 26: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

p-Values

• The p-value is the probability that if H0 holds, we observe values at least as extreme of the test statistic

• It is not the probability that H0 holds

• The smaller the p-value, the stronger is the evidence against H0, i.e., if we observe a small enough p-value, we can reject H0

• How small the p-value needs to be depends on the application

• Typical p-value scale:

• < 0.01 very strong evidence against H0

• 0.01 – 0.05 strong evidence against H0

• 0.05 – 0.10 weak evidence against H0

• > 0.1 little or no evidence against H0

!26

Page 27: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

Types of Errors & Statistical Significance

!

!

!

!

• Hypothesis tests often performed at a level of significance α

• means that H0 is rejected if the p-value is less than α

• reported as “results is statistically significant at the α level”

• specifying p-values is more informative

• Don’t confuse statistical significance with practical significance

• e.g.: “blue hyperlinks increase click rate by 0.0001% over black ones” “fuel consumption is reduced by 0.0001 l/km by new part” …

"27

Retain H0 Reject H0

H0 true OK Type I Error

H1 true Type II Error OK

Page 28: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

The Wald Test

• Two-sided test for H0 : θ = θ0 vs. H1 : θ ⧧ θ0

• Test statistic with sample estimate and

• W converges in probability to N(0, 1)

• If w is the observed value of the Wald statistic, the p-value is 2Φ(-|w|)

"28

se = se(✓) =q

V ar(✓)

W =|✓ � ✓0|

se

Page 29: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

The Wald Test (Example)

• We can use the Wald test to test if our coin is fair

• Suppose the observed sample mean is 0.6 and the observed standard error is 0.049

• We obtain as a test statistic value w = (0.6 - 0.5) / 0.049 ≈ 2.04

• The p-value is therefore 2Φ(-|2.04|) ≈ 0.042 (i.e., a fair coin would lead to such an extreme value w only with probability 0.042), which gives us strong evidence to reject the null hypothesis H0

"29

IRDM WS 2007 2-51

Normal Distribution Table

2 * (1 - 0.97882) ≈ 0.042

Page 30: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

Pearson’s 𝜒2 Test for Multinomial Data

• Let X1, …, Xm ~ Multinomial(n, p), the MLE of p is (X1/n, X2/n, …, Xn/n)

• Let p0 = (p01, p02, …, p0n) and we want to test H0 : p = p0 vs. H1 : p ⧧ p0

• Pearson’s 𝜒2 statistic is with expected value Ej = E[Xj] = n p0j of Xj under H0

• The p-value is where t is the observed value of the test statistic and there are (k-1) degrees of freedom

"30

T =kX

j=1

(Xj � n p0j)2

n p0j=

kX

j=1

(Xj � Ej)2

Ej

P (�2k�1 > t)

Page 31: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

Pearson’s 𝜒2 Test for Multinomial Data (Example)

• We can use Pearson’s 𝜒2 test to test whether a dice is fair

• Suppose after 1,000 throws of the dice, we observed ① x 173, ② x 167, ③ x 167, ④ x 176, ⑤ x 167, ⑥ x 150=> p = (0.173, 0.167, 0.167, 0.176, 0.167, 0.150) (based on MLE)

• p0 = (0.167, 0.167, 0.167, 0.167, 0.167, 0.167)

• T = 2.43 => p-value is 0.80 giving us no evidence to reject H0

"31

IRDM WS 2007 2-63

Chi-Square Distribution Table

Page 32: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

Pearson’s 𝜒2 Test of Independence

• Pearson’s 𝜒2 test can also be used to test if two random variables X and Y are independent

• Let X1, …, Xn and Y1, …, Yn be the two samples

• Divide outcomes into r (for X) and c (for Y) disjoint intervals

• Populate r-by-c table O with frequencies, so that Olk tells how many (Xi, Yi) pairs have values l-th and k-interval respectively

• Assuming independence (H0) the expected value of Olk is

"32

Elk =

Pci=1 Oli

Prj=1 OjkPr

j=1

Pci=1 Oij

Page 33: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

Pearson’s 𝜒2 Test of Independence (cont’d)

• The value of the test statistic is

!

!

• There are (r-1)(c-1) degrees of freedom

"33

�2 =cX

i=1

rX

j=1

(Oij � Eij)2

Eij

Page 34: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

Summary of II.3

• Statistical inference based on a sample from a population

• Empirical distribution function and histograms as non-parametric estimation methods

• Method of moments and maximum likelihood as parametric estimation methods

• Confidence intervals

• Wald test and Pearson’s 𝜒2 test for hypothesis testing

"34

Page 35: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

Normal Distribution Table

"35

Page 36: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

𝜒2 Distribution Table

"36

IRDM WS 2007 2-63

Chi-Square Distribution Table

Page 37: Chapter II: Basics from Linear Algebra, Probability Theory ...resources.mpi-inf.mpg.de/departments/d5/teaching/ws13_14/irdm/... · Basics from Linear Algebra, Probability Theory,

IR&DM ’13/’14

Student’s t Distribution Table

"37


Recommended