Maximum Likelihood Estimates and the EM Algorithms I Henry Horng-Shing Lu Institute of Statistics...

Post on 04-Jan-2016

216 views 0 download

transcript

Maximum Likelihood Estimates and the EM

Algorithms I

Henry Horng-Shing LuInstitute of Statistics

National Chiao Tung Universityhslu@stat.nctu.edu.tw

1

Part 1Computation Tools

2

Computation Tools R (http://www.r-project.org/): good for

statistical computing C/C++: good for fast computation and large

data sets More:

http://www.stat.nctu.edu.tw/subhtml/source/teachers/hslu/course/statcomp/links.htm

3

The R Project R is a free software environment for

statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS.

Similar to the commercial software of Splus. C/C++, Fortran and other codes can be

linked and called at run time. More: http://www.r-project.org/

4

Download R from http://www.r-project.org/

5

Choose one Mirror Site of R

6

Choose the OS System

7

Select the Base of R

8

Download the Setup Program

9

Install R

Double click R-icon to install R

10

Execute R

11

Interactive command window

Download Add-on Packages

12

Choose a Mirror Site

Choose a mirror site close to you

13

Select One Package to Download

Choose one package to download, like “rgl” or “adimpro”.

14

Load Packages There are two methods to load packages:

15

Method 1:

Click from the menu bar

Method 2:

Type “library(rgl)” in the command window

Help in R (1) What is the loaded library?

help(rgl)

16

Help in R (2) How to search functions for key words?

help.search(“key words”)It will show all functions has the key words.

help.search(“3D plot”)

17

Help in R (3) How to find the illustration of function?

?function nameIt will show the usage, arguments, author, reference, related functions, and examples.

?plot3d

18

R Operators (1) Mathematic operators:

+, -, *, /, ^ Mod: %% sqrt, exp, log, log10, sin, cos, tan, …

19

R Operators (2) Other operators:

: sequence operator %*% matrix algebra <, >, <=, >= inequality ==, != comparison &, &&, |, || and, or ~ formulas <-, = assignment

20

Algebra, Operators and Functions> 1+2[1] 3> 1>2[1] FALSE> 1>2 | 2>1[1] TRUE> A = 1:3> A[1] 1 2 3> A*6[1] 6 12 18> A/10[1] 0.1 0.2 0.3> A%%2[1] 1 0 1

> B = 4:6> A*B[1] 4 10 18> t(A)%*%B

[1][1] 32> A%*%t(B)

[1] [2] [3][1] 4 5 6 [2] 8 10 12[3] 12 15 18> sqrt(A)[1] 1.000 1.1414 1.7320> log(A)[1] 0.000 0.6931 1.0986

> round(sqrt(A), 2)[1] 1.00 1.14 1.73> ceiling(sqrt(A))[1] 1 2 2> floor(sqrt(A))[1] 1 1 1> eigen(A%*%t(B))$values[1] 3.20e+01 8.44e-16 -4.09e-16$vectors

[1] [2] [3][1,] -0.2673 0.3112 -0.2353[2,] -0.5345 -0.8218 -0.6637[3,] -0.8018 0.4773 0.7100

21

Variable TypesItem Descriptions

VectorX=c(10.4,5.6,3.1,6.4) or Z=array(data_vector,

dim_vector)

Matrices X=matrix(1:8,2,4) or Z=matrix(rnorm(30),5,6)

Factors Statef=factor(state)

Lists pts = list(x=cars[,1], y=cars[,2])

Data Framesdata.frame(cbind(x=1, y=1:10),

fac=sample(LETTERS[1:3], 10, repl=TRUE))

Functions name=function(arg_1,arg_2,…) expression

Missing Values

NA or NAN

22

Define Your Own Function (1) Use "fix(myfunction)"

# a window will show up function(parameter){

statements;return (object);# if you want to return some values

} Save the document Use "myfunction(parameter)" in R

23

Define Your Own Function (2) Example: Find all the factors of an integer

24

Define Your Own Function (3)

25

When you leave the program, remember to save the work space for the next use, or the function you defined will disappear after you close R project.

Read and Write Files Write Data to a TXT File Write Data to a CSV File Read TXT and CSV Files Demo

26

Write Data to a TXT File Usage:

write(x, file, …)> X = matrix(1:6, 2, 3)> X

[,1] [,2] [,3][1,] 1 3 5[2,] 2 4 6> write(t(X), file = "d:/out1.txt", ncolumns = 3)> write(X, file = "d:/out2.txt", ncolumns = 3)

27

d:/out1.txt1 3 52 4 6

d:/out2.txt1 2 34 5 6

Write Data to a CSV File Usage:

write.table(x, file = "foo.csv", …)> X = matrix(1:6, 2, 3)> X

[,1] [,2] [,3][1,] 1 3 5[2,] 2 4 6> write.table(t(X), file = "d:/out1.csv", sep = ",", col.names = FALSE, row.names = FALSE)> write.table(X, file = "d:/out2.csv", sep = ",", col.names = FALSE, row.names = FALSE)

28

d:/out1.csv1,23,45,6

d:/out2.csv1,3,52,4,6

Read TXT and CSV Files Usage:

read.table(file, ...)> X = read.table(file = "d:/out1.txt")> X V1 V2 V31 1 3 52 2 4 6> Y = read.table(file = "d:/out1.csv", sep = ",", header = FALSE)> Y V1 V21 1 22 3 43 5 6

29

Demo (1) Practice for read file and basic analysis

> Data = read.table(file = "d:/01.csv", header = TRUE, sep = ",")> Data Y X1 X2[1,] 2.651680 13.808990 26.75896[2,] 1.875039 17.734520 37.89857[3,] 1.523964 19.891030 26.03624[4,] 2.984314 15.574260 30.21754[5,] 10.423090 9.293612 28.91459[6,] 0.840065 8.830160 30.38578[7,] 8.126936 9.615875 32.69579

30

01.csv

Demo (2) Practice for read file and basic analysis

> mean(Data$Y)[1] 4.060727> boxplot(Data$Y)> boxplot(Data)

31

Part 2Motivation Examples

32

Example 1 in Genetics (1) Two linked loci with alleles A and a, and B

and b A, B: dominant a, b: recessive

A double heterozygote AaBb will produce gametes of four types: AB, Ab, aB, ab

33

A

B b

a B

A

b

a

1/2

1/2

a

B

b

A

A

B b

a 1/2

1/2

Example 1 in Genetics (2) Probabilities for genotypes in gametes

34

No Recombination Recombination

Male 1-r r

Female 1-r’ r’

AB ab aB Ab

Male (1-r)/2 (1-r)/2 r/2 r/2

Female (1-r’)/2 (1-r’)/2 r’/2 r’/2

A

B b

a B

A

b

a

1/2

1/2

a

B

b

A

A

B b

a 1/2

1/2

Example 1 in Genetics (3) Fisher, R. A. and Balmukand, B. (1928). The

estimation of linkage from the offspring of selfed heterozygotes. Journal of Genetics, 20, 79–92.

More:http://en.wikipedia.org/wiki/Genetics http://www2.isye.gatech.edu/~brani/isyebayes/bank/handout12.pdf

35

Example 1 in Genetics (4)

36

MALE

AB (1-r)/2

ab(1-r)/2

aBr/2

Abr/2

FEMALE

AB (1-r’)/2

AABB (1-r) (1-r’)/4

aABb(1-r) (1-r’)/4

aABBr (1-r’)/4

AABbr (1-r’)/4

ab(1-r’)/2

AaBb(1-r) (1-r’)/4

aabb(1-r) (1-r’)/4

aaBbr (1-r’)/4

Aabbr (1-r’)/4

aB r’/2

AaBB(1-r) r’/4

aabB(1-r) r’/4

aaBBr r’/4

AabBr r’/4

Ab r’/2

AABb(1-r) r’/4

aAbb(1-r) r’/4

aABbr r’/4

AAbb r r’/4

Example 1 in Genetics (5) Four distinct phenotypes:

A*B*, A*b*, a*B* and a*b*. A*: the dominant phenotype from (Aa, AA, aA). a*: the recessive phenotype from aa. B*: the dominant phenotype from (Bb, BB, bB). b*: the recessive phenotype from bb. A*B*: 9 gametic combinations. A*b*: 3 gametic combinations. a*B*: 3 gametic combinations. a*b*: 1 gametic combination. Total: 16 combinations.

37

Example 1 in Genetics (6) Let , then

38

(1 )(1 ')r r

2( * *)

41

( * *) ( * *)4

( * *)4

P A B

P A b P a B

P a b

Example 1 in Genetics (7) Hence, the random sample of n from the

offspring of selfed heterozygotes will follow a multinomial distribution:

We know that and

So

39

2 1 1; , , ,

4 4 4 4Multinomial n

(1 )(1 '), 0 1/ 2,r r r

1/ 4 1

0 ' 1/ 2r

Example 1 in Genetics (8) Suppose that we observe the data of

which is a random sample from

Then the probability mass function is

40

1 2 3 4, , , 125,18,20,24y y y y y

2 1 1; , , ,

4 4 4 4Multinomial n

2 31 4

1 2 3 4

! 2 1( , ) ( ) ( ) ( )

! ! ! ! 4 4 4y yy yn

g yy y y y

Estimation Methods Frequentist Approaches:

http://en.wikipedia.org/wiki/Frequency_probability

Method of Moments Estimate (MME)http://en.wikipedia.org/wiki/Method_of_moments_%28statistics%29

Maximum Likelihood Estimate (MLE)http://en.wikipedia.org/wiki/Maximum_likelihood

Bayesian Approaches:http://en.wikipedia.org/wiki/Bayesian_probability

41

Method of Moments Estimate (MME) Solve the equations when population

moments are equal to sample moments: for k = 1, 2, …, t, where t is

the number of parameters to be estimated. MME is simple. Under regular conditions, the MME is

consistent! More:

http://en.wikipedia.org/wiki/Method_of_moments_%28statistics%29

42

' 'k km

MME for Example 1

Note: MME can’t assure

43

11 1 1

22 2 2

1 2 3 4

33 3 3

44 4 4

2 1ˆ( ) 4( )

4 21

ˆ( ) 1 4ˆ ˆ ˆ ˆ4 ˆ

1 4ˆ( ) 1 4

44

ˆ( ) 4

MME

yE Y n y

ny

E Y n yny

E Y n yn

yE Y n y

n

ˆ [1/ 4,1]!MME

MME by R> MME <- function(y1, y2, y3, y4){ n = y1+y2+y3+y4; phi1 = 4.0*(y1/n-0.5); phi2 = 1-4*y2/n; phi3 = 1-4*y3/n; phi4 = 4.0*y4/n; phi = (phi1+phi2+phi3+phi4)/4.0; print("By MME method"); return(phi); # print(phi);}> MME(125, 18, 20, 24)[1] "By MME method"[1] 0.5935829

44

MME by C/C++

45

Maximum Likelihood Estimate (MLE) Likelihood: Maximize likelihood: Solve the score

equations, which are setting the first derivates of likelihood to be zeros.

Under regular conditions, the MLE is consistent, asymptotic efficient and normal!

More: http://en.wikipedia.org/wiki/Maximum_likelihood

46

Example 2 (1) We toss an unfair coin 3 times and the

random variable is

If p is the probability of tossing head, then

47

1, if the ith trial is head;

0, if the ith trial is tail.iX

1 with probability ;

0 with probability 1- .i

pX

p

Example 2 (2) The distribution of “# of tossing head”:

48

# of tossing head ( ) probability

0 (0,0,0) (1-p)3

1 (1,0,0) (0,1,0) (0,0,1) 3p(1-p)2

2 (0,1,1) (1,0,1) (1,1,0) 3p2(1-p)

3 (1,1,1) p3

1 2 3, ,x x x

Example 2 (3) Suppose we observe the toss of 1 heads

and 2 tails, the likelihood function becomes

One way to maximize this likelihood function is by solving the score equation, which sets the first derivative to be zero:

49

21 2 3

3( | , , ) (1 ) , where 0 p 1

2L p x x x p p

2 2 23(1 ) 3(1 ) 6 (1 ) 9 12 3 = 0

2p p p p p p p

p

Example 2 (4) The solution of p for the score equation is

1/3 or 1.

One can check that p=1/3 is the maximum point. (How?)

Hence, the MLE of p is 1/3 for this example.

50

MLE for Example 1 (1) Likelihood

MLE:

51

11 2 3 4

2 3 4

! 2( ) ( ) log( ) log( )

! ! ! ! 4

1 ( ) log( ) log( )

4 4

nlogL y

y y y y

y y y

2 31 4

1 2 3 4

! 2 1( ) ( ) ( ) ( )

! ! ! ! 4 4 4y yy yn

Ly y y y

ˆ ˆmax ( ) max log ( )MLE MLEL L

MLE for Example 1 (2)

52

2 31 4log ( ) 02 1

y yy yd dl L

d d

21 2 3 4 1 2 3 4 4( ) ( 2 2 ) 2 0y y y y y y y y y

A B C

2 4

2MLE

B B AC

A

MLE for Example 1 (3) Checking:

1.

2.

3. Compare ?

53

2

( )0?

MLE

d

d

ˆ1/ 4 1?MLE

ˆlog ( )MLEL

Use R to find MLE (1)> #MLE> y1 = 125; y2 = 18; y3 = 20; y4 = 24> f <- function(phi){+ ((2.0+phi)/4.0)^y1 * ((1.0-phi)/4.0)^(y2+y3) * (phi/4.0)^y4+ }> plot(f, 1/4, 1, xlab = expression(varphi), ylab = "likelihood

function multipling a constant")> optimize(f, interval = c(1/4, 1), maximum = T)$maximum[1] 0.5778734

$objective[1] 7.46944e-82

54

Use R to find MLE (2)

55

Use C/C++ to find MLE (1)

56

Use C/C++ to find MLE (2)

57

Exercises Write your own programs for those

examples presented in this talk. Write programs for those examples

mentioned at the following web page:http://en.wikipedia.org/wiki/Maximum_likelihood

Write programs for the other examples that you know.

58

More Exercises (1) Example 3 in genetics:

The observed data are

where , , and fall in such that Find the likelihood function and score equations for , , and .

59

2 2 2

, , , 176,182,60,17

~ , 2 , 2 ,2

O A B ABn n n n

Multinomial r p pr q qr pq

p q r [0,1]

1p q r

p q r

More Exercises (2) Example 4 in the positron emission

tomography (PET): The observed data are

and

The values of are known and the unknown parameters are .

Find the likelihood function and score equations for .

60

*

1

( ) ( , ) ( ).B

b

d p b d b

* *~ , 1,2, ,n d Poisson d d D

,p b d

, 1, 2, ,b b B

, 1, 2, ,b b B

More Exercises (3) Example 5 in the normal mixture:

The observed data are random samples from the following probability density function:

Find the likelihood function and score equations for the following parameters:

61

2

1 1

( ) ~ ( , ), 1, and 0 1 for all .K K

i k k k k kk k

f x Normal k

1 1 1( ,..., , ,..., , ,..., ).K K K

, 1, 2, ,iX i n