+ All Categories
Home > Documents > Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold...

Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold...

Date post: 17-Jan-2018
Category:
Upload: kevin-goodwin
View: 217 times
Download: 0 times
Share this document with a friend
Description:
W1W2W3W4… D11011 D2………… D3………… ……………
18
Sophia(Xueyao) Liang CPSC 503 Final Project
Transcript
Page 1: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.

Sophia(Xueyao) LiangCPSC 503 Final Project

Page 2: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.

K=3

Unsupervised

P( |d)P( |d)P( |d)

Olympic, vancouver

Snow, cold

Moon light, spider man

Page 3: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.

W1 W2 W3 W4 …D1 1 0 1 1D2 … … … …D3 … … … …… … … … …

Page 4: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.

W1 W2 W3 W4 …D1 1 0 1 1D2 … … … …D3 … … … …… … … … …

zk∈{z1,z2,…,zN}ln ( , )i j

i j

L p d w

( , ) ( ) ( | ) ( | )i j k i k j kk

p d w p z p d z p w z

Page 5: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.

' ' ''

( ) ( | ) ( | )( | , )

( ) ( | ) ( | )k i k j k

k i jk i k j k

k

p z P d z p w zp z d w

p z p d z p w z

', '

( | , )( | )

( | , )

k i ji

j kk i j

i j

p z d wp w z

p z d w

'',

( | , )( | )

( | , )

k i jj

i kk i j

i j

p z d wp d z

p z d w

,

', , '

( | , )( )

( | , )

k i ji j

kk i j

i j k

p z d wp z

p z d w

Expectation:

Maximization:

Page 6: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.

D1 D2 D3 D4 …D1 1 0 1 1D2 … … … …D3 … … … …… … … … …

( , )i jp d c( , )i jp d w

Page 7: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.

W1 W2 W3 W4 …D1 1 0 1 1D2 … … … …D3 … … … …… … … … …

Page 8: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.
Page 9: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.

( , ) :i jw d d

' '1. ( , ) 1( ') ( , ) 0( ')i i i iw d d i i w d d i i

'| ( )| | ( )|

' m '1 1'

2. ( , ) (I ( ), ( ))| ( ) || ( ) |

i iI d I d

i i i n im ni i

Cw d d w d I dI d I d

Page 10: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.

(1 )* *O L R 2

' ', '

( , ) ( ( | ) ( | )) ( ')i i k i k ii i k

R w d d p z d p z d i i

( | ) ( )* ( | )k i k i kp z d p z p d z

Page 11: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.

Efficient Algorithm:Expectation (PLSA)Maximization(PLSA)The result of the previous steps may not

ends in better value for O

Parameter Inference: No closed form solution for expectation step

' ''

''

( , ) ( | )( | ) (1 ) ( | )

( , )

i i i ki

i k i ki i

i

w d d p d zp d z p d z

w d d

Page 12: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.

Potential Problems of the model

Parameter InferenceHigher time complexity and slower to converge

(1 )* *O L R

-10000

100

Page 13: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.

Cora Data version 1.0

Cited paper not in the corpusNo abstract for some post-script files

Too many categoriesDuplicated or isolated papers

30000 scientific papers, with citation informationImportant files: papers (ID-name, link, author…..) citations (ID-cited ID) classifications (link-category) directory: extractions (post-script form of the papers)

Page 14: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.

Cora Data version 1.0Papers in category Machine LearningAbout 2700 papers1400 Frequent Words (stop words removed, stemmed)Theory 315

Reinforcement 217Geneti Algorithms 418Neural Networks 818Probabilistic 426Case based 298Rule Learning 180

Page 15: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.

arg max ( | )kk

p z d

Page 16: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.

(A) Accuracy

(B) RecallAccuray and Recall for each category

PHITS PLSA NetPLSA Overall Accuracy

0.470 0.501 0.562

Overall Accuracy

Page 17: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.

Justified the claim that adding network structure into the model could improve the result of topic modeling

Modeled the network on a scale of articles

Inherent problem exists in the picked framework

The result is still far from satisfactory

Page 18: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.

How to model the network structure of blog articles, especially considering model them on a scale of articles

Bag-of-words matrix extraction Better integral model, maybe LDA

based Efficiency of the algorithm Recommendation based on topic

communtiy discovery


Recommended