+ All Categories
Home > Documents > 1 Authoritative Sources in a Hyperlinked Environment Jon M. Kleinberg Presented by Yongqiang Li...

1 Authoritative Sources in a Hyperlinked Environment Jon M. Kleinberg Presented by Yongqiang Li...

Date post: 19-Dec-2015
Category:
View: 217 times
Download: 0 times
Share this document with a friend
Popular Tags:
19
1 Authoritative Sources in a Hyperlinked Environment Jon M. Kleinberg Presented by Yongqiang Li Adapted from h ttp://4.bp.blogspot.com/_ZoVPDQT8m6o/SbbYNkPJ CnI/AAAAAAAAAEM/c2ueV-36llY/s320/relevance2.J PG
Transcript
Page 1: 1 Authoritative Sources in a Hyperlinked Environment Jon M. Kleinberg Presented by Yongqiang Li Adapted from .

1

Authoritative Sources in a Hyperlinked Environment

Jon M. Kleinberg

Presented by

Yongqiang LiAdapted from http://4.bp.blogspot.com/_ZoVPDQT8m6o/SbbYNkPJCnI/AAAAAAAAAEM/c2ueV-36llY/s320/relevance2.JPG

Page 2: 1 Authoritative Sources in a Hyperlinked Environment Jon M. Kleinberg Presented by Yongqiang Li Adapted from .

2

This presentation explains how to find the authoritative sources to a broad search query in WWW.

Page 3: 1 Authoritative Sources in a Hyperlinked Environment Jon M. Kleinberg Presented by Yongqiang Li Adapted from .

3

Why authorities are important in broad search queries.

Is a web page the authority?

Page 4: 1 Authoritative Sources in a Hyperlinked Environment Jon M. Kleinberg Presented by Yongqiang Li Adapted from .

4

Why need to analyze the link structure to know if a web page is an authority.

Content-based query failed to find TOYOTA , an authority of automobile manufactures.

http://www.kolberg.co.uk/img/hubs_and_authorities.gif

Link-based model for conferral of authorities

Page 5: 1 Authoritative Sources in a Hyperlinked Environment Jon M. Kleinberg Presented by Yongqiang Li Adapted from .

5

Focused subgraph of WWW is where to find authorities and hubs.

1).relatively small2).rich in relevant pages3).contains most of strongestauthorities

The process to obtain a focused graph of WWW

Link Types

Page 6: 1 Authoritative Sources in a Hyperlinked Environment Jon M. Kleinberg Presented by Yongqiang Li Adapted from .

6

In-degree can not obtain authorities in a focused subgraph.

Query:Java

Definition of in-degrees Top 3 pages with large number of in-degrees

How to filter out the popularities?

Page 7: 1 Authoritative Sources in a Hyperlinked Environment Jon M. Kleinberg Presented by Yongqiang Li Adapted from .

7

The iteration flow to compute weightsor scores.

Hubs and authorities reinforcing approach can find authorities in a focused subgraph.

Heuristics:1).An authority is pointed by good hubs.2).A hub is pointing to good authorities.

Broad-topic

Page 8: 1 Authoritative Sources in a Hyperlinked Environment Jon M. Kleinberg Presented by Yongqiang Li Adapted from .

8

Authority and hub Model analysis shows this method is convergent and eigenvector based.

Theorem 3.1 The sequences x1; x2; x3; … and y1; y2; y3; … converge (to limits x* and y* respectively).

yAx TA is a adjacent matrix of the focused subgraph. We have authority score vector: and hub score vector: Axy

Eigenvalue Assumption: then we refer to

as principal eigenvector; other as non-principal eigenvector;

Theorem 3.2 (Subject to Assumption (#).) x* is the principal eigenvector of , and y* is the principal eigenvector of .

Page 9: 1 Authoritative Sources in a Hyperlinked Environment Jon M. Kleinberg Presented by Yongqiang Li Adapted from .

9

Experiment results of hubs and authorities scoring approach are compelling.

Top pages are highly relevant to the query.

Page 10: 1 Authoritative Sources in a Hyperlinked Environment Jon M. Kleinberg Presented by Yongqiang Li Adapted from .

10

It’s very compelling to use authority and hub scoring approach to find similar pages.

1).Expand a highly referenced page to the focused graph.

2).Use authority and hub approach to find similar pages.

3).Top pages are highly relevant to the query page.

Page 11: 1 Authoritative Sources in a Hyperlinked Environment Jon M. Kleinberg Presented by Yongqiang Li Adapted from .

11

There are other link-based ranking approaches in different academic fields.

Similar concepts:ranking,scoring,standing,impact and influence.

How to measure them?

Page 12: 1 Authoritative Sources in a Hyperlinked Environment Jon M. Kleinberg Presented by Yongqiang Li Adapted from .

12

How to obtain clustering sets of hubs and authorities.Why multiple sets exist for a broad topic query?

1. Multiple meanings. i.e. “jaguar”2. Multiple academic communities. i.e. “randomized algorithms”

Proposition 6.1 and have the same multiset of eigenvalues, and their eigenvectors can be chosen so that .

This proposition shows that the authority ranking x* and hub ranking y* can reinforce each other in the pair of egenvectors.

An eigenvector a set of authorities

Page 13: 1 Authoritative Sources in a Hyperlinked Environment Jon M. Kleinberg Presented by Yongqiang Li Adapted from .

13

Results of clustering sets of hubs and authorities.

Broad-topic:Jaguar

Page 14: 1 Authoritative Sources in a Hyperlinked Environment Jon M. Kleinberg Presented by Yongqiang Li Adapted from .

14

Diffusion of the hubs and authorities ranking approach.

The query in the focused subgraph has no dense enough relevant pages.Then broader topic pages are returned as principal eigenvector.

Page 15: 1 Authoritative Sources in a Hyperlinked Environment Jon M. Kleinberg Presented by Yongqiang Li Adapted from .

15

It’s challenging to evaluate the query results.

Since the quality of the query results rely on the human judgement, no quantitative measurement.

The authority and hub ranking approach is implemented in the CLEVER project. The further information can refer to the presentations by Hira Bahir and Ray Yamada separately.

Page 16: 1 Authoritative Sources in a Hyperlinked Environment Jon M. Kleinberg Presented by Yongqiang Li Adapted from .

16

Procs

• This paper utilizes AUTHOTITY concept to query highly relevant pagesto a broad topic in an effective and efficient way by analyzing the link structure in a focused subgraph of WWW.• The AUTHORITY and HUB ranking approach can not only be used in theWWW ranking but also can be utilized in other academic fields, like social work and scientific citations. • A focused graph link structure is maintained instead of maintaining the entire link structure of WWW, therefore the storage and efficiency have higher performance.• HUB is a compelling concept paired with AUTHORITIES. HUB plays acritical role to find AUTHORITIES in WWW. HUB concept is innovative sinceit can not be found in other bibliometrics.• Even eigenvector-base method is not first presented in this paper. It is very effective compared with in degree ranking approach. This eigenvectorbased method has very natural advantages to group the relevant pages intoClusters.• This AUTHORITY and HUB ranking approach can also be utilized in finding the similar pages from an interested page as well as broad topic search.

Page 17: 1 Authoritative Sources in a Hyperlinked Environment Jon M. Kleinberg Presented by Yongqiang Li Adapted from .

17

Cons

• The AUTHORITY and HUB ranking approach is based on heuristics. Butthe real WWW is far more complex than these idealized assumptions,i.e. a good HUB is pointing many good AUTHORITIES; or a good AUTHORITY is pointing to many good HUBS.• How to tell out the INTRINSIC and TRANSEVERSE links in same or different domains are not that straight forward. Since some pages belong tothe same business body even their domains are different. Or some pagesbelong to different business bodies even they are under the same domain.• The base set of the focused subgraph to a broad topic query may not have enough dense relevant pages. This leads to diffusion and generalization.• The assumption that the first eigenvalue of adjacent matrix is the principal one may not hold in some cases.

Page 18: 1 Authoritative Sources in a Hyperlinked Environment Jon M. Kleinberg Presented by Yongqiang Li Adapted from .

18

Q A

Page 19: 1 Authoritative Sources in a Hyperlinked Environment Jon M. Kleinberg Presented by Yongqiang Li Adapted from .

19

References:

• [1] E. Garfield, "Citation analysis as a tool in journal evaluation," Science, 178(1972), pp.471-479.

• [2] N. Geller, "On the citation influence methodology of Pinski and Narin," Inf. Proc. andManagement, 14(1978), pp. 93-95.

• [3] C.H. Hubbell, "An input-output approach to clique identification," Sociometry,28(1965), pp. 377-399.

• [4] L. Katz, "A new status index derived from sociometric analysis," Psychometrika,18(1953), pp. 39-43.

• [5] G. Pinski, F. Narin, "Citation influence for journal aggregates of scientific publica-tions: Theory, with application to the literature of physics," Inf. Proc. and Management,12(1976), pp. 297-312.


Recommended