+ All Categories
Home > Documents > “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

“The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

Date post: 17-Dec-2015
Category:
Upload: noel-chambers
View: 221 times
Download: 2 times
Share this document with a friend
Popular Tags:
32
“The Rise and Rise of Citation AnalysisTanmoy Chakraborty CNeRG, IIT-Kgp, India
Transcript
Page 1: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

“The Rise and Rise of Citation

Analysis”

Tanmoy ChakrabortyCNeRG, IIT-Kgp, India

Page 2: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

Twofold Research Interests

• Analyzing communities/clusters in complex networks

• Studying different aspects of citation network

Page 3: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

“The Rise and Rise of Citation

Analysis”

In collaboration with

Suhansanu Kumar, Pawan Gowel,

Animesh Mukherjee, Niloy Ganguly

Page 4: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

Mixed Sentiment

• Sense and non-sense about citation analysis (*6860) --- T. Opthof, Cardiovascular research, 97

• The rise and rise of citation analysis (*1399)

--- Lokman I. Meho, Phy. Res., 07

• Does citation pay? (*887)

--- Fowler & Aksnes, Scientometrics, 07

• Think beyond citation analysis (*1009)

--- Sarli et al., NIPS, 10

Page 5: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

Raw Citations Count

To assess • Quality of a paper• Prominence of a researcher• Success of a collaboration/group• Quality of a conference/journal• Quality of an Institute• Impact of a research area

Only Citation Count

Sooner or later, you will definitely be subjected to such an analysis

Page 6: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

Bibliometrics: Raw Citation Count

• Journal Impact factor • Immediacy factor• Eigen factor• Altmetric• 5 years Impact factor

Cit

atio

n

Common assumption

Page 7: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

Publication Universe

• Crawled entire Microsoft Academic Search • Papers only in Computer Science domain• Basic preprocessing

Basic Statistics of papers from 1960-2010

Values

Number of valid entries 3,473,171

Number of authors 1,186,412

Number of unique venues 6,143

Avg. number of papers per author 5.18

Avg. number of authors per paper 2.49

Page 8: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

Publication Universe

Available Metadata for each paper

Title

Unique ID

Named entity disambiguated authors’ name

Year of publication

Named entity disambiguated publication venue

Related research field(s)

References

Keywords

Abstract

Citation context

Page 9: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

Citation Profile

An exhaustive analysis of the citation profiles• Papers having at least 10 yrs history and consider at most

20yrs history• Scale the entries of the citation profile between 0-1• Use peak-detection heuristics

» Each peak should be at least 75% of the max peak» Two consecutive peak should be separated at least 3 yrs

Page 10: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

Five Universal Citation Profiles

Q1 and Q3 represent the first and third quartiles of the data points respectively.

Avg. behavior

Another category: ‘Oth’ => having less than one citation (on avg) per year

Page 11: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

Five Universal Citation ProfilesA deeper look

Page 12: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

Immediate Questions

• Is the Journal Impact factor (JIF) formula correct?

• JIF at year 2000 : Eugene Garfield (1975)

# of citations received in 2000 by papers published in that journal in 1998 and 1999

divided by

# of papers published in the journal in 1998 and 1999

Page 13: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

Immediate Questions

What does JIF really imply?

Importance of the recent papers in current time period?

Relevance of the journal itself in current time period

Why last 2 years?Why not all the citations at current time

Page 14: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

Immediate Questions

Over-consider

Under-consider

JIF overlooks the importance of Peak_Late and MonIncr

Over-consider

Page 15: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

More on the Categories

1. Are they biased on the year of publication? (Aging factor)

Mean year Year deviation

Peak_Int 1994 5.19

Peak_Mul 1992 6.68

Peak_Late 1992 6.54

MonDec 1994 5.43

MonIncr 1993 5.36

Same ageAns: No

Page 16: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

More on the Categories

2. Are they biased on Journals/conferences?

Peak_Int Peak_Mul Peak_Late MonDec MonIncr% of

conferences paper 65 39.03 39.89 60.73 25.26% of

Journal paper 35 60.97 60.11 39.27 74.74

Page 17: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

More on the Categories

3. Are they affected by self-citation?

Peak_Int Peak_Mul Peak_Late MonDec MonIncr Oth

Peak_Int 0.72 0.10 0.03 0.01 0 0.15

Peak_Mul 0.02 0.81 0.04 0 0.01 0.11

Peak_Late 0.01 0.06 0.86 0 0.01 0.06

MonDec 0.05 0.14 0 0.41 0 0.35

MonIncr 0 0.02 0.01 0 0.88 0.09

Transition matrix showing the transition of categories after removing self citations

Most affected by self citation

Least affected by self citations

Page 18: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

More on the Categories

4. What about Peak_Mul?

Peak_Mul Might be Intermediary between Peak_Int and Peak_Late

2.5

5.3

5.1

3.1

4.2

5.3 12.1

10.8

Time

Avg

Pea

k H

eigh

t

Peak_Int

Peak_Mul

Peak_Late

Years after publication

3.1+2.5 = 5.6

(12.1-5.3) = 6.8 ~ (10.8-4.2)

Page 19: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

Where does this classification help?

• To improve Bibliometrics in scientific research• Various prediction systems

• Future citation prediction system• Predicting emerging field/topic• Predating future star/seminal papers

• Paper search and Recommender systems

Page 20: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

On predicting

Future Citation Count

at the Time of Publication

Page 21: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

Problem Definition

Page 22: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

Traditional Framework

Yan et al., JCDL, 2012

Page 23: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

Problems in Traditional Framework

• Consider initial few years’ statistics after publication Proved to be very effective

• Lack of time dimension in prediction

• Suffers a lot from outlier points during regression

Page 24: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

Problems in Traditional Framework:how to tackle

• Consider initial few years’ statistics after publication Proved to be very effective

• Lack of time dimension in prediction

• Suffers a lot from outlier points during regression

Try to predict citations as early as possible(may be at the time of publication )

Should consider the time dimension

Reduce outlier points as much as possible

Page 25: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

Our Framework: 2-stage Model

Page 26: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

Features

Author-centric

Productivity (Max/Avg)

H-index (Max/Avg)

Versatility (Max/Avg)

Sociality (Max/Avg)

Page 27: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

Performance Evaluation

(i) Coefficient of determination (R2)The more, the better

(ii) Mean squared error (θ) The less, the better

(iii) Pearson correlation coefficient (ρ) The more, the better

Page 28: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

Performance of SVM

Confusion Matrix

Page 29: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

Performance of Regression Model

Page 30: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

Feature Analysis

Page 31: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

• Five universal citation profiles• Different analysis on these categories• Can help to reframe existing bibliometrics• Can be a generic way in machine learning• Can enhance the performance of the existing

systems

Conclusion

Page 32: “The Rise and Rise of Citation Analysis ” Tanmoy Chakraborty CNeRG, IIT-Kgp, India.

Thank You


Recommended