Date post: | 19-Jan-2016 |
Category: |
Documents |
Upload: | reginald-hardy |
View: | 214 times |
Download: | 0 times |
CiteSearch:CiteSearch: Multi-faceted Fusion Approach to Citation Multi-faceted Fusion Approach to Citation
AnalysisAnalysis
Kiduk Yang and Lokman MehoKiduk Yang and Lokman Meho
Web Information Discovery Integrated Tool LaboratoryWeb Information Discovery Integrated Tool Laboratory
School of Library and Information Science, Indiana UniversitySchool of Library and Information Science, Indiana University
January 18, 2007January 18, 2007
22
CiteSearch: CiteSearch: What, Why, & HowWhat, Why, & How
GoalGoal• Quality Assessment of Scholarly PublicationsQuality Assessment of Scholarly Publications
MotivationMotivation• Lack of comprehensive citation databaseLack of comprehensive citation database• Limitations of conventional citation analysisLimitations of conventional citation analysis
One-dimensional assessmentOne-dimensional assessment Misleading evaluation Misleading evaluation
ApproachApproach• Multi-faceted, Fusion-based Citation AnalysisMulti-faceted, Fusion-based Citation Analysis
Combine data from multiple citation databasesCombine data from multiple citation databases Assess quality using various quality evaluation measuresAssess quality using various quality evaluation measures
33
CiteSearch Study: CiteSearch Study: OverviewOverview
ObjectivesObjectives• Investigate current citation analysis environment Investigate current citation analysis environment • Test the viability of CiteSearch systemTest the viability of CiteSearch system
MethodMethod• Search citation databases and compare the results Search citation databases and compare the results
SetupSetup• Study sampleStudy sample
Publications of 15 SLIS faculty members Publications of 15 SLIS faculty members (approx. 1,100 publications)(approx. 1,100 publications)
• Databases usedDatabases used Google Scholar, Scopus, Web of ScienceGoogle Scholar, Scopus, Web of Science
• Citation sourcesCitation sources Journals and conference papers in 1996-2005Journals and conference papers in 1996-2005
44
Citation DatabasesCitation Databases
Web of ScienceWeb of Science ScopusScopus Google Google ScholarScholar
Breadth of Breadth of coveragecoverage
36M records36M records
8,700 titles8,700 titlesJournals (240 open access) Journals (240 open access) & conference papers& conference papers
28M records28M records
15,000 titles 15,000 titles Journals (500 open access) Journals (500 open access) & conference papers& conference papers
500M records500M records
UnknownUnknown30+ document types30+ document types
Coverage Coverage yearsyears
A&HCI: 1975-A&HCI: 1975-
SCI: 1900-SCI: 1900-
SSCI: 1956-SSCI: 1956-
1996-present1996-present (with cited references)(with cited references)
1966-present1966-present (without cited references(without cited references))
UnknownUnknown
Subject areaSubject area AllAll AllAll AllAll
• Data collectionData collection- WoS: 100 hoursWoS: 100 hours- Scopus: 200 hoursScopus: 200 hours- GS: over 3,000 hoursGS: over 3,000 hours
55
Scopus and WoS: Citation CountScopus and WoS: Citation Count Scopus vs. WoSScopus vs. WoS
• 14.0% (278) more citations by Scopus14.0% (278) more citations by Scopus More comprehensive coverage by Scopus (15,000 vs. 8,700 More comprehensive coverage by Scopus (15,000 vs. 8,700
periodicals)periodicals)
Scopus + WoSScopus + WoS• Scopus increases WoS citations by 35% (710)Scopus increases WoS citations by 35% (710)• WoS increases Scopus citations by 19.0% (432)WoS increases Scopus citations by 19.0% (432)• Relatively low overlap (58%) and high uniqueness (42%)Relatively low overlap (58%) and high uniqueness (42%)
Scopus(2,301)
Web of Science(2,023)
58%(1,591)
26%(710)
16%(432)
Scopus WoS(2,733)
66
Impact of Scopus By Research AreaImpact of Scopus By Research Area- varies significantly between research areas- varies significantly between research areas
77
Impact of Scopus on Faculty Members Impact of Scopus on Faculty Members Relative RankingRelative Ranking
Scopus significantly alters the relative ranking of those faculty members that appear in the middle of the rankings
88
Scopus + WoS: Citation Count By Document Scopus + WoS: Citation Count By Document TypeType
Scopus(359)
WoS(229)
18%(92)
54%(267)
Scopus WoS(496)
28%(137)
Conference Papers Only
99
Scopus + WoS: Summary of ResultsScopus + WoS: Summary of Results
CoverageCoverage• Varies greatly between research areasVaries greatly between research areas
Increase in citations ranges from 5% to 99% by combining Increase in citations ranges from 5% to 99% by combining results from both databases results from both databases
• Scopus has a much better coverage of conference Scopus has a much better coverage of conference proceedingsproceedings
Overlap: 18%Overlap: 18% Scopus only: 54%Scopus only: 54% WoS only: 28%WoS only: 28%
Ranking by citation countRanking by citation count• Relative ranking of faculty members changes Relative ranking of faculty members changes
significantly for those in the middlesignificantly for those in the middle
1010
Google Scholar Citations By Document TypeGoogle Scholar Citations By Document Type
1111
Citations By LanguageCitations By Language
1212
Impact of GS By Research AreaImpact of GS By Research Area
1313
Impact of GS on Faculty Members Relative Impact of GS on Faculty Members Relative RankingRanking
GS does not significantly alter the rankings of faculty members
1414
GS vs. ScopusGS vs. ScopusWoSWoS GS increases WoSGS increases WoSScopus citations by Scopus citations by 93%93% (2,552) (2,552) ScopusScopusWoS increases GS citations by 26% (1,104)WoS increases GS citations by 26% (1,104) GS identifies 53% (or 1,448) more citations than WoSGS identifies 53% (or 1,448) more citations than WoSScopus Scopus GS has much better coverage of conference proceedings GS has much better coverage of conference proceedings
• (1,849 by GS vs. 496 by Scopus(1,849 by GS vs. 496 by ScopusWoS)WoS) GS has over twice as many unique citations as ScopusGS has over twice as many unique citations as ScopusWoS WoS
• (2,552 vs. 1,104, respectively(2,552 vs. 1,104, respectively))
Google Scholar(4,181)
ScopusWoS(2,733)
31%(1,629)
48%(2,552)
21%(1,104)
GS ScopusWoS(5,285)
1515
GS + ScopusGS + ScopusWoS: Summary of WoS: Summary of ResultsResults
CoverageCoverage• Varies greatly between research areasVaries greatly between research areas
23% to 144% increase by combining GS & Scopus23% to 144% increase by combining GS & ScopusWoS WoS 5% to 98% increase by combining Scopus & WoS 5% to 98% increase by combining Scopus & WoS
• GS has strong coverage in CS & ISGS has strong coverage in CS & IS HCI, IR, computational linguistics, social informaticsHCI, IR, computational linguistics, social informatics
• ScopusScopusWoS has stronger coverage in LSWoS has stronger coverage in LS Bibliometrics, collection development, information policyBibliometrics, collection development, information policy
• GS provides significantly better coverage of non-English GS provides significantly better coverage of non-English materialsmaterials
GS GS (7%);(7%); Scopus Scopus (1%);(1%); WoS WoS (1%)(1%)
RankingRanking
• No significant changes in relative ranking of faculty No significant changes in relative ranking of faculty
membersmembers
1616
FindingsFindings• Scopus, WoS, and GS complement rather than replace Scopus, WoS, and GS complement rather than replace
each othereach other
• GS can be useful in showing evidence of broader GS can be useful in showing evidence of broader
international impact than could possibly be done international impact than could possibly be done
through Scopus and WoS through Scopus and WoS
• GS can be very useful for citation searching purposes; GS can be very useful for citation searching purposes;
however, it is not conducive for large-scale comparative however, it is not conducive for large-scale comparative
citation analysescitation analyses
• Scopus significantly alters the relative citation ranking of Scopus significantly alters the relative citation ranking of
scholars as measured by Web of Science. GS does notscholars as measured by Web of Science. GS does not
1717
ConclusionsConclusions Multiple sources of citations should be used to generate Multiple sources of citations should be used to generate
accurate citation counts and rankingsaccurate citation counts and rankings• Citation databases complement one another Citation databases complement one another • Small overlap between sources may significantly influence relative Small overlap between sources may significantly influence relative
rankingranking
Multi-faceted citation analysis is neededMulti-faceted citation analysis is needed
• citation coverage varies by research area, document type, citation coverage varies by research area, document type,
languagelanguage
CiteSearchCiteSearch can greatly facilitate citation analysis can greatly facilitate citation analysis• Enormous effort is required toEnormous effort is required to
Refine search strategyRefine search strategy Parse search resultsParse search results Eliminate noise (duplicate citations)Eliminate noise (duplicate citations) Extract & normalize citation metadataExtract & normalize citation metadata
1818
CiteSearch System: CiteSearch System: OverviewOverview
A Web-based citation search and analysis toolA Web-based citation search and analysis tool Work-in-progressWork-in-progress prototype system prototype system
1.1. Search multiple citation sources Search multiple citation sources Google Scholar, Web of Science, Scopus, EBSCO, ProQuest, Google Scholar, Web of Science, Scopus, EBSCO, ProQuest,
etc.etc.
2.2. Extract and compile citation metadataExtract and compile citation metadata Parse & normalize the search resultsParse & normalize the search results
3.3. Compute various citation-based quality evaluation Compute various citation-based quality evaluation measuresmeasures Document-based measuresDocument-based measures
• Weighted citation counts, CiteRankWeighted citation counts, CiteRank
Author-based measuresAuthor-based measures• Weighted publication counts, H-Index, Mentor-IndexWeighted publication counts, H-Index, Mentor-Index
1919
2020
2121
CiteSearch System: ArchitectureCiteSearch System: Architecture
2222
EndEnd
2323
CiteSearch System: CiteSearch System: Work-in-ProgressWork-in-Progress Federated Citation SearchFederated Citation Search
• To compile comprehensive & usable citation dataTo compile comprehensive & usable citation data
1.1. Query multiple citation databasesQuery multiple citation databases2.2. Filter out noiseFilter out noise
• e.g., invalid, duplicate citationse.g., invalid, duplicate citations
3.3. Extract & normalize metadataExtract & normalize metadata• bibliographical metadata bibliographical metadata (e.g., title, author, year, source, etc.)(e.g., title, author, year, source, etc.)• citation metadata citation metadata (e.g., doctype, subject, language, etc.)(e.g., doctype, subject, language, etc.)
Multi-faceted Citation AnalysisMulti-faceted Citation Analysis• To produce multi-faceted quality/impact assessment measures that To produce multi-faceted quality/impact assessment measures that
account for variance in citation quality (e.g., Weighted citation counts, CiteRank)account for variance in citation quality (e.g., Weighted citation counts, CiteRank) consider various facets of evaluation metric (e.g., Document type, language)consider various facets of evaluation metric (e.g., Document type, language) accommodate diffent aspects of quality assessment (e.g., H-Index, Mentor-Index)accommodate diffent aspects of quality assessment (e.g., H-Index, Mentor-Index)
1.1. Compute citation-based quality scores (CQS) for each publicationCompute citation-based quality scores (CQS) for each publication2.2. Compute CQS for authors, schools, publishers using publication CQSCompute CQS for authors, schools, publishers using publication CQS3.3. Compute CQS for each publication weighted by author/school/publisher Compute CQS for each publication weighted by author/school/publisher
scoresscores4.4. Compute CQS for authors, schools, publishers using weighted publication Compute CQS for authors, schools, publishers using weighted publication
CQSCQS5.5. Repeat steps 3 and 4 until convergenceRepeat steps 3 and 4 until convergence
2424
CiteSearch Study: CiteSearch Study: Citation DatabasesCitation Databases
Web of ScienceWeb of Science• 3 Institute for Scientific Information (ISI) databases 3 Institute for Scientific Information (ISI) databases • Standard tool for citation studies worldwide Standard tool for citation studies worldwide • 35 million records from 9,000 publishers35 million records from 9,000 publishers
ScopusScopus• Produced by ElsevierProduced by Elsevier• 27 million records from 15,000 publishers27 million records from 15,000 publishers
Google ScholarGoogle Scholar• 500 million records500 million records
UBC (UBC (http://weblogs.elearning.ubc.ca/googlescholar/archives/025964.htmlhttp://weblogs.elearning.ubc.ca/googlescholar/archives/025964.html))
• UnknownsUnknowns Coverage (subject, publisher, time-span)Coverage (subject, publisher, time-span) Document type and refereed status of recordsDocument type and refereed status of records
2525
Google Scholar Citations by YearGoogle Scholar Citations by Year
2626
Sources of Unique CitationsSources of Unique Citations
2727
CiteSearch Study: CiteSearch Study: GS + Scopus + WoSGS + Scopus + WoS
Google Scholar(4203)
4.3%(230)
18.3%(970)
48.3%(2561)
GS Scopus WoS(5307)
Scopus
(2308)
WoS(2025)
11.7%(617)
8.2%(435)
3.8%(204)
5.3%(282)