Date post: | 11-Jan-2016 |
Category: |
Documents |
Upload: | bryce-simpson |
View: | 214 times |
Download: | 1 times |
SAINT Toolkitfor Applied Scientometrics
Edwin Horlings
August 2012
August 2012
Structure
• Applied scientometrics
• The SAINT Toolkit
• Examples of applications
• Collaboration
Edwin Horlings | 2 / 28 | Patenting in the Netherlands 1945-2011
August 2012
APPLIED SCIENTOMETRICS
Edwin Horlings | 3 / 28 | Patenting in the Netherlands 1945-2011
August 2012
Applied scientometrics
• We are living in the age of Big Data− continuous increase in the amounts of data on science, technology and
innovation− Web of Science c. 45 million scientific publications, PATSTAT c. 67
million patent applications, with detailed information on every record− enormous expansion of web data (e.g. twitter, blogs)− we now have the computer power to mine and analyse those data
• Increasing call for evidence-based policy− support policy and politics with reliable information− applied scientometrics can help understand the effects of policy
Edwin Horlings | 4 / 28 | Patenting in the Netherlands 1945-2011
August 2012
Applied scientometrics
Edwin Horlings | 5 / 28 | Patenting in the Netherlands 1945-2011
Applying advanced quantitative methodsto large heterogeneous datasets
in order to extract patternsthat show the structure and development of
science, technology and innovation
August 2012
What sort of patterns do we look for?
• Networks− co-authors of scientific papers− inventors and assignees of patents− members of the same assocations− researchers working on or talking about the same topics
• Clustering of similar items− publications about the same topic or in the same specialisation− similar patents by different organisations− clusters in collaboration networks
• Statistical analysis of patterns (e.g. Social Network Analysis)
Edwin Horlings | 6 / 28 | Patenting in the Netherlands 1945-2011
August 2012
SAINT TOOLKIT
Edwin Horlings | 7 / 28 | Patenting in the Netherlands 1945-2011
August 2012
SAINT Toolkit
• the Science Assessment Integrated Network Toolkit
• Main components at the moment− ISI Parser: convert raw Web of Science data into a relational database− Word Splitter: cuts full text into words, eliminating stop words, and
shortening words to their stem using different algorithms− Network Tools: identify clusters in network using one of the best
clustering algorithms (Blondel et al. 2008)
Edwin Horlings | 8 / 28 | Patenting in the Netherlands 1945-2011
August 2012
SAINT Toolkit
• Under development− Integrating all tools into a Work Flow Manager− Word splitting using Natural Language Programming to extract terms
rather than words− Adding alternative clustering algorithms for network analysis− Improve the Parser for new data sources, such as Scopus and online
data− Develop tools for disambiguation of authors and addresses
Edwin Horlings | 9 / 28 | Patenting in the Netherlands 1945-2011
August 2012
APPLICATIONS
Edwin Horlings | 10 / 28 | Patenting in the Netherlands 1945-2011
August 2012
Author disambiguation
• Thousands of researchers with identical names (e.g. Y. Zhang): how to tell the difference?
• Important for evaluation and for research
• Developed an algorithm with 95-100% accuracy
• Now developing software tool with University of Paris Est (ESIEE)
Edwin Horlings | 11 / 28 | Patenting in the Netherlands 1945-2011
August 2012
Portfolio of individual researchers
Edwin Horlings & Thomas Gurney | 12 / 28 | Search strategies along the academic lifecycle
• How do individual researchers develop their scientific portfolio?
− different stages in their career− different problem areas, often
simultaneous− coherence of their portfolio− author position
• Developed a scientometric method to visualise and statistically analyse
August 2012
Ronald Plasterk, former Minister of Education, Science and Culture
Barend van der Meulen | 13 / 28 | SAINT Toolkit for applied scientometrics
August 2012
Edwin Horlings | 14 / 22 | Science policy for the bio-economy
Bio-energy worldwide 8,414 publications 2008-2009
primary strength of China
secondary strength of China
primary strength of Netherlandssecondary strength of Netherlands
August 2012
Advantage of having a large facility
• Does a large-scale facility provide home advantage to local research groups
− accumulating reputation− opening up new avenues of research− developing social networks− producing scientific
• Examine for high-field magnet laboratories, such as in Hefei and in Nijmegen
Edwin Horlings & Thomas Gurney | 15 / 28 | Search strategies along the academic lifecycle
August 2012
Collaboration networks in graphene
Collaboration between institutesin graphene research worldwide1990-2011 (17,968 publications)
Collaboration between institutesin graphene research worldwide1990-2011 (17,968 publications)
INSTITUTION(E.G. UNIVERSITY)
INSTITUTION(E.G. UNIVERSITY)
August 2012
Collaboration networks in graphene
Clusters of institutes thatcollaborate more with each other
than with other institutes
Clusters of institutes thatcollaborate more with each other
than with other institutes
CLUSTERCLUSTER
August 2012
Collaboration networks in graphene
EUEU
NORTHAMERICANORTH
AMERICASOUTH EAST
ASIASOUTH EAST
ASIA
Networks of scientific collaborationin graphene are highly regionally clustered
August 2012
Collaboration networks in graphene
All Chinese institutions in the networkhighlighted in black
August 2012
Collaboration in graphene research in China
Edwin Horlings & Thomas Gurney | 20 / 28 | Search strategies along the academic lifecycle
August 2012
How Dutch universities work on scientific topics
Edwin Horlings & Thomas Gurney | 21 / 28 | Search strategies along the academic lifecycle
Celiac Disease Consortium2000-2003
TOPICTOPIC
UNIVERSITYUNIVERSITY
August 2012
How Dutch universities work on scientific topics
• Denser network• More institutions
involved• More coherent:
more universities work on the same small set of topics
Edwin Horlings & Thomas Gurney | 22 / 28 | Search strategies along the academic lifecycle
Celiac Disease Consortium2007-2010