OSSLM-2016
Open Source Software (OSS) for Bibliometrics and Scientometrics
WORKSHOP MANUAL
Prepared by
Dr. S. K. Jalal
Deputy Librarian, Central Library, IIT Kharagpur
National Workshop on Open Source Software for Library
Management
(OSSLM 2016) – June 13-18, 2016
Jointly Organized by
Central Library
Indian Institute of Technology Kharagpur Kharagpur – 721302, West Bengal, India
&
National Digital Library (NDL) NMEICT Project, MHRD, Govt. of India
OSSLM-2016
Open Source Software (OSS) for Bibliometrics and Scientometrics
Abstract
The manual deals with three important open source software i.e. Publish or Perish
(POP), SciMAT and Bibexcel. These software will help in analysing the faculty
publications of an author or an institute or a specific subject.
1. Introduction
There are many open source software available on the Web for Bibliometric and
Scientometrics analysis. Bibliometric mainly deals with the statistical and mathematical
application to printed documents, whereas Scientometrics deals with the application of
statistical and mathematical techniques to printed documents especially scientific documents
and analysis of science and its modeling etc. Scientific document, usually, means articles,
letters, reviews or proceedings papers. Citation analysis is one the most popular methods in
Bibliometrics. Citation analysis is the examination, visualization and representation of
frequency pattern of citations of published documents. Authorship pattern, publication
analysis, bibliographic coupling and co-citation analysis are some of the important measures
in citation analysis.
Some common features of Scientometric software are as follows:
Preparation of a list of authors;
Preparation of a list of titles;
Preparation of a list of Journals;
Finding out authorship pattern;
Supports mapping and visualization of discipline (e.g. Physics, Chemistry etc.);
Analyzing data imported from other data sources ( e.g Scopus, WoS);
Helps to execute metrics based evaluation (h-index, number of citations);
Creation of maps and generating networks
OSSLM-2016
2. Open Source software for Bibliometrics Analysis
Some open source software predominantly used in Bibliometric and Scientometric analysis
are given below:
S.N Name of OSS Available
1. POP Publish or Perish http://www.harzing.com/resources/publish-or-perish
2. SciMAT http://sci2s.ugr.es/scimat/
3. VOSViewer http://www.vosviewer.com/
4. BibExcel http://homepage.univie.ac.at/juan.gorraiz/bibexcel/
5. CiteSpace II http://cluster.cis.drexel.edu/~cchen/citespace/
6. Gephi https://gephi.org/
7. HistCite http://histcite.software.informer.com/12.3/
8. NodeXL http://nodexl.codeplex.com/
9. Pajek http://vlado.fmf.uni-lj.si/pub/networks/pajek/
10. Ucinet https://sites.google.com/site/ucinetsoftware/home
Among these software, only a few will be discussed here.
3. POP – Publish or Perish
3.1 Introduction
Publish or Perish (Version: 4.25.1)is a software program that retrieves and analyses
academic citations. It uses Google Scholar data and Microsoft Academic Search data to
obtain raw data and citations for analysis and prepare some metrics:
3.2 Important Metrics
Total Number of Papers andtotal number of citations. It provides total
number of citations for a set of publications.
Average citations per paper, citations per author, papers per author, and
citations per year. These metrics can be calculated.
Hirsch's h-index and related parameters.A scientist has index h if h of his/her Np
papers have at least h citations each, and the other (Np-h) papers have no more
than h citations each.
Egghe's g-index:A set of articles ranked in decreasing order of the number of
citations that they received, the g-index is the (unique) largest number such that
the top g articles received (together) at least g2 citations.
The contemporary h-index:It adds an age-related weighting to each cited
article, giving less weight to older articles. This means that for an article
OSSLM-2016
published during the current year, its citations count four times. For an article
published 4 years ago, its citations count only once (4/4). For an article published
6 years ago, its citations count 4/6 times, and so on.
The Individual h-index (hI):It divides the standard h-index by the average
number of authors in the articles that contribute to the h-index, in order to reduce
the effects of co-authorship; the resulting index is called hI.
The age-weighted citation rate (AWCR) and AW:It is an age-weighted
citation rate, where the number of citations to a given paper is divided by the age
of that paper. The AW-index is defined as the square root of the AWCR. Jin
defines the AR-index as the square root of the sum of all age-weighted citation
counts over all papers that contribute to the h-index.
The results are available on screen. It can be copied to Microsoft word or Excel.
3.3 Installation
3.3.1 Download and Install POP (Windows)
The Publish or Perish software is a Microsoft Windows application that can also be installed
in Linux computers, with the aid of a suitable emulator such as Wine.
1. Download the Publish or Perish software installer from the Harzing.com :
http://www.harzing.com/download/PoPSetup.exe.
Publish or Perish installer for Windows (949 KB) .Version: 4.25.1 (17 Jan ‘16)
2. Start the PoPSetup.exe installer by double-clicking on the file that you just
downloaded.
3. On most systems, a security warning dialog box will
appear.Click Run or Continue or Yes after you have verified that the publisher's
name is Tarma Software Research Ltd
4. The installer will now start. Follow the instructions on the screen to confirm your acceptance of the license agreement
and to install the Publish or Perish software on your computer.
3.3.2 Download and Install POP (LINUX)
Wine is an Open Source implementation of the Windows API on top of UNIX. Wine
provides the programs and libraries that allow you to run many Windows applications
(including Publish or Perish) unchanged on your Linux system and other supported Unix-
like operating systems.
OSSLM-2016
3.3.3 Installing Publish or Perish using Wine
Once you have Wine installed, you can install Publish or Perish using its normal Windows
installer according to the following procedure.
a) Download the Publish or Perish software installer from the Harzing.com web site:
Publish or Perish installer for Windows (949 KB)
Version: 4.25.1 (17 January 2016) - What's new?
b) Open a File Browser window and go to the PoPSetup.exe file that you just
downloaded. Right-click on the file and choose Properties from the popup menu.
c) In the PoPSetup.exe Properties window that appears, check the Allow executing file
as program box:
d) Click Close todismiss the Properties window, then double-click the PoPSetup.exefile
to start the Publish or Perish installer.
e) Follow the instructions on the screen to install Publish or Perish.
After successful installation, once we start POP, following interface may be seen
OSSLM-2016
3.4 Search Trips
1. Always use "quotes" around the author’s name, e.g. "A Harzing"
2. PoP is not case dependent, "A HARZING" gives the same result as "a harzing"
3. The order of search terms does not matter. "A Harzing" will give the same result as
"Harzing A".
4. Use an author’s initials rather than their full given name as not all journals publish
author names in full.
5. If an author has consistently published with only one initial, you can exclude
namesakes using 2nd and 3rd initials by using wildcards in the "exclude these
names" field, e.g. when searching for "G Sewell", you can exclude "G* Sewell"
"G** Sewell".
6. If an author has published under two different names (e.g. maiden name and married
name) use OR between search terms for a combined search.
7. If an author has mostly published with two initials, but has incidental publications
with one initial, a combined search with initials and full given name (e.g. "CT Kulik"
OR "Carol Kulik") will usually capture all of their publications.
8. Lookup: Looks up the current query, using the internal cache if possible. This means
that if you have run the query before, the results will come from the cache and the
search is not submitted to Google Scholar again
OSSLM-2016
9. Lookup Directly: Look up directly means the query will be sent to Google Scholar
directly and result will be shown on screen. It does not depend on cache.
3.5 Internal Search
3.5.1 Author Search One can look for his publications indexed under Google Scholar along with citation using
query like “Mike Thelwall” .
3.5.2 Journal Search One can look for his publications in a particular Journal indexed under Google Scholar
along with citation.
3.5.3 General Citation Search Under General Citation Search implies that , it is possible to execute multiple query at a
time, for example, one can search articles on nanoparticles from the “Journal of
Nanotechnology and Nanoscience’’ published during 2013-2015.
3.6 External Data. The main attraction of the software is that you can work with external data.
3.6.1 Scopus The data can be exported or downloaded from Scopus. The data should be
exported/downloaded either .csv or .ris format. Besides, downloaded data should be
OSSLM-2016
compatible with “citation information only”. It is advisable to use *.csv format because
sometimes, *.ris file format, we failed to get the citation information.
Click on the drop-down field next to export. You will get a pop-up menu. On that menu,
select CSV export and save the file with a meaningful name. Do not change anything under
"Choose the information to be exported". Doing so will make the file unreadable for
Publish or Perish. This provides you with a *.csv file that you can import into Publish or
Perish, simply by clicking on the New Import icon [See multi-query center] or by clicking
File/Import
3.6.2 Web of Science The can also be downloaded from Web of Science database. The data should be saved in a
plain text format.
OSSLM-2016
You will then see the following screenshot. Click on OK and Publish or Perish will import
the Scopus data into the multi-query center. The results will appear in the folder that you are
in when you import the data.
The result is a neat list of publications that can then be sorted in any way you want. Statistics
and results can be exported for further analyses just like the results of Google Scholar
searches.
3.7 Export Results
Results can be exported into CSV format as: File>Save as CSV
Copy> Statistics with excel header=>details statistics will be copied
Copy> Results with excel header => details result will be copied
3.8 Results/ Descriptive Statistics a) Result of the authorship analysis for the data of IIT Kharagpur for 2015. The data
was downloaded from Scopus and imported to POP software and made the analysis.
OSSLM-2016
60 paper(s) with 1 author(s)
530 paper(s) with 2 author(s)
495 paper(s) with 3 author(s)
330 paper(s) with 4 author(s)
170 paper(s) with 5 author(s)
96 paper(s) with 6 author(s)
52 paper(s) with 7 author(s)
44 paper(s) with 8 author(s)
16 paper(s) with 9 author(s)
9 paper(s) with 10 author(s)
12 paper(s) with 11 author(s)
1 paper(s) with 12 author(s)
4 paper(s) with 13 author(s)
1 paper(s) with 15 author(s)
2 paper(s) with 16 author(s)
1 paper(s) with 17 author(s)
2 paper(s) with 23 author(s)
1 paper(s) with 33 author(s)
1 paper(s) with 48 author(s)
3.9 Limitations of Publish or Perish (POP) There are some limitations of POP software:
a) Publish or perish- supports all file format except BibTeX
b) Google Scholar-- CSV, EndNote, RIS
c) Scopus -- comma separated file , RIS
d) Web of Science –supported Tab Delimited Win or Win, UTF-8, Plain Text
e) Web of Science not supported Mac-based format
f) POP under Google Scholar supports only 1000 papers
g) POP under Microsoft Academic Search supports 1888 papers
Note: If you are using the Publish or Perish software in one of your research articles or
otherwise want to refer to it, please use the following format: Harzing, A.W.
(2007) Publish or Perish, available from http://www.harzing.com/pop.htm
OSSLM-2016
4. SciMAT: Science Mapping Analysis Tool
4.1 Why SciMAT?
SciMAT– a science mapping analysis tool is a open source software written in java
programming language and compatible in LINUX, windows and other operating systems.
SciMAT generates a knowledge base with key parameters like author, title, publishers,
keywords, journal name and references etc. Knowledge base has sixteen entities like author,
document, affiliation etc.
SciMAT has three modules
1. A module dedicated for management of knowledge base
2. A module responsible for carrying science mapping;
3. A module for visualization of generated results and maps.
4.2 SciMAT-1.1.03: Installation
Step-1: Download the software from http://sci2s.ugr.es/scimat/download.html
Step-2: Installation of Java version 6 or more[given in a CD]
Step-3:To run SciMAT v1.1.03, unpack this zip file and execute the SciMAT jar file.
Step-4: User guide can be downloaded here:
http://sci2s.ugr.es/scimat/software/v1.01/SciMAT-v1.0-userGuide.pdf
Step-5: Install sqlite viewer [free to download and it is optional]
4.3 Main Components and sub-components in SciMAT
Fig-1: Main window of SciMAT
OSSLM-2016
4.4 SciMAT has the following components as:
4.4.1 File
New Project: You need to create a project for a particular data. You need to specify
the path where the new project will be saved. First time, no need to open the
project because it is already active.
Open Project: Once you create a new project, you need to open it for further
processing and analysis;
Close Project: Once the work is over, you need to close the project.
Add files: It is possibleto add files for analysis. It is better to download data from
scopus*.ris format and add here.
Export > Groups
Export option works after completing the Group work. Exported file can be
opened in text editor and it is an xml file.
OSSLM-2016
Import> Groups
Exit: Exit from the project
4.4.2 Edit
Global Replace
Undo
Redo
4.4.3 Knowledgebase and its Managers Knowledge Base
SciMAT generates a knowledge base from a set of scientific documents, where the relations
of the different entities related with each document (authors, keywords, journal, references,
etc.) are stored. The knowledge base is composed of sixteen entities.
Knowledge Base Manager
The module to manage the knowledge base is responsible for building it, importing the data
from different bibliographical sources, and cleaning and fixing the possible errors in the
entities.
Step-1: The first step in this module is to build a new project or load an existing one. It
can be done through the menu File or using the buttons of the toolbar. If a new project is
selected, a new window will appear asking for the path where the knowledge base file will
be stored and the name of the file. We can give any extension for the file.
Step-2: Once you create a new project or you open the project, New Project and Existing
Project option will be inactive and Add file option, Knowledge Base Manager, Group
Manager, Export and Import option will be activated.
Step-3: The add files option allows the user to add bibliographical information, exported
from bibliographical databases to the knowledge base. Particularly, SciMAT is able to read
bibliographical information exported in ISI Web of Knowledge format (ISI-CE) or RIS
(Scopus) format. While adding files you should follow Add Files> In RIS (May 2004
format). Then it will ask a) will you want to import data with reference, if so it will
delay the process. You can say yes.
Step-4: The manager allows us to add a new Document (filling manually each attribute) by
clicking Add button.
OSSLM-2016
Supported File Format Scopus>RIS format
Web of Science> Text format
4.4.3.1 Author Author>Author Manager: Name of author along with no. of documents
Author>Author Affiliation Manager: It gives department-wise output or provides
author’s affiliation
Author>Author Group Manager: It will be activated after the group work
4.4.3.2 Documents
Document>Document Manager: Gives Article, author, year, citation
4.4.3.3 Journal
Journal>Journal Manager: Gives source along with no. of documents
4.4.3.4 References: Reference represents the intellectual base of the document. A
document has a set of references associated with it. Each reference can be
represented by different documents. References may be author-reference
and Source reference.
Reference Manager: Gives cited documents and citing document
4.4.3.5 Periods
Period>Periods Manager: Gives group of documents
Period>Period Manager>Add>2000-2005 >click add
Period will be added> go to right panel>Add> Add document
OSSLM-2016
4.4.3.6 Publish Dates
Publish date>Publish date Manager: It provides year-wise record of documents
published under the query.
4.4.3.7 Subject Categories
A document can have one journal or conference and publish date associated with it.
But, both entries (Journal and published date) can have a set of documents
associated. These entries can have an associated subject category, which represents a
global category. A journal can also be associated with many subject categories.
Subject Category Manager> Click on Add on left panel> type the name of subject
category. More than one subject category can be added
4.4.3.8 Words
Words Manager: Gives author keywords with number of documents. For example, the
ISIWoS adds a set of keywords called ISI Keywords PLUS to each document. In this sense,
the entity Word represents a descriptive term of a document. A set of Words can appear in
different Documents and each Document can have a set of Words.
The words provided by the authors (author's words), provided by the database (source's
words), or added in the pre-processing step (extracted words).
…………………………………………………………………………………………..
OSSLM-2016
Working with Group
Working under Group- ‘Move To’ The move or join capability allows us to join a set of entities under other. It is especially
useful when we are working with groups. Once we have selected a set of entities that we
want to join, a new dialog (click on ‘Move to’) will appear. In this dialog box, one has to
select one record under which remaining 26 documents will join. These 26 documents have
an association with the master entry.
Step-1:Go to Document Manager
Step-2: Make Filter with keyword like ‘nanostructure’ and click on Filter; Result will
show that there are only six documents out of 1298 documents
Step-3: As soon as you select these six documents, Move to button will be activated;
Step-4: Click on ‘Move to’ button, new window will open.
Step-5: The user should select one main entry among six documents under which
remaining five documents will join. The main target entry will maintain its association with
other entries.
Working with Group Set
To the entry manager the manual groups set manager have a common structure: the left-side
shows a list of defined groups, and the right-side shows the entities associated with the
selected entity (header-table) and the entities without groups (foot-table). The manual set
group manager allows us to add a new group, delete a set of groups, join a set of groups
under other, and finally edit them.
4.4.4 Group Set
A group is a set of items that represents the same entity (e.g. E. Garfield &
Eugene Garfield]
A group can be marked as stop group and it will not be taken part as science
mapping analysis.
4.4.5 Author
Author Group Manual Set>Here one can create a new author group by
clicking Add button. Then select the author group from the left panel and
add authors of your choice from the right panel at middle using up and
down arrow.
Find Similar Authors by Distances
OSSLM-2016
4.4.6 Author-References
Author-References Group Manual Set
Find Similar Author Reference by Distances
4.4.7 References
Reference Group Manual Set
Find Similar References by Distances
4.4.8 Source References
Source References Manual Set
Find Similar Source-References by Distances
4.4.9 Words
Word Group Manual Set
Find similar words by plurals
Find similar words by distance
…………………………………………………………………………………….
4.5 Statistics[based on Group Set]
Author Groups Statistics
References Groups Statistics
Words Groups Statistics> provides period wise documents with mean, median,
standardization and variance.
4.6 Analysis [based on Group Set]
Make Analysis
Load Analysis
4.6.1 Science mapping analysis wizard
Step-1: In the first step the user has to select the periods that he/she wants to analyze. Each
period will produce a map. These periods will be used in the longitudinal or temporal
analysis in order to study the structural evolution of the field.
Step-2: The second step is the selection of the unit of analysis. As the unit of analysis the
user can select any of the five groups existing in the knowledge base: Author Group,
Author-Reference Group, Source-Reference Group, Reference Group, or Word Group. Only
one of them can be selected. If the Word Group has been selected, the role of the word with
which the user wants to perform the analysis has to be chosen.
OSSLM-2016
Step-3: The third step is the data reduction. SciMAT allows the data to be filtered using a
minimum frequency threshold. For each selected period, a threshold must also be selected.
That is, only the item that appears in almost n documents in a given period will be taken into
account.
Step-4: The fourth step is the selection of the way in which the network will be built: co-
occurrence or coupling. Using co-occurrence, co-author, co-word, co-citation (using the
references), author co-citation (using the authors-reference), and journal co-citation (using
the sources-reference) network can be built.
b) Co-occurrence
c) Basic Coupling
d) Aggregated Coupling based on Author
e) Aggregated Coupling Based on Journal
Step-5: The fifth step is the network reduction. SciMAT allows the network to be filtered
using a minimum edge value threshold. For each selected period, a threshold value must be
set. That is, only the edges with a value greater or equal to n in a given period will be taken
into account.
Step-6: The sixth step is the selection of the similarity measure used to normalize the
network. SciMAT allows the user to choose the similarity measures commonly used in the
OSSLM-2016
literature to normalize networks: Association Strength, Equivalence Index, Inclusion Index,
Jaccard’s Index and Salton’s Cosine.
Step-7: The seventh step is the selection of the clustering algorithm used to get the map and
its associated clusters or subnetworks.
Clustering Analysis
Cluster analysis is the process of identification of homogeneous group of objects. It is a
collection of statistical methods. The objective of cluster analysis is to find out optimum tree
or set of clusters. There are various ways to find out clusters: hierarchical cluster method
and k-means method.
Clustering is the process of classifying objects into sub-groups based on some similarity
criteria. Cluster analysis or clustering is the task of grouping a set of objects in such a way
that objects in the same group (called a cluster) are more similar (in some sense or another)
to each other than to those in other groups (clusters).
The agglomerative algorithm for hierarchical clustering starts by placing each of the
objects in the data set in an individual cluster and then gradually merges those individual
clusters.
The divisive algorithm however, starts with the whole data set as a single cluster and then
breaks it down into fewer clusters. Single Link and Complete Link are two hierarchical
agglomerative clustering procedures.
– Single Link Clustering Algorithm
– Complete Link Clustering Algorithm
Step-8: The eighth step is the selection of the documents mapper used in the performance
analysis. SciMAT incorporates five different document mappers for co-occurrence
networks:
Step-9: The ninth step is the selection of the performance and quality bibliometric
measures. SciMAT adds by default the number of documents as performance measure.
Moreover, the citations of a set of documents are used in order to assess the quality and
impact of the clusters. In this sense, basic measures such as the sum, minimum, maximum
and average citations, or complex measures such as the h-index, g-index, hg-index or q2-
index can be selected.
OSSLM-2016
Step-10: The tenth step is the selection of the similarity measure used to build the evolution
map and the overlapping map. SciMAT allows us to choose between: Association Strength,
Equivalence Index, Inclusion Index, Jaccard’s Index and Salton’s Cosine.
Step-11: Finally, the eleventh step is responsible to perform the science mapping analysis.
This process can be cancelled at any time. At the end, the analysis has to be saved (a new
save window will be open when the process end), and then the results are visualized in the
visualization module.
Results
OSSLM-2016
OSSLM-2016
BibExcel Manual
Introduction
Bibexcel is a great tool for citation analysis. BibExcel is designed by OllePersson from
Umea University, Sweden to assist a user in analyzing bibliographic data.
Bibexcel: Features
• Co-citation, bibliographic coupling, mapping and clustering analysis;
• Bibexcel allows interaction with other software like Pajek, Excel and SPSS;
• Able to import many different type of data besides Web of Science
• Flexibility data management and analysis
Menu
• File Edit DOC file
• Edit OUT file Add data Classify
• Analyze Misc
• Mapping Help
Bibexcel
OSSLM-2016
How Bibexcel works?
• Download bibliographic data either from Scopus or Web of Science
• Preparing Data, which will be suitable for use in Bibexcel;
• Analysis the data ;
• Preparing Reports.
What Bibexcel can do?
• Step-1: Download data from Web of Science in Plain text format OR from Scopus
in RIS format.
• Step-2: Restructuring of downloaded data
• Step-3: Creating an OUT-file.
• Step-4: Analysis of data;
• Step-5: Export files to Pajek for visualizations
How to do the Re-structuring of data?
Step-1: There are two steps of retracting the data
– Insert carriage return in the file
– Convert the bibliographic record to DIALOG format
Step-2: Carriage return can be done
– Go to Bibexcel menu. Edit doc file>Replace line feed with carriage return. <
*.tx2 file will be created>
Step-3: to convert the bibliographic record to DIALOG format;
– Selecting the file *.tx2 and then choose option like
– Misc>Convert to Dialog format> Convert from Web of Science, OR
– Misc>Convert to Dialog format> Convert from Scopus RIS format. < Result:
*.doc file will be created>
WoS -Record Looks Like! • PT- Journal|
• AU- Brown S; Blackmon K|
• TI- Aligning manufacturing strategy and business-level competitive strategy in new
competitive environments: The case for strategic resonance|
• SO- JOURNAL OF MANAGEMENT STUDIES|
• NR- 190|
• CD- 1998, IND WEEK 1207, P22, V247; 1998, IND WEEK 1207, P24, V247;
ADLER PS, 1990, P55, CALIFORNIA MANAG SPR; ANDERSON J, 1991, V1,
P86, INT J PRODUCTION OPE; ZAJAC EJ, 2000, V21, P429, STRATEGIC
OSSLM-2016
MANAGE J; ZAJAC EJ, 1989, V10, P413, STRATEGIC MANAGE J9- J
MANAGE STUD-OXFORD|
• JN- JOURNAL OF MANAGEMENT STUDIES, 2005, V42, N4, P793-815|
• UT- ISI:000229369000004 ER ||
Scopus- Record Looks Like! [*.doc] • TY- JOUR|
• TI- Surface modification of polyacrylonitrile co-polymer membranes using pulsed
direct current nitrogen plasma|
• T2- Thin Solid Films|
• VL- 597|
• SP- 171|
• EP- 182|
• PY- 2015|
• DO- 10.1016/j.tsf.2015.11.050|
• AU- Pal, D.; Neogi, S.; De, S.|
• N1- Export Date: 17 April 2016|
• M3- Article|
• DB- Scopus|
• UR- http://www.scopus.com/inward/record.url?eid=2-s2.0-
84959475022&partnerID=40&md5=7f25e70a7940e813bd6cfcd7c1a3ac78|
• ER- ||
How do you analyze the file in Bibexcel?
How do you create an OUT file?
OUT file is a tab delimited text file. It can be imported into excel. It can be created as:
Step-1: Select the *.doc file.
Step-2: Entering the field TAG (e.g. AU for Author, TI for Title) under Frequency
Distribution Panel in the box marked “Old Tag”;
Step-3: Go to select field to be analyzed; From drop down, choose Any; separated field
option;
Step-4:Choose ‘Whole String’ under Frequency distribution panel
Step-5:Click on “Prep”
OSSLM-2016
Step-6: OUT file will be generated <with extension *.OUT> after answering questions.
Creating Various file types in Bibexcel Step-1: *.tx2 file [carriage return]
Step-2: *.doc file
Step-3: *.OUT file
Step-4: *.CIT file [output file under frequency dist.]
Step-5: *.OUX file
Step-6: *.COC file
Step-7: *.jn1 or *.jn2 etc file
Result-1: Author / title/ journal with Document Identification No
How to create Author*.OUT file? Step-1: Selecting *.doc file
Step-2:Select field delimiter as Any ; separated field [from Select field to be analyzed
box]
Step-3: Type AU in the box marked under Old Tag Box
Step-4: Press the button Prep.
Step-5: Click ok , ok and yes
Step-6: Result will be generated in [*.OUT file].
How to save the *.OUT file ?
Step-7: The list of author or Title can be saved in a new file as
– Click on view whole file
– Type a file name under the box ‘ Type New file name here’
– Click on Start under select documents
Result-2: Author / title/ journal with no. of articles
Step-1: Make OUT file for AU Tag and Selecting *.OUT file
Step-2:Select field delimiter as Any ; separated field [from Select field to be analyzed
box]
Step-3: Type AU in the box marked under Old Tag Box
Step-4: Choose Whole String [from frequency distribution]
click on sort descending and Press the button “Start.”
OSSLM-2016
Step-5: Click ok , ok and yes
Step-6: Result will be generated [*.CIT file].
How to save the *.CIT file ?
Step-7: The list of author or Title can be saved in a new file
– Click on view whole file
– Type a file name under the box ‘ Type New file name here’
– Click on Start under select documents
Result-3: List of Journals with Citations
Step-1: Select *.doc file
Step-2: Create *.OUT file by selecting “Cited Journals with whole string” and Any ;
separated field and click on “Prep”. Result is *.out file with doc. ID and Journal
Name
Step-3: Select *.out file and Enter Tag “TC” in the ‘Old Tag’ and click “Add Field to
unit” and then say “No”; Result is *.jn1 file . The List of journals with doc. ID and
Citation.
Note: you can use *.jn1 to create journal map as: Mapping>Create Pajek map file.
Remove Duplicates 1. Select *.doc and view whole file
2. Put Old Tag AU
3. Check duplicates
4. Frequency Distribution- Whole String
5. Select Field Any comma separated
6. Click on “Prep” to create *.out file
7. Select *.out file and click on ‘Start’ to remove duplicate authors. Result is the
*.cit file
Sort by Number of Documents 1. Select *.out file and view whole file
2. Put Old Tag AU
3. Check sort descending
OSSLM-2016
4. Frequency Distribution- Whole String
5. Select Field Any comma separated
6. Click on “Start”
Bibexcel: Frequency Distribution
• First of all choose OUT file. • Fractionalize : If a document written by two authors, each will contribute half an
article. If the box is unchecked , it implies that we have chosen Whole Counts
method:,
• Whole Counts: If the Fractionalize box is checked, it implies that we have chosen
Fractionalize method.
• Whole String : We select Whole string from the scrollbar under ‘Select type of
unit’, Bibexcel will count whole author name.
•
OSSLM-2016
Co-occurrence analysis [co-word analysis/ Co-author analysis]
Creating Pajek Files in Bibexcel
Step-1: Download data from Scopus in *.ris format.
Step-2: Go to bibexcel menu>Edit doc file>Replace Line feed with Carriage return. Got
*.tx2 file
Step-3: Go to bibexcel menu>Misc>Convert to Dialog format>Convert from Scopus RIS
Format.Got *.doc file
……………………………………………………….
Sample Record TY- INPR|
TI- Interpreting correlations between citation counts and other indicators|
T2- Scientometrics|
J2- Scientometrics|
SP- 1|
EP- 11|
PY- 2016|
DO- 10.1007/s11192-016-1973-7|
SN- 01389130 (ISSN)|
AU- Thelwall, M.|
KW- Altmetrics; Citation analysis; Correlation; Discretised lognormal; Indicators; Simulation|
PB- Springer Netherlands|
N1- Export Date: 28 May 2016|
M3- Article in Press|
DB- Scopus|
N1- Article in Press|
LA- English|
RP- Thelwall, M.; Statistical Cybermetrics Research Group, University of Wolverhampton,
Wulfruna Street, United Kingdom; email: [email protected]|
UR- https://www.scopus.com/inward/record.uri?eid=2-s2.0-
84966667404&partnerID=40&md5=3fce45bd36c86dc44b08849c8be47787|
ER- ||
Step-4: Select *.doc and go to ”Frequency distribution”-box, choose from the drop-down
menu ”Whole string” check the checkbox labeled ”Make new out-file” and write in Old
tag-field AU (Author). Click Start. This will create a new file, a.oux-file.
OSSLM-2016
Step-5: Choose the .oux -file and open it to The List by clicking on “View file”. Then, from
the “Select field to be analysed…” -box, choose from the drop-down menu ”Any;
separated field” press on the Prep-button. This will create a new .out -file, where all the
authors are listed by cases.
Step-6:From The List, mark the *.out-file and choose Analyze-> Co-occurence-> Make pairs
via listbox. Answer NO to the first question and OK to the second. This will result in an
.coc-file.
Step-7:Use the .coc file to create a network file. Go to Mapping and choose “Create a .net file
for Pajek. Say No and yes. The result will be un-directed graph. If we say Yes and Yes , the
result will be directed graph.
Step-8:Use *.cit file, to create VEC file, go to Mapping > create VEC file. This will result in an
*.vec file.
Step-9: To partition the co-citation matrix, Use *.coc file. Go to Analyze>Co-
occurrence>Cluster pair . The result will create three files *.pe2, *.pe3, *.pe4, and *.pe5
Step10: Use *.pe2, Go to menu>Mapping>Create clu file
Step-11: Importing files *.net, under network, *.vec under vectors, and *.clu under
Partition in Pajek.
Step-12: After we have opened these files in Pajek, we choose the following option from
Pajeck Menu: Draw>Draw-Partition-Vector
OSSLM-2016
Conclusion
The manual of Publish or Perish (POP), SciMAT and Bibexcel are some of the tools for
Bibliometric or scientometrics analysis using the basic principles of Bibliometrics /
scientometrics techniques. For visualization of results, Pajek is most suitable and easy tool.
Acknowledgements
We acknowledge all the developers of the software and organizations who are directly and
indirectly involved in the software development or upgradation process. Also, each software
Manual are really helpful to prepare the working manual for the workshop.