Date post: | 26-Dec-2015 |
Category: |
Documents |
Upload: | henry-james |
View: | 216 times |
Download: | 2 times |
Semantic Analytics on Social Networks: Experiences in Addressing the Problem of
Conflict of Interest Detection
Boanerges Aleman-Meza, Meenakshi Nagarajan, Cartic Ramakrishnan, Amit P. Sheth, I. Budak Arpinar,
LSDIS Lab, Dept. of Computer Science. University of Georgia Athens, (boanerg, bala, cartic, amit, budak)@cs.uga.edu
Li Ding, Pranam Kolari, Anupam Joshi, Tim FininDepartment of Computer Science and Electrical Engineering
University of Maryland, Baltimore County Baltimore, MD 21250(dingli1, kolari1, joshi, finin)@cs.umbc.edu
WWW 2006
2
Conflict of Interest (COI) Detection Problem
The NIH (National Institutes of Health) defines COI in the context of the grant review process as: “A Conflict Of Interest (COI) in scientific peer review exists when a reviewer has an interest in a grant or cooperative agreement application or an R&D contract proposal that is likely to bias his or her evaluation of it. A reviewer who has a real conflict of interest with an application or proposal may not participate in its review.”
3
Abstract
A Semantic Web application It detects Conflict of Interest (COI) relationships among
potential reviewers and authors of scientific papers. It discovers various ‘semantic associations’ between the
reviewers and authors. Integrated entities and relationships from two social
networks: “knows” - FOAF (Friend-of-a-Friend) social network “co-author” - DBLP bibliography
4
Introduction
Social Network on the Web Friendship or personal ties
LinkedIn.com MySpace.com Friendster Hi5
College student Facebook.com Club Nexus (Stanford students)
Social Network application Yahoo! 3600
Dodgeball.com (by Google)
5
Introduction
COI detection systems EDAS
edas.info/doc Microsoft Research CMT tools
msrcmt.research.microsoft.com/cmt/ Confious
www.confious.com
6
Introduction Open resources
Real-world examples Addressing the problem of integrating different social networks
Two open resources for evaluations “co-author” relationship - DBLP bibliography
dblp.unitrier.de “knows” relationship - FOAF (Friend-of-a-Friend) social network
Swoogle
7
Motivation and Background
Obtaining high quality data
Semantic Association
Reviewer vs. Author
8
Integration of Two Social Networks
FOAT The dataset includes 207,000 person entities from 49,750
FOAF documents collected during the first three months of 2005.
DBLP It is one of the best formatted and organized bibliography
datasets. DBLP covers approximately 400,000 researchers who have
publications in major Computer Science publication venues.
9
1. Metadata Extraction
10
2.Cleaning FOAF and DBLP Datasets – 1/2
DBLO-SW (Semantic Web): 38,027 person entities
11
2.Cleaning FOAF and DBLP Datasets – 2/2
FOAF-EDU : 21,308 person entities
12
3.Entity Disambiguation - Algorithm
Name-Reconciliation algorithm: Dong, X., Halevy, A. and Madhavan, J., Reference Reconci
liation in Complex Information Spaces. In ACM SIGMOD Conference, (Baltimore, Maryland, 2005). atomic attributes: similarity of their names and affiliations … associations attributes: common co-author relationship..
Weights are manually assigned
13
14
3.Entity Disambiguation - Results
Entity Disambiguation Results
6 random samples, each having 50 entity pairs 1 false positive , 16 false negatives
15
3.Entity Disambiguation - Analysis
16
Semantic Analysis for COI Detection
Levels of Conflict of Interest
An algorithm for COI detection quantity and strength of relationships ‘distance’ between a reviewer and an author.
17
Weighting Relationships for COI Detection
foaf:knows from A to B Potential positive bias from A to B Not necessarily imply a reciprocal relationship from B to A.
We assigned a weight of 0.5 to all 34,824 foaf:knows relationships in the FOAF-EDU dataset.
co-author relationship It is a good indicator for collaboration and/or social interact
ions among authors.
18
Weighting Relationships for COI Detection
For any two co-authors, a and b,
let represent the set of relationships where a co-authors a publication with b
We define the weight of the co-authorship relationship from a to b as follows:
Pa represent the set of papers published by a
19
Detection of Conflict of Interest – 1/5 Anyanwu, K. and Sheth, A.P., ρ-Queries: Enabling Querying for Semantic Associations o
n the Semantic Web. In Twelfth International World Wide Web Conference, (Budapest,Hungary, 2003), 690-699.
20
Detection of Conflict of Interest – 2/5
Algorithm for COI detection works as follows: First, it finds all semantic associations between two entities. Second, each of the semantic associations found is
analyzed by looking at the weights of its individual relationships.
Thresholds were required to decide what weight values are indicative of strong and weak collaborations. The following cases are considered:
Reviewer and author are directly related Reviewer and author are not directly related but they are directly
related to (at least) one common person. Reviewer and author are indirectly related
21
Detection of Conflict of Interest – 3/5
(i) Reviewer and author are directly related Through foaf:knows and/or co-author The assessments are: “high”
At least one relationship have weight on the range medium-to-high (i.e., weight ≥ 0.3)
The assessments are: “medium” At least one relationship have weight on the range low-to-medium
(i.e., 0.1 ≤ weight < 0.3) The assessments are: “low”
At least one relationship have low weight (i.e., weight < 0.1)
22
Detection of Conflict of Interest – 4/5
(ii) Reviewer and author are not directly related but they are directly related to (at least) one common person. The common person is an intermediary. The assessments are: “medium”
Case1: 10 intermediaries in common. Case2: The relationships connecting to the intermediary (i.e., one fr
om the reviewer and another from the author) have weight on the range medium-to-high (i.e., weight ≥ 0.3).
If neither of these two cases holds, then the assessment is “low.”
23
Detection of Conflict of Interest – 5/5
(iii) Reviewer and author are indirectly related Through a semantic association containing three relationshi
ps. In this case, the assessment is “low” level of potential COI.
The assessments are: “medium” have weight on the range low-to-medium (i.e., 0.1 ≤ weight < 0.3)
24
Experimental Results
25
Conclusion Conflict of Interest Detection fits in a multi-step
process of a class of Semantic Web applications. Identified some major stumbling blocks
Metadata extraction Data integration algorithms and techniques Entity disambiguation Metadata and Ontology representation
COI detection is based on semantic technologies techniques Integrated social network from the FOAF social network
and the DBLP co-authorship network.
26
Conclusion
A demo of the application is available (lsdis.cs.uga.edu/projects/semdis/coi/).