+ All Categories
Home > Documents > Event-Based Infrastructure for Reconciling Distributed Annotation Records

Event-Based Infrastructure for Reconciling Distributed Annotation Records

Date post: 31-Dec-2015
Category:
Upload: inga-decker
View: 27 times
Download: 0 times
Share this document with a friend
Description:
Event-Based Infrastructure for Reconciling Distributed Annotation Records. Ahmet Fatih Mustacoglu [email protected] Advisor: Prof. Geoffrey C. Fox. Outline. Introduction Motivations and research issues Architecture Event-Based Infrastructure Measurements and Analysis Conclusions - PowerPoint PPT Presentation
Popular Tags:
26
Event-Based Infrastructure for Reconciling Distributed Annotation Records Ahmet Fatih Mustacoglu [email protected] Advisor: Prof. Geoffrey C. Fox
Transcript
Page 1: Event-Based Infrastructure for Reconciling Distributed Annotation Records

Event-Based Infrastructure for Reconciling Distributed

Annotation Records

Ahmet Fatih [email protected]

Advisor: Prof. Geoffrey C. Fox

Page 2: Event-Based Infrastructure for Reconciling Distributed Annotation Records

Outline Introduction Motivations and research issues Architecture

Event-Based Infrastructure

Measurements and Analysis Conclusions

Contributions and Future Works

204/19/23 Ahmet Fatih Mustacoglu

Page 3: Event-Based Infrastructure for Reconciling Distributed Annotation Records

Online Collaboration Rapid development of annotation tools and services Aimed at fostering online collaboration and sharing

between users and communities:Bookmarking Tools supports annotation using keywords

called tags and sharinge.g. del.icio.us

Tools for annotation and sharing of scholarly publicationsConnoteaCiteulikeBibsonomy

Social Networking Toolse.g. MySpace, and Facebook

Video Sharing and annotatione.g. YouTube

304/19/23 Ahmet Fatih Mustacoglu

Page 4: Event-Based Infrastructure for Reconciling Distributed Annotation Records

Motivations Various annotation tools, different and limited

metadata storageMultiple instances of metadata about the same document

No time-stamp info for updated recordsCausing inconsistencies

Lack of interoperability between annotation sitesApplying service-based architecture to annotation systems

Unification and Federation of major annotation tools to use them with added capabilities for scientific researchManagement of metadata coming from different sourcesAdding missing services

Upload and extract metadata from/to a repository404/19/23

Ahmet Fatih Mustacoglu

Page 5: Event-Based Infrastructure for Reconciling Distributed Annotation Records

Research Issues I Need an infrastructure to manage metadata

Dealing with metadata coming from several sources Issues with using annotation tools and their services with

added capabilities Extract and upload data to/from tools

More metadata support for documentsProviding communication between annotation tools Issues with document tracking and access to previous

versions of documents Consistency Enforcement

Issues with maintaining consistency between copies of a record stored at various annotation tools

504/19/23 Ahmet Fatih Mustacoglu

Page 6: Event-Based Infrastructure for Reconciling Distributed Annotation Records

Research Issues II Unification

How to combine different annotation tools under the same umbrella?

FederationHow to federate major annotation tools?

Scalability System behavior for increased message rate per second

Flexibility and Extensibility Interoperable with other clientsEase of integrating an annotation tool

604/19/23 Ahmet Fatih Mustacoglu

Page 7: Event-Based Infrastructure for Reconciling Distributed Annotation Records

Ahmet Fatih Mustacoglu 7

Event-based Infrastructure

and Consistency Enforcement Architecture

04/19/23

Page 8: Event-Based Infrastructure for Reconciling Distributed Annotation Records

KEY CONCEPTS

Distributed Annotation Record (DAR): Collection of metadata stored at an annotation tool.

Digital Entity (DE): A digital collection of metadata for a citation stored in a system database forms a primary copy of a DAR.

Event: A time-stamped action on a digital entityMajor Events:

Insertion or deletion of a digital entityMinor Events:

Modifications to an existing digital entity04/19/23 Ahmet Fatih Mustacoglu 8

Page 9: Event-Based Infrastructure for Reconciling Distributed Annotation Records

Communication Manager Responsible for providing communication between

annotation tools and update manager and digital entity manager via gatewayse.g. Connotea gateway

Utilizes a gateway for each annotation tool, and a parserRetrieve records in XML formatParse and pass records to update managerPost updates coming from digital entity manager

to annotation tools

904/19/23 Ahmet Fatih Mustacoglu

Page 10: Event-Based Infrastructure for Reconciling Distributed Annotation Records

10

Communication Manager

04/19/23 Ahmet Fatih Mustacoglu

Page 11: Event-Based Infrastructure for Reconciling Distributed Annotation Records

GatewayInterface between Event-based infrastructure and each annotation tool Provides extensibility A gateway needs to be deployed for each annotation tool that need to be integrated into the system

11

GatewaysEBI Modules

EBI

Annotation Tools

04/19/23 Ahmet Fatih Mustacoglu

Page 12: Event-Based Infrastructure for Reconciling Distributed Annotation Records

Ahmet Fatih Mustacoglu

Annotation Tools Update Manager

Responsible for: Retrieving the records from annotation tools periodically (Time-based consistency approach by pulling records) Finding out the updatesPassing the updates to Digital Entity Manager so that they can be applied on the primary copy of each record

1204/19/23

Page 13: Event-Based Infrastructure for Reconciling Distributed Annotation Records

Digital Entity Manager Responsible for:

Events and dataset creationEvent Processing

Manages updates made on the primary copy of a digital entity

Updates primary copy located on a system databasePass updates to the Communication Manager (Strict consistency by pushing updates immediately)

Handles periodic update management Deals with history and rollback management of a digital entity

1304/19/23

Page 14: Event-Based Infrastructure for Reconciling Distributed Annotation Records

Key Design Features Representation of metadata of documents coming from various

sources as events Major and minor events More metadata support than major current annotation tools Ability to access and rollback to previous versions of documents

Unification and Federation of Connotea, Delicious, and Citeulike tools and support for web-based academic search tools for scientific research Using annotation tools’ existing services with added capabilities Support major online search tools to collect metadata Provides communication among annotation tools

Leveraging interoperability via service-enabled architecture Keeps records located at annotation tools and a system database

consistent with each other Adopting time-based and strict consistency approaches

04/19/23 Ahmet Fatih Mustacoglu 14

Page 15: Event-Based Infrastructure for Reconciling Distributed Annotation Records

Use Cases Collaborative Tagging

Updating or assigning keywords to records Collecting and managing citation metadata

Obtaining metadata about a publication through online scholarly search tools or annotation tools

Unification and Federation of Connotea, Citeulike and Delicious annotation toolsProviding schema and communication among them

Tracking updates to documentsRolling back to previous states

Building versions of documents based onUsers, groups, or all events

04/19/23 Ahmet Fatih Mustacoglu 15

Page 16: Event-Based Infrastructure for Reconciling Distributed Annotation Records

Benchmarks and Environments Message rate scalability investigation

MoreInfo operationWith DB AccessWith Memory Utilization

Update DE operation We have used:

Java 2 Standard Edition compiler with version 1.5.0_12. The maximum heap size of Java Virtual Machine (JVM) to1024MB

Apache Tomcat Server with version 5.0.28Apache Axis technology with version 1.2

1604/19/23 Ahmet Fatih Mustacoglu

Page 17: Event-Based Infrastructure for Reconciling Distributed Annotation Records

1704/19/23 Ahmet Fatih Mustacoglu

Page 18: Event-Based Infrastructure for Reconciling Distributed Annotation Records

Message rate scalability investigation result (DB Usage) - I

1804/19/23 Ahmet Fatih Mustacoglu

200 300 400 500 600 700 800 900 10002.5

3

3.5

4

4.5

5

5.5

6

6.5

7

message rate (message/per second)

aver

age

roun

d tr

ip t

ime

(mse

c)more info message rate

Page 19: Event-Based Infrastructure for Reconciling Distributed Annotation Records

Message rate scalability investigation result (Memory Utilization) - II

1904/19/23 Ahmet Fatih Mustacoglu

200 400 600 800 1000 1200 1400 16001.5

2

2.5

3

3.5

4

message rate (message/per second)

aver

age

roun

d tr

ip t

ime

(mse

c)

more info message rate

Page 20: Event-Based Infrastructure for Reconciling Distributed Annotation Records

Message rate scalability investigation result (Update DE) - III

2004/19/23Ahmet Fatih Mustacoglu

150 200 250 300 350 400 450 500 550 600 6502

2.5

3

3.5

4

4.5

5

5.5

6

6.5

7

message rate (message/per second)

aver

age

roun

d tr

ip t

ime

(mse

c)

update message rate

Page 21: Event-Based Infrastructure for Reconciling Distributed Annotation Records

Overheads for updating Memory and DB

04/19/23 Ahmet Fatih Mustacoglu 21

Message Rate (message/sec)

Overhead Time (DB) (msec)

STDev for DB Overhead Time (Memory) (msec)

STDev for Memory

266 6.88 0.85 0.93 0.37

432 6.79 0.75 0.98 0.34

593 6.85 0.74 0.96 0.35

715 6.75 0.74 0.96 0.34

803 6.82 0.75 0.96 0.35

877 6.88 0.71 0.96 0.36

963 6.89 0.79 0.98 0.35

1030 6.75 0.74 0.97 0.34

1088 6.86 0.72 0.97 0.35

1115 6.74 0.72 0.96 0.35

Page 22: Event-Based Infrastructure for Reconciling Distributed Annotation Records

Contributions System research

Event-based InfrastructureUnification, Federation and Interoperability of Connotea, Delicious and

Citeulike annotation toolsStrategies for increasing performance and scalability via in top-to

bottom approach and memory utilizationHandling various types of metadata coming from several sourcesFlexibility to access previous versions of a documentAdopting consistency enforcement approaches to maintain consistency Comprehensive benchmarks to evaluate the scalability of the prototype

system System software

An implementation of Event-based Infrastructure of Internet Documentation and Integration of Metadata (IDIOM) system

An implementation of consistency maintenance mechanism for Internet Documentation and Integration of Metadata (IDIOM) system 2204/19/23 Ahmet Fatih Mustacoglu

Page 23: Event-Based Infrastructure for Reconciling Distributed Annotation Records

Future Works

Applying Event-based Infrastructure to broader range of application use casesSupporting video collaboration tools (e.g. YouTube)Social networking (e.g. Facebook)

Unification and Federation of other academic collaboration and publication tools into EBIe.g. BibSonomy

From a single storage of metadata to distributed storages

2304/19/23 Ahmet Fatih Mustacoglu

Page 24: Event-Based Infrastructure for Reconciling Distributed Annotation Records

Publications Book Chapters

1. Web 2.0 for Grids and e-Science; Geoffrey C. Fox, Rajarshi Guha, Donald F. McMullen, Ahmet Fatih Mustacoglu, Marlon E. Pierce, Ahmet E. Topcu, David J. Wild. Published by Springer, 2007 - Grid Enabled Remote Instrumentation (Chapter: Web 2.0 for Grids and e-Science)

Publications1. Hybrid Consistency Framework for Distributed Annotation Records in a Collaborative Environment;

Ahmet Fatih Mustacoglu and Geoffrey Fox2. Web 2.0 for E-Science Environments Keynote Presentation; Geoffrey C. Fox, Marlon E. Pierce, Ahmet

Fatih Mustacoglu, Ahmet E. Topcu3. Integration of Collaborative Information Systems in Web 2.0; Ahmet E. Topcu, Ahmet Fatih

Mustacoglu, Geoffrey Fox, Aurel Cami4. SRG: A Digital Document-Enhanced Service Oriented Research Grid; Geoffrey Fox, Ahmet Fatih

Mustacoglu, Ahmet E. Topcu, Aurel Cami5. AJAX Integration Approach for Collaborative Calendar-Server Web Services; Ahmet Fatih

Mustacoglu, Geoffrey Fox6. A Novel Event-Based Consistency Model for Supporting Collaborative Cyberinfrastructure Based

Scientific Research; Ahmet Fatih Mustacoglu, Ahmet E. Topcu, Aurel Cami, Geoffrey Fox 7. iCalendar (RFC2445) Compatible Collaborative Calendar-Server Services; Ahmet Fatih Mustacoglu,

Wenjun Wu, Geoffrey Fox

2404/19/23 Ahmet Fatih Mustacoglu

Page 25: Event-Based Infrastructure for Reconciling Distributed Annotation Records

Tools for Annotation and Sharing Publications They are used for:

Collecting data and metadataAnnotating data Sharing papers

Limitations of these tools:Different and limited metadata storageNeed to enter same entry to each toolNo timing information for updated records Lack of ability to transfer data between tools Lack of services to extract and import data into a repository Lack of services to upload data from a repository

2504/19/23 Ahmet Fatih Mustacoglu

Page 26: Event-Based Infrastructure for Reconciling Distributed Annotation Records

04/19/23 Ahmet Fatih Mustacoglu 26


Recommended