Merck's Information Landscape Knowledgebase - Eugenio, Clark

Post on 29-Aug-2014

207 views 3 download

Tags:

description

Merck’s Informatics IT team has developed the Information Landscape Knowledgebase, a semantically rich, intuitively accessible model of information types and sources across Merck. In this presentation, we discuss how the solution takes advantage of underlying semantic models and novel visualizations to address the needs of both scientists and data stewards while providing valuable insights for IT resourcing. Pharmas of all sizes possess a wealth of information across hundreds of data sources at all levels of the organization from basic and clinical research to drug safety and commercial competitive intelligence. Without a cohesive understanding of which data sources include which types of information, how they are used and who uses them, the potential value of this information is far from realized. IT organizations have used traditional enterprise data management solutions to track data sources as best they can. However, complexity of these tools does not allow the data stewards to provide first-hand insight about their data sources or scientists to explore and discover new sources. The results are under-utilization and duplication of data, leading to undiscovered knowledge and inefficient IT spending. By leveraging semantic models and visualizations, our Information Landscape Knowledgebase overcomes these traditional challenges to weave together an easily understood, discoverable, and analyzable view into a huge diversity of information sources across Merck.

transcript

Merck’s Information Landscape KnowledgebaseKelly Clark & Charisma Eugenio June 10, 2014

Modeling Merck’s Information Landscape(MRL = Merck Research Laboratories)

Metabolomics

Gene expression

SequencingPharmacogenetics

Imaging

Safety/PV Real world data

Clinical trials

Global EMRs & registries Social networks / patient communities

MonitoringBehavior/ adherence /compliance

Vaccine production

RNAi

HTS

PathwaysBiological lit. / Targets

Chemistry / Chemical lit

Disease

Competitive intelligence

Pharmacology

Epidemiology 

Target ClassMechanism of Action

Toxicity

OutcomesEHR

Stage gate decisions

External partners

Indication

Targeted institutionsEngagement programs

Formulary

Market access / channels

Batch records

Labeling

Quality Assurance

Manufacturing & process design

2011 – “Retina Diagram“ - Concept

Information Types/Taxonomies from Merck’s Information Landscape Knowledgebase (ILK)

3

Marketing

Regulatory

Clinical

Molecules / Compounds

PharmacologyToxicology

PK/PD

Studies

Gene

Competitive Intelligence

Animals

Biological Specimens

Assays

Real World Evidence

Targets

ProteinExperimentsLaboratory

Programs

Biomarkers

Modeling & Simulation

2014 – “Retina Diagram” - Real

This graphic created using Gephi: Bastian M., Heymann S., Jacomy M. (2009). Gephi: an open source software for exploring and manipulating networks. International AAAI Conference on Weblogs and Social Media.

R&D decisions rely on high quality information to steer programs and the pipeline

Knowledge Artifacts“Target validation plan”

Business Groups“Early Development team”

People“John Smith”

Information Types“Clinical Trial Name”

Organization Units“Analytical Chemistry”

Sources“Electronic Lab Notebook”

Business Processes“Integrative assessment of liver 

toxicity” Activities“Refine model”

Roles“Statistician”

Decisions/ Gateways“Determine Patient 

Stratification Biomarkers”

R&D Information Landscape>27,000 entities and 70,000 relationships defined

The volume and variety of internal information and external scientific information continues to grow at a rapid and accelerating rate

The ability to readily find, access, and use information is absolutely critical

Capabilities“Biomarker Validation”

FeedbackSurveys, VoC

5

Today Next 2-3 Years Beyond

Culture of Single Use

“Find & Access”

Dec

isio

n M

akin

g Q

ualit

y

Vocabulary Management

Embedded Stewardship

Information Flows Modeled

Effective Search

Integrated Information Architecture

IM Challenges Characterized

Fragmented tools,

processes

Systematic categorization

of data

Info

rmat

ion

Man

agem

ent M

atur

ity

As knowledge workers understand and embrace improved information management practices, better decision making can be enabled by better access to information

Organization-Wide Information Re-Use

? Better Information Management Better Decision Making: Better analysis, more transparency and collaboration, better workflow management, faster decisions

Dec

isio

n Q

ualit

yA

dopt

ion,

Mat

urity

Improving R&D Decision Making 

Information Flows Modeled

Clinical

Development

Consumer Care

Research

FormulationSafety

Regulatory

Manufacturing

Enterprise Business Analysis and IT Resource planning tends to focus on organizational domains separately

“Every system is perfectly designed to get exactly the results it gets.” --

“Google Street View” for information flows…

Merck

Analysts need a way to collaborate on mapping information flowsfrom different domains without explicit coordination

http://www.dwalls.com/Nature/Nature-World-Travel/Aerial+View+of+Downtown+Boston

Is a method of documenting and modeling the flow of information through an enterprise that allows both targeted and holistic analysis across the information continuum.

Sales & Marketing

MCC

•Regulatory

R&D

ManufacturingMerck

Semantic Information Flow Modeling (sIFM)…

Regulatory

Discovering the Information Flow Modeling Ontology

Mind Map for Merck’s “Patients Like Me” collaboration hand‐drawn by Jyoti ShahUsed with permission from PatientsLikeMe ®

DataSources Organizations Business

ProcessesDecisions

KMArtifacts

Initiatives / Projects

UseCasesPeople Roles

InformationTypes Capabilities Business

GroupsActivities

We determined the types of things (entities) and the types of relationshipsthat we were trying to understand and model them using a common semantic framework

Collaboration without Coordination

The use of an information modeling ontology allows multiple informatics and business analysts to collaborate on the same model without explicit coordination

Analyst 1 Analyst 2 Analyst 3

Compound structure ELN Medicinal Chemist uses ChemCart Pharm Sci uses ELN

Program Biologist uses ELN Compound Structure ChemCart Active Pharmaceutical Ingredient ELN

Toxicologist uses ELN Medicinal Chemist member-of Lead optimization team

Compound Structure ELN

Leveraging the Information Landscape  Knowledgebase to enable Information Management and Search

By encoding this knowledge in a searchable semantic knowledgebase, we can discover details about Merck’s information landscape on the fly, that were previously difficult to uncover.

Project InformationTypes

DataSources

KMArtifacts

Translational PK/PD Modeling ?Information

Types

?KMArtifacts

?DataSources

includes

flow

flow

What are the types of information and

data sources associated with

Translational PKPD Modeling?

Semantic Information Flow Modeling –Tools

Role

(from Information Modeling

Framework)

«informatio...Information type

(from Information Modeling

Framework)

«business gr...Team

(from Information Modeling

Framework)

«informatio...Information class

(from Information Modeling

Framework)

BusinessProcess1

(from Information Modeling Framework)

Activity2

Decision

Business Div ision

(from Information Modeling Framework)

«initiative»Initiativ e

(from Information Modeling

Framework)

Internal Data Source

«view»View

(from Information Modeling

Framework)

Document

(from Information Model ing Framework)

«organizati...Organization Unit

(from Information Modeling

Framework)

«business enterp...Enterprise

(from Information Modeling Framework)

«organization...Organization Unit 2

(from Information Model ing Framework)

«organization...Organization Unit 3

(from Information Modeling Framework)

Role 2

(from Information Modeling

Framework)

Role 4

(from Information Modeling

Framework)

External Data Source

Unstructured DocumentRepository

«informatio...information type 2

(from Information Model ing

Framework)

«informatio...information type 3

(from Information Model ing

Framework)

«information c...information class 2

(from Information Modeling Framework)

Person Name 2Person name 3

Person Name

Person Name 4

Person Name 5

«information c...Information class 3

(from Information Model ing Framework)

Specialization of Role

(from Information Modeling

Framework)

Specialization 2 of role

(from Information Model ing

Framework)

Decision 2

«business gr...Committee

(from Information Modeling

Framework)

«business gr...Governance Body

(from Information Modeling

Framework)

«initiative»Initiativ e 2

(from Information Modeling

Framework)

«ini tiative»Initiativ e 3

(from Information Model ing

Framework)

Person name 6

Person Name 7

StartEvent1

EndEvent1

«informatio...Information type 5

(from Information Modeling

Framework)

«informatio...Information type 6

(from Information Modeling

Framework)

Internal Data Source 2Internal Data Source 3

«information c...Information class 4

(from Information Modeling Framework)

«Capability»Capability

(from Information Modeling

Framework)

(from Information Modeling

Framework)

«use case»Use Case

«external or...External

Organization

(from Information Model ing

Framework)

«problem»Pain Point /

Problem

(from Information Modeling

Framework)

Information Modeling

Framework::Database

Table 1

«Field»Database Field 1

(from Information Model ing

Framework)

«Field»Database Field 2

(from Information Modeling

Framework)

Information Modeling

Framework::Database

Table 2

«Field»Database Field 3

(from Information Model ing

Framework)

«dataset»Dataset 1

(from Information Modeling

Framework)

part-of

member-of

participates-in

«flow»«flow»

«flow»

decides

«flow»

includesincludes

«flow»

part-of

enables

enables

part-of

includes-role

part-of

includes-role

includes-role

provides

member-of

member-of

«flow»

«flow»

«flow»

«flow»

«flow»

«flow»

«flow»

has-role

has-role

has-role

part-of

«flow»

performs

decides

decides

part-of

has-role

«stewards»

uses

«flow»

«flow»

«flow»«flow»

«includes» «includes»

«flow»

hasContact

«identified-by»

«owns»

«participates-in»

«provides»

«identi fied-by»

«includes» «includes» «includes»«includes»

«flow»

sIFM starts with the work business analysts do to assess the current state, elicit requirements, pain points, and informatics opportunities

• Stakeholder analysis• Interviews / Req. Elicitation

• Surveys / Diagnostics• Review of existing artifacts

• Brainstorming / Concept mappingInformation‐related entities and concepts are graphically represented using an ontological model with the help of the EA business and information modeling software

Anzo software provides semantic web structuring of the IFM model, and an intuitive interface to allow ad hoc querying, visualization and analytics

Information Landscape Knowledgebase

Information Flow –Knowledge Gathering

Information Flow Modeling

Information Landscape Knowledgebase (ILK)

Information Landscape Knowledgebase –Entity Explorer for Ad Hoc Analysis

Joe Smith

Jane Doe

Information Landscape Knowledgebase –Data Sources

Drill down by faceting on any entity type

18

Feedback/Pain Point Analysis

19

MRL Search Survey 2013

Example Pain Point Categories:• Access provisioning• Business Process/Workflow

management• Data Quality and Completeness• Data Security• Data Vizualzation/Analysis

Capabilities• Federated Search• Knowledge Management and

organization• Platform / Database• Tags / Keywords• Training / Awareness

Summary – Merck’s Information Landscape Knowledgebase

• Volume and complexity of scientific information is accelerating

• It is critical that scientists can seamlessly find and access that information

• Traditional analysis makes it difficult to understand how information is actually stewarded across the organization, and perpetuates the establishment of siloed solutions

• Semantic Information Flow Modeling and the Information Landscape Knowledgebase allows analysts and architects to collaborate on a common representation of information flow across the organization to enable:– Linked-data approach to analysis – Identification of high impact scientific information management solutions– Stewardship of Merck’s knowledge assets

20

Acknowledgements• Merck Research Laboratories – Information Technology

– Karen Conrad– Charisma Eugenio– John Koch– Ellie Norris– Jyoti Shah– Kim Wilson

• Cambridge Semantics Inc.– Ben Szekely– Lee Feigenbaum

21

Merck’s Information Landscape KnowledgebaseCharisma EugenioJune 10, 2014

23

Information Landscape Knowledgebase (ILK)

Role

(from Information Modeling

Framework)

«informatio...Information type

(from Information Model ing

Framework)

«business gr...Team

(from Information Modeling

Framework)

«informatio...Information class

(from Information Model ing

Framework)

BusinessProcess1

(from Information Modeling Framework)

Activity2

Decision

Business Division

(from Information Modeling Framework)

«initiative»Initiative

(from Information Modeling

Framework)

Internal Data Source

«view»View

(from Information Modeling

Framework)

Document

(from Information Model ing Framework)

«organizati...Organization Unit

(from Information Modeling

Framework)

«business enterp...Enterprise

(from Information Modeling Framework)

«organization...Organization Unit 2

(from Information Modeling Framework)

«organization...Organization Unit 3

(from Information Modeling Framework)

Role 2

(from Information Modeling

Framework)

Role 4

(from Information Model ing

Framework)

External Data Source

Unstructured DocumentRepository

«informatio...information type 2

(from Information Modeling

Framework)

«informatio...information type 3

(from Information Modeling

Framework)

«information c...information class 2

(from Information Model ing Framework)

Person Name 2Person name 3

Person Name

Person Name 4

Person Name 5

«information c...Information class 3

(from Information Modeling Framework)

Specialization of Role

(from Information Modeling

Framework)

Specialization 2 of role

(from Information Modeling

Framework)

Decision 2

«business gr...Committee

(from Information Modeling

Framework)

«business gr...Gov ernance Body

(from Information Model ing

Framework)

«ini tiative»Initiative 2

(from Information Modeling

Framework)

«initiative»Initiativ e 3

(from Information Modeling

Framework)

Person name 6

Person Name 7

StartEvent1

EndEvent1

«informatio...Information type 5

(from Information Modeling

Framework)

«informatio...Information type 6

(from Information Modeling

Framework)

Internal Data Source 2Internal Data Source 3

«information c...Information class 4

(from Information Modeling Framework)

«Capabi lity»Capability

(from Information Modeling

Framework)

(from Information Modeling

Framework)

«use case»Use Case

«external or...External

Organization

(from Information Model ing

Framework)

«problem»Pain Point /

Problem

(from Information Modeling

Framework)

Information Modeling

Framework::Database

Table 1

«Field»Database Field 1

(from Information Modeling

Framework)

«Field»Database Field 2

(from Information Model ing

Framework)

Information Modeling

Framework::Database

Table 2

«Field»Database Field 3

(from Information Model ing

Framework)

«dataset»Dataset 1

(from Information Modeling

Framework)

part-of

member-of

participates-in

«flow»«flow»

«flow»

decides

«flow»

includesincludes

«flow»

part-of

enables

enables

part-of

includes-role

part-of

includes-role

includes-role

provides

member-of

member-of

«flow»

«flow»

«flow»

«flow»

«flow»

«flow»

«flow»

has-role

has-role

has-role

part-of

«flow»

performs

decides

decides

part-of

has-role

«stewards»

uses

«flow»

«flow»

«flow»«flow»

«includes» «includes»

«flow»

hasContact

«identified-by»

«owns»

«participates-in»

«provides»

«identified-by»

«includes» «includes» «includes»«includes»

«flow»

Anzo knowledgebase for Merck’s information landscape, including information-related concepts and entities captured via the Enterprise Semantic Information Flow Modeling (sIFM) initiative

24

Anzo Express is a complete spreadsheet data management solution. It enables users to link data from multiple Excel spreadsheets and relational databases together in real-time for data collection, collaboration, and reporting. Anzo Express also includes a state-of-the-art Web dashboard tool so that you can easily share your integrated Excel data with your colleagues.

• Anzo Ontology Editor• Anzo Connect• Anzo for Excel• Anzo on the Web

Anzo Express

25

Anzo Ontology Editor

26

Anzo Ontology Editor

27

Information Flow Model (IFM)

28

IFM Oracle Database Schema

29

Anzo Connect

30

Anzo Connect

31

Anzo Connect

32

Anzo Connect

33

Anzo Connect

34

Anzo Connect

35

Information Landscape Knowledgebase (ILK)

36

Information Landscape Knowledgebase (ILK)

37

Information Landscape Knowledgebase (ILK)

38

Information Landscape Knowledgebase (ILK)

39

Information Landscape Knowledgebase (ILK)

40

Information Landscape Knowledgebase (ILK)

41

ILK – Entity Explorer

42

ILK – Entity Explorer

43

ILK – Entity Explorer

44

ILK – Entity Explorer

45

ILK – Entity Explorer

46

ILK – Entity Explorer

47

ILK – Entity Explorer

48

ILK – Entity Explorer

49

ILK – Systems

50

ILK – Systems

51

ILK – Systems

52

ILK – Systems

53

ILK – Systems

54

ILK – Systems

55

ILK – Systems

56

ILK – Systems

57

ILK – Systems

58

ILK – Systems

59

ILK – Systems

60

ILK – Systems

61

ILK – Systems

62

ILK – Systems

63

ILK – Systems

64

ILK – Systems

65

Information Landscape Knowledgebase (ILK)