Ontology of imaging datasets as a prerequisite for ontologies of imaging biomarkers Bernard Gibaud...

Post on 05-Jan-2016

213 views 0 download

Tags:

transcript

Ontology of imaging datasets as a prerequisite for ontologies of imaging

biomarkers

Bernard Gibaud

MediCIS, LTSI, U1099 InsermFaculté de médecine, Rennes

bernard.gibaud@univ-rennes1.fr

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

1

Acknowledgements

• Former partners of the NeuroLOG project (supported by ANR)

• CrEDIBLE project (CNRS initiative for Big Data in science), and my colleagues from this project

• Former colleagues of the DICOM WG6 and WG23, especially David Clunie and Larry Tarbox

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

Gilles Kassel Michel Dojat Bénédicte Batrancourt Lynda Temal Johan Montagnat Alban Gaignard (Amiens) (Grenoble) (Paris) (Paris)

(Sophia-Antipolis)

2

Overview

• Introduction (scope and motivations)

• Part 1. Modeling datasets

• Part 2. Modeling datasets related actions

• Part 3. Modeling imaging biomarkers

• Conclusion

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

3

Introduction

Scope and motivations

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

4

Imaging biomarkers

• Definition of biomarkers (Atkinson 2001)*

– « characteristics that are objectively measured and evaluated as indicators of

• normal biological processes, • pathological processes, • pharmaceutical responses to a therapeutic

intervention »

• Definition of imaging biomarkers– Derived from medical images

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

* Clin Pharmacol & Ther. 2001 Mar;69(3):89-95. Biomarkers and surrogate endpoints: preferred definitions and conceptual framework. Biomarkers Definitions Working Group.

5

Imaging biomarkers

• Of critical importance in research– Focused clinical research (e.g. controlled clinical

trials)– Translational research

• Link/correlate results obtained in various domains• Need to share them at a broad scale federated imaging biobanks (incl. imaging biomarkers)

• Of critical importance in (future) care delivery– Involved in decision criteria (with other biomarkers)

• Diagnosis• Therapy (prognosis)

– Key aspect of a structured EHR / tasks planning

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

6

General framework

Reality

Human sujectAnimal subject

Specimenetc.

Acquisition

Images

MR imageCT image

PET imageetc.

Imaging biomarkers

Processing

Volume of anatomical structure

Fractal dimensionMean reg. blood

volumeLesion load (MS)

etc.

FactsPlans, etc.

Decision

Diagnosis of ADDiagnosis of MS

Resp to treatmentetc.

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

7

Importance of context

Scientific question to be

answeredor

Clinical question

Set of required imaging

biomarkers

Decision

Detailed imagingprotocol

ProcessingAcquisition

Detailed subject/spec

imen preparation

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

8

Need for standards

• Standard vocabulary used as metadata to consistently refer to:– Imaged objects and phenomena (in reality)– Image acquisition and image processing artifacts (devices,

software)– Images and any relevant datasets resulting from image

processing– Imaging biomarkers– Context and motivation of acquisition / processing of

images

• Standard formats for images– e.g. DICOM, NifTi, TIFF, GIF, JPEG

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

10

Acquisition

Reality Images Imaging biomarkers

FactsPlans, etc.

Processing Decision

What kind of vocabulary ?

• What metadata structure ?– Simple hierarchy of related terms: example *

– Complex hierarchies of data items: example DICOM• DICOM Part 3 : Information Object Definition / Module / Data Element (1260 pages)• DICOM Part 16 : Terminology + Structured Report templates (1034 pages)

– Formal vocabulary (ontology)• Set of related complementary markers: Ontology versus data model ?

• What method of development ?– Defined as a standard (by a standards development body, e.g.

DICOM) – or freely extendable by users ? Example *

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

* Plant et al. New concepts for building vocabulary for cell image ontologies. BMC Bioinformatics 2011, 12:487.

11

Part 1. Modeling of datasets

Experience from the NeuroBase and NeuroLOG projects

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

12

Goals of the NeuroLOG project

(mid-2007 end-2010)• To set up a federated system, allowing the

sharing and re-use of:− Neuroimaging data (images and related technical, demographical

and medical metadata)

− Processing tools published by cooperating partners

• Ontology modeled according to OntoSpec methodology *– Based on DOLCE– OntoSpec semi-formal document

* Kassel 2005

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

13

Example of OntoSpec representation

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

14

OntoNeuroLOG ontology

• Three modules in OWL– (available on BioPortal since 09/2013)– Dataset processing (ONL-DP)

• Includes an ontology of Datasets

– MR dataset acquisition (ONL-MR-DA)– Mental state assessment (ONL-MSA)

• OntoSpec documents on line as well

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

15

Ontology of datasets: scope

• Primarily focusing on, (but not limited to) clinical imaging modalities (CT, MRI, etc.)

• Considers also « processed images », e.g. result of segmentation, registration, diffusion tensors, fiber tracks, etc.

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

17

Ontology of datasets: approach

• Based on DOLCE and a number of core ontologies– Language, IEC (Inscription, Expression &

Conceptualization), D&M (Discourse & Message), etc.

• Considers Dataset as a Proposition (i.e. document content)– is expressed by an Expression (i.e. representation in a

particular format)– is physically realized by an Inscription (e.g. one or more

files)

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

18

Ontology of datasets: major choices

• Fundamentally composed of two parts *:– a Dataset set of values part (aggregate of atomic values) – a Dataset metadata part

• Datasets may be categorized along five classification axes, based on– Imaging modality (e.g. CT dataset, MR dataset, PET dataset)– Some processing that generates them (e.g. Segmentation,

Registration)– What is being explored (e.g. Anatomical dataset, Functional data,

Metabolic dataset)– Number of subjects that it characterizes (e.g. Single subject dataset,

Multiple subjects dataset)– Reconstructed dataset or Non-reconstructed dataset (i.e. raw data)

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

19* Temal et al., JBI 2008

ExamplesTaxonomy of MR dataset Taxonomy of

Parameter quantification dataset

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

20

Dataset: unity criterium

• What determines the content of a dataset ?Example 1: images, i.e. Sets of elementary data values (pixels / voxels)–‘Basic 2D image’ (‘frame’ in DICOM jargon) –or ‘Set of 2D images’ (‘stack’ in DICOM jargon) –or ‘All images acquired in a single acquisition’ (‘multi-frame image’ in DICOM jargon)

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

21

Dataset: unity criterium

• What determines the content of a dataset ?Example 2: tractographic data–A ‘particular tract’ connecting 2 voxels–or ‘All tracts extracted by a tractographic algorithm’

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

22

Dataset: unity criterium

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

23

• Our (pragmatic) choice– Data values obtained during a single acquisition /

processing– And pertaining to a single subject or group of subjects

• Why ?– Strongly related to provenance– Facilitates datasets’ reuse

• Subject-oriented retrieval• Single entry of image processing tools

• But …– May require further description of dataset structure – e.g. MR segmentation image with multiple ROIs

Datasets: Identity criterium

• How to distinguish two dataset instances ?

• Our (pragmatic) choice – Based on creation context (rather than actual

data values)– Consequences

• Two acquisitions always result in distinct datasets• Datasets are immutable: any dataset processing results

in a new dataset

• In practice, how to identify, re-identify datasets ?

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

24

Ontology of datasets: still open issues

how datasets’ values relate to each other ?

– Case 1: Mutidimensional matrix: a 2D ou 3D map of a given parameter (spatio-temporal map)

Modeling with notion of field *• The same measurement is made at every point of a sampling grid• Not limited to scalars, can also concern vectors and tensors

First try made in the context of DICOM WG23 **Needs to be modeled / revisited as a real ontology

• Measured qualities, e.g. MR intensity signal, proton density, density in HU

• Scales of measurement• Sampling grids• Measurement of time-dependent phenomena

* W Kuhn, Core concepts of spatial information for transdisc. Research, IJGIS 2012** Abstract multidimensional image model, DICOM Part 19 (WG23), 2009

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

26

Ontology of datasets: still open issues

how datasets’ values relate to each other ?

– Case 2: Network datasets: e.g. tractography, vascular tree

Needs to model the semantics of nodes and links

– Case 3: Combination of 4 x 4 matrices: e.g. result of image registration

– Case 4: Meshes (e.g. surface of objects)

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

27

Ontology of datasets: still open issues

regions of interest*– ROI (basic taxonomy based on shape)

• To denote a subset of the pixels / voxels within a Dataset

– ROI annotations• To relate a ROI to an object in reality• To associate a measurement referring to this

object (via references to a quality and a quale)

– ROI annotation collection• Several annotations made by the same agent in

the same action

* Temal et al., JBI 2008

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

28

Part 2. Modeling of dataset related actions

Experience from the NeuroLOG project

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

30

Ontology of dataset related actions: scope

• Dataset acquisition– Action involving a subject who physically participates in

the action (as affected)

• Dataset processing– Image processing actions that apply to Datasets and

produce Datasets or Imaging biomarkers

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

31

Acquisition

Reality ImagesImaging

biomarkersFacts

Plans, etc.

Processing Decision

Ontology of dataset acquisition: approach

• Based on DOLCE and a number of core ontologies– Action, Participant role, etc.

• Specifies participating entities and output, i.e.– has for instrument a Planned acquisition protocol – has for instrument a Dataset acquisition equipment – has for result a Dataset (i.e. produced as output

dataset)

– Example MR dataset acquisition

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

32

Ontology of dataset processing: approach

• Based on DOLCE and a number of core ontologies– Action, Participant role, etc.

• Specifies contraints on input / output– has for data a Dataset (i.e. used as input dataset)– has for result a Dataset (i.e. produced as output

dataset)

– Example Diffusion tensor calculation

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

33

Ontology of dataset processing: major choices

• Major classes of Dataset processing, modeled as conceptual actions– Dataset arithmetical operation– Dataset transformation (e.g. Fourier, wavelet)– Filtering (e.g. convolution, mathematical morphology filtering)– Registration– Reconstruction– Resampling– Quantitative parameter estimation– Segmentation– Restoration– Mesh generation– Statistical analysis

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

34

ExamplesTaxonomy of

Quantitative parameter estimationTaxonomy of Registration

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

35

Part 3. Modeling imaging biomarkers

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

36

Modeling imaging biomarkers• Note: It is important to distinguish…

– Imaging biomarker as result of a measurement– from its role in some medical decision (e.g. diagnosis,

prognostic)

– In this talk, we focus on the first, only

• Main aspects to address– Measure– Relation to reality– Provenance– Context

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

37

Imaging biomarkers: measure

• Is the result of some measurement process (manual or implemented in image processing software)

• Indirectly involves a physical object under study, and / or a process under study (dynamic process or longitudinal process) in which this object participates– Note: This object is usually part of the image’s field of view

• Concerns a specific quality of this object, or of the process under study)– Note: This quality may be a complex human construct (e.g.

model-based: fractal dimension, gyrification index)

• Values chosen from a predefined scale of measurement– interval, ratio, ordinal, nominal (categorical)

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

38

Imaging biomarkers: relation to reality

• A Measurement of a quality beared by an object

• Or a Measurement of a temporal quality of the process under study

• (Simple) Examples– Volume of hippocampus (in cm3)– Speed of brain atrophy process – neuronal loss (in

cm3/year)– Mean Fractional Anisotropy over uncinate fasciculus

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

39

Imaging biomarkers: provenance

• Execution of a program implementing some conceptual action

• Resources used of this execution (user, date, platform)

• Input data (datasets, ROIs, imaging biomarkers)• Input parameters (if any)

• Some open issues– Complexity of image processing pipelines

need of description at several granularity levels

– W3C PROV-O, but which upper level ontology ?

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

40

Imaging biomarkers: context

• Case 1: Relation to a research question– Measurement process is part of the execution of

research protocol– Context is provided by the research goal and

protocol

• Case 2: Relation to a clinical question– Measurement process is part of the actions

performed to answer the clinical question (possibly detailed via a protocol, and/or a report template)

– Context is provided by the clinical question and associated clinical information

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

41

Conclusion: how to progress…

• Toward defining suitable ontologies …– Select / improve / complete relevant domain ontologies e.g. OBI / IAO / OCRe / PATO / FMA / RadLex/ QIBO / OntoNeuroLOG

– Especially w.r.t. observation & measurement – Collaborate with DICOM – as well as the editors of important image processing

software (Freesurfer, FSL, SPM, 3D-slicer, etc.)

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

42

Conclusion: how to progress…

• Towards deployment and federation of imaging biobanks1. Setup semantic resources that complement (rather than

replace) existing image repositories• Start with basic (image) dataset categories• Continue with image processing actions and imaging

biomarkers

2. Progressively evolve the image repositories to more closely follow the ontology (entities, relationships)

3. Equip image processing pipelines to natively produce semantic annotations

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

43

Thank you for your attention

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

44

Extra slides

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

45

Ontology: 3-level structure

• Application ontology (called OntoNeuroLOG)• one Foundational ontology (DOLCE)• Several Core ontologies• Several Domain ontologies

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

46

Ontology: 3 representations

1. OntoSpec representation (Kassel, 2005)

– Semi-formal notation (rich semantics)– Numerous axioms

2. OWL-Lite– Edited with PROTÉGÉ– Tailored to perform inferences with CORESE (search

engine)

3. Federated relational schema– Entities and relations are closely linked to concepts and

relations of the ontology

Ontology and Imaging Informatics Workshop, 23-25 June 2014, Amherst (NY)

47