Date post: | 14-Jan-2016 |
Category: |
Documents |
Upload: | melina-jenkins |
View: | 213 times |
Download: | 0 times |
Bioimage database architecture and infrastructure
2005, Bio-ITR, UCSB
Overview
• Current system – Status of collection– Capabilities– Architecture
• Joint system under development– Capabilities– Architecture
• Future– Layered databases– Distributed databases
Current collection• Retinal
– Confocal microscope – EM (Electron
micrograph)
Type Current Backlog
Rate/y
Expected 4Yrs
Total size
Retinal EM 600 19000 500 20,000 20GB
Retinal confocal P
3000 500 2400 10,000 10GB
Retinal confocal Z
0 14000 12000 10,000 65GB
Microtubule light 3000 2500 2500 13,000 12GB
Microtubule AFM 200 0 1200 5000 15GB
Microtubule DIC 0 0 2.7M 10M 10TB
• Microtubule – Light– Atomic Force Microscopy– DIC/Nomarski
Current capabilities
• Import process• Image and meta storage • Web access and browsing• Limited access by content
Screenshots (browsing)
Screenshots (search)
Screenshot (metadata edit)
Screenshot (retina meta)
Current architecture
• Metadata• Database implementation• Front end implementation• Image import API• Software and hardware infrastructure
Metadata• Standard (image types, parameters)
– File, size, type, tiff data, channel info, etc.• Retinal
– Visible cells– Antibody labeling– Experimental conditions– Researcher
• Microtubule– Track (hand captured)
• AFM– Machine parameters
• Metadata sources– Researcher– Annotated excel files – Proprietary image formats
Database implementation• MySql • First generation schema
– image parameters• File, • size, • type, • tiff data,• etc.
– Metadata• Experimenter, • condition, • antibodies, • tissue, • notes, • etc.
imagemodel.svg
Front end
• Apache, Php, Javascript• Import proprietary image types• Browse images• Search by metadata • Search by similarity• Multi user and release protection
Image and metadata import
• Excel parser for metadata • Image import library
– Image Format API and C/C++ library for database and client applications were developed.
– Currently supported proprietary image formats:• Metamorph Stack,• Fluoview TIFF, BioRad PIC,• PSIA TIFF, Nanoscope,• + common: JPEG, TIFF, BMP, PNG…
Hardware and software infrastructure• Hardware
– Dell Server with dual Intel Xeon cpu at 2.4Ghz
– 140GB scsi hard drive set up as RAID 1– Gigabit network switch
• Software– Linux, version Fedora 2– Apache Web server with PHP, PERL and
graphical modules– MySQL Database server
Overview
• Current system – Status of collection– Capabilities– Architecture
• Joint system under development– Capabilities– Architecture
• Future– Layered databases– Distributed databases
Overview
• Current system – Status of collection– Capabilities– Architecture
• Joint system under development– Capabilities– Architecture
• Future– Layered databases– Distributed databases
Motivation
• Common schema between UCSB and CMU
• Support greater functionality– Analysis and interpretation tools– Ground truth– Semantics– Uncertainty – Complex features and distance metrics
• MPEG-7 features• Other features
– Querying and relevance feedback
Capabilities
• Image and metadata storage • Web access and browsing• Access and search by content• Import/Export
– Streamlined XML import/export for external tools
• Schema extensions– Image5d, semantic, uncertainty, analysis
• Image processing modules and tools
Infrastructure – Interchange XML
Unified interchange XML format is being developed for database feeding and extraction procedures, external client application interaction and database intercommunication.
DBXML
External clients
Image library
ExternalDB
interchange
Import/export
remote access
Ground truth tools
Image processing tools
Ground truth acquisition toolsImage processing and infrastructure teams are developing universal “ground truth” collection tools able to retrieve data from data-base and feed user defined information back to the database. The main communication vehicle is XML interchange format.
At the current stage stand alone tools are being developed and tested that later on will be grouped in the universal application able to communicate directly to the data-base.
+
Image processing APIFast development of image processing tools concentrated on problem solving. API provides simple access to multi-channel image and mask information. Allows progress output, acquisition of user defined parameters and automatically created filter preview.
Example of API usage: Noise removal for Fluoview images
resultnoise
input
Semantic data modules
• Integration of current research in automatic image analysis:– Cell identification– Layer detection– Cell counting– Microtubule detection and tracking– Microtubule dynamicity and global
characterization
Modeling uncertainty• Uncertain identification/analysis
– Simple probability (e.g., 0.8)– “Is this a rod bipolar cell?”
• Imprecise location/extent/count– 90% accuracy in cell count– Line segment (single or sequence), polygon
• Identified by a sequence of points• Each point Gaussian• Store mean x, mean y, and standard deviation
– Circle• Center Gaussian point, as above• Radius mean r and standard deviation
Schema• Image5d• Analysis and interpretation tools
– Quantitative data generation– Semantic Labeling
• Experimental description• Shape and geometry• Domain knowledge
– Ground truth– Semantic objects
• Uncertainty • Features and distance metrics
• MPEG-7 features• Other features
• Querying and relevance feedback
Schema (image5d)
Plane#id+im age_id+im agedata_id+channel_id+tim ept_id+zlevel+m ask_id+parent_id
PlaneT+im age_id+tim ept_id+plane_id
PlaneC+im age_id+channel_id+plane_id
idid1 1
PlaneZ+im age_id+zlevel+plane_id
id PlaneTC+im age_id+tim ept_id+plane_id+channel_id
PlaneTCZ+im age_id+tim ept_id+plane_id+channel_id+zlevel
PlaneCZ+im age_id+channel_id+plane_id+zlevel: int
ididid
PlaneZT+im age_id+zlevel+plane_id+tim ept_id
id
Microtubule Images
Confocal Images
Image#id+description+slide_id+slide_pos+im ageprotocol_id+perm ission_id
id1
*
id1
*
id1
*
EM Z series
id1im age_id
*
id1
*
id1
*
id1
*
id1
*
5d images• Image is a set of bit-
planes• Group planes by which
dimensions vary• Permits
– Multiple formats– Caching
Schema (semantic objects)Target
+target_id *+...
Antibody_labeling+target_id*+cell_part_id*
Cell_and_cell_part+cell-part_id #+name+is_kind_of *+is_part_of *
Semantic_object+sem_obj_id #+cell_part_id *+shape : enum {round, line, polygon}+confidence+source_id *
Cell_part_location+cell_part_id *+layer_id *+expt_cond
Layer_order+layer_id #+layer_order+name
Layer+plane_id #*+layer_id *+confidence+source_id *
Layer_thickness+region_id #*+thickness: gaussian+source_id *
Layer_shape+region_id #*+point_id #+x: gaussian+y:gaussian
Semantic_round_object+sem_obj_id #*+center_x: gaussian+center_y: gaussian+radius: gaussian
Semantic_line_object+sem_obj_id #*+start_x: gaussian+start_y: gaussian+end_x: gaussian+end_y: gaussian
Semantic_polygon_point+sem_obj_id #*+point_id #+x: gaussian+y: gaussian
1 * 1
*
1
1
*
is_kind_of1is_kind_of*
is_part_of1
is_part_of*
1
*
*
*
1
*
1
*
1
1
10..1
1
0..1
1
0..1
•Capture semantics
•Capture uncertainty
•Type of object : confidence
•Position of object: Gaussian
domain
domain
domain
Schema (analysis and features)FeatureDescriptor+id+description+code_ptr+perm ission_id
Feature_result+id+im age_id+result+tim estam p
result:type (double, vector)nam e:string
FeatureInputType+feature_id+table_nam e
FeatureOutputType+feature_id+table_nam e
CellCount_result+id+im age_id+result: num eric+tim estam p
FeatureDescriptor----------------------------1 Cell Counter
FeatureInputTypes----------------------------1 PlaneC
FeatureOutputType----------------------------1 CellCount_result
CellCount_resultid im age_id result tim e----------------------------1 1 103 1
CellIdentfier_result+id+im age_id+result: cell_location+tim estam p
• Capture provenance• Support type
checking• Support feature
substitution
Hardware and software components• Hardware requirements
– Same as original system
• Software – Postgresql backend– JSP / JSF front end
• Migrate php/javascript current code into components
Architecture
WebPage
UI Generation
View
Menu Table
Semantic Interface
DB Storage
Image
Cell
Dynamic
JSF Components
Programmable
Image API
Model API
Object
Relational
(Postgresql)
HTML
XML
Overview• Current system
– Status of collection– Capabilities– Architecture
• Joint system under development– Capabilities– Architecture
• Future– Layered databases– Integration with other databases
• BIRN • OME metadata and schema exchange
Layered database• Overlay model
(interpretation) on image (raw) data
• Multiple interpretations of data
• URI references between databases• Pro: Logical distinction, multiple
interpretations, flexible implementation
Raw Image
Metadata
SemanticObject
Semantic
Complex biological model 1
Complex biological model 2
ObjectInterpretation 3
Semantic objectInterpretation 1
# 2
BIRN (Biomedical Informatics Research Network)• Goals:
– Link multiple databases with different schemas, maintained at different research institutions• 19 universities, 26 research groups
• Current collection– Three test beds centered around brain imaging of human
neurological disorders and associated animal models:• Functional BIRN• Morphometry BIRN • Mouse BIRN
Integration with BIRN• Databases at UCSB/CMU Centers can be
integrated into the BIRN federation• UCSB/CMU infrastructure supports
– Extensive metadata for images – Standard XML interchange format for 5d images– Computational tools to refine data
• Web based visualization and analysis tools
• We need to:– Translate UCSB/CMU Schema to F-logic
(Knowledge-based mediation)– Link UCSB/CMU dataset to UMLS (Unified Medical
Language System) ontology– Reference a common spatial framework
• Standard atlas coordinate system, e.g., SMART Atlas
OME• Open Microscopy Environment
– a set of software that interacts with a database to manage images, image meta data, image analysis and analysis results
• Designed to perform as a local system
• Integration with OME– Adapt OME XML image interchange
mechanism– Adapt the database oriented modular
analysis approach of OME
Conclusion
• Built prototype and collected ~4000 images– Being used internally
• Concurrent work on 2nd generation system– Image loading– Integration of tools– New front end
Intro bio slide
• Retina
Images from webvision.med.utah.edu