+ All Categories
Home > Technology > Life Science Database Cross Search and Metadata

Life Science Database Cross Search and Metadata

Date post: 06-May-2015
Category:
Upload: maori-ito
View: 2,839 times
Download: 3 times
Share this document with a friend
Description:
Life science databases are sometimes difficult to understand due to lack of information. I'd like to add metadata into databases and improve search results.
Popular Tags:
30
Life Science Database Cross Search and Metadata Maori Ito @ NIBIO
Transcript
Page 1: Life Science Database Cross Search and Metadata

Life Science Database Cross Search and Metadata

Maori Ito @ NIBIO

Page 2: Life Science Database Cross Search and Metadata

Database integrate collaboration among 4 ministries with NBDC• Database Catalog

• Life Science Database Cross Search

• Life Science Database Archive

• Database Reconstructive Integration

Page 3: Life Science Database Cross Search and Metadata

Why Cross Search?• Easy to use

• Accustomed to use

• Appropriate for comparing various kinds of databases

Page 4: Life Science Database Cross Search and Metadata

Sagace• Search for Biomedical Data &

Resources in Japan

Page 5: Life Science Database Cross Search and Metadata

Bad Skeptical Reputations for Search Results…• Useless…

• Slow….

• What is the advantage?

Page 6: Life Science Database Cross Search and Metadata

What is the most Important thing in cross search ?

Page 7: Life Science Database Cross Search and Metadata

Simple Answers

•Speed and Accuracy

Page 8: Life Science Database Cross Search and Metadata

Mechanism of Search Engine

1. Crawling

2. Indexing

3. Query Processing

4. Scoring

Page 9: Life Science Database Cross Search and Metadata

Crawling• Crawl databases and pages by

program

Program

Page 10: Life Science Database Cross Search and Metadata

Indexing

• Split data convenient size and store own server

External Data

Internal Server

Page 11: Life Science Database Cross Search and Metadata

Query Processing and Scoring

Page 12: Life Science Database Cross Search and Metadata

NIBIO

MEDALS

JCGGDB

NBDC / DBCLS

AgriTogo

Collaborate by using P2P architecture

Under Comtemplation

In case of Hyper Estraier (Search System)

12

Page 13: Life Science Database Cross Search and Metadata

Back to the simple answers to improvement

• Speed (Thanks to Johan-san ,Mizuguchi-san and many collaborators)1. Relax limits on access of DBCLS

(Use a liggle ingenuity in css and images)

• Accuracy NIBIO

NBDC / DBCLS

Page 14: Life Science Database Cross Search and Metadata

How to improve accuracy?• What is accuracy for life science

database cross search?

• What is accuracy for life science specialist?

Page 15: Life Science Database Cross Search and Metadata

• In general, developers emphasize search algorithms and scorings.

• However, general results and methods for cross search may not suitable for life science specialists..?

• Data (Index files) from life science databases are sometimes difficult to understand immediately.

• It’s hard to make each crawler program for each database and maintenance it.

• (We have no extra …. to make proper search page like entrez et al….)

Page 16: Life Science Database Cross Search and Metadata

To Improve Accuracy• Manually select Databases

• Assigned weights to crawled databases for improving the ranking system

Page 17: Life Science Database Cross Search and Metadata

Metadata!• One way to solve these problems

Difficult to understand

data immediatel

y

Page 18: Life Science Database Cross Search and Metadata

If metadata are added data…

Disease:Epithelial adenomaSpecies:Mouse Keywords:DNA sequenceLast Modified:2013-01-19

Metadata

Data

Page 19: Life Science Database Cross Search and Metadata

Easy to understand for users• It can be a guide to improve user

experience.

Image

Page 20: Life Science Database Cross Search and Metadata

Easy to understand for crawlers

Disease:Epithelial adenomaSpecies:Mouse Keywords:DNA sequenceLast Modified:2013-01-19

Metadata

Page 21: Life Science Database Cross Search and Metadata

How to use it?• Mark up data by microdata like a tag

Last Modified

TitleImage

ID

http://www.pdbj.org/emnavi/emnavi_detail.php?id=1556&lang=en

Page 22: Life Science Database Cross Search and Metadata

• Google, Yahoo! and Bing decided to use microdata to show search results more valuable.

• Some vocabularies have already applied to search results.

• E.g.

Is it a practical suggestion?

Page 23: Life Science Database Cross Search and Metadata

Schema.org• Provide a collection of schemas (htm tags)

• Bing, Google, Yahoo! and Yandex rely on this markup to improve the display of search results, making it easier for people to find the right web pages. (quoted by schema.org)

• We proposed “schema.org” extensions for “BiologicalDatabaseEntry” and “Biological Database”.

• Schema.org proposals : http://www.w3.org/wiki/WebSchemas/SchemaDotOrgProposals

Page 24: Life Science Database Cross Search and Metadata

Properties for BiologicalDatabaseEntry

entryID additionalType dateCreated

isEntryof description dateModified

taxon image keywords

seeAlso url provider

reference alternativeHeadline

breadcrumb

name inLanguage

Page 26: Life Science Database Cross Search and Metadata

How to markup ?

<div itemscope itemtype=“http://schema.org/BiologicalDatabaseEntry”>ID <span itemprop="entryID">1556</span>Specied<span itemprop="taxon" itemscope itemtype="http://schema.org/BiologicalDatabaseEntry"> <span itemprop="name">Bacillus subtilis</span></span>Deposition: <span itemprop="dateCreated">2008-09-08</span>Last update: <span itemprop="dateModified">2012-10-24</span>

</div>

Declaration

Specify Property and markup with normal tag

Page 27: Life Science Database Cross Search and Metadata

And then• Crawl these microdata

• Reflect Search Results

At Present

Within the fiscal year (Preparation to

reflect)

Image

Page 28: Life Science Database Cross Search and Metadata

Ask for your help• If this approach have some efforts,

there are may be chances to reflect major search engines.

• Please markup your own site or database and give me feedback.

• If you have any suggestions or comments, please let me know.

Page 29: Life Science Database Cross Search and Metadata

Future Perspective• Focus on Accuracy continuously

• Microdata– Discuss many scientists and finalize the

proposal of schema.org extension

– Boost numbers of databases

– Make support tools to mark up microdata

• Add appropriate data from high-quality databases

Page 30: Life Science Database Cross Search and Metadata

Thank you for listening!


Recommended