Databases מאגרי מידע - חלק ב' אחסון שליפה. What are we looking for in a GOOD...

Post on 22-Dec-2015

217 views 2 download

Tags:

transcript

Databases

- חלק ב'מאגרי מידע

אחסון

שליפה

What are we looking for in a GOOD database?

• Large amount of data Numerous entries Well defined fields

• Non-redundancy• Reliable data (periodic updating) • Informative links to other DBs

• Efficient and user-friendly associated tools (software) necesary for db access/query, db information insertion, db information deletion

Curated vs. non-curated DBs

Repository DBs (archives) vs. topic centered

First generation vs. advanced generations

Not curated vs. well curated Partially annotated vs. fully annotated

Nucleotide & Protein Sequence DBs ~20 Years of Data Accumulation

More redundant vs. less redundant

Primary Sequence Repositories

בור סוד שאינו מאבד טיפה

(highly redundant)

אך גם אינו מעבד טיפה

(poorly annotated)

First Generation Databases EMBL/GenBank/DDBJ

EMBL/GenBank/DDBJ

Sort of sequence museum, where sequences are preserved for eternity as they were determined, interpreted and published

originally by their authors

)primary sequence repository(

The authors have full authority over the content of the entries they submit!

)editorial control of the content belongs to the authors(

Redundancy, insufficient annotation .

Unexpected information you can find in these dbs:

מי חבר של פידל?

EMBL

כמה שנים הוא שמר את הסיגר?

Advanced generations of nucleotide sequence databases

Non-redundant sequence-centric databaseA comprehensive, integrated, non-redundant set of

sequences, including genomic DNA, transcript (RNA), and protein products.

RefSeq

Gene-centric databasesAll the sequence information relevant to a given gene

is made accessible at onceGene

Genome-centric databasesInformation about gene sequence, relative position,

strand orientation, biochemical functions…Genome browsers

Different entries

Single entry

Boolean operatorsKeywords

Fields

Syntax

4. Access additional entries discussing same or similar entities by links to additional databases (DBXref)

2 .Choose appropriate database

3.

5 .Think, evaluate. The computer is just a machine.

You are (hopefully) a thinking organism.

1. Think – phrase your scientific question.

Phrase your query

Current tutorial

Preview/index

Preview/index, limits

MeSH terms

Previous and current tutorials

History

Not found (-)

Found (+)

RelatedFalse negative

True positive

UnrelatedTrue negative

False positive

Search results“sci

entific

trut

h”

Evaluating Search Results

Easy to detectHarder to detect (?)