databases מאגרי מידע - חלק ב' אחסון שליפה. what are we looking for in a...

10
Databases עעעעע עעעע עעע ע'- עעעעע עעעעע

Post on 22-Dec-2015

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Databases מאגרי מידע - חלק ב' אחסון שליפה. What are we looking for in a GOOD database? Large amount of data Numerous entries Well defined fields Non-redundancy

Databases

- חלק ב'מאגרי מידע

אחסון

שליפה

Page 2: Databases מאגרי מידע - חלק ב' אחסון שליפה. What are we looking for in a GOOD database? Large amount of data Numerous entries Well defined fields Non-redundancy

What are we looking for in a GOOD database?

• Large amount of data Numerous entries Well defined fields

• Non-redundancy• Reliable data (periodic updating) • Informative links to other DBs

• Efficient and user-friendly associated tools (software) necesary for db access/query, db information insertion, db information deletion

Curated vs. non-curated DBs

Page 3: Databases מאגרי מידע - חלק ב' אחסון שליפה. What are we looking for in a GOOD database? Large amount of data Numerous entries Well defined fields Non-redundancy

Repository DBs (archives) vs. topic centered

First generation vs. advanced generations

Not curated vs. well curated Partially annotated vs. fully annotated

Nucleotide & Protein Sequence DBs ~20 Years of Data Accumulation

More redundant vs. less redundant

Page 4: Databases מאגרי מידע - חלק ב' אחסון שליפה. What are we looking for in a GOOD database? Large amount of data Numerous entries Well defined fields Non-redundancy

Primary Sequence Repositories

בור סוד שאינו מאבד טיפה

(highly redundant)

אך גם אינו מעבד טיפה

(poorly annotated)

First Generation Databases EMBL/GenBank/DDBJ

Page 5: Databases מאגרי מידע - חלק ב' אחסון שליפה. What are we looking for in a GOOD database? Large amount of data Numerous entries Well defined fields Non-redundancy

EMBL/GenBank/DDBJ

Sort of sequence museum, where sequences are preserved for eternity as they were determined, interpreted and published

originally by their authors

)primary sequence repository(

The authors have full authority over the content of the entries they submit!

)editorial control of the content belongs to the authors(

Redundancy, insufficient annotation .

Page 6: Databases מאגרי מידע - חלק ב' אחסון שליפה. What are we looking for in a GOOD database? Large amount of data Numerous entries Well defined fields Non-redundancy

Unexpected information you can find in these dbs:

מי חבר של פידל?

EMBL

כמה שנים הוא שמר את הסיגר?

Page 7: Databases מאגרי מידע - חלק ב' אחסון שליפה. What are we looking for in a GOOD database? Large amount of data Numerous entries Well defined fields Non-redundancy

Advanced generations of nucleotide sequence databases

Non-redundant sequence-centric databaseA comprehensive, integrated, non-redundant set of

sequences, including genomic DNA, transcript (RNA), and protein products.

RefSeq

Gene-centric databasesAll the sequence information relevant to a given gene

is made accessible at onceGene

Genome-centric databasesInformation about gene sequence, relative position,

strand orientation, biochemical functions…Genome browsers

Different entries

Single entry

Page 8: Databases מאגרי מידע - חלק ב' אחסון שליפה. What are we looking for in a GOOD database? Large amount of data Numerous entries Well defined fields Non-redundancy

Boolean operatorsKeywords

Fields

Syntax

4. Access additional entries discussing same or similar entities by links to additional databases (DBXref)

2 .Choose appropriate database

3.

5 .Think, evaluate. The computer is just a machine.

You are (hopefully) a thinking organism.

1. Think – phrase your scientific question.

Phrase your query

Current tutorial

Preview/index

Preview/index, limits

MeSH terms

Previous and current tutorials

History

Page 9: Databases מאגרי מידע - חלק ב' אחסון שליפה. What are we looking for in a GOOD database? Large amount of data Numerous entries Well defined fields Non-redundancy

Not found (-)

Found (+)

RelatedFalse negative

True positive

UnrelatedTrue negative

False positive

Search results“sci

entific

trut

h”

Evaluating Search Results

Easy to detectHarder to detect (?)

Page 10: Databases מאגרי מידע - חלק ב' אחסון שליפה. What are we looking for in a GOOD database? Large amount of data Numerous entries Well defined fields Non-redundancy