Download - Friday Seminar 15 10 2004
Funded by:
Facilitating Standardization and Exchange of Array Design
ADF MAGE-ML Tool
Pierre Marguerite – Friday Seminar
EBI – Microarray Informatics Team
15 October 2004
Funded by:
15/10/2004 - Friday Seminar Pierre Marguerite
Microarray Informatics Team2
ADF MAGE-ML Tool• Application
– stand-alone– plateform independant
• Supports:– Simple/Complex microarray layout– Differents microarray applications
• gene_expression• snp_detection• comparative_genomic_hybridization• binding_site_identification• Others (minimal)
• Respects Good practices
Funded by:
15/10/2004 - Friday Seminar Pierre Marguerite
Microarray Informatics Team3
conversiontool
Funded by:
15/10/2004 - Friday Seminar Pierre Marguerite
Microarray Informatics Team4
MAGE-ML (MAGE-OM)Description Biosequence
Array
Array Design DesignElement DesignElement
Funded by:
15/10/2004 - Friday Seminar Pierre Marguerite
Microarray Informatics Team5
MAGE-ML (next)
Funded by:
15/10/2004 - Friday Seminar Pierre Marguerite
Microarray Informatics Team6
ADF (previous)
Funded by:
15/10/2004 - Friday Seminar Pierre Marguerite
Microarray Informatics Team7
Array Design File
adh
adr
adc
Header
contacts
Technical Information
Funded by:
15/10/2004 - Friday Seminar Pierre Marguerite
Microarray Informatics Team8
Array Design File
adr
adc
ReportersFeatures
Feature /Reporter
Funded by:
15/10/2004 - Friday Seminar Pierre Marguerite
Microarray Informatics Team9
Array Design FileComposite
Characteristics
Map to reporters
Funded by:
15/10/2004 - Friday Seminar Pierre Marguerite
Microarray Informatics Team10
ADF version differences• 3 parts (files) instead of 1• As Workbook or text files
• No Reporter Identifier item • No Reporter Group [role] item• New Chromosome item • New Chromosome_band item • New Species item
Funded by:
15/10/2004 - Friday Seminar Pierre Marguerite
Microarray Informatics Team11
• 2 mandatory steps :– Validation– Conversion
Funded by:
15/10/2004 - Friday Seminar Pierre Marguerite
Microarray Informatics Team12
Validation
• File format validation:
• File content validation– Validation of controlled vocabulary
• MGED ontology terms
• Approved Databases (Tags, Accession numbers)
– Automatic curation (when possible)
Funded by:
15/10/2004 - Friday Seminar Pierre Marguerite
Microarray Informatics Team13
Validation
• two levels of checking:– Relaxed– Strict
• two execution modes :– A complete mode– A step-by-step mode
• Error Log : for correction
Funded by:
15/10/2004 - Friday Seminar Pierre Marguerite
Microarray Informatics Team14
Checking lists (header)• File/Data structure checklist:
1. Header file is a tab-delimited-file2. Item names are correct or can be identified
if an item is not identified, it is skipped.1. All mandatory items are present in the header
• Data/file content checklist1. Correct field value format
Possible value types:"Integer"
• "Free Text"• "Controlled vocabulary"• "MGED ontology term"• "DatabaseEntry"• "Sequence"• "Species"
2. Check single multiple value
Funded by:
15/10/2004 - Friday Seminar Pierre Marguerite
Microarray Informatics Team15
Checking lists (feature reporter)• Feature Reporter file• File/Data structure checklist:
1. Header File is correct (structure and data )2. FeatureReporter file is a tab-delimited-file3. Header item names are correct (unknown items are skipped)4. All mandatory items are present. item cardinalities and dependences are correct.5. Database tags are approved and database accession numbers are correct6. Item order is correct (Optional, do not fail the checking)7. Field dependences are correct
• Data/file content checklist1. FeatureReporter file structure must be correct2. Mandatory Field are present. Field cardinalities and field value multiplicities must be correct.3. Field values are in a mandatory format
• Database tags are approved by ArrayExpress and are supplied in lower caseand between square brackets• Database ID are correct• Ontology terms are correct (MGED ontology)• Sequences are correct following the associated polymer type (DNA, RNA, protein):• Integer field values are correct
4. Duplicate features must not exist5. Duplicate Reporter (equal names) must have the characteristics.
Funded by:
15/10/2004 - Friday Seminar Pierre Marguerite
Microarray Informatics Team16
Checking lists (composite)• CompositeSequence• File/Data structure checklist:
1. Feature Reporter file must be correct (structure and data)2. CompositeSequence file is a tab-delimited-file3. Header item names are correct. (Unknown items are skipped)4. All mandatory items are present. Header item cardinalities and dependences are
correct5. Column order is correct (non mandatory)
• Data/file content checklist1. Composite file structure must be correct2. All mandatory fields are present. Field cardinalities are correct3. Field values are in expected format. Field multiplicity is correct (same as
Feature/Reporter)4. Names in map are reporter or composite sequence names5. No duplicate CompositeSequences (same names)
Funded by:
15/10/2004 - Friday Seminar Pierre Marguerite
Microarray Informatics Team17
Checking lists• Header item names are correct• All mandatory items are present• All mandatory fields are present.• No Duplicate features • Duplicate Reporter (equal names) must have the
characteristics.• No duplicate CompositeSequences (same names)• Names in map are reporter or composite
sequence names
Funded by:
15/10/2004 - Friday Seminar Pierre Marguerite
Microarray Informatics Team18
Funded by:
15/10/2004 - Friday Seminar Pierre Marguerite
Microarray Informatics Team19
Funded by:
15/10/2004 - Friday Seminar Pierre Marguerite
Microarray Informatics Team20
MGED Ontology / DAML+OIL
Funded by:
15/10/2004 - Friday Seminar Pierre Marguerite
Microarray Informatics Team21
Approved Databases
Funded by:
15/10/2004 - Friday Seminar Pierre Marguerite
Microarray Informatics Team22
User modes
Funded by:
15/10/2004 - Friday Seminar Pierre Marguerite
Microarray Informatics Team23
Implementation - technical choices
-MAGE-stk-JaxB-Configuration (default parameters)
Performance:4000 features : ~10 minutes
Funded by:
15/10/2004 - Friday Seminar Pierre Marguerite
Microarray Informatics Team24
Installer - izpack http://www.izforge.com/izpack/
Funded by:
15/10/2004 - Friday Seminar Pierre Marguerite
Microarray Informatics Team25
http://www.ebi.ac.uk/adf
http://www.ebi.ac.uk/adf/
Funded by:
15/10/2004 - Friday Seminar Pierre Marguerite
Microarray Informatics Team26