mage : revised submission against lsr rfp-007 "gene expression"
DESCRIPTION
MAGE : Revised submission against LSR RFP-007 "Gene Expression". Ugis Sarkans, EBI Michael Miller, Rosetta Inpharmatics. Overview. Acknowledgements Specification history and structure Fundamental Terms UML Packages Mapping from PIM to XML-PSM Schedule Resources. Doug Bassett (Rosetta) - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/1.jpg)
1
MAGE:Revised submission
against LSR RFP-007"Gene Expression"
Ugis Sarkans, EBI
Michael Miller, Rosetta Inpharmatics
![Page 2: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/2.jpg)
2
Overview• Acknowledgements
• Specification history and structure
• Fundamental Terms
• UML Packages
• Mapping from PIM to XML-PSM
• Schedule
• Resources
![Page 3: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/3.jpg)
3
Acknowledgements• Doug Bassett (Rosetta)
• Derek Bernhart (Affymetrix)
• Alvis Brazma (EBI)
• Steve Chervitz (Affymetrix)
• Francisco Dela Vega (Applied Biosystems)
• Michael Dickson (NetGenics)
• David Frankel (IONA)
• Ken Griffiths (NetGenics)
• Scott Markel (NetGenics)
• Michael Miller (Rosetta)
• Dave Nellesen (Incyte)
• Alan Robinson (EBI)
• Ugis Sarkans (EBI)
• Barry Schwartz (Affymetrix)
• Martin Senger (EBI)
• Paul Spellman (Stanford)
• Jason Stewart (NCGR)
• Charles Troup (Agilent)
• participants of MAGE programming jamboree (hosted by Iobion) in Toronto, September 2001
![Page 4: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/4.jpg)
4
Model -Driven Architecture• Platform Independent Model (UML)
– most of the effort spent on this
• Platform Specific Model– XML
• UML (refined from PIM):– not used (Rational Rose profile for UML not that useful)
• DTD – generated from PIM
– manual modifications
![Page 5: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/5.jpg)
5
History of the submittal• lifesci/01-06-02 - an interim draft before the
Danvers meeting– not enough time to work out XML
• lifesci/01-08-01 - not the final submission– programming jamboree after the Toronto
meeting helped a lot, especially in the XML mapping area
• lifesci/01-10-01 - current submission
![Page 6: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/6.jpg)
6
Specification Structure
• Text document with explanations, including all diagrams– prepared partly by exporting from Rational
Rose
• PIM, UML model as a single XMI file
• XMI => DTD translation software (as a formal representation of the mapping rules)
• XML DTD
![Page 7: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/7.jpg)
7
Fundamental Terms• BioSample - tissue, cell-line, etc. that may
be treated
• BioMaterial - generic term for biological-based material
• BioSequence - an abstraction of a biological sequence
• BioAssay – treatment of an array with a labeled extract, i.e.
hybridization– experimental step in a broader sense
![Page 8: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/8.jpg)
8
Fundamental Terms (2)• Reporter - the physical representation of
biosequence(s) on an array
• Feature - location on an array
• Event - description of an action, i.e. treatment of a BioSample or the act of hybridization
• Transformation - a specific Event, transforming a set of data to another set of data.
![Page 9: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/9.jpg)
9
UML Packages (1)
• BioSequence and BQS
• BioMaterial
• BioEvent
• ArrayDesign and DesignElement
• ArrayManufacture
• BioAssay
• BioAssayData
![Page 10: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/10.jpg)
10
UML Packages (2)
• Experiment
• HigherLevelAnalysis
• Miscellaneous– Describable– Measurement– QuantitationType– Protocol– Audit and Security
![Page 11: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/11.jpg)
11BSANE BQS
Description
Protocol
Measurement
Audit
Treatment
Transformation
BioEvent
Experiment
ArrayDesign
BioMaterial
BioAssayData BioAssay
DesignElement
UML Packages (3)
HigherLevelAnalysis
BioSequence
ArrayManufactureQuantitationType
![Page 12: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/12.jpg)
12
Package dependencies
![Page 13: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/13.jpg)
13
Important package dependencies
![Page 14: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/14.jpg)
14
Experiment• Represents the container for a hierarchical
grouping of BioAssays
• ExperimentDesign decribes and annotates the overall design and purpose of the experiment
• Description of experimental steps can be structured by ExperimentalFactors/ FactorValues:– ExperimentalFactor is a part of
ExperimentDesign– FactorValues can be attached to BioAssays
![Page 15: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/15.jpg)
15
Experiment
![Page 16: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/16.jpg)
16
HigherLevelAnalysis
• The results of performing analysis on the BioAssayData from an Experiment
• Clustering allows specifying the results of analysis as a hierarchical tree
• Cluster Nodes can have NodeValues and are associated with *Dimension objects
![Page 17: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/17.jpg)
17
BioAssayData• The data associated with either a measured
BioAssay or a derived BioAssay• Data is conceptually a 3-D matrix, with
dimensions:– BioAssayDimension
– DesignElementDimension
– QuantitationTypeDimension
• Transformations are used to capture data processing sequence and rules– *Mapping objects formalize dimension translations
• Two representations for BioDataValues:– a set of BioDataTuples
– BioDataCube
![Page 18: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/18.jpg)
18
BioAssayData
![Page 19: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/19.jpg)
19
BioAssayDataBioAssay
QuantitationType
DesignElement
Transformation
![Page 20: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/20.jpg)
20
QuantitationType
• StandardQuantitationTypes and SpecializedQuantitationTypes
• list of SQTs
• can refer to a Channel object
• QuantitationTypeMap - within BioAssayData package
![Page 21: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/21.jpg)
21
BioAssay
• Three types of BioAssays (experimental steps):– PhysicalBioAssay
• Contains information and annotation on the event of joining an Array with BioMaterial, typically with LabeledExtract(s); also, Treatments
– MeasuredBioAssay• FeatureExtraction
– DerivedBioAssay• corresponds to a dry-lab experimental step
![Page 22: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/22.jpg)
22
BioAssay
![Page 23: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/23.jpg)
23
Array• Manufacturing information about the
implementation of an array design– Defects and deviations from the design can be
recorded• FeatureDefects
• ZoneDefects
– The LIMS biomaterial information for what was put on each feature can be recorded here
– ArrayGroups and Fiducials
![Page 24: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/24.jpg)
24
Array
![Page 25: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/25.jpg)
25
BioMaterial• Describes how a BioSource is treated to
obtain the BioMaterial for Hybridization (typically a LabeledExtract)
• Used by a BioAssayCreation in combination with an Array to produce a PhysicalBioAssay
• A set of treatments are typically linear in time but can form a Directed Acyclic Graph
• Formalization of Treatments with Compounds
![Page 26: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/26.jpg)
26
BioMaterial
![Page 27: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/27.jpg)
27
DesignElement
• DesignElements– Features are the locations on the array
– Reporters represents some biological sequence (clone, oligo, etc.) that can be placed on one or more features
• immobilized characteristics
– CompositeSequence is a grouping that represents a biological sequence composed of other biological sequences (gene, exon, etc.)
• biological characteristics
• *Maps - for relating Features to Reporters etc– MismatchInformation
![Page 28: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/28.jpg)
28
DesignElement
![Page 29: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/29.jpg)
29
BioSequence
• BioSequence class - abstraction of various biosequences
• DatabaseEntries for characterizing BioSequences
• Simplication of BSANE draft; will need to be compatible with the end result of BSANE
![Page 30: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/30.jpg)
30
ArrayDesign
• ArrayDesign describes a microarray design that can be manufactured– Zone information– DesignElementGroups
![Page 31: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/31.jpg)
31
ArrayDesign
![Page 32: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/32.jpg)
32
BioEvent• Abstraction of various MAGE events:
– physical (e.g., BioMaterial Treatment) – data manipulation (Transformation)
• Have associated ProtocolApplications (an ordered list)
• Subclasses have some target (the result of the BioEvent)
• Often have sources
• Relevant for BioMaterial, BioAssay, BioAssayData packages
![Page 33: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/33.jpg)
33
Protocol• Protocol and ProtocolApplication
– Protocol describes a generic laboratory procedure or analysis algorithm
– ProtocolApplication describes the actual application of a protocol
– ProtocolApplication:• values for the replaceable parameters
• any variation from the Protocol
• Similarly:– Hardware and HardwareApplication– Software and SoftwareApplication
![Page 34: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/34.jpg)
34
Protocol
![Page 35: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/35.jpg)
35
Miscellaneous (1)• Hierarchy of top-level abstract classes
– Extendable - can have properties– Describable - can have also Descriptions and
Security and Audit information– Identifiable - also has (unambiguous within
some scope) identifier and a name
• AuditAndSecurity package– Contact/Person/Organization classes– tracking of changes (audit trail)– user security (access rights to MAGE objects)
![Page 36: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/36.jpg)
36
Miscellaneous (2)• Description package
– Description is a container for• free text description
• OntologyEntries
• DatabaseEntries
• BibliographicReferences
• BQS package– BibliographicReference class
• Measurement package– Measurement is a quantity with a unit– simple Measurement ontology provided
![Page 37: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/37.jpg)
37
DTD & XML Format
<MAGE-ML> <{packageName}_package> <{className}_assnlist> <!-- generated container element --> <{className}> <!-- independent class elements --> <{container}> <!-- one of *_assn, *_assnref, *_assnlist, *_assnreflist --> <{className or className_ref}>
…<!-- alternating {container} and {className or className_ref} --> </{className or className_ref}> </{container}> </{className}> </{className}_assnlist> ... <!-- more independent classes --> </{packageName}_package>> ... <!-- more packages --></MAGE-ML>
* slide borrowed from Angel Pizarro, UPenn
![Page 38: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/38.jpg)
38
XML tree example
AuditAndSecurity_pkg
Contact_assnlist
ExperimentDesign_assn
Experiment_pkg
Experiment_assnlist
Experiment
Contact_ref
ExperimentDesign
Provider_assnref
MAGE-ML
Contact
* slide borrowed from Angel Pizarro, UPenn
![Page 39: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/39.jpg)
39
Programming APIs• Mapping of OM to language-specific OMs• API’s are automatically generated from the
OM specifications– Get/set methods for associations– Get/set methods for attributes
• XML <=> language-specific OM marshallers/unmarshallers - also automatically generated
![Page 40: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/40.jpg)
40
Programming APIs (cont.)
• Use standard modules/packages– Xerces, JDK, etc.
• Implementation in Java, C++, Perl
• Building annotation tools/database access modules on top of these APIs
![Page 41: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/41.jpg)
41
Schedule
• LSR ‘vote to vote’ at Dublin OMG meeting in November– LSR, AB, DTC votes at Dublin OMG meeting
• Setting up FTF
• open source implementation efforts– Jamboree II at EBI, December 6-11
• MAGE v.2.0– current MAGE <=> MAGE v.2.0 mapping
rules
![Page 42: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/42.jpg)
42
Web Sites
• MAGE specification - hosted by Rosetta– links to documents
• presentations
• UML models– XMI files
– Rose .mdl files
– HTML version
– PNG image files of diagrams
– http://www.geml.org/omg.htm
• MGED programming effort:– http://sourceforge.net/projects/mged
![Page 43: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/43.jpg)
43
Mailing Lists• Specification-related
– [email protected]– to subscribe, send the following to
subscribe lsr-ge <yourEmailAddress>
• MAGE-STK development-related– https://lists.sourceforge.net/lists/listinfo/mged-
mage
![Page 44: MAGE : Revised submission against LSR RFP-007 "Gene Expression"](https://reader035.vdocument.in/reader035/viewer/2022081603/568148bf550346895db5dc96/html5/thumbnails/44.jpg)
44
Questions?