the gus 3.0 perl object layer cbil jonathan schug june 18 2002
TRANSCRIPT
A Programmer's View of GUS3.0
• The application programmer interacts with GUS via the GusApplication (GA) Perl program.
• The GA is a general framework for connecting to GUS30.• Specific tasks are performed by individual plugins.• Plugins use either table-specific classes or SQL access.• Low-level database access is provided by DBI classes.
RAD TESSDoTS
CoreSResDBIPlugin
ClassClassClassClassClass S
uperC
lass
es
SQL
GusApplication
A GUS3.0 Table
Primary key
Table-specific attributes
GUS overhead attributes
Parents - pointed to by this table
Children - point to this table
A GUS3.0 Perl Object Layer Class
GUS30/DoTS/Clone.pm
package DoTS::Clone;use strict;use GUS30::DoTS::gen::Clone_gen;use vars qw (@ISA);@ISA = qw (DoTS::Clone_gen);1;
• Relies on _gen class for accessor methods.• This is stub for hand-edited domain-specific methods.
The _gen Class - I
GUS30/DoTS/gen/Clone_gen.pm
package DoTS::Clone_gen;use strict;use GUS30::dbiperl_utils::RelationalRow;use vars qw (@ISA);@ISA = qw (RelationalRow);sub setDefaultParams { ... }
• Inherits from RelationalRow
• setDefaultParams to determine if versionable and updateable.
The _gen Class - II
GUS30/DoTS/gen/Clone_gen.pm
sub setCloneId { ... } sub getCloneId { ... } sub setLibraryId { ... } sub getLibraryId { ... } sub setImageId { ... } sub getImageId { ... } sub setDbestCloneUid { ... }sub getDbestCloneUid { ... } sub setWashuName { ... } sub getWashuName { ... } sub setGdbId { ... } sub getGdbId { ... } sub setMgiId { ... } sub getMgiId { ... } sub setDbestLength { ... }sub getDbestLength { ... }sub setWashuLength { ... }sub getWashuLength { ... }
• There is an accessor for each column.
• Note the case change and loss of underscores.
The _gen Class - IIIGUS30/DoTS/gen/Clone_gen.pm
sub setModificationDate { ... }sub getModificationDate { ... }sub setUserRead { ... }sub getUserRead { ... }sub setUserWrite { ... }sub getUserWrite { ... }sub setGroupRead { ... }sub getGroupRead { ... }sub setGroupWrite { ... }sub getGroupWrite { ... }sub setOtherRead { ... }sub getOtherRead { ... }sub setOtherWrite { ... }sub getOtherWrite { ... }sub setRowUserId { ... }sub getRowUserId { ... }sub setRowGroupId { ... }sub getRowGroupId { ... }sub setRowProjectId { ... }sub getRowProjectId { ... }sub setRowAlgInvocationId { ... }sub getRowAlgInvocationId { ... }
• There is an accessor for each column.
• Note the case change and loss of underscores.
Hand Edited Methods
• Edit main class file, e.g., GUS30/DoTS/Clone.pm
• Typically placed in GUS30/DoTS/hand_edited/
• Symlink in GUS30/DoTS.• Mostly used in DoTS
section.
DoTS/AAFeature.pm:4DoTS/AASequence.pm:2DoTS/Assembly.pm:76DoTS/AssemblySequence.pm:29DoTS/Evidence.pm:6DoTS/GeneFeature.pm:4DoTS/Gene.pm:9DoTS/IndexWordSimLink.pm:2DoTS/NAFeature.pm:5DoTS/NASequence.pm:4DoTS/RNAFeature.pm:4DoTS/RNA.pm:3DoTS/Similarity.pm:9DoTS/SimilaritySpan.pm:6DoTS/SplicedNASequence.pm:1DoTS/TranslatedAAFeature.pm:3DoTS/TranslatedAAFeatureSegment.pm:2DoTS/TranslatedAASequence.pm:1DoTS/VirtualSequence.pm:1
Creating Objects# get the class use GUS30::DoTS::Clone;…# create new objectmy $clone_gus = DoTS::Clone->new({ washu_length => 5,});
# adjust a column value$clone_gus->setDbestUid(‘A123456’);
# print some values.print $clone_gus->getDbestUid, “\n”;print $clone_gus->toXML, “\n”;
# submit to database$clone_gus->submit;
Connecting Objects
use GUS30::DoTS::Clone;use GUS30::DoTS::CloneLibrary;
My $clone_lib_gus = DoTS::CloneLibrary->new({…});While (<>) {
chomp; my @parts = split /\t/; my $clone_gus = DoTS::Clone->new({…}); # this $clone_lib_gus->addChild($clone_gus); # or this $clone_gus->setParent($clone_lib_gus);}$clone_lib_gus->submit;
Retrieving Objects
Use GUS30::DoTS::CloneLibrary;
My $clone_lib_gus = DoTS::CloneLibrary->new({ clone_library_id => 12345});If ($clone_lib_gus->retrieveFromDB) { $clone_lib_gus->set…(…); $clone_lib_gus->submit; print “found it!\n”;}Else { print “did not find any unique row!\n”;}
Traversing Object Relations - I
Use GUS30::DoTS::CloneLibrary;Use GUS30::DoTS::Clone;
My $clone_lib_gus = DoTS::CloneLibrary({ clone_library_id => 12345});If ($clone_lib_gus->retrieveFromDB) { my @clones = $clone_lib_gus->getChildren(‘DoTS.Clone’,1); foreach (@clones) { … }}
Traversing Object Relations - II
Use GUS30::DoTS::CloneLibrary;Use GUS30::DoTS::Clone;
My $clone_lib = DoTS::Clone->new({ clone_id => 12345});If ($clone_gus->retrieveFromDB) { my $clone_lib_gus = $clone_gus->getParent(‘DoTS.CloneLibrary’,1); . . .}
Deleting ObjectsUse GUS30::DoTS::CloneLibrary;Use GUS30::DoTS::Clone;
My $clone_lib_gus = DoTS::CloneLibrary({ clone_library_id => 12345});If ($clone_lib_gus->retrieveFromDB) { $clone_lib_gus->markDeleted; my @clones = $clone_lib_gus->getChildren(‘DoTS.Clone’,1); foreach (@clones) { $_->markDeleted; } $clone_lib_gus->submit;}
• Recursively deletes children as well.
The Object Cache
• A cache of objects is maintained so that getParents and getChildren always return the same instance of a row.
• Cache is limited in size to avoid large memory requirements.
• Cache is cleared with undefPointerCache method on object or plugin
• Cache size is increased with setMaximumNumberOfObjects method.
Dbiperl_utils
Support and base classes for object classes.
• RelationalRow• DbiRow• DbiTable • DbiDatabase
RelationalRow.pm
• Contains 176 methods in these categories:– Accessors for default overhead values– Accessors for debugging and verbose modes– Pointer cache maintenance– Class information– Parent/child information– XML management– Deletion marking– Submission management– Similarity and Evidence management
• Isa DbiRow
DbiRow.pm
• Contains 43 methods in these categories:– Get/Set methods to support class-specific
accessors– Accessors for table and class names– Attribute information– Tracking attribute value changes– retrieveFromDB– IdentityInsert management– Get DbHandle, MetaHandle, and Database
DbiTable.pm
• 76 methods for– Various table names– Attribute information– Relations information– Primary keys and ids– Others
DbiDatabase.pm
• 103 methods covering these areas:– Database handles– Login information– Database and section names– Transaction management– Table and view names– Object cache– Counters
Overhead Columns
Contain information about:
•History
•Ownership
•Access permissions
•Data provenance
Who manages these columns?
GusApplication (GA)
• Purpose is to standardize database access application
• Provides:– Database login– Default ownership and permissions– Algorithm and parameter tracking– Command line access
Algorithms & Stuff
Algorithm
AlgorithmImplementation
AlgorithmInvocation
AlgorithmParamKey
AlgorithmParamKeyType
AlgorithmParam
Tracks what programs implementing what algorithms were run with what parameters.
GA populates these tables.
GA Usage
ga [<mode>] [<plugin_class>] <plugin_class_options>• <mode> is one of
– +create : creates Algorithm, AlgorithmImplementation, and AlgorithmParamKey
– +update : creates AlgorithmImplementation and AlgorithmParamKey
– +history : lists invocations– +run : runs the plugin (default)
• <plugin_class>– From hierarchical namespace, e.g.,
Utils::UpdateGusFromXML
• <plugin_class_options>– Defined by plugin plus some generic GA options.– E.g., --file data.tab --commit --verbose
Plugins
• A plugin is just a package that inherits from GUS30::GA_plugins::Plugin.
package GUS30::GA_plugins::Utils::UpdateGusFromXML;@ISA = qw(GUS30::GA_plugins::Plugin);
• It must have two methods:– new - to create and initialize the plugin
object– run - perform actions of plugin
The new Method
• Must initialize certain important plugin attributes:
sub new {
my $Class = shift;
my $m = bless {}, $Class;
$m->setUsage(‘what this algorithm does’);
$m->setVersion('2.0');
$m->setRequiredDbVersion({ Core => ‘3’, DoTS => ‘3’ });
$m->setDescription(‘what is new in implementation);
$m->setEasyCspOptions(…); # command line options
return $m
}
Command Line Options
• A hash describing a parameter:– h => hint for user– t => parameter data type (boolean, string, integer,
float)– d => default value– l => is a list if true– e => list of legal reg-exps– r => required if true– o => command line flag
• E.g., { h => 'start label ordinals with this value', t => 'integer', d => 0, o => 'FirstOrdinal', },
GA-Supplied Comand-line Options
• GA adds these options:– commit– verbose– debug– user– group– project– comment– database– server– implementation– algoinvo
• Pink ones also read from config file .gus30.cfg
Example: TESS::LoadMultinomialLabelSet
TESS::MultinomialLabelSet TESS::MultinomialLabel
•Task is to maintain entries in these two tables
•MultinomialLabelSet stores sets of labels for multinomial observations, e.g., DNA, AA, or dimer gaps.
•Can also be DNA or AA dimers, trimers, etc.
•MultinomialLabel stores individual names.