logos: a computational framework for neuroinformatics...

11
L OGOS: A Computational Framework for Neuroinformatics Research Michael Stiber 1,2,3 , Gwen A. Jacobs 1,3 , Deborah Swanberg 4 1 Dept. of Molec. & Cell Biol., Univ. of Calif., Berkeley, CA 94720-3200 2 Comp. Sci. Dept., Montana State Univ., Bozeman, MT 59717 3 Center for Comp. Biol., Montana State University, Bozeman, MT 59717-3505 4 Dept. of Comp. Sci. and Eng.-0407, Univ. of Calif. San Diego, La Jolla, CA 92093-0407 [email protected] [email protected] [email protected] Abstract Neuroinformatics presents a great challenge to the com- puter science community. Quantities of data currently range up to multiple-petabyte levels. The data itself are di- verse, including scalar, vector (from 1 to 4 dimensions), vol- umetric (up to 4 dimensional spatio-temporal), topological, and symbolic, structured knowledge. Spatial scales range from Angstroms to meters, while temporal scales go from microseconds to decades. Base data vary greatly from indi- vidual to individual, and results computed can change with improvements in algorithms, data collection techniques, or underlying methods. We describe a system for managing, sharing, processing, and visualizing such data. Envisioned as a “researcher’s associate”, it will facilitate collaboration, interface be- tween researchers and data, and perform bookkeeping as- sociated with the complete scientific information life cycle, from collection, analysis, and publication to review of pre- vious results and the start of a new cycle. 1. Introduction Neuroscientists study the various anatomical, physio- logical, and functional components of nervous systems to better understand how the “low-level” activity of individ- ual cells maps to behavior. In this research process, large amounts of complex data are collected, but technology has not yet provided systems which integrate this data to help scientists analyze, visualize, and understand it [4, 10]. There are several aspects of this which are unusual when compared to most other scientific data processing activities. This work was supported by NSF grant number BIR-9507314 to G.A.J. Figure 1. An example rendered neuron. Cell is shown within outline of enclosing gan- glion. The cell is represented as a tree (with each branch including diameter data along its length) and an associated “cloud” of output sites (too small to be visible here). First of all, the base data gathered from experiments are diverse, including anatomical (2D and 3D images, 3D ge- ometries as shown in Fig. 1), physiological (times series, point processes; see example in Fig. 2), molecular (2D and 3D density distributions), and symbolic (functions, behav- iors). Additionally, data are gathered from single cells in

Upload: others

Post on 17-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: LOGOS: A Computational Framework for Neuroinformatics …depts.washington.edu/biocomp/pubs/logos.pdfData Manager Interactive Client GU I Works pace Interactive Client GU I Workspace

LOGOS: A Computational Framework for Neuroinformatics Research�

MichaelStiber1,2,3, GwenA. Jacobs1,3, DeborahSwanberg4

1Dept.of Molec. & Cell Biol., Univ. of Calif., Berkeley, CA 94720-32002Comp.Sci. Dept.,MontanaStateUniv., Bozeman,MT 59717

3Centerfor Comp.Biol., MontanaStateUniversity, Bozeman,MT 59717-35054Dept.of Comp.Sci. andEng.-0407,Univ. of Calif. SanDiego,La Jolla,CA 92093-0407

[email protected]@[email protected]

Abstract

Neuroinformaticspresentsa greatchallengeto thecom-puter sciencecommunity. Quantities of data currentlyrangeup to multiple-petabytelevels.Thedataitself are di-verse, includingscalar, vector(from1 to4 dimensions),vol-umetric(upto 4 dimensionalspatio-temporal), topological,and symbolic,structured knowledge. Spatialscalesrangefrom Angstromsto meters, while temporal scalesgo frommicrosecondsto decades.Basedatavarygreatlyfromindi-vidual to individual,andresultscomputedcanchangewithimprovementsin algorithms,datacollectiontechniques,orunderlyingmethods.

Wedescribea systemfor managing, sharing, processing,and visualizingsuch data. Envisionedas a “r esearcher’sassociate”, it will facilitate collaboration, interface be-tweenresearchers and data,and performbookkeepingas-sociatedwith thecompletescientificinformationlife cycle,fromcollection,analysis,andpublicationto review of pre-viousresultsandthestartof a new cycle.

1. Introduction

Neuroscientistsstudy the various anatomical,physio-logical, andfunctionalcomponentsof nervoussystemstobetterunderstandhow the “low-level” activity of individ-ual cells mapsto behavior. In this researchprocess,largeamountsof complex dataarecollected,but technologyhasnot yet providedsystemswhich integratethis datato helpscientistsanalyze,visualize,andunderstandit [4, 10].

Thereareseveralaspectsof thiswhichareunusualwhencomparedto mostotherscientificdataprocessingactivities.�This work was supportedby NSF grant numberBIR-9507314to

G.A.J.

Figure 1. An example rendered neuron. Cellis sho wn within outline of enclosing gan-glion. The cell is represented as a tree (witheach branc h inc luding diameter data along itslength) and an associated “c loud” of outputsites (too small to be visib le here).

First of all, the basedatagatheredfrom experimentsarediverse,includinganatomical(2D and3D images,3D ge-ometriesasshown in Fig. 1), physiological(times series,point processes;seeexamplein Fig. 2), molecular(2D and3D densitydistributions),andsymbolic(functions,behav-iors). Additionally, dataaregatheredfrom singlecells in

Page 2: LOGOS: A Computational Framework for Neuroinformatics …depts.washington.edu/biocomp/pubs/logos.pdfData Manager Interactive Client GU I Works pace Interactive Client GU I Workspace

Figure 2. An example physiological recor ding, in this case of the simultaneous disc harge of twonerve cells (X-axis is time; Y-axis volta ge). Cell disc harges are often recor ded as a sequence ofvolta ges along time . Alternativel y, they may be assimilated to point processes by noting onl y thetimes of occurrence of the large volta ge spikes.

individual animalsand can vary greatly from one animalto another. However, researchersusuallydon’t want to askquestionsabouta particularcell or individual; they wanttogeneralizefrom the examplesthey’ve seento produceanunderstandingof how cellsandsystemsfunction.

We addressherethreemajor problemsassociatedwithneurosciencedataprocessing,or neuroinformatics:

1. The constructionof a basicsystemto supportneuro-sciencedatamanagement:theLOGOS system.

2. Providing researcherswith the ability to presentqueriesandreceiveresponsesin termsof typicalcells,relying on an extensioncalledMETALOGOS to mapthesehigher-level constructsto operationson datagatheredfrom individualexperiments.

3. The useof METALOGOS asa cells-to-systemsinter-face, wherebytheusercanmanipulatedatapertainingto largenumbersof cellswhichcollectively contributeto a particularfunction.

To make 2 and3 above moreconcrete,considerthefol-lowing example[5]. Crickets have elongatedsensoryor-gans,calledcerci, which projectbehindthem. Thesecerciare coveredwith hairs which serve as transducersfor airmotion. Transducedsignalsare carriedby sensoryneu-ronsto an abdominalganglion: a collectionof nerve cellsandinter-neuronalconnections.Themany sensoryneuronseachtransmittheir messagesto a numberof second-levelneurons,or interneurons; eachinterneuronreceiving inputfrom many sensorycells.

The obviousquestionto askhereis: what computationdoesthe cercalsensorysystemas a wholeperform? It isnot usuallyfeasible,however, to performanexperimenttoanswerthis questiondirectly. Instead,a researchercouldperformthefollowing experiments:

1. Stain a single neuronin an individual cricket with achemicaldye, so that its entirestructurecanbe seeneasily. Digitize its three-dimensionalgeometry. Asshown in Fig. 1, a cell is representedas: a treewitheachbranchhaving particulardiametersat eachnode,and a cloud of small spheres,or varicosities, whichcorrespondto connectionpoints betweenthat cell’soutputandothercells’ inputs. This couldbedonefora largenumberof sensorycellsandinterneurons

2. Insertanelectrodeinto a cricket,andrecordthephys-iological responses(similar to thatshown in Fig. 2) ofsensorycells to air currentshaving differentdirectionandmagnitude. This would allow one to computeatuningcurvefor eachcell: a mappingfrom thespaceof wind velocity to neuronoutput intensity. For sen-sorycells,thismapis typically a simplefunction.

3. Make physiological recordings of interneuron re-sponsesto air motion.

Experiments1 and2 canbeusedto build a databaseforthe first stagesof the cercalsystem[11]. However, it islikely that the resultsof experiment3 would be difficultto interpretinitially, sinceeachinterneuronreceivesinputfrom many sensorycells (andthusthemapfrom wind ve-locity to responseis complex anddifficult to eithersumma-rize meaningfullyor provide constanciesfrom oneanimalto another).Instead,onemustfirst determinewhatcellularaspectsareconservedacrossdifferentanimals:in thiscase,thedirectionalsensitivity of correspondinghairsandthere-gion of the ganglionto which eachhair’s sensoryneuronprojects.

Basedon this, thesummarydisplayshown in Fig. 3 canbeproduced.A numberof sensorycellswereusedto gen-eratethis diagram. For each,the distribution of varicosi-tieswasusedto produceanestimateof whatits connection

Page 3: LOGOS: A Computational Framework for Neuroinformatics …depts.washington.edu/biocomp/pubs/logos.pdfData Manager Interactive Client GU I Works pace Interactive Client GU I Workspace

Figure 3. System-le vel view computed froma number of individual cells and animals.This displa y sho ws sign and magnitude of re-sponse to air motion to the right rear encodedby shade of grey (original was in color) anddiamond size.

strengthto anothercell would be at eachpoint in the vol-umeof theganglion. If an interneuronhasan input regionin asub-volumedenselyfilled with sensorycell outputs,wewould expectit to receive strongsignalsfrom thatsensorycell. Contrastingly, an input in a sparsesub-volumewouldyield little input.

Thisdensityfunctionwasthenusedasaweightfor mul-tiplicationwith thesensorycell’swind velocitytuningfunc-tion. The total systemresponsefor a numberof sensoryneuronscanbecomputedasthesumof theindividualcells’tuning functions,weightedby their densitiesat eachpointwithin theganglion.Usingthis3D mapof netvelocity tun-ing, onecancomputetheoverall systemresponseto a puffof air of aparticularspeedanddirection,which is shown inthefigureasgrey level (from acolororiginal).

We cantake this onestepfurther, by showing how thissystem-level responsemapsto the input of a particularin-terneuron.Fig. 4 shows theresponsein Fig. 3 mappedontoan interneuron’s structure.All branchesof the interneuronsmallerthansomecutoff diameterwereassumedto receiveinput from sensorycells. For eachpoint on the surfaceofthesebranches,thenetsystemresponsecomputedfor Fig.3wastakenastheinputthatwouldbereceived[11], indicatedby shadeof grey in Fig.4 (betterdisplayedin thecolororig-inal). Oncethis is accomplished,onecould thenproceed

Figure 4. System response in Fig. 3 mappedonto rendering of an interneur on (greyscalerendering of a color original).

to comparetheseinputswith the physiologicalrecordingsmadeof the interneuronoutputto helpdeterminethecom-putationit performs.

There are additional designgoals for LOGOS which,while perhapsnot as novel as thosediscussedabove, areno lessnecessary. Theseinclude:

� The logical and physicalviews shouldmake explicitdistinctionsbetweenraw and(possiblyvariouslevelsof) processeddata[2, 7].

� Humanunderstandingof any field changesover time;datacaptureandanalysischanges,too. The schemaswhichunderliedatastorageshouldbeevolvable[3].

� Scientific computingoccurswithin a heterogeneoushardwareandsoftwareenvironment;a scientificdatamanagementsystemshouldaccommodatethis [8].

� A domain-specificuserinterfacewhichaccommodatesdiffering levelsof usersophistication(includingthosewhowrite theirown applications)shouldbeused[2].

� Becausethis is a scientificdatamanagementsystem,capabilitiessuchasmaintenanceof audit trails, errortracking,anddatasecurityareessential[10].

Page 4: LOGOS: A Computational Framework for Neuroinformatics …depts.washington.edu/biocomp/pubs/logos.pdfData Manager Interactive Client GU I Works pace Interactive Client GU I Workspace

DataManager

InteractiveClient

GU�

I Works� pace

Interactiv� e Client

GU�

I Works� pace

Simu� lator

CycleServer

Figure 5. LOGOS system architecture , sho wing suppor t for distrib ution of data management, compu-tation, and user interface tasks.

2. LOGOS Architecture

The overall LOGOS systemarchitectureis presentedinFig. 5. Its basicdistributednatureis dictatedby thedesireto supportexisting hardwareandsoftwarewhenever feasi-ble. It is thereforedividedinto threemainmodules:a datamanager (DM) asa persistentdatastoreandqueryserver(implementedusing ObjectStore,an off-the-shelf object-orienteddatabasemanagementsystem),a workspace(WS)for performingdataanalysisandothercomputationallyin-tensive tasks(theresultsof which might laterbecomepartof thedatamanager’sstore),anda userinterface(UI).

Many researchersmakedailyuseof MacintoshandIntel-basedmachines,andit wasdecidedearly-onto supporttwouser interfaces: one hostedon a high-performanceUnixworkstationand anotherimplementedfor less expensivehardware. However, useof a lessexpensive machineonone’s desktopmight not meanlack of accessto morepow-erful hardwareelsewhere. A separateworkspaceprocess,which canbe hostedon somecycle server, minimizesthepenaltypaid. If onehasaccessto a graphicsworkstation,thenahigher-performanceuserinterface— currentlybasedon Open Inventor [12] — would be available, and therewould betheoptionof runningtheworkspaceon thesame

workstationor anothermachine.CurrentLOGOS develop-mentis performedonSiliconGraphicsworkstations.

Thereis oneadditionalway thatsomeusersmight wantto accessthe system. Insteadof using the user interfaceprovided with LOGOS, one might want to stick to someexisting software,for examplea neuralsimulationsystemlike GENESIS[1] or a mathematicalanalysispackagelikeMATLAB. Appropriateaccessmethodscanbeprovidedsothata userof GENESIS,for example,couldconnectto thedatamanagerto retrieve experimentaldatato serve asthebasisfor simulations.

Theneuroinformaticsproblemis complex, andit is un-realisticto supposethatany initial systemdesignwill solveall of theissuesof userinterfacedesign,visualization,datamanagement,useof domainknowledge,etc. LOGOS wasdesignedto be a framework for researchinto theseissues,andthusits divisioninto thesemajorsubsystemsmaximizesour ability to isolateeachof theseareas. Object-orienteddesignandimplementationusingC++ hasbeenusedto in-creaseour ability to encapsulateindependentsystemfunc-tions. Of course,thereis necessarilya correspondencebe-tween WS and UI data objects,since WS data must beviewedby the user. However, this architecturedoesallowisolationof decisionsof how the userviews and interacts

Page 5: LOGOS: A Computational Framework for Neuroinformatics …depts.washington.edu/biocomp/pubs/logos.pdfData Manager Interactive Client GU I Works pace Interactive Client GU I Workspace

U�

I

Visualizers/Manipulators

Scriptsand

�Data

LanguageResources

UserInterfa ceController

WorkspaceOverview

Figure 6. User interface modules.

with thedatafrom theoperationstheWSperformsonthem.

2.1. User Interface

Fig.6 showsthattheuserinterfacehasfour components:one or more viewers/manipulators, which handleexami-nation and interactive modificationof data, a workspaceoverview (WSO), which provides views of metadata(in-formationaboutworkspacedataandfunctionobjects),lan-guage resourcessupportingmulti-lingual capabilities,andthe user interfacecontroller (UIC), which sequencestheoperationsof theUI componentsandtheir communicationwith theworkspaceandprovidesa file systeminterfaceforcommandscripts and data import/export. This providesisolationof datavisualizationto the viewers/manipulators(currentlyimplementedusingOpenInventor)andmetadataUI issuesto theWSO.TheUIC providesa “generic” con-nectionto theworkspace.

Uponsystemstartup,theUIC requeststhat theuserlogin to thesystem.This allows for securityin accessto dataandlogging of who performedwhat operations.The UICthenconnectsto theWS andrequestlogin verification.As-sumingthe login is accepted,the UIC thenpresentsa listof datatypesthat it can accept(constrainedby the capa-bilities of its viewer) andthe WS replieswith a list of in-formationaboutdataobjectsit containsandfunctionsthatit can provide. The UIC displaysa WSO, which showsthe systemwidefunction and dataobject information and

allows the userto selectdataobjectsandoperationsto beperformedon them.Commandsfrom theWSOflow to theUIC, which maydispatchthemto theWS or createoneormoreviewersor manipulators.

A viewer is ameansfor renderingdata,typically graphi-cally, andallowing a limited setof localandWSoperationsto be performedon the renderingor data,respectively. Aviewerproducesnonew dataitemsto addto theworkspace.A viewer providesa setof local operationswhich modifytheappearanceof the rendering.A viewer alsoallows theuserto requestWSperformanceof certainoperationsonthedatacorrespondingto a particularrendering,which wouldpassvia theUIC to theWS. Theviewer extendstheWSOcommanddispatchingcapabilityby allowing userspecifi-cationof somesubpartor singleelementof adataobjectastheinput for a WSfunction.

A manipulatoris a viewer that also offers interactivefunctionsover its data. A manipulatorcan producenew,deriveddata,which passvia theUIC to theWS.An exam-ple is the interactive alignmentof two neurons,which canproduceanalignmentdataobjectfor storagein theWS(seesection3.4for a discussionof thealignmentprocess).

2.2. Workspace

The workspaceshown in Fig. 7 hasfour major compo-nents:a workspacecontroller (WC) which managescom-municationwith theUI andsequencesthe activities of theotherWS modules,processingplug-inswhich representasetof operationsover thedata(seesection3.2),a Workingmemory(WM) that providesa temporarydatarepository,andaquerymanager (QM) whichassemblesqueriesfor thedatamanagerandtransfersdatabetweentheDM andWM.

WhenaUI requestsaconnectionto theWS,theWC firstperformslogin verification. Assuminglogin is successful,the WC retainsinformationaboutthe currentuserfor thedurationof thesession,andusesthis to tagnewly importedbasedataandnewly computedderiveddata.TheWC thennegotiatesaconnectionprotocolwith theUI, thisconsistingprimarilyof theUI notifyingtheWCof thetypesof datait iscapableof displaying(this guaranteedto bea subsetof thedatatheWC is capableof sending)andtheWC providinga list of datacontainedin WM andfunctionsprovided bythe processingpluginsandQM. Subsequently, the WC isresponsiblenotonly for receiptanddispatchof computationrequestsfrom theUI, but alsoupdatingtheUI onchangesinthecontentsof WM (theresultof thecomputationrequests)andconvertingdatain theWM into typesdisplayableby theUI.

Page 6: LOGOS: A Computational Framework for Neuroinformatics …depts.washington.edu/biocomp/pubs/logos.pdfData Manager Interactive Client GU I Works pace Interactive Client GU I Workspace

Works paceProcessing� Plug-ins

Working Memory

QueryManager

WorkspaceController

Figure 7. Workspace block diagram.

3. Data, Functor, and Metadata Models

Data in LOGOS canbe consideredasexisting in a dis-cretized,2D space,with one axis beingphysicallocationwithin thesystem(UI, WS,DM) andtheotherbeingthecat-egory of information. Threecategoriesexist: data, includ-ing base(experimental)and derived, functors or functionobjectsencapsulatingWS processingplugins,DM opera-tions,andUI manipulatorcapabilities,andmetadataaboutdataandfunctors.

Datamay move from onephysicallocation to another.When they do, they may changeform (data structures,methods)and/orcontent(e.g., pruning down of WS datato visualizableattributesbeforetransmittalto theUI). Cre-ation of dataor functor objectsimplies creationof corre-spondingmetadata,and movementof the former impliesthepreviousmovementof thelatter. Thus,themetadatacancontaininformation(or handles) necessaryfor referringtodatabetweensubsystems.

TheUI mustdealwith eachcategoryof datadifferently.Its WSOis a tool for metadataviewing andinteraction.Atits simplest,it usesmetadatato display lists of WS dataand available functions,appliesconstraintsabout functorargumentnumberandtypeto providebasicerrorchecking,and passeshandlesto dataand functorsto the UIC con-troller for dispatchto theWSor viewers/manipulators.Theviewersandmanipulatorsreceivearenderablesubsetof thecontentsof WS dataobjects,their metadata,anda subsetof functormetadata,allowing theuserto commandcertainWS operationson the displayeddata. The UIC receivesmetadataandrenderabledatafrom theWS andhandlesfordataandfunctorsfrom theWSOandviewers/manipulators,providesfunctormetadatafor manipulatoroperations,androutesdatato theappropriatedestinations.

Within the workspace,all threecategoriesresidein the

WM, which is merelya passive, non-persistentstore. TheWC receiveshandlesto functorsanddatafrom theUI, ex-ecutesfunctors,receivesmetadatafor any new datacreatedin theWM, andpassesthisnew metadataon to theUI. TheWC also,uponUI request,filtersWM datato producevisu-alizablerepresentations(asper the capabilitysettransmit-ted by the UI at connecttime) and transmitsthem to theUI. Whendataarecreatedin aUI manipulator, theWC willcreatemetadatafor them,storebothin WM, andreturnthenew metadatato theUI.

Theaboveuseof metadatato describegenericcharacter-isticsof bothdataandfunctorsservesto loosenthecouplingamongsystemcomponents,allowing, for example,thead-dition of functors(and, to someextent, datatypes)to theWSwithoutmodificationof theUI.

3.1. Data

All data,whetherimportedinto thesystemasraw exper-imentalresultsor derivedvia (possiblymany) operationsonotherdata,aresubclassedfrom anabstractDataclass.WMoperationsare all performedon objectsof classData (orInfo — metadata— objects,seesection3.3). Functors,ontheotherhand,maybespecializedto operateon particularsubclassesof Data or certainof their componentparts.Forbrevity’s sake, we presentheretwo examplesof theobjecthierarchiesusedin LOGOS for storinganatomicaldataanddataderivedfrom anatomicalinformation.

Fig. 8 presentsa partial LOGOS class hierarchy foranatomicalexperimentaldatagatheredfrom a single ani-mal: a Preparation. Suchdatausually include both theanatomyof oneor moreNeurons andadditionalanatomi-cal featuresnot associatedwith any or these.Theseaddi-tional features,or Fiducials,areusedto registercoordinatesystemsbetweenindividuals.Fiducialsare“landmarks”—relatively invariantacrossindividuals— usedto bring the

Page 7: LOGOS: A Computational Framework for Neuroinformatics …depts.washington.edu/biocomp/pubs/logos.pdfData Manager Interactive Client GU I Works pace Interactive Client GU I Workspace

Prepa� ration

Neu ron Fidu cial

CellBra� nches Varicos� itySet FiducialPath FiducialP� ointSet

XYZDAna� tomyTree XYZDAna� tomySet XYZAnat�omyTree

XYZD� Data XYZData

Anatom� yTree Anatomy� Cluster

Binary� Tree Se� t

XYZAnatomyCluster

Da� ta

Figure 8. Living preparation class diagram. Arrows indicate parent/c hild relationship, with arrow di-rection pointing to super class. Filled cir cles indicate “contains” relationship, with cir cle at containerclass. Open cir cles sho w “uses” relation, with cir cle at consumer end.

spatialcoordinatesystemsin which multiple preparations’anatomicaldataarecapturedinto alignment.

A neuronhasa treelike structure,andthusa genericbi-nary tree classis the ancestorof much of its data. Un-fortunately, it is often impracticalor impossibleto capturethecell’s anatomyin its entirety, asits branchdiameterde-creasessignificantlyasdistancefrom thecentralcell bodyincreases.However, theconnectionsbetweencells,or vari-cositiesaretypically largeandimportantenoughthat theyaredigitized separately, and modeledas a set of spheres.In both cases,the basicdatatypeis the 4-tupleof ����������� �locationanddiameter! .

Fiducials,on theotherhand,usuallyhave only their lo-cations(without any diameterdata)recorded,as they aremostly the outlines of large internal structures(saved aspaths)or key featurelocations(savedaspoints).

As wasalludedto in the introductionandthe previousdiscussionof aligningcoordinatesystems,computinggen-eral resultsfrom the specificexamplesobtainedfrom par-ticular individuals is challenging. Preciseanatomicalor-ganizationis not identical from one animal to the next.However, thereareof coursecertainorganizationalaspects

which are preserved amongall membersof a particularspecies.Therefore,thoughthe precisecoordinatesystemsfor any two individualsarenot the same,the topologyofthatspaceis, andin many casesthecoordinatesof onemaybetransformedto thatof theotherby simpleoperationsliketranslation,rotation,andscalingalongindividualaxes.Thistransformationis a3DAlignment, shown in Fig. 9.

A 3DAlignmentis aDerivedData, andis computedfromtwo Preparations: a referenceandanalignee. Theprocessinvolveslocatingcorrespondingfiducialsbelongingto eachpreparation,and thenusing themto determinethe appro-priate transformation.The computationassociatedwith a3DAlignmentmaybeperformedinteractively in somecases(seesection3.4).

Another kind of derived data, this time combiningin-formationaboutseveralneuronsfor a higher, system-leveloverview, is theSynapticDensityField. A SynapticDensity-Field is computedfrom the VaricositySets of oneor moreneurons(for which a single, unifying 3DAlignmentmustexist). It is a real-valuedfunction of the �"�#���$�%�&� spacedefinedby their unifying 3DAlignment, with the value ateachpoint beingan estimateof the input strengthexerted

Page 8: LOGOS: A Computational Framework for Neuroinformatics …depts.washington.edu/biocomp/pubs/logos.pdfData Manager Interactive Client GU I Works pace Interactive Client GU I Workspace

Prepa' ration

3DAlig( nment

Derive) dData

SynapticDe) nsityField

3DF* ield

XYZDa' taMixin Fie) ld

Da' ta

Figure 9. Derived data class diagram. Connector s with arrows and open or filled cir cles are as inFig. 8; plain lines indicate the existence of a relation between the two classes.

by the neuronson any otherneuronthatwould receive in-put at that point. Synapticdensityis oneway to produceanaverageneuron from specificonesmeasured.It canalsobe extendedto a vector-valuedfunction, and the result isa completemapof wind velocity within thephysicalspacedefinedby thesensoryneurons’varicosities[6]. Section3.5describeshow LOGOS functorscanbecombinedto produceaSynapticDensityField.

3.2. Functors

The LOGOS functor concept is similar to that pro-vided by the C++ StandardTemplateLibrary [9] (STL):it is an object that can be called like a function (an ordi-nary function,a function pointer, or an objectthat definesoperator()). However, unlike the STL, compile-timetype checkingis not very useful here; we thereforedis-pensewith theSTL functorclassesandprovideonefunctorclassand mechanismsfor run-timecheckingof argumenttype, number, andorder, aswell as“documentationdata”displayablefor theuser’s benefit(printablefunctionname,shortdescription,etc). For eachfunctor, a correspondingmetadataobject(FunctorInfo) is createdwith thefollowinginformation:

Name Stringcontainingfunctionname,for userinterface.

Description Short documentationstring describing thefunction,for userinterface.

Arguments Numberof arguments;unsigned.

ArgTypes One for eachargument;an enum that is alsousedto designatedataobjecttype.Notethattheactualargumentspassedat theeventualfunctioncall mayin-cludeadditioninformation,perhapstakenfrom associ-atedmetadata.

ArgDesignators Somefunctorstakein multipleargumentsof thesametype,but treatthemdifferently. Theuserinterfacecannotbe relied upon to provide argumentorderinformation,sotheroleof eachargumentcannotbe implicit in their specificationorder. A string thatdocumentsan argument’s role canbe associatedwithany or all (seesection3.4for anexample).

ReturnType An enum specifyingthe type of dataobjectcreatedby the functor. Note that this doesnot implythat the functionsthat areeventuallycalledhave thisreturntype;they returnstructuresthatcontainsuccessor failureinformationpluspointersto datathatcanbeusedby the WS to constructthe appropriatedataob-ject.

Handle A referenceto thefunctoritself, usedto invoke it.

3.3. Metadata

Thereare two typesof metadata:that which describesdataandthatwhich describesfunctors. The latter wasde-scribedin section3.2;wesummarizetheformerhere.LikeFunctorInfo, DataInfoservesthepurposesof run-timetypecheckingandinput validationanddocumentation.Unlike

Page 9: LOGOS: A Computational Framework for Neuroinformatics …depts.washington.edu/biocomp/pubs/logos.pdfData Manager Interactive Client GU I Works pace Interactive Client GU I Workspace

functors,however, differentdataobjectsof the sameclassare interchangeable(functorsat least have their inherentcomputationalcapabilitiesthatdefinethem). We mustrelyonobjects’DataInfoto allow usto distinguishamongthem.DataInfoincludes:

Name A string containinga creatorselectedname. Forexperimentaldataimportedinto thesystem,thesearetypically selectedusing somesystemenablingquickidentificationof theexperiment.For deriveddatagen-eratedby somefunctor, thesemaybelessuseful.

Notes A stringcontaininguser-enterednotes.

History Functorand argumenthandles, along with date,time,anduser. For experimentaldata,thiswould indi-cateimport date,time, andcreator. For deriveddata,thiswouldpoint to thefunctorthatreturnedit, plusthefunctorarguments.A completehistorycanbe gener-atedby recursively following thehistoryentryof eachdataobjectargument’sDataInfo.

Tuning An optionalfield, which for preparationsindicatesthewind directionthatelicitsa maximalresponse.

Handle A referenceto thedataobjectitself.

3.4. Interactive Example: The Alignment Process

The processof interactively aligning two preparations’coordinatesystemsis a simple exampleof how dataandmetadataareusedby theUI. Theinteractivealignmentpro-cessinvolvesdisplayingtwo preparations,containingbothneuronstructureandfiducial landmarkdata,in a manipu-lator that providestools for translation,rotation,andscal-ing. Onepreparation— the reference— remainslockedin placewhile theother(thealignee) is moved,turned,andstretcheduntil theuserjudgesthat thetwo setsof fiducialsarematchedascloselyaspractical.Theresultis a3DAlign-ment.

Like all LOGOS operations,this begins in the WSO,which is displayinglists of WS dataobjectsandfunctors.Theinteractivealignmentfunctorresidesin theUIC, whichcreatesthefollowing FunctorInfofor theWSO:

Name ‘‘InteractiveAlignment’’

Description ‘‘Use manipulator tomanually align twopreparations’’

Arguments 2ArgTypes arg0: preparation

arg1: preparationArgDesignators arg0: ‘‘reference’’

arg1: ‘‘alignee’’ReturnType 3DAlignmentHandle referenceto theUIC functor

Supposethe userselectstwo preparationsand the “In-teractive Alignment” operation. The WSO then checksthe number of data objects selectedagainst the “Argu-ments”field above andthe type of eachargumentagainstthe “ArgTypes” entries. Next, we note that thereare two“ArgDesignators”,indicating that eachargumentplays aparticularrolein theoperation.TheWSOmustthenprompttheuserto assigna role (“Reference”versus“Alignee”) toeachof thepreparations.

At this point, we are readyto executethe functor, andtheWSOpassesthe functorandargumenthandles(thear-gumentrolesnow beingimplicit in their order)to theUIC.In this case,this is a UIC functorwhich createsa manipu-lator, fetchesrenderableversionsof the preparationsfromthe WS, andpassesthe preparationsandtheir metadatatothemanipulator. Theusercantheninteractwith themanip-ulator(or dootheroperationswithin LOGOS, asthemanip-ulator is managedasa separatewindow) until heor sheissatisfiedwith the alignment. At that point, the manipula-tor returnsa new 3DAlignmentobject,which is passedtothe WS alongwith its newly-createdmetadata(which in-cludesa historyentryshowing it wascreatedby aninterac-tivealignmentof thetwo preparations).

3.5. Abstraction: Computation of Response Maps

A morecomplex operationthat involvesWS functorsisthe computationof the responsemapshown in Fig. 3. Tocomputethis, we mustaddresstheproblemof determininga“typical” cell’s influencebasedonourspecificexperimen-tal data[11]. Ordinarily, this would be accomplishedviastatisticaltechniques.However, thesearenotdirectlyappli-cableto thetree-likestructureof a neuron.Two reasonableassumptionsareusedto renderthis problemtractable:thatsensorycells influenceinterneuronsonly via their varicosi-tiesandthat theprobabilityof an interneuronreceiving in-put from a typical sensorycell canbe computedfrom thedistanceits branchesarefrom its varicositiesandthevari-cosities’surfacearea.Determinationof a typical cell’s in-fluenceis thenreducedto:

1. computinga functionof 3D spacewithin theganglionbasedon thedistributionof varicositiesin eachprepa-ration.

2. combiningthesemultiple functionsto form a single,overall influencefunction (by summation,for exam-ple).

Themetadataassociatedwith theresponsemapcompu-tationis:

Page 10: LOGOS: A Computational Framework for Neuroinformatics …depts.washington.edu/biocomp/pubs/logos.pdfData Manager Interactive Client GU I Works pace Interactive Client GU I Workspace

Name ‘‘Compute Response Map’’Description ‘‘Estimate overall

response to a stimulus’’Arguments 2ArgTypes arg0: preparationList

arg1: userRealArgDesignators arg0: none

arg1: ‘‘Direction’’ReturnType ResponseMapHandle referenceto thefunctor

The argumentto this functor is a list of one or morepreparationsanda numberfor which theuseris prompted(stimuluswind direction);aftertheWSOperformsits argu-mentchecking,thehandlesarepassedto theUIC andthenonto theWS.Theresponsemapfunctoris composedof fivesimpleroperations[11]:

1. Find composite3DAlignments which producea com-mon spacefor a set of preparationsand apply them,producinga setof alignedpreparations.

2. For a singleVaricositySet, computea synapticdensityfield. This is doneby iterating throughthe varicosi-ties,computingits contribution to theoverall field (asa gaussianfunctionof the sphere’s surfacearea),andsummingtheindividualcontributions.ThisproducesaSynapticDensityField object.

3. For asinglePreparation, computeits responseto windblowing in aparticulardirection.

4. For a single SynapticDensityField, computea func-tional transformationby multiplying the field valuesby a constant,producingaResponseMap.

5. Sumasetof ResponseMaps,producinga new one.

Thesearesequencedby the WC. Oneparticularoperationthat is likely to fail is number1, if no compositealign-mentcanbefound(via aminimumspanningtreealgorithm)thatbringsall preparationsinto the samespace.This fail-urewould bereportedto theUIC, andtheuserwould needto do additionalalignmentsbeforetrying again.Assumingsuccess,the WC would return metadatafor the final Re-sponseMap, andperhapssomeof the intermediateobjectsproduced(whetherthe intermediateobjectsare temporaryor not is animplementationissueto beconsideredfor algo-rithm efficiency).

Note that thesefive simpleroperationsmight be utilityfunctionsknown only to theWS,or they mightbefunctors.The lattercapability— compositionof functorsinto morecomplex operations(aswith the STL functor adaptors)—is at theheartof theMETALOGOS extension.

4. METALOGOS and the Application of Do-main Knowledge

A follow-on to the LOGOS project is METALOGOS, asystemwhich addsdomainknowledgeso that usersmaypose system-level queriesand receive responsesat thatlevel. For example,researchersoftendesireto think not intermsof particularcellsin particularindividuals,but rathera particularclassof cells. We may know that all sensorycells connectedto a particularregion of the sensoryorganhavevelocity tuningcurveswith peaksensitivities for high-speedair motion. Thoseconnectedto otherregionsmighthave muchlower peaksensitivities. A convenientcatego-rizationmightbe“f ast”versus“slow”, andwemightassignasensoryneuronto onecategoryor anotherbasedontuningdata,if presentor calculablefrom physiologicaldata,or lo-cationof its connection,if only anatomicaldatais available.Thus,onetypeof domainknowledgeusedbyMETALOGOS

is categorical or taxonomic. Categoriesmay be enteredmanually, and thus attributed to an individual person,orcomputedfrom statisticalclusteranalysis.

Domainknowledgealsoincludesknowledgeof thecon-sequencesof variousalgorithms,for instancehow comput-ing a synapticdensityfield establishesa mappingbetweensinglecells in individual animalsandeither“typical” cellsof that typeor cell systems.Knowledgeaboutdataobjectsor functorswould be storedin their associatedmetadata,and would start with codifying the information currentlycontainedin stringsin a machine-usableformat.

Thebasicflow of processingin METALOGOS is:

1. Acceptsystems-level query.

2. Usedomainknowledgeto mapsystems-level querytoa dataflow diagram,startingwith queriesof existingbaseandderiveddataandusingavailablefunctorstoproducea resultthat is a responseto thequery. If nosuchdataflow programcanbedetermined,any partialdiagramsshouldbe reportedalongwith failure to theuserinterface,so the experimentercaneithermodifythe query, modify the dataflow diagram,or addaddi-tionaldomainknowledge.

3. Otherwise,passdataflow programto commandanddisplayinterfacefor execution.

4. Returnresultto userinterface.

METALOGOS is also meantto deal more thoroughlywith issuessuchasattributionof sourcesfor data,etc.Thisis to someextentanextensionof trackingerrorpropagationthroughcalculations;in thiscase,individualuserswouldbeableto assignlevelsof “trustworthiness”to others,andthesystemwill computea reasonableestimateof thetrustwor-thinessof its results,basedon who originatedthe knowl-

Page 11: LOGOS: A Computational Framework for Neuroinformatics …depts.washington.edu/biocomp/pubs/logos.pdfData Manager Interactive Client GU I Works pace Interactive Client GU I Workspace

edgeit used. Ideally, METALOGOS would include— ei-ther aspart of its databaseor via interfaceor broker pro-grams— links betweendataandthe publicationsthat usethem. This would provide both a mechanismfor makingthe datauponwhich a publicationis basedpublic andthemeansfor otherinvestigatorsto easilyexaminethedataandalgorithmsusedby othersto draw their conclusions,per-hapsusingone’sown dataastests.

5. Discussion

Moving neuroinformaticsbeyond the stageof merelyproducingenhancedfile systemsor electronicanatomicalatlasesis a difficult task. Successfulsystemswill requirearchitectswho arecomfortablein both the biological andcomputerdomains,aswell asclosecollaborationwith ac-tive biology researchers.Thoughonewould not expecttohave solutionsto the large numberof userinterface,visu-alization, database,knowledgerepresentation,etc. prob-lemsa priori , thesystemsproducedshouldbeusefulevenin their infancy. This implies a decompositionof the de-sign into componentswhich areasindependentfrom eachotheraspractical.Sinceany initial dataschemaswill evolveandgrow considerablyover thesystem’s life cycle, its ini-tial designshouldallow this and,asfar aspossible,foreseepossibleareasof change.The systemshouldalsohave atleastanelementof “the vision thing”: anultimategoalthatwill improvebothin depthandbreadththeresearchprocessitself.

6. Implementation Status

As of the dateof this writing, componentsof the userinterface(viewer/manipulator, WSO)andworkspace(WC,WM) havebeenprototyped.Theseprototypesarecurrentlybeingmodifiedfromstand-alone,testversions.Theremain-ing UI andWS modulesarein thedetaileddesignandcod-ing stages;we expecttheUI andWS to befully functionalandintegratedtogethershortly. Datamanagerimplementa-tion awaitsacquisitionof theObjectStoreODBMS.

References

[1] J. M. Bower andD. Beeman.TheBookof GENESIS:Ex-ploring RealisticNeural Models. TELOS,SantaClara,CA,1995.

[2] J.C.French,A. K. Jones,andJ.L. Pfaltz.Scientificdatabasemanagement.TechnicalReport90-21, University of Vir-ginia,Dept.of ComputerScience,Aug. 1990.

[3] N. Goodman,S. Rozen,andL. Stein. Building a labora-tory informationsystemaroundac++-basedobject-orientedDBMS. In Proc.20thVLDB Conf., pages722–9,Santiago,Chile,1994.

[4] M. Huerta,S. Koslow, andA. Leshner. The humanbrainproject: and international resource. Trends Neurosci.,16(11):436–8,1993.

[5] G.A. Jacobs.Detectionandanalysisof air currentsby crick-ets.BioScience, 45(11):776–85,1995.

[6] G. A. JacobsandF. E. Theunissen.Functionalorganizationof aneuralmapin thecricketcercalsensorysystem.J. Neu-rosci., 1996.

[7] L. Kerschberg, H. Gomaa,D. Menasce,andJ. Yoon. Dataandinformationarchitecturesfor large-scaledistributeddataintensive informationsystems.In Proc. 8th Int. Conf. Sta-tistical & ScientificDatabaseManagement, pages226–33,Stockholm,June1996.

[8] E. Mesrobian, R. Muntz, E. Shek, S. Nittel, andM. LaRouche. OASIS: An openarchitecturescientific in-formationsystem. In Proc. RIDE ’96, pages107–16,NewOrleans,Feb. 1996.

[9] D. R. Musserand A. Saini. STL Tutorial and ReferenceGuide. Addison-Wesley, Reading,MA, 1996.

[10] Establishingidentifiedneurondatabases.Workshopreport,NationalScienceFoundation,Arlington, VA, June1994.

[11] T. W. Troyer, J. E. Levin, andG. A. Jacobs.Constructionandanalysisof a databaserepresentinga neuralmap. Mi-croscopyRes.& Tech., 29:329–43,1994.

[12] J.Wernecke. TheInventorMentor. Addison-Wesley, Read-ing, MA, 1994.