an evaluation of microarray visualization tools for biological insight

31
An Evaluation of Microarray An Evaluation of Microarray Visualization Tools for Visualization Tools for Biological Insight Biological Insight Presented by Presented by Tugrul Ince and Nir Tugrul Ince and Nir Peer Peer University of Maryland University of Maryland Purvi Saraiya Chris North Dept. of Computer Science Virginia Polytechnic Institute and State University Karen Duca Virginia Bioinformatics Institute Virginia Polytechnic Institute and State University

Upload: umay

Post on 19-Jan-2016

21 views

Category:

Documents


0 download

DESCRIPTION

Purvi Saraiya Chris North Dept. of Computer Science Virginia Polytechnic Institute and State University. Karen Duca Virginia Bioinformatics Institute Virginia Polytechnic Institute and State University. An Evaluation of Microarray Visualization Tools for Biological Insight. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: An Evaluation of Microarray Visualization Tools for Biological Insight

An Evaluation of Microarray Visualization Tools An Evaluation of Microarray Visualization Tools for Biological Insightfor Biological Insight

Presented byPresented byTugrul Ince and Nir PeerTugrul Ince and Nir PeerUniversity of MarylandUniversity of Maryland

Purvi SaraiyaChris North

Dept. of Computer ScienceVirginia Polytechnic Institute

and State University

Karen Duca

Virginia Bioinformatics Institute

Virginia Polytechnic Institute and State University

Page 2: An Evaluation of Microarray Visualization Tools for Biological Insight

2

GoalsGoals

Evaluate five popular visualization Evaluate five popular visualization toolstools Cluster/TreeviewCluster/Treeview TimeSearcherTimeSearcher Hierarchical Clustering Explorer (HCE)Hierarchical Clustering Explorer (HCE) SpotfireSpotfire GeneSpringGeneSpring

Do so in the context of bioinformatics Do so in the context of bioinformatics data explorationdata exploration

Page 3: An Evaluation of Microarray Visualization Tools for Biological Insight

3

GoalsGoals

Research QuestionsResearch Questions How successful are these tools in stimulating How successful are these tools in stimulating

insight?insight? How do various visualization techniques How do various visualization techniques

affect the users’ perception of data?affect the users’ perception of data? How does users’ background affect the tool How does users’ background affect the tool

usage?usage? How do these tools support hypothesis How do these tools support hypothesis

generation?generation? Can insight be measured in a controlled Can insight be measured in a controlled

experiment?experiment?

Page 4: An Evaluation of Microarray Visualization Tools for Biological Insight

4

Visualization EvaluationsVisualization Evaluations

Typically evaluations consist ofTypically evaluations consist of controlled measurements of user controlled measurements of user

performance and accuracy on performance and accuracy on predetermined taskspredetermined tasks

We are looking for an evaluation that We are looking for an evaluation that better simulates a bioinformatics data better simulates a bioinformatics data analysis scenarioanalysis scenario We use a protocol the focuses onWe use a protocol the focuses on

recognition and quantification of insights gained recognition and quantification of insights gained from actual exploratory use of visualizationsfrom actual exploratory use of visualizations

Page 5: An Evaluation of Microarray Visualization Tools for Biological Insight

5

InsightsInsights Hard to define what is an “insight”Hard to define what is an “insight” We need this term to be quantifiable and We need this term to be quantifiable and

reproduciblereproducible SolutionSolution

Encourage users to think aloudEncourage users to think aloud and report any findings they have about the datasetand report any findings they have about the dataset

Videotape a session to capture and Videotape a session to capture and characterize individual insights as they occurcharacterize individual insights as they occur

generally provides more information than generally provides more information than subjective measures from post-experiment surveyssubjective measures from post-experiment surveys

Page 6: An Evaluation of Microarray Visualization Tools for Biological Insight

6

InsightsInsights

Define insight asDefine insight as an individual observation about the data an individual observation about the data

by the participantby the participant a unit of discoverya unit of discovery Essentially, any data observation made Essentially, any data observation made

during the think aloud protocolduring the think aloud protocol Now we can quantify some Now we can quantify some

characteristics of each insightcharacteristics of each insight

Page 7: An Evaluation of Microarray Visualization Tools for Biological Insight

7

Insight CharacteristicsInsight Characteristics ObservationObservation

The actual finding about the dataThe actual finding about the data TimeTime

The amount of time taken to reach the insightThe amount of time taken to reach the insight Domain ValueDomain Value

The significance of the insight. Coded by a domain expert.The significance of the insight. Coded by a domain expert. HypothesesHypotheses

Hypothesis and direction of researchHypothesis and direction of research Directed vs. UnexpectedDirected vs. Unexpected

Recall: participants are asked to identify questions they Recall: participants are asked to identify questions they want to explorewant to explore

CorrectnessCorrectness Breadth vs. DepthBreadth vs. Depth

Page 8: An Evaluation of Microarray Visualization Tools for Biological Insight

8

Insight CharacteristicsInsight Characteristics

CategoryCategory Overview – overall distributions of gene Overview – overall distributions of gene

expressionexpression Patterns – identification or comparison Patterns – identification or comparison

across data attributesacross data attributes Groups – identification or comparison of Groups – identification or comparison of

groups of genesgroups of genes Details – focused information about Details – focused information about

specific genesspecific genes

Page 9: An Evaluation of Microarray Visualization Tools for Biological Insight

9

Experiment DesignExperiment Design

A 3A 35 between-subjects design5 between-subjects design between-subjects between-subjects different subjects different subjects

for each pairfor each pair Dataset: 3 treatmentsDataset: 3 treatments Visualization tool: 5 treatmentsVisualization tool: 5 treatments

Page 10: An Evaluation of Microarray Visualization Tools for Biological Insight

10

Experiment DesignExperiment Design ParticipantsParticipants

2 participants per dataset per tool2 participants per dataset per tool Have at least a Bachelor’s degree in a biological Have at least a Bachelor’s degree in a biological

fieldfield Assigned to tools they had never worked with Assigned to tools they had never worked with

beforebefore to prevent advantageto prevent advantage measure learning timemeasure learning time

CategoriesCategories 10 Domain Experts10 Domain Experts

Senior researchers with extensive experience in microarray Senior researchers with extensive experience in microarray experiments and microarray data analysisexperiments and microarray data analysis

11 Domain Novices11 Domain Novices Lab technicians or graduate student research assistantsLab technicians or graduate student research assistants

9 Software Developers9 Software Developers Professionals who implement microarray software toolsProfessionals who implement microarray software tools

Page 11: An Evaluation of Microarray Visualization Tools for Biological Insight

11

Protocol and MeasuresProtocol and Measures Chose new users with only minimal tool trainingChose new users with only minimal tool training

Success in the initial usage period is critical for the Success in the initial usage period is critical for the tool’s adoption by biologiststool’s adoption by biologists

Participants received an initial trainingParticipants received an initial training Background description about the datasetBackground description about the dataset 15-minute tool tutorial15-minute tool tutorial

Participants listed some analysis questionsParticipants listed some analysis questions Instructed to examine the data with the tool as Instructed to examine the data with the tool as

long as neededlong as needed They were allowed to ask for help about the toolThey were allowed to ask for help about the tool

Simulates training by colleaguesSimulates training by colleagues

Page 12: An Evaluation of Microarray Visualization Tools for Biological Insight

12

Protocol and MeasuresProtocol and Measures

Every 15 minutes, participants Every 15 minutes, participants estimated percent of total potential estimated percent of total potential insight they obtained so farinsight they obtained so far

Finally, assessed overall experience Finally, assessed overall experience with the tools during sessionwith the tools during session

Entire session was videotaped for Entire session was videotaped for later analysislater analysis Later, all individual occurrences of Later, all individual occurrences of

insights were identified and codifiedinsights were identified and codified

Page 13: An Evaluation of Microarray Visualization Tools for Biological Insight

13

Show me picturesShow me pictures

Here are the tools!!!Here are the tools!!!

Page 14: An Evaluation of Microarray Visualization Tools for Biological Insight

14

Cluster/TreeView = ClusterView

ClusterCluster to cluster datato cluster data

TreeViewTreeView Visualize the Visualize the

clustersclusters Uses heat-mapsUses heat-maps

Page 15: An Evaluation of Microarray Visualization Tools for Biological Insight

15

TimeSearcher 1TimeSearcher 1

Parallel Coordinate Parallel Coordinate VisualizationVisualization

Interactive Interactive FilteringFiltering

Line Graphs for Line Graphs for each data entityeach data entity

Page 16: An Evaluation of Microarray Visualization Tools for Biological Insight

16

HCEHCE

Clusters dataClusters data Several Several

VisualizationsVisualizations Heat-MapsHeat-Maps Parallel Parallel

CoordinatesCoordinates Scatter PlotsScatter Plots HistogramsHistograms

Brushing and Brushing and LinkingLinking

Page 17: An Evaluation of Microarray Visualization Tools for Biological Insight

17

SpotfireSpotfire General Purpose General Purpose

Visualization ToolVisualization Tool Several DisplaysSeveral Displays

Scatter PlotsScatter Plots Bar GraphsBar Graphs HistogramsHistograms Pie/Line ChartsPie/Line Charts Others…Others…

Dynamic Query Dynamic Query SlidersSliders

Brushing and Brushing and LinkingLinking

Page 18: An Evaluation of Microarray Visualization Tools for Biological Insight

18

GeneSpringGeneSpring

Suitable for Suitable for Microarray data Microarray data analysisanalysis Shows physical Shows physical

positions on genomespositions on genomes Array layoutsArray layouts PathwaysPathways Gene-to-gene Gene-to-gene

comparisoncomparison Brushing and LinkingBrushing and Linking Clustering capabilityClustering capability

Page 19: An Evaluation of Microarray Visualization Tools for Biological Insight

19

Enough about Tools,Enough about Tools,

Tell me the Results!!!Tell me the Results!!!

Page 20: An Evaluation of Microarray Visualization Tools for Biological Insight

20

ClusterView TimeSearcher 1 HCE Spotfire GeneSpring

Number of InsightsNumber of Insights

Spotfire: Highest number of insightsSpotfire: Highest number of insights HCE: poorestHCE: poorest

Page 21: An Evaluation of Microarray Visualization Tools for Biological Insight

21

Total Domain ValueTotal Domain Value

Spotfire: Highest insight valueSpotfire: Highest insight value HCE, GeneSpring: poorerHCE, GeneSpring: poorer

ClusterView TimeSearcher 1 HCE Spotfire GeneSpring

Page 22: An Evaluation of Microarray Visualization Tools for Biological Insight

22

Avg. Final Amount Avg. Final Amount LearnedLearned

Spotfire: high value in learningSpotfire: high value in learning ClusterView and HCE are poorClusterView and HCE are poor

ClusterView TimeSearcher 1 HCE Spotfire GeneSpring

Page 23: An Evaluation of Microarray Visualization Tools for Biological Insight

23

Avg. Time to First Avg. Time to First InsightInsight

ClusterView: very short time to first insightClusterView: very short time to first insight TimeSearcher 1 and Spotfire are also quickTimeSearcher 1 and Spotfire are also quick

ClusterView TimeSearcher 1 HCE Spotfire GeneSpring

Page 24: An Evaluation of Microarray Visualization Tools for Biological Insight

24

Avg. Total TimeAvg. Total Time

Total time users spent using the toolTotal time users spent using the tool Low Values: Efficient or Not useful Low Values: Efficient or Not useful

for insightfor insight

ClusterView TimeSearcher 1 HCE Spotfire GeneSpring

Page 25: An Evaluation of Microarray Visualization Tools for Biological Insight

25

Unexpected InsightsUnexpected Insights HCE revealed several unexpected resultsHCE revealed several unexpected results ClusterView provided a fewClusterView provided a few TimeSearcher 1 for time series dataTimeSearcher 1 for time series data Spotfire contributed to 2 unexpected Spotfire contributed to 2 unexpected

insightsinsights

HypothesesHypotheses A few insights led to hypothesesA few insights led to hypotheses

Spotfire Spotfire 3 3 ClusterView ClusterView 2 2 TimeSearcher 1 TimeSearcher 1 1 1 HCE HCE 1 1

Page 26: An Evaluation of Microarray Visualization Tools for Biological Insight

26

Tools vs. DatasetsTools vs. Datasets

Page 27: An Evaluation of Microarray Visualization Tools for Biological Insight

27

Insight CategoriesInsight Categories Overall Gene ExpressionOverall Gene Expression

Overview of genes in generalOverview of genes in general Expression PatternsExpression Patterns

Searching patterns is criticalSearching patterns is critical Clustering is usefulClustering is useful

GroupingGrouping Some users wanted to group genesSome users wanted to group genes GeneSpring enables groupingGeneSpring enables grouping

Detail InformationDetail Information Users want detailed information about genes Users want detailed information about genes

that are familiar to themthat are familiar to them

Page 28: An Evaluation of Microarray Visualization Tools for Biological Insight

28

Visual Representations and Visual Representations and InteractionsInteractions

Although some tools have many Although some tools have many visualization techniques, users tend to visualization techniques, users tend to use only a fewuse only a few Spotfire users preferred heat-mapsSpotfire users preferred heat-maps GeneSpring users preferred parallel GeneSpring users preferred parallel

coordinatescoordinates Lupus dataset: visualized best with heat-Lupus dataset: visualized best with heat-

mapsmaps Most users preferred outputs of Most users preferred outputs of

clustering algorithmsclustering algorithms HCE not useful when a particular HCE not useful when a particular

column arrangement is usefulcolumn arrangement is useful

Page 29: An Evaluation of Microarray Visualization Tools for Biological Insight

29

Running out of time, So, Running out of time, So, wrap upwrap up

Use a Visualization tool (that’s why we’re Use a Visualization tool (that’s why we’re here!)here!)

Spotfire: best general performanceSpotfire: best general performance GeneSpring: Hard to useGeneSpring: Hard to use Dataset dictates best tool!Dataset dictates best tool!

Time Series data: TimeSearcherTime Series data: TimeSearcher Others: Spotfire, GeneSpring?Others: Spotfire, GeneSpring?

Interaction is the keyInteraction is the key Grouping and Clustering are necessary Grouping and Clustering are necessary

featuresfeatures

Page 30: An Evaluation of Microarray Visualization Tools for Biological Insight

30

CritiqueCritique

In all fairness, measuring insights is really In all fairness, measuring insights is really hard! Here are some possible issueshard! Here are some possible issues

SubjectivitySubjectivity Experiment relies on users always thinking aloudExperiment relies on users always thinking aloud Also, depends on a domain expert to evaluate Also, depends on a domain expert to evaluate

insightsinsights Results may vary widely based on participants Results may vary widely based on participants

expertise (only two per tool-dataset pair)expertise (only two per tool-dataset pair) Some insight characteristics are inherently Some insight characteristics are inherently

subjectivesubjective Domain ValueDomain Value Breadth vs. DepthBreadth vs. Depth

Page 31: An Evaluation of Microarray Visualization Tools for Biological Insight

31

CritiqueCritique

How do one count insights?How do one count insights? Assumes honest reporting by participantsAssumes honest reporting by participants Some insights may be of no great valueSome insights may be of no great value What if a discovery just reaffirms a known What if a discovery just reaffirms a known

fact? Is that an insight?fact? Is that an insight? Measuring time taken to reach an Measuring time taken to reach an

insightinsight Maybe instead of measuring from beginning Maybe instead of measuring from beginning

of session we should measure from last of session we should measure from last insightinsight