![Page 1: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/1.jpg)
The Genomic HyperBrowser
Statistical genome analysismade accessible and reproducible
Sveinung GundersenElixir.no, UiO
![Page 2: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/2.jpg)
Credit
• Based on a presentation Assoc. Prof. Geir Kjetil Sandve held at a meeting in Oxford, may 7th, 2013
![Page 3: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/3.jpg)
Focus
• Downstream analysis of high-level genome-scale data
• You want to compare your data with existing data collections
• But..
• how to find the questions they can answer?
• how to go about answering questions at this scale?
![Page 4: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/4.jpg)
Outline
• A bioinformatician’s view on genomics
• Analyzing genomic track data
• Under the hood of the analysis tools
• A quick tour of HyperBrowser features
![Page 5: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/5.jpg)
Outline
• A bioinformatician’s view on genomics
• Analyzing genomic track data
• Under the hood of the analysis tools
• A quick tour of HyperBrowser features
![Page 6: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/6.jpg)
What is a reference genome?
• It’s a bunch of sequence
• Human genome a collection of ~3 billion nucleotides
• It’s a map!
• Where sequences belong in relation to each other
• Essentially makes up a line
Genome
![Page 7: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/7.jpg)
The whiteboard and the computer file
Genome
Reference genome acts like
coordinate system for genomic data
chr21!10079666!10120808!NM_001187chr21!13332357!13412442!NR_026916chr21!13700575!13700652!NR_036164chr21!13904368!13935777!NM_174981chr21!14137324!14142556!NR_026755
![Page 8: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/8.jpg)
DNA as a line
• This is indeed the dynamic perspective!
• DNA doesn’t change that much from hour to hour, or cell to cell
• But a lot happens along the DNA: binding by TFs, modifications of histones, ...
• Even for gene expression or SNPs we can usually abstract away from the underlying sequence
• Functional genomics typically refers to the genome as a line (map), not as sequence
![Page 9: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/9.jpg)
The UCSC Genome Browser
![Page 10: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/10.jpg)
Public data- ENCODE, FANTOM, GEO, Roadmap Epigenomics ..
• By now, Big Science provides:
• Chromatin accessibility (DHSs) for ~350 cell samples
• Binding of ~100 TFs in several cell types
• Most histone modifications in several cell types
• Gene expression for thousands of setups
• TSS and active promoters in ~950 cell samples
• DNA methylation, 3D genome structure, ...
![Page 11: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/11.jpg)
Outline
• A bioinformatician’s view on genomics
• Analyzing genomic track data
• Under the hood of the analysis tools
• A quick tour of HyperBrowser features
![Page 12: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/12.jpg)
Exploiting the data
• Data is becoming less of a bottleneck
• With so much public data, some is likely to be relevant
• Producing broad amounts of new data is often within reach
• But, asking the right questions is still tricky
• Forming interesting hypotheses is no easier than before
• The large scale complicates analysis
![Page 13: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/13.jpg)
This can’t be it?!
?
![Page 14: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/14.jpg)
Cell types and MS associated regions
• Regions of the genome are not always active
• Varies e.g. between cells types
• Due to e.g. modification of histones
• In which cell types are MS associated regions active?
![Page 15: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/15.jpg)
Cell-type specific activity of MS regions
• MS GWAS SNP locations along genome
• Histone modification-derived chromatin states along the genome, in 9 cell types
• Derived from ENCODE data (Nature, 473, 43–49)
• Are regions around MS GWAS SNPs unexpectedly active in B-cells (gm12878)?
![Page 16: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/16.jpg)
A simple approach
• Do MS regions overlap more than expected with B-cell AP regions?
• But, this is really a bit too simple
![Page 17: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/17.jpg)
A more reasonable approach (and still quite straightforward)
• Do MS regions overlap unexpectedly more with B-cell than e.g. stem cell regions?
• Yes!
• (“Genomic regions associated with multiple sclerosis are active in B cells”, PLoS One. 2012;7(3))
![Page 18: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/18.jpg)
Outline
• A bioinformatician’s view on genomics
• Analyzing genomic track data
• Under the hood of the analysis tools
• A quick tour of HyperBrowser features
![Page 19: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/19.jpg)
Delineating basic types of genomic tracks
Points
Segments
Function
Bins
![Page 20: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/20.jpg)
Track types:7 basic track types
Genome Partition (GP)
Step Function (SF)
Function (F)
Points (P)
Segments (S)
Valued Points (VP)
Valued Segments (VS)
![Page 21: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/21.jpg)
Track types:8 advanced track types
Linked Points (LP)
Linked Segments (LS)
Linked Genome Partition (LGP)
Linked Valued Points (LVP)
Linked Valued Segments (LVS)
Linked Step Function (LSF)
Linked Base Pairs (LBP)
Linked Function (LF)
![Page 22: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/22.jpg)
![Page 23: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/23.jpg)
S-S Overlap
![Page 24: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/24.jpg)
The troubling random nature
• Counting overlap is straightforward
• But statistical testing requires random data
• “The multitudes of possible genomes that evolution might have produced for our and other species”
• Must find something that is reasonable enough
• Does appropriate randomness match statistical tests?
![Page 25: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/25.jpg)
Tracing assumptions
• Textbook Wilcoxon H0:
• Values (4) independent and symmetric around 0
• But what is assumed on the genomic track data?
![Page 26: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/26.jpg)
A grammar for null models
• Specifying assumptions:
• Which of the tracks should be randomized?
• Which properties should still be preserved?
• How should track elements be randomized?
• Computing p-values according to model
• Exact/asymptotic test if assumptions match
• Monte Carlo with explicit randomization if needed
![Page 27: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/27.jpg)
Outline
• A bioinformatician’s view on genomics
• Analyzing genomic track data
• Under the hood of the analysis tools
• A quick tour of HyperBrowser features
![Page 28: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/28.jpg)
Tracks suitable for analysis
Basic trackrepresentation
External trackcollection
(UCSC, ENCODE)
Galaxy historydata
Explorative plotsof tracks and
relations
Visualization
(Table 2)
5 tools
Hypothesessupported
by data
Hypothesis testing
(Table 1)
Analyze genomic tracks
Unsupervisedsubgrouping
of tracks
Clusteringanalysis
(Table 2)
Cluster tracks
Hypotheses on3D co-localizationsupported by data
3Danalysis
(Table 2)
Analyze spatial
co-localization
Generatetracks
(Table 3)
6 tools
HB trackrepository
(Table 3)
Extracttracktool
Customizetracks
(Table 3)
4 tools
Data preparationData customizationAnalysis
Spreadsheet /WDEXODU�ÀOHV
Format & convert
(Table 3)
2 tools
Statisticson tracks and
relations
Descriptivestatistics
(Table 1)
Analyze genomic tracks
![Page 29: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/29.jpg)
Tracks suitable for analysis
Basic trackrepresentation
External trackcollection
(UCSC, ENCODE)
Galaxy historydata
Explorative plotsof tracks and
relations
Visualization
(Table 2)
5 tools
Hypothesessupported
by data
Hypothesis testing
(Table 1)
Analyze genomic tracks
Unsupervisedsubgrouping
of tracks
Clusteringanalysis
(Table 2)
Cluster tracks
Hypotheses on3D co-localizationsupported by data
3Danalysis
(Table 2)
Analyze spatial
co-localization
Generatetracks
(Table 3)
6 tools
HB trackrepository
(Table 3)
Extracttracktool
Customizetracks
(Table 3)
4 tools
Data preparationData customizationAnalysis
Spreadsheet /WDEXODU�ÀOHV
Format & convert
(Table 3)
2 tools
Statisticson tracks and
relations
Descriptivestatistics
(Table 1)
Analyze genomic tracks
![Page 30: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/30.jpg)
Current focus
• Main focus:
• Simple fetching collections of genomic tracks from public sources
• Handling of multi-track collections and analysis
• Better integration of HyperBrowser with NeLS and TSD
![Page 31: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/31.jpg)
Future directions
• Analyzing predictor-enhancer interaction, taking high-resolution chromosome conformation data into account
• Better handling of phenotype information (for pharmacology collaboration projects)
• Several other collaboration projects
![Page 32: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/32.jpg)
Publications
• Core statistical analysis system
• “The Genomic HyperBrowser: inferential genomics at the sequence level” (Genome Biology, 2010)
• “The Genomic HyperBrowser: an analysis web server for genome-scale data” (Nucleic Acids Research, 2013)
• Types of genomic tracks
• “Identifying elemental genomic track types and representing them uniformly” (BMC Bioinformatics, 2011)
![Page 33: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/33.jpg)
Publications
• Google maps of many-to-many analyses
• “The differential disease regulome” (BMC Genomics, 2011)
• 3D genome structure analysis
• “Handling realistic assumptions in hypothesis testing of 3D co-localization of genomic elements” (Nucleic Acids Research, 2013)
![Page 34: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/34.jpg)
The team
Knut Liestøl
Eivind Tøstesen
Sigve Nakken
Halfdan Rydbeck
Geir Kjetil Sandve
Trevor Clancy Fang
Liu Sveinung Gundersen
Ingrid K.
Lars
Arnoldo Frigessi
Eivind HovigMorten Johansen
Marit HoldenVegard NygaardEgil Ferkingstad
2008
2012
![Page 35: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/35.jpg)
Support
![Page 36: The Genomic HyperBrowser · DNA as a line • This is indeed the dynamic perspective! • DNA doesn’t change that much from hour to hour, or cell to cell • But a lot happens along](https://reader034.vdocument.in/reader034/viewer/2022050203/5f5720c0037b04452e4d1ab2/html5/thumbnails/36.jpg)
Conclusion
• If you want to do genome analysis, and don’t want to reinvent the wheel:
• Google “HyperBrowser” and try out the web system
• PubMed “HyperBrowser” and skim 2013 NAR article