how i learned to quit worrying deanna m. church staff scientist, ncbi @deannachurch short course in...

12
ow I learned to quit worry Deanna M. Church Staff Scientist, NCBI @deannachurch Short Course in Medical Genetics And love multiple coordinate systems

Upload: curtis-willis

Post on 30-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: How I learned to quit worrying Deanna M. Church Staff Scientist, NCBI @deannachurch Short Course in Medical Genetics 2013 And love multiple coordinate

How I learned to quit worrying

Deanna M. Church Staff Scientist, NCBI

@deannachurch Short Course in Medical Genetics 2013

And love multiple coordinate systems

Page 2: How I learned to quit worrying Deanna M. Church Staff Scientist, NCBI @deannachurch Short Course in Medical Genetics 2013 And love multiple coordinate

Alternate loci/Patch

RefSeqGene/LRG

Transcripts (NM_XXXXXX.X)

Proteins (NP_XXXXXX.X)

* Not drawn to scale

Page 3: How I learned to quit worrying Deanna M. Church Staff Scientist, NCBI @deannachurch Short Course in Medical Genetics 2013 And love multiple coordinate

http://www.ncbi.nlm.nih.gov/genome/tools/remap

Page 4: How I learned to quit worrying Deanna M. Church Staff Scientist, NCBI @deannachurch Short Course in Medical Genetics 2013 And love multiple coordinate

Software v1.4

Different versions of same assembly to each other (e.g. NCBI36<->GRCh37)Different assemblies from same organism to each other (HuRef<->GRCh37)

Page 5: How I learned to quit worrying Deanna M. Church Staff Scientist, NCBI @deannachurch Short Course in Medical Genetics 2013 And love multiple coordinate

Producing Assembly-Assembly Alignments

First Pass Alignments: Symmetrical best hits- only 0 or 1 alignment to the other assembly.

Second Pass Alignments: attempt to recover regions not in the first pass

Uses assembly structure to guide first pass alignments

Page 6: How I learned to quit worrying Deanna M. Church Staff Scientist, NCBI @deannachurch Short Course in Medical Genetics 2013 And love multiple coordinate

NCBI36

GRCh37Remap failure: low coverage (<50%)

100 bp

GRCh37

NCBI36

400 bp

Remap failure: expansion (target length/source length >2)

Page 7: How I learned to quit worrying Deanna M. Church Staff Scientist, NCBI @deannachurch Short Course in Medical Genetics 2013 And love multiple coordinate

Helps rescue features that cross a gap (common for CNVs/Structural Variants)

Page 8: How I learned to quit worrying Deanna M. Church Staff Scientist, NCBI @deannachurch Short Course in Medical Genetics 2013 And love multiple coordinate

Beware: Second Pass alignments and Merge

Page 9: How I learned to quit worrying Deanna M. Church Staff Scientist, NCBI @deannachurch Short Course in Medical Genetics 2013 And love multiple coordinate

Remap Output

Summary data: Quick overview of how well your features mapped

Mapping report: Detailed report containing all of your input featuresand their source location, target location (or reason for failure) and coverage score.

Annotation File: An annotation file of only the features that successfullyremapped. Suitable for loading to most browsers.

Genome Workbench file: A file formatted for loading to Genome Workbench (a client side browser). Includes assembly-assembly alignments forreview.

Genome Workbench videos

Page 10: How I learned to quit worrying Deanna M. Church Staff Scientist, NCBI @deannachurch Short Course in Medical Genetics 2013 And love multiple coordinate

* LRG soon

When mapping to RefSeqGeneGenomic location (NG_XXXXXX.X)Transcript location(s) (NM_XXXXXX.X)Protein location(s) (NP_XXXXXX.X)

Optional, but checkedby default

No second pass alignments, only one ‘best’ alignment

Page 11: How I learned to quit worrying Deanna M. Church Staff Scientist, NCBI @deannachurch Short Course in Medical Genetics 2013 And love multiple coordinate

No second pass alignments, only one ‘best’ alignment

Maps features:From Primary Assembly -> Alternate Loci/Patches (common)From Alternate Loci/Patches->Primary Assembly

Page 12: How I learned to quit worrying Deanna M. Church Staff Scientist, NCBI @deannachurch Short Course in Medical Genetics 2013 And love multiple coordinate

Take home messages

Tools are available for mapping features from onecoordinate system to another.

Assembly <-> AssemblyAssembly <-> RefSeqGenePrimary Assembly <-> Alternate Loci/Patches

Feature remapping is NOT a substitute for de novo annotation.