copy number variation

Download Copy Number Variation

If you can't read please download the document

Upload: milla

Post on 09-Jan-2016

52 views

Category:

Documents


6 download

DESCRIPTION

Copy Number Variation. Eleanor Feingold University of Pittsburgh March 2012. GCTC ATATATAT TTG. kb - Mb (gene or gene region). What do we mean by “ copy number variation? ”. “ normal ”. deletion. duplication of one gene. duplication of several genes. duplication of part of a gene. - PowerPoint PPT Presentation

TRANSCRIPT

  • Copy Number Variation

    Eleanor Feingold

    University of Pittsburgh

    March 2012

  • What do we mean by copy number variation?

  • Copy number variation in a gene or gene region

  • WhatFind chromosomal segments (usually large ones) that are duplicated and/or deleted in tumor cell lines

    WhyLearn something about cancer biologyor

    Implications for treatment and prognosis

    Cancer geneticsClinical pediatricsWhatDetect inherited or de novo deletions in individuals

    WhyDiagnose birth defects

    Classical copy number study types

  • And now: Genetic association studies for CNVs

    012+cases65133202controls1681316

  • How do we assay copy number variation?

  • What

    Microarray of clones (e.g. BACs)

    Usually on glass slide

    Competitive hybridization of test and reference samples.

    Measure fluorescence ratio clone by clone.

    Limitations

    Large clones.

    Sparse coverage.

    High noise due to spotting process.

    Generation 1 - Array CGH

  • What

    High-throughput SNP genotyping platforms (e.g. Affymetrix, Illumina)

    Disadvantages

    Technology was never intended for measuring copy number.

    SNPs on chip selected to avoid CNV regions by design.

    Generation 2 - SNP chipsAdvantage

    Hundreds of thousands of points of info.

  • Advantages

    SNPs in known CNV regions are now included.

    Also have non-polymorphic SNPs (SNs?)

    Generation 3 - SNP chips with CNV markers(Affy 6.0, Illumina 1M)Affymetrix

    200K probes in 5K known large CNV regions700K probes evenly spaced along the genome Illumina

    1M markers in 10K regions of various types and sizes

  • Changes

    Got rid of the non-polymorphic markers.

    Special coverage of CNV regions???

    Are these better or worse for CNVs than the previous generation?

    Generation 4 -(Illumina 2.5M, 5M)

  • What data do these technologies give us, and how do we use it?

  • BBABAAStandard genotyping

    Genotype information is in the angle (relative intensity of the two alleles).

    Copy number information is in the distance from the origin (total intensity).

  • AAAAABABBBBBAAABBBABnullIn theory

  • AAA and AAAABABABBBBB and BBBut when you look at the data trisomic(DownSyndrome)disomic

  • All SNPs on chromosome 21

  • AAAAABABBBBBAAABBBABnullIn theory

  • ABnullIn practice

  • So how are copy numbers called?Look for runs of SNPs that are high or low in intensity Many available algorithms e.g. HMM, CBS, change-point

  • Basic picture

  • Komura et al.

    GenomeResearch2006

  • More complex examples (cancer genetics)Peiffer et al. Genome Research, 2006

  • AAABBB

  • Extra copy of whole chromosome

  • No copy number change, but a region of homozygosity (LOH)

  • Basic pictureWang et al. Genome Research, 2007

  • *Chromosome 9

  • A few statistical issues to think about

    (theres still a lot to do)

  • Many run-calling algorithms are oriented towards clinical applications.

    Many CNV detection algorithms are very conservative - aim for zero false positive rate.

    Most use normalization methods that assume a large reference population is not available.

    Many use models that make assumptions about what kinds of variation are likely (e.g. cancer).

  • Family data should be modeled together.

    CNV calls will be much more accurate if you use the whole family, but the model you use should depend on whether you are expecting de novo mutations or not.

    For some diseases youll expect associations with de novo changes. For others you might expect inherited variants.

  • How do we group CNVs for association testing?

  • Separate methods for deletions?

    Deletions are easier to detect than other changes.

    Deletions are likely to have simpler biological effects.

  • The most important one

    The technology is still NOT intended for reliably and comparably measuring total intensity!

    Total intensity numbers are very sensitive to DNA source, sample handling, etc., so extreme measures must be taken to ensure that cases and controls are comparable.

    *********************This copy-neutral aberration can be due to mitotic crossing-over followed by differential reproduction of the two daughter cells.******