selecting short dna fragments in plasma improves detection ...plasma dna resulted in a median of 1.5...

12
Selecting short DNA fragments in plasma improves detection of circulating tumour DNA Florent Mouliere 1,2,* , Anna M. Piskorz 1,2,* , Dineika Chandrananda 1,2,* , Elizabeth Moore 1,2,3,* , James Morris 1,2 , Christopher G. Smith 1,2 , Teodora Goranova 1,2 , Katrin Heider 1,2 , Richard Mair 1,2 , Anna Supernat 1,2,4 , Ioannis Gounaris 1,2,3 , Susana Ros 1,2 , Jonathan C. M. Wan 1,2 , Mercedes Jimenez-Linan 2,3 , Davina Gale 1,2 , Kevin Brindle 1,2,5 , Charles E. Massie 1,2 , Christine A. Parkinson 1,2,3,6,7 , James D. Brenton 1,2,3,6,7,# , Nitzan Rosenfeld 1,2,# * co-first authors; # corresponding authors: [email protected]; [email protected] 1. Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK 2. Cancer Research UK Major Centre – Cambridge, Cancer Research UK Cambridge Institute, Cambridge, UK 3. Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK 4. Medical University of Gdansk, Gdansk, Poland 5. Department of Biochemistry, University of Cambridge, Cambridge, UK 6. Department of Oncology, Hutchison/MRC Research Centre, University of Cambridge, Cambridge, UK 7. NIHR Cambridge Biomedical Research Centre, Cambridge, UK Introductory paragraph: Non-invasive analysis of cancer genomes using cell-free circulating tumour DNA (ctDNA) is being widely implemented for clinical indications. The sensitivity for detecting the presence of ctDNA and genomic changes in ctDNA is limited by its low concentration compared to cell-free DNA of non- tumour origin. We studied the feasibility for enrichment of ctDNA by size selection, in plasma samples collected before and during chemotherapy treatment in 13 patients with recurrent high- grade serous ovarian cancer. We evaluated the effects using targeted and whole genome sequencing. Selecting DNA fragments between 90-150 bp before analysis yielded enrichment of mutated DNA fraction of up to 11-fold. This allowed identification of adverse copy number alterations, including MYC amplification, otherwise not observed. Size selection allows detection of tumour alterations masked by non-tumour DNA in plasma and could help overcome sensitivity limitations of liquid biopsy for applications in early diagnosis, detection of minimal residual disease, and genomic profiling. certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not this version posted May 5, 2017. ; https://doi.org/10.1101/134437 doi: bioRxiv preprint

Upload: others

Post on 26-Jan-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

  • Selecting short DNA fragments in plasma improvesdetectionofcirculatingtumourDNA

    Florent Mouliere1,2,*, Anna M. Piskorz1,2,*, Dineika Chandrananda1,2,*, ElizabethMoore1,2,3,*, James Morris1,2, Christopher G. Smith1,2, Teodora Goranova1,2, KatrinHeider1,2,RichardMair1,2,AnnaSupernat1,2,4,IoannisGounaris1,2,3,SusanaRos1,2,JonathanC. M. Wan1,2, Mercedes Jimenez-Linan2,3, Davina Gale1,2, Kevin Brindle1,2,5, Charles E.Massie1,2,ChristineA.Parkinson1,2,3,6,7,JamesD.Brenton1,2,3,6,7,#,NitzanRosenfeld1,2,#

    *co-firstauthors;

    #correspondingauthors:[email protected];[email protected]

    1. CancerResearchUKCambridgeInstitute,UniversityofCambridge,Cambridge,UK2. CancerResearchUKMajorCentre–Cambridge,CancerResearchUKCambridge

    Institute,Cambridge,UK3. CambridgeUniversityHospitalsNHSFoundationTrust,Cambridge,UK4. MedicalUniversityofGdansk,Gdansk,Poland5. DepartmentofBiochemistry,UniversityofCambridge,Cambridge,UK6. DepartmentofOncology,Hutchison/MRCResearchCentre,UniversityofCambridge,

    Cambridge,UK7. NIHRCambridgeBiomedicalResearchCentre,Cambridge,UK

    Introductoryparagraph:

    Non-invasive analysis of cancer genomes using cell-free circulating tumour DNA (ctDNA) is beingwidelyimplementedforclinicalindications.ThesensitivityfordetectingthepresenceofctDNAandgenomic changes in ctDNA is limited by its low concentration compared to cell-free DNA of non-tumour origin. We studied the feasibility for enrichment of ctDNA by size selection, in plasmasamples collected before and during chemotherapy treatment in 13 patientswith recurrent high-grade serous ovarian cancer. We evaluated the effects using targeted and whole genomesequencing. Selecting DNA fragments between 90-150 bp before analysis yielded enrichment ofmutated DNA fraction of up to 11-fold. This allowed identification of adverse copy numberalterations, includingMYCamplification,otherwisenotobserved.Sizeselectionallowsdetectionoftumour alterations masked by non-tumour DNA in plasma and could help overcome sensitivitylimitationsofliquidbiopsyforapplicationsinearlydiagnosis,detectionofminimalresidualdisease,andgenomicprofiling.

    certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was notthis version posted May 5, 2017. ; https://doi.org/10.1101/134437doi: bioRxiv preprint

    https://doi.org/10.1101/134437

  • Text:

    AnalysisofcirculatingtumourDNA(ctDNA)bynon-invasivesamplingofcell-freeDNAfromplasmaisnowbecominganimportanttoolinoncologyformolecularstratification,monitoringtumourburdenandanalysisofgenomicevolutionduringtreatment1–3.AnalysisofctDNAistechnicallychallenging,asctDNAisoftenpresentinlowconcentrationsinplasmaandismixedwithcell-freeDNA(cfDNA)ofnon-cancerous origin, which is generally present at much higher concentrations. In patients withadvancedcancers, themedianconcentrationofctDNAcanreach10%ormoreof the totalcfDNA,but this fraction ismuch lower in earlier stage cancer, and ctDNAmay rapidly decrease followinginitiationofsystemictreatmentorsurgery1,2,4,5.Variousstrategieshavebeenproposedto improvethesensitivityofctDNAanalysis2.Suchmethodsgenerally focusona small subsetof thegenome,such as hot-spot PCR-based assays or ultra-deep sequencing across gene panels2,6–9. Recentobservations that ctDNA fragmentsmay be shorter than non-tumour cfDNA in plasma has led tosuggestions that these differences may be exploited to enrich for the tumour-specific signal inplasmaDNA10–14.In-silicoanalysisofctDNAsizedifferenceshasbeenusedtoenhancethesignalforchromosomal changes13. However, physical size selection to filter out non-tumour DNA prior tolarge-scalegenomicsequencinghasnotbeendemonstrated. Therefore,wetested thehypothesisthat selecting DNA fragments of specific sizes could improve the sensitivity of detecting genomicalterationsincfDNAfromplasmaofcancerpatients,enablingthedetectionofpointmutationsandcopynumberalterationsthatarepreviouslyundetectable11.

    Previousreportsusingpaired-endsequencingreadsrevealedthatcell-freeDNAismostlydistributedaround amode at 167bp, a length that could correspond to the chromatosome (corehistones +linker)15,16.Thissizedistributionpatternischaracteristicofacaspase-dependantcleavage,thereforeit was hypothesized that apoptosis releases a large fraction of cfDNA into the bloodstream14,15,17.Previous studies in non-invasive prenatal testing have explored the potential of size selection forenrichingthefetalDNAfractioninmaternalplasmawithbothphysicaland in-silicomethods14,18–20.However, this analysis of the size distribution cannot be easily generalised to tumour-derivedfragments as the characterisation of ctDNA-specific patterns requires analysing fragment sizes ofDNA with tumour-derived alterations13. Plasma samples from xenografted animal models haveshownctDNAtobehighlyfragmentedbelow167bp10andthisdistributionwithamodeat145bphas been then confirmed with PCR, atomic force microscopy and recently with whole-genomesequencing12,21. If the distribution differs between tumour-derived and non-tumour derivedfragments, sieving fragmentsby their size could reduce the (oftenoverwhelming) fractionofnon-tumourcfDNAandimprovethesignaltonoiseratioindownstreamanalysis.

    Wefirstanalysedpaired-endreadsfromshallowwholegenomesequencing(sWGS)ofplasmaDNAfrom animal models xenografted with a human ovarian cancer cells, and confirmed that thedistributionofctDNAdifferedinthismodelfromnon-tumourcfDNA(Fig.1a,b).ctDNAinthismodelwasenrichedinthesizerangebetween90and150bp,whilenon-tumourcfDNAwasdominantatsizesgreaterthan150bp,andpeakedat~166bp,similartopreviousobservationsinanimalmodelsandinhumansamples12,15,21,22.Basedontheseobservationsweusedanautomatedelectrophoresisagarose gel selectionmethod to isolate DNA fragments between 90 and 150 bp for downstreamanalysis.Wesequencedsize-selectedDNAusingbothsWGSandtagged-amplicondeepsequencing(TAm-Seq23), in 26 samples collected from 13 patients with high-grade serous ovarian cancer(HGSOC),andcomparedtheresults to thoseof thesamesampleswithoutsizeselection.Foreach

    certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was notthis version posted May 5, 2017. ; https://doi.org/10.1101/134437doi: bioRxiv preprint

    https://doi.org/10.1101/134437

  • patient, two plasma samples were collected: one pre-treatment, when levels of ctDNA weregenerallyhigh,andanotherseveralweeksafterthestartoftreatment,whenlevelsofctDNAwereoftenmuchlowerduetotreatment24.Analysisofthedistributionoffragmentlengthsafter in-vitrosizeselectionandsWGS indicatedthat96%ofresultingreadswere intheselectedrange(Fig.1c).Fortwoofthepatients,thedistributionoffragmentsizesobtainedbysWGSwithoutsize-selectionexhibitedadegradedpatternofcfDNA,withoutaprominentpeakat166bporthe10-bpincrementpeaks,whichhasbeenobservedpreviously12,13,15,and ispresent inall theothercases inthisstudy(Suppl.Fig.1).

    Analysis of somatic copy number aberrations (SCNAs) was carried out with sWGS on all plasmasamplesbeforeandaftersizeselectionofDNAfragmentsbetween90and150bp(Fig.2a).OnecaseisexemplifiedinFig.2band2c(seefurtherdatainSupplementaryFig.2).Withoutsizeselection,asmallnumberofSCNAsweredetectedwithsWGS inDNA isolated fromplasmacollected3weeksafter initiationof treatment frompatientOV04-83 (Fig.2b). These included focal amplifications inchromosomes 8p, 14p, 17, and 19q that were observed in this sample at very low levels (Suppl.Table2).AnalysisofthesameDNAsamplefollowingsize-selectionforshortDNAfragmentsrevealedanincreaseinthelevelofthesedetectedSCNAs,inadditiontomultiplesotherSCNAsthatwerenotobservedwithout size selection (Fig. 2c). The samepatternof SCNAsand focal amplificationswasobserved inDNA fromplasmacollected from the samepatient before initiationof the treatment,whenthefractionoftumourDNAintheplasmawashigher(Suppl.Fig.2).

    The relative copy number signals in plasmaDNA, across a list of 29 genes frequentlymutated inHGSOC, were compared with and without size selection of DNA from plasma samples collectedduringtreatmentacrossthecohortof13patients.Thisshowedthata largenumberofSCNAsthatwere not observedwithout size selection, could be detected after size selection for shorter DNAfragments,notablyasamplifications inkeygenessuchasNF1,PARP2andMYC (Fig.2dandSuppl.Fig.3).MoreSCNAscouldbedetectedaftersizeselectionin11/13patients,andtheabsolutelevelofthelog2ratiowassignificantlyincreasedaftersizeselection(t-test,p=7.72.10-9).The2patientsforwhomtheSCNAsignaldidnotincreaseexhibitedadegradedpatternofcfDNA,whichcouldexplainwhythesizeselectionhadnotenrichedforctDNA(Fig.2dandSuppl.Fig.2).

    WenextassessedthedetectionofSCNAsandpointmutations,inplasmasamplesofthe13patientscollected at baseline (before treatment initiation) and 3 weeks after initiation of chemotherapytreatments,withandwithoutsizeselectionoftheplasmaDNA(Fig.3aandSuppl.Table1).Thedatafrom the baseline samples, where ctDNA levels are generally higher24, was used to identify andconfirmgenomicchanges,whichwerethenstudiedinthesamplesafterinitiationoftreatment,withgenerally, lower levels of ctDNA. The amplitudes of the absolute log2ratio for the SCNAs werehigher,andtheconcordanceofthealterationsdetectedbetweenthebaselineandpost-treatmentsampleswereimproved,withsizeselectionoftheplasmaDNA(Fig.3bandSuppl.Fig.3).

    UsingsWGSdata,weconvertedtheamplitudeoftheCNAsintoaquantitativemetriccalledt-MAD(trimmedMean Absolute Deviation from copy-number neutrality, seeMethods). Size selection ofplasmaDNAresultedinamedianof1.5fold(n=26)increaseinthet-MADscore(figure3cand3d).However, in thesamplescollectedafter the initiationof treatment,whenctDNAcontentwas low,thegenomewideenrichmentwashigher(Fig.3d),withamedianincreaseofthet-MADscoreof2.9fold(range:0.6-4.5fold),exceptfortwosamples.Forthosetwosamplesweobservedadecreasein

    certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was notthis version posted May 5, 2017. ; https://doi.org/10.1101/134437doi: bioRxiv preprint

    https://doi.org/10.1101/134437

  • thet-MADscore,andananalysisofthesizeprofilebeforeselectionforthesesamplesrevealedthattheDNAhadbeenheavilyfragmented,whichcouldexplainwhythesizeselectiondidnotresult inenrichmentforthesecases(Suppl.Fig.1).Inthisdataset,wedidnotidentifyadifferentialeffectofsize selection on the recovery of specific alterations, suggesting that there is a global genomeenrichmentpostsizeselection.Additionally,sizeselectionnotably ledtoan increaseddetectionofdeletions(Suppl.Fig.2).Analysiswithgreatersequencingdepth,integrationofsamplesfromothercancertypesanddifferentstagesofthediseasewouldhelptoextendourobservationsandexpandfurtherourunderstandingofctDNAbiologyandfragmentation.

    In order to confirm that enrichment for tumour DNA could be observed irrespective of thesequencingapproach,wefurtheranalysedthemutantallelefractionsinthesamplesusingTagged-Amplicon Sequencing (TAm-Seq). We detected a relative enrichment in the ctDNA fraction in allsamplesexhibitingatypicalpatternofcfDNAfragmentation,withamodeoffragmentdistributionat166bp, before size selection (Suppl. Fig. 2). The enrichment effect was below 2-fold in samplescollectedpre-treatment,whenthectDNAfractionswerehighinplasma(20%-50%allelefractionsforTP53mutationsasdetectedbyTAm-Seq)(Fig.3eand3f).Enrichmentofthetumourfractionbysizeselection was much greater, between 5-fold and 11-fold for most samples, in samples collectedapproximately3weeksafter initiationof treatment,when levelsof ctDNA (without size selection)werelow(rangingfrom

  • librarypreparationforsequencingmayenablemoreeffectiverecoveryofshortDNAfragmentsfromplasma samples, which have led to new observations on cfDNA size distributions28,30,32. SuchmethodsshouldbefurtherinvestigatedtodetermineifthesecouldhelprecovermorectDNA.

    The size selectionprocesswedemonstratedhere isbasedon inherent characteristicsof ctDNA incomparison to cfDNA and does not require alteration of these fragments. The enrichment weobserved is therefore compatible with any downstream genomic analysis, from locus-specific towider genomic sequencing. This work shows that sWGS (and by extension, whole exomesequencing)canbeperformedonplasmaDNAsampleswith lowctDNAcontent,andthat thiscanfacilitatethecharacterisationofmutationspresentinplasmaatlowerallelefractionsandwithlowersequencing depth. The compatibility of the cfDNA fragment size selection with wide-scale andsensitivegenomicanalysiscouldunlockthepotentialofliquidbiopsiesforthediagnosisofcanceratanearlierstage,andforthedetectionofminimalresidualdisease.

    Methods:

    Patientsandsamplepreparation.13patientswererecruitedaspartofprospectiveclinicalstudiesatAddenbrooke’sHospital,Cambridge,UK, approvedby the local researchethics committee (RECreferencenumbers07/Q0106/63,08/H0306/61and07/Q0106/63).Written informedconsentwasobtainedfromallpatientsandbloodsampleswerecollectedbeforeandafterinitiationoftreatmentwith chemotherapeutic agents. DNA was extracted from 2 mL of plasma using the QIAampcirculatingnucleicacidkit(Qiagen)accordingtothemanufacturer’sinstructions.

    Size selection. Between 8-10 ng of DNAwere loaded into a 3% agarose cassette (HTC3010, SageBioscience) and size selection was performed on a PippinHT (Sage Bioscience) according to themanufacturer’sprotocol.

    TAm-Seq. Tagged-Amplicon Sequencing libraries were prepared as previously described23, usingprimersdesignedtoassesssinglenucleotidevariants(SNV)andsmallindelsacrossselectedhotspotsandtheentirecodingregionsofTP53.LibrariesweresequencedusinganHiSeq4000(Illumina).

    sWGS. Indexed sequencing libraries were prepared using a commercially available kit (ThruPLEX-PlasmaSeq,RubiconGenomics).Librarieswerepooled inequimolaramountsandsequencedonaHiSeq4000(Illumina)generating150-bppaired-endreads.Sequencedatawasanalysedusinganin-housepipelinethatconsistsofthefollowing;Pairedendsequencereadswerealignedtothehumanreference genome (GRCh37) using BWA-mem following the removal of contaminating adaptersequences33. PCR andoptical duplicatesweremarkedusingMarkDuplicates (Picard Tools) featureand thesewereexcluded fromdownstreamanalysis alongwith readsof lowmappingquality andsupplementaryalignments.

    SCNA analysis: Copy number analysis was performed in R using a modification of the QDNAseqpipeline34,asfollow:sequencereadswereallocatedintoequallysized(50kbp)non-overlappingbinsthroughout the length of the genome. Read counts in each bin were corrected to account forsequenceGCcontentandmappability,andbinsoverlapping ‘blacklisted’regions (ENCODEproject)were excluded from downstream analysis. After median normalisation of the counts, bins were

    certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was notthis version posted May 5, 2017. ; https://doi.org/10.1101/134437doi: bioRxiv preprint

    https://doi.org/10.1101/134437

  • segmentedusingbothCircularBinarySegmentationandLocus-awareCircularBinarySegmentationalgorithms,andanaveragedlog2Rvalueperbinwascalculated.Thet-MADscoreiscalculatedastheaveragedabsolutedeviationfromlog2R=0afterfirsttrimmingbincountsgreaterthan4standarddeviationsfromthemeancountacrossallgenomicregions.

    References:

    1. Siravegna,G.,Marsoni,S.,Siena,S.&Bardelli,A.Integratingliquidbiopsiesintothemanagementofcancer.Nat.Rev.Clin.Oncol.(2017).doi:10.1038/nrclinonc.2017.14

    2. Wan,J.C.M.etal.Liquidbiopsiescomeofage:towardsimplementationofcirculatingtumourDNA.Nat.Rev.Cancer(2017).doi:10.1038/nrc.2017.7

    3. Murtaza,M.etal.Non-invasiveanalysisofacquiredresistancetocancertherapybysequencingofplasmaDNA.Nature497,108–112(2013).

    4. Bettegowda,C.etal.DetectionofCirculatingTumorDNAinEarly-andLate-StageHumanMalignancies.Sci.Transl.Med.6,224ra24-224ra24(2014).

    5. Dawson,S.-J.etal.AnalysisofCirculatingTumorDNAtoMonitorMetastaticBreastCancer.N.Engl.J.Med.368,1199–1209(2013).

    6. Diehl,F.etal.Detectionandquantificationofmutationsintheplasmaofpatientswithcolorectaltumors.Proc.Natl.Acad.Sci.102,16368–16373(2005).

    7. Taly,V.etal.MultiplexPicodropletDigitalPCRtoDetectKRASMutationsinCirculatingDNAfromthePlasmaofColorectalCancerPatients.Clin.Chem.59,1722–1731(2013).

    8. Newman,A.M.etal.IntegrateddigitalerrorsuppressionforimproveddetectionofcirculatingtumorDNA.Nat.Biotechnol.34,547–555(2016).

    9. Khodakov,D.,Wang,C.&Zhang,D.Y.Diagnosticsbasedonnucleicacidsequencevariantprofiling:PCR,hybridization,andNGSapproaches.Adv.DrugDeliv.Rev.105,3–19(2016).

    10. Mouliere,F.etal.HighFragmentationCharacterizesTumour-DerivedCirculatingDNA.PLoSOne6,e23418(2011).

    11. Mouliere,F.&Rosenfeld,N.Circulatingtumor-derivedDNAisshorterthansomaticDNAinplasma.Proc.Natl.Acad.Sci.112,3178–3179(2015).

    12. Underhill,H.R.etal.FragmentLengthofCirculatingTumorDNA.PLOSGenet.12,e1006162(2016).

    13. Jiang,P.etal.LengtheningandshorteningofplasmaDNAinhepatocellularcarcinomapatients.Proc.Natl.Acad.Sci.112,E1317–E1325(2015).

    14. Jiang,P.&Lo,Y.M.D.TheLongandShortofCirculatingCell-FreeDNAandtheInsandOutsofMolecularDiagnostics.TrendsGenet.32,360–371(2016).

    15. Lo,Y.M.D.etal.MaternalPlasmaDNASequencingRevealstheGenome-WideGeneticandMutationalProfileoftheFetus.Sci.Transl.Med.2,61ra91-61ra91(2010).

    16. Chandrananda,D.etal.High-resolutioncharacterizationofsequencesignaturesduetonon-randomcleavageofcell-freeDNA.BMCMed.Genomics8,29(2015).

    17. Jahr,S.etal.DNAfragmentsinthebloodplasmaofcancerpatients:quantitationsandevidencefortheiroriginfromapoptoticandnecroticcells.CancerRes.61,1659–65(2001).

    certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was notthis version posted May 5, 2017. ; https://doi.org/10.1101/134437doi: bioRxiv preprint

    https://doi.org/10.1101/134437

  • 18. Yu,S.C.Y.etal.Size-basedmoleculardiagnosticsusingplasmaDNAfornoninvasiveprenataltesting.Proc.Natl.Acad.Sci.U.S.A.111,8583–8(2014).

    19. Lun,F.M.F.etal.NoninvasiveprenataldiagnosisofmonogenicdiseasesbydigitalsizeselectionandrelativemutationdosageonDNAinmaternalplasma.Proc.Natl.Acad.Sci.U.S.A.105,19920–5(2008).

    20. Minarik,G.etal.UtilizationofBenchtopNextGenerationSequencingPlatformsIonTorrentPGMandMiSeqinNoninvasivePrenatalTestingforChromosome21TrisomyandTestingofImpactofInSilicoandPhysicalSizeSelectiononItsAnalyticalPerformance.PLoSOne10,e0144811(2015).

    21. Mouliere,F.,ElMessaoudi,S.,Pang,D.,Dritschilo,A.&Thierry,A.R.Multi-markeranalysisofcirculatingcell-freeDNAtowardpersonalizedmedicineforcolorectalcancer.Mol.Oncol.8,927–941(2014).

    22. Thierry,A.R.etal.OriginandquantificationofcirculatingDNAinmicewithhumancolorectalcancerxenografts.NucleicAcidsRes.38,6159–6175(2010).

    23. Forshew,T.etal.NoninvasiveIdentificationandMonitoringofCancerMutationsbyTargetedDeepSequencingofPlasmaDNA.Sci.Transl.Med.4,136ra68-136ra68(2012).

    24. Parkinson,C.A.etal.ExploratoryAnalysisofTP53MutationsinCirculatingTumourDNAasBiomarkersofTreatmentResponseforPatientswithRelapsedHigh-GradeSerousOvarianCarcinoma:ARetrospectiveStudy.PLOSMed.13,e1002198(2016).

    25. Murtaza,M.etal.MultifocalclonalevolutioncharacterizedusingcirculatingtumourDNAinacaseofmetastaticbreastcancer.Nat.Commun.6,8760(2015).

    26. Heitzer,E.etal.Tumor-associatedcopynumberchangesinthecirculationofpatientswithprostatecanceridentifiedthroughwhole-genomesequencing.GenomeMed.5,30(2013).

    27. Belic,J.etal.RapidIdentificationofPlasmaDNASampleswithIncreasedctDNALevelsbyaModifiedFAST-SeqSApproach.Clin.Chem.61,838–849(2015).

    28. Snyder,M.W.etal.Cell-freeDNAComprisesanInVivoNucleosomeFootprintthatInformsItsTissues-Of-Origin.Cell164,57–68(2016).

    29. Ulz,P.etal.Inferringexpressedgenesbywhole-genomesequencingofplasmaDNA.Nat.Genet.48,1273–1278(2016).

    30. Burnham,P.etal.Single-strandedDNAlibrarypreparationuncoverstheoriginanddiversityofultrashortcell-freeDNAinplasma.Sci.Rep.6,27859(2016).

    31. Shema,E.etal.Single-moleculedecodingofcombinatoriallymodifiednucleosomes.Science(80-.).352,(2016).

    32. Vong,J.S.L.etal.Single-StrandedDNALibraryPreparationPreferentiallyEnrichesShortMaternalDNAinMaternalPlasma.Clin.Chem.(2017).doi:10.1373/clinchem.2016.268656

    33. Li,H.&Durbin,R.FastandaccurateshortreadalignmentwithBurrows-Wheelertransform.Bioinformatics25,1754–1760(2009).

    34. Scheinin,I.etal.DNAcopynumberanalysisoffreshandformalin-fixedspecimensbyshallowwhole-genomesequencingwithidentificationandexclusionofproblematicregionsinthegenomeassembly.GenomeRes.24,2022–2032(2014).

    certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was notthis version posted May 5, 2017. ; https://doi.org/10.1101/134437doi: bioRxiv preprint

    https://doi.org/10.1101/134437

  • Notes:Acknowledgements:TheauthorswouldliketothankallmembersoftheRosenfeldLabandBrentonLab for help and constructive discussion, in particular FrancescoMarass,WendyN. Cooper, KevalPatel, JennyP.Y.Chan,MareikeThompson,LiseBarleboAhlbornand IrenaHudecovà.TheauthorswouldliketoalsothanktheCancerResearchUKCambridgeInstitutecorefacilitiesfortheirsupport,inparticular thegenomicsandbiorepository facilities.Wewould like toacknowledgeourpatientsand our caregivers, and the help and support of the research nurses, trial staff and the staff atAddenbrooke’s Hospital. In particular, we would like to acknowledge Charlotte Hodgkin, HeatherBiggs and Karen Hosking. We would like also to acknowledge the support of The University ofCambridge,CancerResearchUK(grantnumberA11906,A20240)andHutchisonWhampoaLimited.The research leading to these results has received funding from the European Research Councilunder the European Union's Seventh Framework Programme (FP/2007-2013) / ERC GrantAgreement n. 337905. This research is also supported by TargetOvarian Cancer and theMedicalResearchCouncilthroughtheirJointClinicalResearchTrainingFellowshipforDrMoore.Thefundershadnorole instudydesign,datacollectionandanalysis,decisiontopublish,orpreparationofthemanuscript.

    Authorcontributions:FM,AMP,DC,EM, JDBandNRconceptualisedanddesigned thestudy;FM,AMP,EM,KH,CGS,JCMW,DG,RM,TG,AS,IG,CAPhaveperformedexperimentsandcollecteddata;DChasconceptualisedanddesignedthet-MADindexandperformedsWGSbioinformaticsanalysis;JMperformedTAm-Seqbioinformatics analysis;RMandSRhavedesigned theanimalmodel;MJLperformedhistopathologyrevision;FM,AMP,DC,EM,JDBandNRhavewrittenthemanuscript;allco-authorshavecriticallyreviewedthemanuscript;FM,AMP,DC,JDBandNRsupervisedthestudy.

    Author information: NR, JDB and DG are cofounders, shareholders and officers/consultants ofInivataLtd,acancergenomicscompanythatcommercialisesctDNAanalysis.InivataLtdhadnorolein the conceptualisation, study design, data collection and analysis, decision to publish orpreparation of the manuscript. NR and FM are co-inventors of patent applications that describemethods for theanalysisofDNA fragmentsandapplicationsof circulating tumourDNA.Otherco-authorshavedeclarednoconflictofinterests.

    certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was notthis version posted May 5, 2017. ; https://doi.org/10.1101/134437doi: bioRxiv preprint

    https://doi.org/10.1101/134437

  • Figure1:

    Figure1:PlasmaDNAoriginatingfromtumourandnon-tumourcellshavedifferentsizes,enablingspecific enrichment for ctDNA. a. Using an animal model with xenografted cells enabled thediscrimination of DNA fragments released by the cancer cells (corresponding to the humanDNA)from the DNA fragments released by the wild-type cells (corresponding to the rat DNA). b. Sizedistribution,assessedbysWGS,ofDNAfragmentsfromaplasmasampleofaratxenograftedwithahumanglioblastomatumour.c.SizedistributionofDNAfragmentsfrom26plasmasamplesincludedin this study, assessed by sWGS. In green are the DNA fragments of the samples without size-selection,andinorangeaftersize-selection.Thetwodottedverticallinesindicatethesizeselectionrangebetween90bpand150bp.

    sequencing andsequencing andalignmentalignment

    human genome = tumour DNAhuman genome = tumour DNA

    rat genome = wild-type DNArat genome = wild-type DNA

    a

    xenograftxenograft

    selected fragment lengthsselected fragment lengthssize range enriched size range enriched for tumour DNAfor tumour DNA

    b c

    0.00

    0.01

    0.02

    0.03

    0.04

    0.05

    100 200 300Fragment size (bp)

    Prop

    ortio

    n of

    read

    s

    genome

    human

    mouse

    0.00

    0.01

    0.02

    0.03

    0.04

    0.05

    100 200 300Fragment size (bp)

    Prop

    ortio

    n of

    read

    s

    statusno size selection

    with size selection

    certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was notthis version posted May 5, 2017. ; https://doi.org/10.1101/134437doi: bioRxiv preprint

    https://doi.org/10.1101/134437

  • Figure2:

    Figure 2: Recovery of short cfDNA fragments enriches for the representation of the cancergenome.a.Samplescollected from13patientswithHGSOCwereanalysedeitherwithorwithoutfilteringbysizeselection.b.SCNAanalysisbasedonthelog2ratiosofregionsalongthegenomeofDNAextractedfromaplasmasamplecollectedduringtreatmentforpatientOV04-83.c.Thesameanalysisof the samesamplewith size selectionof fragmentsbetween90bpand150bp. Inferredamplificationsareshowninblueanddeletionsinorange.d.SCNAanalysisofthesegmentallog2ratioacross a list of 29 genes frequently mutated in recurrent ovarian cancer, measured in plasmasamples collected during treatment for all 13 patients, without size selection (left) and with sizeselection(right).ThetwosampleswhichexhibitedadegradedpatternofcfDNAfragmentationwereOV04-292andOV04-300 (both labelledby“#”).e.Acomparisonof theabsolute levelof log2ratioacrossthe29genesofinterestindicatedasignificantdifferencebetweenthesamesampleswithoutand with size selection (p = 7.72.10-9). The 2 samples with the degraded pattern of cfDNAfragmentationhavebeenexcludedfromthisanalysis.

    Sequencing, alignment and genomic analysis

    no size selection

    with size selection

    0.01

    1.00

    no size selection with size selection

    abso

    lute

    log2

    ratio

    no size selection

    MSH

    2M

    SH6

    APLF

    PAX8

    BARD

    1FA

    NCD2

    MLH

    1M

    ECO

    MTE

    RT ID4

    MYC

    APTX

    PTEN

    CHEK

    1BR

    CA2

    RB1

    PARP

    2FA

    NCM

    RAD5

    1BPA

    LB2

    TP53

    NF1

    RAD5

    1DCD

    K12

    BRCA

    1RA

    D51C

    CCNE

    1ZM

    YND8

    CHEK

    2

    with size selectionOV04−77OV04−83OV04−122OV04−141OV04−143OV04−180OV04−211OV04−226OV04−264OV04−292OV04−295OV04−297OV04−300

    MSH

    2M

    SH6

    APLF

    PAX8

    BARD

    1FA

    NCD2

    MLH

    1M

    ECO

    MTE

    RT ID4

    MYC

    APTX

    PTEN

    CHEK

    1BR

    CA2

    RB1

    PARP

    2FA

    NCM

    RAD5

    1BPA

    LB2

    TP53

    NF1

    RAD5

    1DCD

    K12

    BRCA

    1RA

    D51C

    CCNE

    1ZM

    YND8

    CHEK

    2

    log 2 ratio10.50−0.5

    log 2 ratio10.50−0.5

    a b

    d e

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 202122

    Chromosomes

    0

    -1

    -2

    1

    2

    3

    log2

    Rat

    io

    OV04-83 no size selection

    c

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 202122

    Chromosomes

    0

    -1

    -2

    1

    2

    3

    log2

    Rat

    io

    OV04-83 with size selection

    ***

    #

    #

    certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was notthis version posted May 5, 2017. ; https://doi.org/10.1101/134437doi: bioRxiv preprint

    https://doi.org/10.1101/134437

  • Figure3:

    Figure 3:Analysis of the enrichment after size selection in 26 samples sequencedby sWGS andTaggedAmplicon Sequencing (TAm-Seq) revealed relative enrichment in tumour content. a. Foreach of 13 patients, we compared cfDNA from plasma samples collected before initiation ofchemotherapyand3weeksormoreafterinitiationofchemotherapy.Eachofthe26plasmasampleswasanalysedwithandwithoutsizeselection.b.Comparisonoftheabsolutevalueofthesegmentedlog2ratiooftheSCNAscalledfortheplasmasamplesofpatientOV04-83collectedbeforeandafterinitiationofthetreatment.Datafromthesampleswithoutsize-selectionisshowningreen,andwithsizeselectioninorange.c.Thet-MADscoredeterminedfromthesWGSwithsizeselection(vertical)washigherthanwithoutsizeselection(horizontal)formostsamples,includingthesamplescollectedat baseline (red circles) and after initiation of treatment (blue triangles). The 2 samples with noobserved enrichment are OV04-292 and OV04-300. d. The enrichment factor with size selection,

    0.01

    0.10

    0.01 0.10t−MAD no size selection

    t−M

    AD w

    ith s

    ize

    sele

    ctio

    n

    statusbaselinepost−treatment

    c

    1

    2

    3

    4

    0.0 0.1 0.2 0.3t−MAD no size selection

    t−M

    AD fo

    ld e

    nric

    hmen

    t with

    siz

    e se

    lect

    ion

    statusbaselinepost−treatment

    d

    0.01

    1.00

    0.01 1.00MAF no size selection

    MAF

    with

    siz

    e se

    lect

    ion

    statusbaselinepost−treatment

    e

    0.01

    1.00

    baseline post−treatmentTimepoint

    MAF

    selectionno size selectionwith size selection

    f

    treatment

    aa

    0.0

    0.5

    1.0

    1.5

    2.0

    2.5

    0.0 0.5 1.0 1.5 2.0 2.5absolute log2ratio at baseline

    abso

    lute l

    og2r

    atio p

    ost tr

    eatm

    ent

    selectionno size selectionwith size selection

    b

    certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was notthis version posted May 5, 2017. ; https://doi.org/10.1101/134437doi: bioRxiv preprint

    https://doi.org/10.1101/134437

  • determinedbyt-MAD,variedpersamplebutwas lowerforwassamplescollectedatbaseline(redcircles),which had high initial t-MAD score, compared to samples collected after treatment (bluetriangle).e.Themutantallelefraction(MAF)determinedbytargetedsequencingwithsizeselection(vertical) was higher thanwithout size selection (horizontal) formost samples, including samplescollectedatbaseline(redcircles)andafter initiationoftreatment(bluetriangles).Thedottedareahighlights samples with low MAF (5%andthereforeaccessibleforwide-scaleanalysis.f.ComparisonoftheMAFdetectedbyTAm-Seqbeforetreatmentandafterinitiationoftreatment,asassessedbytargetedsequencing,withsizeselection(yellowtriangles)andwithoutsizeselection(greencircles).

    certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was notthis version posted May 5, 2017. ; https://doi.org/10.1101/134437doi: bioRxiv preprint

    https://doi.org/10.1101/134437