viral immunology comprehensive serological profiling of … · systematic viral epitope scanning...

11
RESEARCH ARTICLE SUMMARY VIRAL IMMUNOLOGY Comprehensive serological profiling of human populations using a synthetic human virome George J. Xu, Tomasz Kula, Qikai Xu, Mamie Z. Li, Suzanne D. Vernon, Thumbi Ndungu, Kiat Ruxrungtham, Jorge Sanchez, Christian Brander, Raymond T. Chung, Kevin C. OConnor, Bruce Walker, H. Benjamin Larman, Stephen J. Elledge* INTRODUCTION: The collection of viruses found to infect humans can have profound effects on human health. In addition to di- rectly causing acute or chronic illness, viral infection can alter host immunity in more subtle ways, leaving an indelible footprint on the immune system. This interplay between virome and host immunity has been implicated in the pathogenesis of complex diseases such as type 1 dia- betes, inflammatory bowel disease, and asthma. Despite the growing appreciation for the importance of inter- actions between the virome and host, a comprehensive method to systematically characterize these interactions has yet to be developed. RATIONALE: Current sero- logical methods to detect viral infections are predomi- nantly limited to testing one pathogen at a time and are therefore used primarily to address specific clinical hypotheses. A method that could simultaneously detect responses to all human vi- ruses would allow hypothesis- free analysis to detect as- sociations between past viral infections and particular diseases or pop- ulation structures. Humoral responses to infection typically arise within 10 to 14 days of initial exposure and can persist over years or decades, thus providing a rich source of the history of pathogen encounters. In this work, we present VirScan, a high-throughput method that allows comprehensive analysis of antiviral antibodies in human sera. VirScan uses DNA microarray synthesis and bacteriophage display to create a uniform, synthetic representation of peptide epitopes comprising the human virome. Immu- noprecipitation and high-throughput DNA sequencing reveal the peptides recognized by antibodies in the sample. The analysis re- quires less than 1 ml of blood. RESULTS: We screened sera from 569 human donors across four continents, assaying a total of over 10 8 antibody-peptide interactions for reactivity to 206 human viral species and >1000 strains. We found that VirScan s performance in detecting known infections and distinguishing between exposures to related viruses is com- parable to that of classical serum antibody tests for single viruses. We detected antibodies to an average of 10 viral species per person and 84 species in at least two individuals. Our approach maps antibody targets at 56amino acid resolu- tion, and our results nearly double the number of previously established viral B cell epitopes. Although rates of specific virus exposure varied depending on age, HIV status, and geographic location of the donor, we observed strong similarities in antibody re- sponses across individ- uals. In particular, we found multiple instances of sin- gle peptides that were re- currently recognized by antibodies in the vast ma- jority of donors. We performed tiling muta- genesis and found that these antibody responses targeted substantially conserved public epi- topesfor each virus, suggesting that antibodies with highly similar specificities, and possibly structures, are elicited across individuals. CONCLUSION: VirScan is a method that ena- bles human virome-wide exploration, at the epi- tope level, of immune responses in large numbers of individuals. We have demonstrated its ef- fectiveness for determining viral exposure and characterizing viral B cell epitopes in high throughput and at high resolution. Our prelim- inary studies have revealed intriguing general properties of the human immune system, both at the individual and the population scale. VirScan may prove to be an important tool for uncovering the effect of host-virome interactions on human health and disease and could easily be expanded to include new viruses as they are discovered, as well as other human pathogens, such as bacteria, fungi, and protozoa. RESEARCH SCIENCE sciencemag.org 5 JUNE 2015 VOL 348 ISSUE 6239 1105 The list of author affiliations is available in the full article online. *Corresponding author. E-mail: [email protected] Cite this paper as G. J. Xu et al ., Science 348, aaa0698 (2015). DOI: 10.1126/science.aaa0698 ON OUR WEB SITE Read the full article at http://dx.doi. org/10.1126/ science.aaa0698 .................................................. Systematic viral epitope scanning (VirScan). This method allows comprehensive analysis of antiviral antibodies in human sera. VirScan combines DNA microarray synthesis and bacteriophage display to create a uniform, synthetic representation of peptide epitopes comprising the human virome. Immunoprecipitation and high-throughput DNA sequencing reveal the peptides recognized by antibodies in the sample. The color of each cell in the heatmap depicts the relative number of antigenic epitopes detected for a virus (rows) in each sample (columns). on September 17, 2020 http://science.sciencemag.org/ Downloaded from

Upload: others

Post on 22-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: VIRAL IMMUNOLOGY Comprehensive serological profiling of … · Systematic viral epitope scanning (VirScan). This method allows comprehensive analysis of antiviral antibodies in human

RESEARCH ARTICLE SUMMARY

VIRAL IMMUNOLOGY

Comprehensive serological profilingof human populations using asynthetic human viromeGeorge J Xu Tomasz Kula Qikai Xu Mamie Z Li Suzanne D VernonThumbi Ndungrsquou Kiat Ruxrungtham Jorge Sanchez Christian BranderRaymond T Chung Kevin C OrsquoConnor Bruce WalkerH Benjamin Larman Stephen J Elledge

INTRODUCTION The collection of virusesfound to infect humans can have profoundeffects on human health In addition to di-rectly causing acute or chronic illness viralinfection can alter host immunity in moresubtle ways leaving an indelible footprint onthe immune system This interplay betweenvirome and host immunityhas been implicated in thepathogenesis of complexdiseases such as type 1 dia-betes inflammatory boweldisease and asthmaDespitethe growing appreciationfor the importance of inter-actions between the viromeand host a comprehensivemethod to systematicallycharacterizethese interactionshas yet to be developed

RATIONALE Current sero-logical methods to detectviral infections are predomi-nantly limited to testingone pathogen at a time andare therefore used primarilyto address specific clinicalhypotheses A method thatcould simultaneously detectresponses to all human vi-ruses would allow hypothesis-free analysis to detect as-sociations between past viralinfections and particular diseases or pop-ulation structures Humoral responses to infectiontypically arise within 10 to 14 days of initialexposure and can persist over years or decadesthus providing a rich source of the history ofpathogen encounters In this work we presentVirScan a high-throughput method that allowscomprehensive analysis of antiviral antibodiesin human sera VirScan uses DNA microarraysynthesis and bacteriophage display to createa uniform synthetic representation of peptideepitopes comprising the human virome Immu-

noprecipitation and high-throughput DNAsequencing reveal the peptides recognizedby antibodies in the sample The analysis re-quires less than 1 ml of blood

RESULTS We screened sera from 569 humandonors across four continents assaying a total

of over 108 antibody-peptide interactions forreactivity to 206 human viral species and gt1000strains We found that VirScanrsquos performance indetecting known infections and distinguishingbetween exposures to related viruses is com-parable to that of classical serum antibody testsfor single viruses We detected antibodies to anaverage of 10 viral species per person and 84species in at least two individuals Our approachmaps antibody targets at 56ndashamino acid resolu-tion and our results nearly double the numberof previously established viral B cell epitopes

Although rates of specific virus exposure varieddepending on age HIV status and geographiclocation of the donor we observed strong

similarities in antibody re-sponses across individ-uals In particular we foundmultiple instances of sin-gle peptides that were re-currently recognized byantibodies in the vast ma-

jority of donors We performed tiling muta-genesis and found that these antibody responsestargeted substantially conserved ldquopublic epi-topesrdquo for each virus suggesting that antibodieswith highly similar specificities and possiblystructures are elicited across individuals

CONCLUSION VirScan is a method that ena-bles human virome-wide exploration at the epi-tope level of immune responses in largenumbersof individuals We have demonstrated its ef-fectiveness for determining viral exposure andcharacterizing viral B cell epitopes in highthroughput and at high resolution Our prelim-inary studies have revealed intriguing general

properties of the human immune systemboth at the individual and the population scaleVirScan may prove to be an important tool foruncovering the effect of host-virome interactionson human health and disease and could easilybe expanded to include new viruses as they arediscovered as well as other human pathogenssuch as bacteria fungi and protozoa

RESEARCH

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 1105

The list of author affiliations is available in the full article onlineCorresponding author E-mail selledgegeneticsmedharvardeduCite this paper as G J Xu et al Science 348 aaa0698 (2015)DOI 101126scienceaaa0698

ON OUR WEB SITE

Read the full articleat httpdxdoiorg101126scienceaaa0698

Systematic viral epitope scanning (VirScan)This method allows comprehensive analysis of antiviral antibodies inhuman sera VirScan combines DNA microarray synthesis and bacteriophage display to create a uniform syntheticrepresentation of peptide epitopes comprising the human virome Immunoprecipitation and high-throughput DNAsequencing reveal the peptides recognized by antibodies in the sample The color of each cell in the heatmap depictsthe relative number of antigenic epitopes detected for a virus (rows) in each sample (columns)

on Septem

ber 17 2020

httpsciencesciencemagorg

Dow

nloaded from

RESEARCH ARTICLE

VIRAL IMMUNOLOGY

Comprehensive serological profilingof human populations using asynthetic human viromeGeorge J Xu1234 Tomasz Kula345 Qikai Xu34 Mamie Z Li34 Suzanne D Vernon6

Thumbi Ndungrsquou78910 Kiat Ruxrungtham11 Jorge Sanchez12 Christian Brander13

Raymond T Chung14 Kevin C OrsquoConnor15 Bruce Walker89

H Benjamin Larman16 Stephen J Elledge346dagger

The human virome plays important roles in health and immunity However currentmethods for detecting viral infections and antiviral responses have limited throughput andcoverage Here we present VirScan a high-throughput method to comprehensively analyzeantiviral antibodies using immunoprecipitation and massively parallel DNA sequencing of abacteriophage library displaying proteome-wide peptides from all human viruses Weassayed over 108 antibody-peptide interactions in 569 humans across four continentsnearly doubling the number of previously established viral epitopes We detectedantibodies to an average of 10 viral species per person and 84 species in at least twoindividuals Although rates of specific virus exposure were heterogeneous acrosspopulations antibody responses targeted strongly conserved ldquopublic epitopesrdquo for eachvirus suggesting that they may elicit highly similar antibodies VirScan is a powerfulapproach for studying interactions between the virome and the immune system

The collection of viruses found to infecthumans (the ldquohuman viromerdquo) can haveprofound effects on human health (1) Inaddition to directly causing acute or chronicillness viral infection can also alter host

immunity in more subtle ways leaving an in-

delible footprint on the immune system (2) Forexample latent herpesvirus infection has beenshown to confer symbiotic protection againstbacterial infection in mice through prolongedproduction of interferon-g and systemic activa-tion of macrophages (3) This interplay betweenvirome and host immunity has also been impli-cated in the pathogenesis of complex diseasessuch as type 1 diabetes inflammatory bowel dis-ease and asthma (4) Despite this growing ap-preciation for the importance of interactionsbetween the virome and host a comprehensivemethod to systematically characterize these inter-actions has yet to be developed (5)Viral infections can be detected by serological

or nucleic acidndashbased methods (6) However nu-cleic acid tests fail in cases where viruses havealready been cleared after causing or initiatingtissue damage and can miss viruses of low abun-dance or viruses not normally present in thesampled fluid or surface In contrast humoral re-sponses to infection typically arise within 2 weeksof initial exposure and can persist over years ordecades (7) Tests detecting antiviral antibodiesin peripheral blood can therefore identify on-going and cleared infections However currentserological methods are predominantly limitedto testing one virus at a time and are thereforeonly used to address specific clinical hypothesesScaling serological analyses to encompass thecomplete human virome poses substantial tech-nical challenges but would be of great valuefor better understanding host-virus interactionsand would overcome many of the limitations

associated with current clinical technologiesIn this work we present VirScan a programma-ble high-throughput method to comprehensivelyanalyze antiviral antibodies using immunopre-cipitation andmassively parallel DNA sequencingof a bacteriophage library displaying proteome-wide coverage of peptides from all human viruses

Results

The VirScan platform

VirScan uses the phage immunoprecipitationsequencing (PhIP-seq) technology previously de-veloped in our laboratory (8) Briefly we used aprogrammable DNA microarray to synthesize93904 200-mer oligonucleotides encoding 56-residue peptide tiles with 28-residue overlapsthat together span the reference protein se-quences (collapsed to 90 identity) of all vi-ruses annotated to have human tropism in theUniProt database (Fig 1A a and b) (9) This li-brary includes peptides from 206 species of virusand over 1000 different strains We cloned thelibrary into a T7 bacteriophage display vector forscreening (Fig 1A c)To perform a screen we incubate the library

with a serum sample containing antibodiesrecover the antibodies by using a mixture ofprotein Andash and Gndashcoated magnetic beads andremove unbound phage particles by washing(Fig 1A d and e) Last we perform polymerasechain reaction (PCR) and massively parallelsequencing on the phage DNA to quantify en-richment of each library member resulting fromantibody binding (Fig 1A f) Each sample isscreened in duplicate to ensure reproducibilityVirScan requires only 2 mg of immunoglobulin(lt1 ml of serum) per sample and can be auto-mated on a 96-well liquid handling robot (10)PCR product from 96 immunoprecipitationscan be individually barcoded and pooled for se-quencing reducing the cost for a comprehensiveviral antibody screen to about $25 per sampleAfter sequencing we tally the read count for

each peptide before (ldquoinputrdquo) and after (ldquooutputrdquo)immunoprecipitation We then fit a zero-inflatedgeneralized Poisson model to the distribution ofoutput read counts for each input read count andregress the parameters as a function of input readcount (fig S1) With use of this model we cal-culate a ndashlog10(P value) for the significance of eachpeptidersquos enrichment Last we call a peptide sig-nificantly enriched if its ndashlog10(P value) is greaterthan the reproducibility threshold of 23 in bothreplicates (fig S2)

Characterizing VirScanrsquos sensitivityand specificity

Figure 1B shows the antibody profiles of a set ofhuman viruses in sera from a typical group ofindividuals in a heat map format that illustratesthe number of enriched peptides from each virusWe frequently detected antibodies to multiplepeptides from common human viruses such asEpstein-Barr virus (EBV) cytomegalovirus (CMV)and rhinovirus As expected we observed morepeptides to be enriched from viruses with largerproteomes such as EBV and CMV likely because

RESEARCH

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-1

1Program in Biophysics Harvard University Cambridge MA02115 USA 2Harvard-Massachusetts Institute of Technology(MIT) Division of Health Sciences and TechnologyCambridge MA 02139 USA 3Division of GeneticsDepartment of Medicine Howard Hughes Medical InstituteBrigham and Womens Hospital Boston MA 02115 USA4Department of Genetics Harvard University Medical SchoolBoston MA 02115 USA 5Program in Biological andBiomedical Sciences Harvard University Cambridge MA02115 USA 6Solve MECFS Initiative Los Angeles CA90036 USA 7KwaZulu-Natal Research Institute forTuberculosis and HIV Nelson R Mandela School of MedicineUniversity of KwaZulu-Natal Durban South Africa 8HIVPathogenesis Programme Doris Duke Medical ResearchInstitute Nelson R Mandela School of Medicine DurbanSouth Africa 9Ragon Institute of Massachusetts GeneralHospital MIT and Harvard University Cambridge MA 02139USA 10Max Planck Institute for Infection BiologyChariteplatz D-10117 Berlin Germany 11Vaccine and CellularImmunology Laboratory Department of Medicine Faculty ofMedicine and Chula-Vaccine Research Center ChulalongkornUniversity Bangkok Thailand 12Asociacioacuten Civil IMPACTASalud y Educacioacuten Lima Peru 13AIDS Research Institute-IrsiCaixa and AIDS Unit Hospital Germans Trias i PujolUniversitat Autogravenoma de Barcelona Badalona SpainInstitucioacute Catalana de Recerca i Estudis Avanccedilats (ICREA)Barcelona Spain 14Division of GastroenterologyMassachusetts General Hospital Boston MA 02114 USA15Department of Neurology Yale School of Medicine New HavenCT 06520 USA 16Division of Immunology Department ofPathology Johns Hopkins University Baltimore MD 21205 USAThese authors contributed equally to this work daggerCorrespondingauthor E-mail selledgegeneticsmedharvardedu

on Septem

ber 17 2020

httpsciencesciencemagorg

Dow

nloaded from

there aremore epitopes available for recognitionWe noticed fewer enriched peptides in samplesfrom individuals less than 10 years of age com-paredwith their geographicallymatched controlsin line with an accumulation of viral infectionsthroughout adolescence and adulthood Howevertherewere occasional samples fromyoung donorswith very strong responses to viruses that cause

childhood illness such as parvovirus B19 andherpesvirus 6B which cause the ldquofifth diseaserdquoand ldquosixth diseaserdquo of the classical infectiouschildhood rashes respectively (11) These obser-vations are examined in greater detail in Fig 2Wedeveloped a computationalmethod to iden-

tify the set of viruses to which an individual hasbeen exposed based on the number of enriched

peptides identifiedper virus Brieflywe set a thresh-oldnumberof significantnon-overlappingenrichedpeptides for each virus We empirically determinedthat a threshold of three non-overlapping en-riched peptides gave the best performance fordetecting herpes simplex virus 1 (HSV1) com-pared with a commercial serologic test describedbelow (Table 1) For other viruses we adjusted the

aaa0698-2 5 JUNE 2015 bull VOL 348 ISSUE 6239 sciencemagorg SCIENCE

Fig 1 General VirScan analysis of the human virome (A) Construction ofthe virome peptide library and VirScan screening procedure (a) The viromepeptide library consists of 93904 56ndashamino acid peptides tiling with 28ndashaminoacid overlap across the proteomes of all known human viruses (b) The 200-ntDNA sequences encoding the peptides were printed on a releasable DNAmicroarray (c) The released DNA was amplified and cloned into a T7 phagedisplay vector and packaged into virus particles displaying the encoded peptideon its surface (d) The library is mixed with a sample containing antibodies thatbind to their cognatepeptide antigenon thephage surface (e) Theantibodiesareimmobilized and unbound phage are washed away (f) Last amplification of thebound DNA and high-throughput sequencing of the insert DNA from boundphage reveals peptides targeted by sample antibodies Ab antibody IP im-munoprecipitation (B) Antibody profile of randomly chosen group of donors toshow typical assay results Each row is a virus each column is a sample Thelabel above each chart indicateswhether the donors are over 10 years of age oratmost 10 years of ageThe color intensity of each cell indicates the number ofpeptides from the virus that were significantly enriched by antibodies in thesample (C) Scatter plot of the number of unique enriched peptides (afterapplying maximum parsimony filtering) detected in each sample against theviral load in that sample Data are shown for the HCV-positive and HIV-positive

samples for which we were able to obtain viral load data For the HIV-positivesamples red dots indicate samples fromdonors currently on highly active anti-retroviral therapy (HAART) at the time the sample was taken whereas bluedots indicate different donors before undergoing therapy IU internationalunits (D) Overlap between enriched peptides detected by VirScan and humanB cell epitopes from viruses in IEDBThe entire pink circle represents the 1392groups of nonredundant IEDB epitopes that are also present in the VirScanlibrary (out of 1559 clusters total)The overlap region represents the number ofgroups with an epitope that is also contained in an enriched peptide detectedby VirScan The purple-only region represents the number of nonredundantenriched peptides detected by VirScan that do not contain an IEDB epitopeData are shown for peptides enriched in at least one (left) or at least two (right)samples (E) Overlap between enriched peptides detected by VirScan andhuman B cell epitopes in IEDB from common human viruses The regionsrepresent the same values as in (D) except only epitopes corresponding to theindicated virus are considered and only peptides from that virus that wereenriched in at least two sampleswere considered (F) Distribution of numberofviruses detected in each sample The histogram depicts the frequency ofsamples binned by the number of virus species detected by VirScanThemeanand median of the distribution are both about 10 virus species

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

threshold to account for the size of the viralproteome (fig S3) Next we tally the number ofenriched peptides from each virus Antibodiesgenerated against a specific virus can cross-reactwith similar peptides from a related virus Thiswould lead to false positives because an antibodytargeted to an epitope from one virus to which adonor was exposed would also enrich a homolo-gous peptide from a related virus to which thedonor may not have been exposed In order toaddress this issue we adopted a maximum par-simony approach to infer the fewest number ofvirus exposures that could elicit the observedspectrum of antiviral peptide antibodies Groupsof enriched peptides that share a sevenndashaminoacid subsequence may be recognized by a singlespecific antibody so we only count them as oneepitope for the virus that has the greatest num-ber of other enriched peptides If this adjustedpeptide count is greater than the threshold forthat virus the sample is considered positive forthe virus For this analysis we also filtered outpeptides that were enriched in only 1 of the 569samples to avoid spurious hitsWith this analytical framework we measured

the performance of VirScan by using serum sam-ples from individuals known to be infected or

not infected with human immunodeficiency vi-rus (HIV) and hepatitis C virus (HCV) based oncommercial enzyme-linked immunosorbent assay(ELISA) and Western blot assays For both vi-ruses VirScan achieves very high sensitivitiesand specificities of ~95 or higher (Table 1) overa wide range of viral loads (Fig 1C) The viralgenotype was also known for the HCV-positivesamples Despite the over 70 amino acid se-quence conservation amongHCV genotypes (12)which poses a problem for all antibody-baseddetection methods VirScan correctly reportedtheHCV genotype in 69 of the samplesWe alsocompared VirScan to a commercially availableserology test that is type-specific for the highlyrelated HSV1 and HSV2 (Table 1) These resultsdemonstrate that VirScan performs well in dis-tinguishing between closely related viruses andviruses that range in size from small (HIV andHCV) to very large (HSV1 and HSV2) with highsensitivity and specificity

Population-level analysis ofviral exposures

After ascertaining the performance of VirScanfor a panel of viruses we undertook a large-scalescreening of samples with unknown exposure

history By using our multiplex approach we as-sayed over 106 million antibody-peptide inter-actions with samples from 569 human donors induplicate We detected antibody responses to anaverage of 10 species of virus per sample (Fig 1F)Each person is likely exposed tomultiple distinctstrains of some viral species We detected anti-body responses to 62 of the 206 species of virusin our library in at least five individuals and 84species in at least two individuals The mostfrequently detected viruses are generally thoseknown to commonly infect humans (Table 2 andtable S1) We occasionally detected what appearto be false positives thatmay be due to antibodiesthat cross-react with nonviral peptides For exam-ple 29 of the samples positive for cowpox vi-rus were right at the threshold of detection andhad antibodies against a peptide from the C4Lgene that shares an eightndashamino acid sequence(SESDSDSD D Asp E Glu S Ser) with theclumping factor B protein from Staphylococcusaureus against which humans are known to gen-erate antibodies (13) This will become less of anissue when we test more examples of sera fromindividuals with known infections to determinethe set of likely antigenic peptides for a givenvirus However the fact that we do not detect

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-3

Fig 2 Population stratification of the human virome immune responseThe bar graphs depict the differences in exposure to viruses between donors who are(A) less than 10 years of age versus over 10 years of age (B) HIV-positive versus HIV-negative residing in the United States (C) residing in Peru versus residing inthe United States (D) residing in South Africa versus residing in the United States and (E) residing in Thailand versus residing in the United States Asterisksindicate false discovery rate lt 005

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

high rates of very rare viruses strengthens ourconfidence in VirScanrsquos specificity (see supple-mentary discussion)We frequently detected antibodies to rhino-

virus and respiratory syncytial virus which arenormally found only in the respiratory tract in-dicating that VirScan using blood samples is stillable to detect viruses that do not cause viremiaWe also detected antibodies to influenza whichis normally cleared andpoliovirus towhichmostpeople in modern times generate antibodies

through vaccination Because the original antigenis no longer present we are likely detecting an-tibodies secreted by long-livedmemoryB cells (14)We detected antibodies to certain viruses

less frequently than expected based on previousseroprevalence studies that used optimized serumELISAs For example the frequency at which wedetect influenza (534) and poliovirus (337) islower than expected given that the majority ofthe population has been exposed to or vaccinatedagainst these viruses Thismay be due to reducedsensitivity because of a gradual narrowing anddecrease of the long-lived B cell response in theabsence of persistent antigen We also rarely de-tected antibody responses to small viruses suchas JC virus (JCV) and torque teno virus which arefrequently detected by using specific tests Webelieve that the disparity is due to low titers ofantibodies to unmodified linear epitopes fromthese viruses For example serum antibodiesagainst the major capsid protein of JCV are re-ported to only recognize conformational epitopes(15) Last the frequency of detecting varicellazoster virus (chicken pox) antibodies is also lowerthan expected (243) even though the frequen-cy of detecting other latent herpesviruses such asEBV (871) and CMV (485) is similar to theprevalence reported in epidemiological studies(16ndash18) This may reflect differences in how fre-quently these viruses shed antigens that stimu-late B cell responses or a more limited humoralresponse that relies on epitopes that cannot bedetected in a 56-residue peptide It might also bepossible to increase the sensitivity of detection ofthese viral antibodies by stimulating memory Bcells in vitro to probe the history of infectionmore deeplyTo assess differences in viral exposure between

populations we split the samples into differentgroups based on age HIV status and geographyWe first compared results from children underthe age of 10 to adults within the United States(HIV-positive individuals were excluded from thisanalysis) (Fig 2A) Fewer children were positivefor most viruses including EBV HSV1 HSV2and influenza virus which is consistent with ourpreliminary observations comparing the numberof enriched peptides (Fig 1B) In addition to thefact that children may generate lower antibodytiters in general these younger donors probablyhave not yet been exposed to certain viruses forexample HSV2 which is sexually transmitted (19)When comparing results from HIV-positive to

HIV-negative samples we foundmore of theHIV-positive samples to also be seropositive for addi-tional viruses includingHSV2 CMV andKaposirsquossarcomandashassociated herpesvirus (KSHV) (falsediscovery rate q lt 005 Fig 2B) These results areconsistent with prior studies indicating higherrisk of these co-infections in HIV positive patients(20ndash22) Patients with HIV may engage in activ-ities that put them at higher risk for exposure tothese viruses Alternatively these viruses may in-crease the risk of HIV infection HIV infectionmay reduce the immune systemrsquos ability to con-trol reactivation of normally dormant residentviruses or to prevent opportunistic infections

from taking hold and triggering a strong adaptiveimmune responseLast we compared evidence of viral exposure

among samples taken from adult HIV-negativedonors residing in countries (United States PeruThailand and South Africa) from four differentcontinents In general donors outside the UnitedStates had higher frequencies of seropositivity(Fig 2 C to E) For example CMV antibodieswere found in significantly higher frequencies insamples from Peru Thailand and South AfricaOther viruses such as KSHV and HSV1 were de-tected more frequently in donors from Peru andSouth Africa but not Thailand The observed de-tection frequency of different adenovirus speciesvaries across populations Adenovirus C seropos-itivity was found at similar frequencies in allregions but adenovirus D seropositivity was gen-erally higher outside the United States whereasadenovirus B seropositivity was higher in Peruand South Africa but not in Thailand The higherrates of virus exposure outside the United Statescould be due to differences in population densitycultural practices sanitation or genetic suscep-tibility Additionally influenza B seropositivitywas more common in the United States com-pared with other countries especially Peru andThailand The global incidence of influenza B ismuch lower than influenza A but the standardinfluenza vaccination contains both influenza Aand B strains so the elevated frequency of indi-viduals with seroreactivity may be due to higherrates of influenza vaccination in theUnited StatesOther viruses such as rhinovirus and EBV weredetected at very similar frequencies in all thegeographic regions

Analysis of viral epitope determinants

After analyzing responses on the whole-virus lev-el we focused our attention on the specific pep-tides targeted by these antibodies We detectedantibodies to a total of 8425 peptides in at leasttwo samples and 15052 in at least one sampleBecause of the presence ofmany related peptidesin our library and the Immune Epitope Database(IEDB) for the following analysis we consider apeptide unique only if it does not contain a con-tinuous seven-residue subsequence the estimatedsize of a linear epitope in common with anotherpeptide Analyzed as such our VirScan databasenearly doubles the 1559 unique human B cell ep-itopes from human viruses in the IEDB (23) Theepitopes identified in our unbiased analysis dem-onstrate a significant overlap with those con-tained in the IEDB (P lt 10minus30 Fisherrsquos exact textFig 1D) The amount of overlap is even greaterfor epitopes from viruses that commonly causeinfection (Fig 1E)Wewould likely have detectedeven more antigenic peptides in common withthe IEDB if we had tested more samples fromindividuals infected with rare viruses We nextanalyzed the amino acid composition of recur-rently enriched peptides Enriched peptides tendto have more proline and charged amino acidsand fewer hydrophobic amino acids which isconsistent with a previous analysis of B cell ep-itopes in the IEDB (fig S4) (24) This trend

aaa0698-4 5 JUNE 2015 bull VOL 348 ISSUE 6239 sciencemagorg SCIENCE

Table 2 Frequently detected viruses The column indicates the percentage of samples thatwere positive for the virus by VirScan Known HIV-and HCV-positive samples were excluded whenperforming this analysis

Virus species

Human herpesvirus 4 871Rhinovirus B 718Human adenovirus C 718Rhinovirus A 673Human respiratory syncytial virus 657Human herpesvirus 1 544Influenza A virus 534Human herpesvirus 6B 528Human herpesvirus 5 485Influenza B virus 405Poliovirus 337Human herpesvirus 3 243Human adenovirus F 204Human adenovirus B 168Human herpesvirus 2 155Enterovirus A 152Enterovirus B 133

Table 1 VirScanrsquos sensitivity and specificityon samples with known viral infections Sen-sitivity is the percentage of samples positive forthe virus as determined by VirScan out of all nknown positives Specificity is the percentage ofsamples negative for the virus by VirScan out ofall n known negatives

Virus Sensitivity (n) Specificity (n)

HCV 92 (26) 97 (34)HIV1 95 (61) 100 (33)HSV1 97 (38) 100 (6)HSV2 90 (20) 100 (24)We found that although the false negative samplesdid not meet our stringent cutoff for enriching multipleunique peptides they had detectable antibodies to arecurrent epitope By modifying the criterion to allowfor samples that enrich multiple homologous peptidesthat share a recurrent epitope as described in the textthe sensitivity of detecting HCV increases to 100and the sensitivity for detecting HIV increases to 97This modified criterion does not significantly affectspecificity (fig S13) The one false positive was froman individual whose HCV-negative status was self-reported but who had antibodies to as many HCVpeptides as 23 of the true HCV-positive individualsand is likely to be HCV-positive now or in the past It ispossible that this individual was exposed to HCV butcleared the infection If true the observed specificityfor HCV is 100

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

likely reflects enrichment for amino acids thatare surface-exposed or can form stronger inter-actions with antibodies

B cell responses target highly similar viralepitopes across individuals

We compared the profile of peptides recognizedby the antibody response in different individualsWe found that for a given protein each samplegenerally only had strong responses against oneto three immunodominant peptides (Fig 3) Un-expectedly we found that the vast majority ofseropositive samples for a given virus recognizedthe same immunodominant peptides suggestingthat the antiviral B cell response is highly stereo-typed across individuals For example in glyco-protein G from respiratory syncytial virus thereis only a single immunodominant peptide com-prising positions 141 to 196 that is targeted by allsamples with detectable antibodies to the pro-tein regardless of the country of origin (Fig 3A)

For other antigens we observed interpopu-lation serological differences For example twooverlapping peptides from positions 309 to 364and 337 to 392 of the penton base protein fromadenovirus C frequently elicited antibody re-sponses (Fig 3B) However donors from theUnited States andSouthAfrica hadmuch strongerresponses to peptide 309-364 (P lt 10minus6 t test)relative to donors from Thailand and Peru Weobserved that for the EBNA1 protein from EBVdonors from all four countries frequently hadstrong responses to peptide 393-448 and occa-sionally to peptide 589-644 However donorsfrom Thailand and Peru had much stronger re-sponses to peptide 57-112 (Plt 10minus6 t test) (Fig 3C)These differences may reflect variation in thestrains endemic in each region In addition poly-morphism of major histocompatibility complex(MHC) class II alleles immunoglobulin genesand other modifiers that shape immune re-sponses in each population likely play a role in

defining the relative immunodominance of anti-genic peptidesTo determine whether the humoral responses

that target an immunodominant peptide are ac-tually targeting precisely the same epitope weconstructed single- double- and triple-alaninescanningmutagenesis libraries for eight common-ly recognized peptides These were introducedinto the same T7 bacteriophage display vectorand subjected to the same immunoprecipitationand sequencing protocol using samples from theUnited States Mutants that disrupt the epitopediminish antibody binding affinity and peptideenrichment We found that for all eight peptidestested there was a single largely contiguous sub-sequence in which mutations disrupted bindingfor the majority of samples As expected the tri-ple mutants abolished antibody binding to agreater extent and the enrichment patternsweresimilar among single double and triple mutantsof the same peptide (Fig 4 and figs S5 to S11)

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-5

Fig 3The human antiviromeresponse recognizes a similarspectrum of peptides amonginfected individuals In theheat-map charts each row is apeptide tiling across the indi-cated protein and each columnis a sample The colored barabove each column labeled atthe top of the panels indicatesthe country of origin for thatsample The samples shown area subset of individuals withantibodies to at least one pep-tide from the protein The colorintensity of each cellcorresponds to the ndashlog10(Pvalue) measure of significanceof enrichment for a peptide in asample (greater values indi-cates stronger antibodyresponse) Data are shown for(A) human RSV attachmentglycoprotein G (G) (B) humanadenovirus C penton protein(L2) and (C) EBV nuclear anti-gen 1 (EBNA1) Data shown arethe mean of two replicates

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

For four of the eight peptides a 9ndash to 15ndashaminoacid region was critical for antibody recognitionin gt90 of samples (Fig 4 and figs S5 to S7)One other peptide had a region of similar sizethat was critical in about half of the samples (figS8) In another peptide a single region was im-portant for antibody recognition in the majorityof the samples but the extents of the critical re-gion varied slightly for different samples andoccasionally there were donors that recognized acompletely separate epitope (fig S9) The remain-ing two peptides contained a single triple mutantthat abolished binding in themajority of samplesbut the critical region also extended further todifferent extents depending on the sample (figsS10 and S11) Unexpectedly in one of these pep-tides in addition to themain region surroundingpositions 13 and 14 that is critical for binding asingle Gly36rarrAla36 (G36A) mutation disruptedbinding in almost half of the samples whereasnone of the double- or triple-alaninemutants that

also included the adjacent positions [Lys35 (L35)and G37] affected binding (fig S11) It is possiblethat G36 plays a role in helping the peptide adoptan antigenic conformation and multiple mutantscontaining the adjacent Leu or Gly residues res-cue this ability We occasionally saw other exam-ples of mutations that resulted in patterns ofdisrupted binding with no simple explanationillustrating the complexity of antibody-antigeninteractionThe discovery of recurring targeted epitopes

led us to ask whether we could apply this knowl-edge to improve the sensitivity of viral detectionwith VirScan We hypothesized that samplesshowing a strong response to a recurrently tar-geted ldquodiagnosticrdquo peptide which we defined asa peptide enriched in at least 30 of known pos-itive samples are likely to be seropositive even ifthey do not meet our stringent cutoff requiringat least two non-overlapping enriched peptidesWe tested how this modified criterion affected

our sensitivity and specificity in detecting HIVand HCV and found that it reduced the numberof false negatives without affecting the specificityof the assay (fig S13) We next turned our atten-tion to respiratory syncytial virus (RSV) a virusfor which our detected seroprevalence was lowerthan reported epidemiological rates suggestingimperfect sensitivity of our assay We tested serafrom 60 individuals for antibodies to RSV byELISA and found that 95 were positive abovethe reported sensitivity of the assay and consist-ent with near-universal exposure to this patho-gen Applying the modified criterion to thesesamples increased our rate of detectionbyVirScanfrom 63 to 97 (table S2) These data suggestthat assigning more weight to recurrently tar-geted epitopes can enhance the sensitivity ofVirScan and that the performance of the assaycan be improved by screening known positivesfor a particular virus

Discussion

We have developed VirScan a technology foridentifying viral exposure and B cell epitopesacross the entire known human virome in a sin-gle multiplex reaction using less than a drop ofblood VirScan uses DNA microarray synthesisand bacteriophage display to create a uniformsynthetic representation of peptide epitopes com-prising the human virome Immunoprecipitationand high-throughput DNA sequencing revealsthe peptides recognized by antibodies in the sam-ple VirScan is easily automated in 96-well formatto enable high-throughput sample processingBarcoding of samples during PCR enables pooledanalysis that can dramatically reduce the per-sample cost The VirScan approach has severaladvantages for studying the effect of viruses onthe host immune system By detecting antibodyresponses it can identify infectious agents thathave been cleared after an effective host responseCurrent serological methods of antiviral anti-body detection typically use the selection of asingle optimized antigen in order to achieve highaccuracy In contrast VirScanrsquos unique approachdoes not require such optimization in order toobtain similar performance VirScan achievessensitive detection by assaying each virusrsquos com-plete proteome to detect any antibodies directedto epitopes that can be captured in a 56-residuefragment and specificity by computationallyeliminating cross-reactive antibodies This un-biased approach identifies exposure to less well-studied viruses for which optimal serologicalantigens are not known and can be rapidly ex-tended to include new viruses as they are dis-covered (25)Although sensitive and selective VirScan has a

few limitations First it cannot detect epitopesthat require post-translationalmodifications Sec-ondly it cannot detect epitopes that involvediscontinuous sequences on protein fragmentsgreater than 56 residues In principle the lattercan be overcome byusing alternative technologiesthat allow for the display of full-length proteinssuch as parallel analysis of translated open read-ing frames (PLATO) (26) Third VirScan is likely

aaa0698-6 5 JUNE 2015 bull VOL 348 ISSUE 6239 sciencemagorg SCIENCE

Fig 4 Recognition of common epitopes within an antigenic peptide from human adenovirus Cpenton protein (L2) across individuals Each row is a sample Each column denotes the first mutatedposition for the (A) single- (B) double- and (C) triple-alaninemutant peptide starting with the N terminuson the left Each double- and triple-alaninemutant contains two or three adjacent mutations respectivelyextending toward the C terminus from the colored cell The color intensity of each cell indicates theenrichment of the mutant peptide relative to the wild-type For double-mutants the last position is blankThe same is true for the last two positions for triple mutants Data shown are the mean of two replicatesSingle-letter amino acid abbreviations are as follows F Phe H His I Ile K Lys N Asn P Pro Q Gln RArg TThr V Val and YTyr

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

to be less specific compared with certain nucleicacid tests that discern highly related virus strainsHowever VirScan demonstrates excellent sero-logical discrimination among similar virus speciessuch as HSV1 andHSV2 and can even distinguishthe genotype ofHCV69of the timeWe envisionthat VirScan will become an important tool forfirst-pass unbiased serologic screening appli-cations Individual viruses or viral proteins un-covered in this way can subsequently be analyzedin further detail by using more focused assays aswe have demonstrated for a panel of immunodo-minant epitopesWe have demonstrated that VirScan is a sen-

sitive and specific assay for detecting exposure toviruses across the human virome Because it canbe performed in high-throughput and requiresminimal sample and cost VirScan enables rapidand cost-effective screening of large numbers ofsamples to identify population-level differencesin virus exposure across the human virome Inthis work we analyzed over 106 million anti-body-viral peptide interactions in a comprehen-sive study of pan-virus serology in a large diversepopulation In doing so we detected 84 differentviral species in two or more individuals This islikely to be an underestimate of the history ofviral infection because only low levels of circu-lating antibodies may remain from infectionsthat were cleared in the distant past In additionan individual could be infected by multiple dis-tinct strains of each viral species We identifiedknown and novel differences in virus exposurebetween groups differing in age HIV status andgeographic location across four different con-tinents Our results are largely consistent withprevious studies validating the effectiveness ofVirScan For example CMVantibodieswere foundin significantly higher frequencies in Peru Thai-land and South Africa whereas KSHV andHSV1antibodies were detectedmore frequently in Peruand South Africa but not in Thailand (16 27ndash31)We also uncovered previously undocumentedserological differences such as an increased rateof antibodies against adenovirus B and RSV inHIV-positive individuals compared with HIV-negative individuals These differences may pro-vide insight into howHIV co-infection alters thebalance between host immunity and residentviruses as well as help to identify pathogens thatmay increase susceptibility to HIV and otherheterologous infections HIV infection may re-duce the immune systemrsquos ability to control re-activation of normally dormant resident virusesor to prevent opportunistic infections from takinghold and triggering a strong adaptive immuneresponse Beyond the epidemiological applica-tions demonstrated here VirScan could also beapplied to identify viral exposures that correlatewith disease or other phenotypes in virome-wideassociation studiesOur results identified a large number of novel

B cell epitopes cumulatively nearly doubling thenumber of all previously identified viral epitopesWe have used our data to identify globally im-munodominant and commonly recognized ldquopub-licrdquo epitopes For most species of viruses one or

more peptides are individually recognized inover 70 to 95 of samples positive for thatspecies (table S3) We identified a set of two pep-tides that together are recognized by gt95 ofall screened samples and a set of five peptidesthat together are recognized in gt99 of screenedsamples These public epitopes could be usedto improve vaccine design by piggybacking onthe existing antibody response against themFusing a public B cell epitope to a protein in avaccine to which we hope to induce an immuneresponsemay increase a vaccinersquos efficacy amonga broad population by improving presentationof that protein and aiding affinity maturationPreexisting B cells recognizing the public epi-tope can act as antigen presenting cells to pro-cess and present T cell epitopes of the fusedvaccine target on MHC class I and II (32) An-tibodies secreted by these B cells can also par-ticipate in immune complexes with the fusedvaccine target which are critical for folliculardendritic cells to prime class switching and af-finity maturation of B cells recognizing otherepitopes on the fused antigen (33) Last wedemonstrated that applying more weight tothese public epitopes increases the sensitivityof VirScan without significantly affecting spec-ificity suggesting that this limited subset of pep-tides can serve as the basis for the next generationof our assay or for other novel diagnosticsWe also found that the precise epitopes re-

cognized by the B cell response are highly similaramong individuals across many viral proteinsOne possible model for this notable similarity isthat these regions possess properties favorablefor antigenicity such as accessibility Anothermodel is that the same or highly similar B cellreceptor sequences that recognize these epitopesare commonly generated Identical T cell receptorsequences (ldquopublicrdquo clonotypes) have been foundinmultiple individuals and are thought to be theresult of biases during the recombination proc-ess that favor certain amino acid sequences(34) V(D)J recombination of the immunoglobulinheavy- and light-chain loci is also heavily biased(35) Highly similar or even identical complemen-tarity determining region 3 (CDR3) sequenceshave been observed in dengue virusndashspecific an-tibodies from different individuals (36) It is pos-sible that rather than being an exception fordengue-specific antibodies this represents ageneral phenomenon Inherent biases in V(D)Jrecombination generate the same or similar an-tibodies in multiple individuals that recognizehighly similar epitopes Slight differences in theantibodyCDR3sequencemay subtly alter antibody-antigen interaction leading to the slight var-iations observed in the extent of critical epitoperegions Sequencing of antigen-specific antibodygenes will be required to investigate these pos-sibilities The same principle may also apply to Tcell epitopes and their cognate T cell receptorsVirScan is a method that enables human

virome-wide explorationmdashat the epitope levelmdashof immune responses in large numbers of indi-viduals We have demonstrated its effectivenessfor determining viral exposure and characteriz-

ing viral B cell epitopes in high throughput andat high resolution Our preliminary studies haverevealed intriguing general properties of the hu-man immune system both at the individual andpopulation scale VirScan will be an importanttool in uncovering the effect of host-virome in-teractions on human health and disease andcould easily be expanded to include other humanpathogens such as bacteria fungi and protozoa

Materials and methods

Human donor samples

Specimens originating from human donors werecollected after informed written consent was ob-tained and under a protocol approved by the localgoverning human research protection committeeSecondary use of all samples for the purposesof this work was exempted by the Brigham andWomenrsquos Hospital Institutional Review Board(protocol number 2013P001337) Samples includeddonors residing in Thailand (n = 48) Peru (n =48) South Africa (n = 48) and the Unites StatesincludingHIV+ donors (n=61) andHCV+ donors(n = 26) All serum and plasma samples werestored in aliquots at ndash80degC until use

Design and cloning of viral peptideand scanning mutagenesislibrary sequences

For the virome peptide library we first down-loaded all protein sequences in the UniProt data-base from viruses with human host and collapsedon 90 sequence identity [wwwuniprotorgunirefquery=uniprot(host ldquoHuman+[9606]rdquo)+identity09] The clustering algorithm UniProtrepresents each group of protein sequencessharing at least 90 sequence similarity with asingle representative sequence Then we created56ndashamino acid (aa) peptide sequences tilingthrough all the proteins with 28-aa overlap Wereverse-translated these peptide sequences intoDNA codons optimized for expression in Esche-richia coli making synonymousmutations whennecessary to avoid restriction sites used in sub-sequent cloning steps (EcoRI and XhoI) Lastwe added the adapter sequence AGGAATTC-CGCTGCGT to the 5prime end and CAGGGAAGA-GCTCGAA to the 3prime end to form the 200-nucleotide(nt) oligonucleotide sequencesFor the scanning mutagenesis library we first

took the sequences of the peptides to be muta-genized For each peptide we made all single-mutant and consecutivedouble- and triple-mutantsequences scanning through the whole peptideNon-alanine amino acids were mutated to ala-nine and alanines were mutated to glycine Wereverse-translated these peptide sequences intoDNA codons making synonymous mutationswhen necessary to avoid restriction sites used insubsequent cloning steps (EcoRI and XhoI) Wealso made synonymous mutations to ensure thatthe 50 nt at the 5prime end of peptide sequence isunique to allow unambiguous mapping of thesequencing results Last we added the adaptersequence AGGAATTCCGCTGCGT to the 5prime endand CAGGGAAGAGCTCGAA to the 3prime end to formthe 200-nt oligonucleotide sequences

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-7

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

The 200-nt oligonucleotide sequences weresynthesized on a releasable DNAmicroarray WePCR-amplified the DNAwith the primers T7-PFAAATGATACGGCGGGAATTCCGCTGCGT) and T7-PRA (CAAGCAGAAGACTCGAGCTCTTCCCTG) di-gested the product with EcoRI and XhoI andcloned the fragment into the EcoRISalI site ofthe T7FNS2 vector (8) The resulting library waspackaged into T7 bacteriophage by using the T7Select Packaging Kit (EMD Millipore) and ampli-fied by using themanufacturer suggested protocol

Phage immunoprecipitationand sequencing

We performed phage immunoprecipitation andsequencing by using a slightlymodified version ofpreviously published PhIP-Seq protocols (8 10)First we blocked eachwell of a 96-deep-well platewith 1 ml of 3 bovine serum albumin in TBSTovernight on a rotator at 4degC To each preblockedwell we added sera or plasma containing about2 mg of immunoglobulinG (IgG) [quantified usinga Human IgG ELISA Quantitation Set (BethylLaboratories)] and 1 ml of the bacteriophagelibrary diluted to ~2 times 105 fold representation(2 times 1010 plaque-forming units for a library of105 clones) in phage extraction buffer (20 mMTris-HCl pH 80 100 mMNaCl 6 mMMgSO4)We performed two technical replicates for eachsample We allowed the antibodies to bind thephage overnight on a rotator at 4degC The nextday we added 20 ml each of magnetic protein Aand protein G Dynabeads (Invitrogen) to eachwell and allowed immunoprecipitation to occurfor 4 hours on a rotator at 4degC With a 96-wellmagnetic stand we then washed the beads threetimeswith 400 ml of PhIP-Seqwash buffer (50mMTris-HCl pH 75 150mMNaCl 01 NP-40) Afterthe final wash we resuspended the beads in40 ml of water and lysed the phage at 95degC for10mWe also lysed phage from the library beforeimmunoprecipitation (ldquoinputrdquo) and after immu-noprecipitation with beads aloneWe prepared the DNA for multiplexed Il-

lumina sequencing by using a slightly modifiedversion of a previously published protocol (36)Weperformed two rounds of PCR amplification on thelysedphagematerial usinghot startQ5polymeraseaccording to the manufacturer-suggested protocol(NEB) The first round of PCR used the primersIS7_HsORF5_2 (ACACTCTTTCCCTACACGACTC-CAGTCAGGTGTGATGCTC) and IS8_HsORF3_2(GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCC-GAGCTTATCGTCGTCATCC) The second roundof PCR used 1 ml of the first-round product andthe primers IS4_HsORF5_2 (AATGATACGGCGA-CCACCGAGATCTACACTCTTTCCCTACACGACTC-CAGT) and a different unique indexing primerfor each sample to bemultiplexed for sequencing(CAAGCAGAAGACGGCATACGAGATxxxxxxx-GTGACTGGAGTTCAGACGTGT where ldquoxxxxxxxrdquodenotes a unique 7-nt indexing sequence) Afterthe second round of PCR we determined theDNA concentration of each sample by quan-titative PCR and pooled equimolar amounts ofall samples for gel extraction After gel extractionthe pooled DNA was sequenced by the Harvard

Medical School Biopolymers Facility using a 50ndashbase pair read cycle on an IlluminaHiSeq 2000 or2500We pooled up to 192 samples for sequencingon each lane and generally obtained ~100 mil-lion to 200 million reads per lane (500000 to1000000 reads per sample)

Informatics and statistical analysis

We performed the initial informatics and statis-tical analysis by using a slightly modified versionof the previously published technique (8 10) Wefirstmapped the sequencing reads to the originallibrary sequences by using Bowtie and countedthe frequency of each clone in the ldquoinputrdquo andeach sample ldquooutputrdquo (37) Because the majorityof clones are not enriched we used the observeddistribution of output counts as a null distribu-tion We found that a zero-inflated generalizedpoisson distribution fits our output counts wellWe used this null distribution to calculate a Pvalue for the likelihood of enrichment for eachclone The probabilitymass function for the zero-inflated generalized poisson distribution is

PethY frac14 yTHORNfrac14 pthorn eth1 minus pTHORNfrac12qethqthorn lTHORNxminus1eminusqminusxl y frac14 0

eth1 minus pTHORNfrac12qethqthorn lTHORNxminus1eminusqminusxl y gt 0

We used maximum likelihood estimation toregress the parameters p q and l to fit the dis-tribution of counts after immunoprecipitationfor all clones present at a particular frequencycount in the input We repeated this procedurefor all of the observed input counts and foundthat q and l are well fit by linear regression and pby an exponential regression as a function of in-put count (fig S1) Last for each clonewe used itsinput count and the regression results to deter-mine the null distribution based on the zero-inflated generalized poisson model which weused to calculate the ndashlog10(P value) of obtain-ing the observed countTo call hits we determined the threshold for

reproducibility between technical replicates basedon a previously published method (10) Brieflywe made scatter plots of the log10 of the ndashlog10 (Pvalues) and used a slidingwindowofwidth 0005from0 to 2 across the axis of one replicate For allthe clones that fell within each window we cal-culated the median and median absolute devia-tion of the log10 of the ndashlog10 (P values) in theother replicate and plotted it against the windowlocation (fig S2) We called the threshold for re-producibility the first window in which the me-dian was greater than the median absolutedeviation We found that the distribution of thethreshold ndashlog10 (P value) was centered around amean of ~23 (fig S12) So we called a peptide ahit if the ndashlog10 (P value) was at least 23 in bothreplicates We eliminated the 593 hits that cameup in at least 3 of the 22 immunoprecipitationswith beads alone (negative control for nonspecificbinding) We also filtered out any peptides thatwere not enriched in at least two of the samplesTo call virus exposures we grouped peptides

according to the virus the peptide is derived

from We grouped all peptides from individual vi-ral strains for which we had complete proteomesThe samplewas counted as positive for a species ifit was positive for any strain from that species Forviral strains thathadpartial proteomeswegroupedthemwith other strains from the same species toform a complete set and bioinformatically elim-inated homologous peptides (see next para-graph) We set a threshold number of hits pervirus based on the size of the virus We foundthat there is approximately a power-law relation-ship between size of the virus and the averagenumber of hits per sample (fig S3) In comparingresults from VirScan to samples with known in-fection we empirically determined that a thresh-old of three hits for HSV1 worked the best Weused this value and the slope of the best fit lineto scale the threshold for other viruses We alsoset a minimum threshold of at least two hits inorder to avoid false positives from single spuri-ous hitsTo bioinformatically remove cross-reactive an-

tibodies we first sorted the viruses by total num-ber of hits in descending order We then iteratedthrough each virus in this order For each viruswe iterated through each peptide hit If the hitshared a subsequence of at least 7 aa with any hitpreviously observed in any of the viruses fromthat sample that hit was considered to be from across-reactive antibody and would be ignored forthat virus Otherwise the hit is considered to bespecific and the score for that virus is incre-mented by one In this way we summed only thepeptide hits that do not share any linear epi-topes We compared the final score for each virusto the threshold for that virus to determinewheth-er the sample is positive for exposure to that virusTo identify differences between populations

we first used Fisherrsquos exact test to calculate a Pvalue for the significance of association of virusexposure with one population versus anotherThenwe constructed anull distribution of Fisherrsquosexact P values by randomly permuting the sam-ple labels 1000 times and recalculating the Fisherrsquosexact P value for each virus With use of this nulldistribution we calculated the false discovery rateby dividing the number of permutation P valuesmore extreme than the one observed by the totalnumber of permutations

IEDB epitope overlap analysis

Wedownloaded data for all continuous humanBcell epitopes from IEDB and filtered out all non-viral epitopes (22) To avoid redundancy in these4549 viral epitopes we grouped together epi-topes that are 100 identical or share a 7-aa sub-sequence giving us 1559 nonredundant epitopegroups Of these groups 1392 contain a memberepitope that is also a subsequence of a peptide inthe VirScan library This represents the totalnumber of epitopes we could detect by VirScanTo determine the number of epitopes we de-tected we tallied the number of epitope groupswith at least one member that is contained in apeptide that was enriched in one or two samplesLast to determine the number of nonredundantnew epitopeswe detected we grouped non-IEDB

aaa0698-8 5 JUNE 2015 bull VOL 348 ISSUE 6239 sciencemagorg SCIENCE

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

epitopes containing peptides that share a seven-residue subsequence and counted the number ofthese nonredundant peptide groups

Scanning mutagenesis data analysis

First we estimated the fractional abundance ofeach peptide by dividing the number of readsfor that peptide by the total number of reads forthe sample Then we divided the fractional abun-dance of each peptide after immunoprecipitationby the fractional abundance before immunopre-cipitation to get the enrichment To calculaterelative enrichment we divided enrichment ofthe mutated peptide by enrichment of the wild-type peptide Because most of the single-mutantpeptides had wild-type levels of enrichment weaveraged enrichment of the wild-type peptideenrichment with the middle two quartiles of en-richment of single-mutant peptides to get a bet-ter estimate of the wild-type peptide enrichment

RSV and HSV1 and 2 serology

Serum from 44 donors was tested for HSV1 andHSV2 antibodies by using the HerpeSelect 1 and 2Immunoblot IgG kit (Focus Diagnostics) accord-ing to manufacturerrsquos protocol Serum from 60 do-nors was tested for RSV antibodies by usinganti-RSV IgG Human ELISA Kit (ab108765) ac-cording to manufacturerrsquos protocol

REFERENCES AND NOTES

1 K M Wylie G M Weinstock G A Storch Emerging view ofthe human virome Transl Res 160 283ndash290 (2012)doi 101016jtrsl201203006 pmid 22683423

2 B A Duerkop L V Hooper Resident viruses and theirinteractions with the immune system Nat Immunol 14654ndash659 (2013) doi 101038ni2614 pmid 23778792

3 E S Barton et al Herpesvirus latency confers symbioticprotection from bacterial infection Nature 447 326ndash329(2007) doi 101038nature05762 pmid 17507983

4 E F Foxman A Iwasaki Genome-virome interactionsExamining the role of common viral infections in complexdisease Nat Rev Microbiol 9 254ndash264 (2011) doi 101038nrmicro2541 pmid 21407242

5 M Lecuit M Eloit The human virome New tools andconcepts Trends Microbiol 21 510ndash515 (2013) doi 101016jtim201307001 pmid 23906500

6 I De Vlaminck et al Temporal response of the human virome toimmunosuppression and antiviral therapy Cell 155 1178ndash1187(2013) doi 101016jcell201310034 pmid 24267896

7 E Hammarlund et al Duration of antiviral immunity aftersmallpox vaccination Nat Med 9 1131ndash1137 (2003)doi 101038nm917 pmid 12925846

8 H B Larman et al Autoantigen discovery with a synthetichuman peptidome Nat Biotechnol 29 535ndash541 (2011)doi 101038nbt1856 pmid 21602805

9 UniProt Consortium Activities at the Universal ProteinResource (UniProt) Nucleic Acids Res 42 D191ndashD198 (2014)doi 101093nargkt1140 pmid 24253303

10 H B Larman et al PhIP-Seq characterization ofautoantibodies from patients with multiple sclerosis type

1 diabetes and rheumatoid arthritis J Autoimmun 43 1ndash9(2013) doi 101016jjaut201301013 pmid 23497938

11 C Bialecki H M Feder Jr J M Grant-Kels The six classicchildhood exanthems A review and update J Am AcadDermatol 21 891ndash903 (1989) doi 101016S0190-9622(89)70275-9 pmid 2681288

12 J H Lee W K Roth S Zeuzem Evaluation and comparison ofdifferent hepatitis C virus genotyping and serotyping assaysJ Hepatol 26 1001ndash1009 (1997) doi 101016S0168-8278(97)80108-0 pmid 9186830

13 H F L Wertheim et al Key role for clumping factor B inStaphylococcus aureus nasal colonization of humans PLOSMed 5 e17 (2008) doi 101371journalpmed0050017pmid 18198942

14 R A Manz A E Hauser F Hiepe A Radbruch Maintenance ofserum antibody levels Annu Rev Immunol 23 367ndash386 (2005)doi 101146annurevimmunol23021704115723 pmid 15771575

15 M Wang et al Human anti-JC virus serum reacts with nativebut not denatured JC virus major capsid protein VP1 J VirolMethods 78 171ndash176 (1999) doi 101016S0166-0934(98)00180-3 pmid 10204707

16 S A S Staras et al Seroprevalence of cytomegalovirusinfection in the United States 1988-1994 Clin Infect Dis 431143ndash1151 (2006) doi 101086508173 pmid 17029132

17 M A Reynolds D Kruszon-Moran A Jumaan D S SchmidG M McQuillan Varicella seroprevalence in the US Data fromthe National Health and Nutrition Examination Survey 1999-2004Public Health Rep 125 860ndash869 (2010)pmid 21121231

18 J I Cohen Epstein-Barr virus infection N Engl J Med 343481ndash492 (2000) doi 101056NEJM200008173430707pmid 10944566

19 L Dong et al A combination of serological assays to detecthuman antibodies to the avian influenza A H7N9 virus PLOSONE 9 e95612 (2014) doi 101371journalpone0095612pmid 24755627

20 P Patel et al Prevalence and risk factors associated withherpes simplex virus-2 infection in a contemporary cohort ofHIV-infected persons in the United States Sex Transm Dis39 154ndash160 (2012) doi 101097OLQ0b013e318239d7fdpmid 22249305

21 C T Stover et al Prevalence of and risk factors for viralinfections among human immunodeficiency virus(HIV)-infected and high-risk HIV-uninfected women J InfectDis 187 1388ndash1396 (2003)pmid 12717619

22 E A Engels et al Risk factors for human herpesvirus 8infection among adults in the United States and evidence forsexual transmission J Infect Dis 196 199ndash207 (2007)doi 101086518791 pmid 17570106

23 R Vita et al The immune epitope database 20 Nucleic Acids Res38 D854ndashD862 (2010) doi 101093nargkp1004 pmid 19906713

24 H Singh H R Ansari G P S Raghava Improved method forlinear B-cell epitope prediction using antigenrsquos primarysequence PLOS ONE 8 e62216 (2013) doi 101371journalpone0062216 pmid 23667458

25 J L Mokili F Rohwer B E Dutilh Metagenomics and futureperspectives in virus discovery Curr Opin Virol 2 63ndash77(2012) doi 101016jcoviro201112004 pmid 22440968

26 J Zhu et al Protein interaction discovery using parallelanalysis of translated ORFs (PLATO) Nat Biotechnol 3131ndash334 (2013) doi 101038nbt2539 pmid 23503679

27 Y Urwijitaroon S Teawpatanataworn A KitjareontarmPrevalence of cytomegalovirus antibody in Thai-northeasternblood donors Southeast Asian J Trop Med Public Health 24(suppl 1) 180ndash182 (1993) pmid 7886568

28 M J Cannon D S Schmid T B Hyde Review ofcytomegalovirus seroprevalence and demographiccharacteristics associated with infection Rev Med Virol 20202ndash213 (2010) doi 101002rmv655 pmid 20564615

29 S Mohanna et al Human herpesvirus-8 in Peruvian blooddonors A population with hyperendemic disease Clin Infect Dis44 558ndash561 (2007) doi 101086511044 pmid 17243060

30 D Ablashi et al Seroprevalence of human herpesvirus-8(HHV-8) in countries of Southeast Asia compared to the USAthe Caribbean and Africa Br J Cancer 81 893ndash897 (1999)doi 101038sjbjc6690782 pmid 10555764

31 J S Smith N J Robinson Age-specific prevalence of infectionwith herpes simplex virus types 2 and 1 A global reviewJ Infect Dis 186 (suppl 1) S3ndashS28 (2002) doi 101086343739 pmid 12353183

32 A Heit et al CpG-DNA aided cross-priming by cross-presenting B cells J Immunol 172 1501ndash1507 (2004)doi 104049jimmunol17231501 pmid 14734727

33 Y Aydar S Sukumar A K Szakal J G Tew The influence ofimmune complex-bearing follicular dendritic cells on the IgMresponse Ig class switching and production of high affinityIgG J Immunol 174 5358ndash5366 (2005) doi 104049jimmunol17495358 pmid 15843533

34 M F Quigley et al Convergent recombination shapes theclonotypic landscape of the naive T-cell repertoire Proc NatlAcad Sci USA 107 19414ndash19419 (2010) doi 101073pnas1010586107 pmid 20974936

35 K J L Jackson M J Kidd Y Wang A M Collins The shapeof the lymphocyte receptor repertoire Lessons from the B cellreceptor Front Immunol 4 263 (2013) doi 103389fimmu201300263 pmid 24032032

36 P Parameswaran et al Convergent antibody signatures inhuman dengue Cell Host Microbe 13 691ndash700 (2013)doi 101016jchom201305008 pmid 23768493

37 B Langmead C Trapnell M Pop S L Salzberg Ultrafast andmemory-efficient alignment of short DNA sequences to thehuman genome Genome Biol 10 R25 (2009)doi 101186gb-2009-10-3-r25 pmid 19261174

ACKNOWLEDGMENTS

We thank E Unger and S Buranapraditkun for providing reagentsK Wucherpfennig (Harvard) and H Ploegh (MIT) for criticalreading of the manuscript and TWIST Bioscience for providingaccess to their advanced oligonucleotide synthesis technology Thecohort in Durban South Africa was funded by the NIH(R37AI067073) and the International AIDS Vaccine Initiative(UKZNRSA1001) TN received additional funding from the SouthAfrican Research Chairs Initiative the Victor Daitz Foundation andan International Early Career Scientist Award from the HowardHughes Medical Institute RTC was funded by grants NIHDA033541 and AI082630 CB and JS were supported by NIHN01-AI-30024 and N01-Al-15422 NIHndashNational Institute of Dentaland Craniofacial Research R01 DE018925-04 the HIVACATprogram and CUTHIVAC 241904 KR is supported by TRF SeniorResearch Scholar the Thailand Research Fund the ChulalongkornUniversity Research Professor Program Thailand and NIH grantN01-A1-30024 GJX and TK were supported by the NSFGraduate Research Fellowships Program SJE and BW areInvestigators with the Howard Hughes Medical Institute GJXTK HBL and SJE are inventors on a patent application(application no PCTUS1470902) filed by Brigham and WomenrsquosHospital Incorporated that covers the use of phage displaylibraries to detect antiviral antibodies

SUPPLEMENTARY MATERIALS

wwwsciencemagorgcontent3486239aaa0698supplDC1Supplementary TextFigs S1 to S14Tables S1 to S3

12 October 2014 accepted 24 April 2015101126scienceaaa0698

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-9

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

Comprehensive serological profiling of human populations using a synthetic human virome

Christian Brander Raymond T Chung Kevin C OConnor Bruce Walker H Benjamin Larman and Stephen J ElledgeGeorge J Xu Tomasz Kula Qikai Xu Mamie Z Li Suzanne D Vernon Thumbi Ndungu Kiat Ruxrungtham Jorge Sanchez

DOI 101126scienceaaa0698 (6239) aaa0698348Science

this issue 101126scienceaaa0698Scienceresponses in most individuals targeted the same viral epitopesabout 10 viral species over their lifetime Despite differences in the rates of exposure to specific viruses the antibodysamples from nearly 600 people of differing ages and geographic locations and found that most had been exposed to

developed a blood test that identifies antibodies against all known human viruses They studied bloodet alpeptides Xu exposures Typically such tests measure only one virus at a time Using a synthetic representation of all human viralimmune system Blood tests that detect antiviral antibodies can provide information about both past and present viral

In addition to causing illness viruses leave indelible footprints behind because infection permanently alters thethe complete historyminusminusViral exposure

ARTICLE TOOLS httpsciencesciencemagorgcontent3486239aaa0698

MATERIALSSUPPLEMENTARY httpsciencesciencemagorgcontentsuppl201506033486239aaa0698DC1

CONTENTRELATED

httpstmsciencemagorgcontentscitransmed5203203ra126fullhttpstmsciencemagorgcontentscitransmed6242242ra83full

REFERENCES

httpsciencesciencemagorgcontent3486239aaa0698BIBLThis article cites 37 articles 3 of which you can access for free

PERMISSIONS httpwwwsciencemagorghelpreprints-and-permissions

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASScienceScience 1200 New York Avenue NW Washington DC 20005 The title (print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

Copyright copy 2015 American Association for the Advancement of Science

on Septem

ber 17 2020

httpsciencesciencemagorg

Dow

nloaded from

Page 2: VIRAL IMMUNOLOGY Comprehensive serological profiling of … · Systematic viral epitope scanning (VirScan). This method allows comprehensive analysis of antiviral antibodies in human

RESEARCH ARTICLE

VIRAL IMMUNOLOGY

Comprehensive serological profilingof human populations using asynthetic human viromeGeorge J Xu1234 Tomasz Kula345 Qikai Xu34 Mamie Z Li34 Suzanne D Vernon6

Thumbi Ndungrsquou78910 Kiat Ruxrungtham11 Jorge Sanchez12 Christian Brander13

Raymond T Chung14 Kevin C OrsquoConnor15 Bruce Walker89

H Benjamin Larman16 Stephen J Elledge346dagger

The human virome plays important roles in health and immunity However currentmethods for detecting viral infections and antiviral responses have limited throughput andcoverage Here we present VirScan a high-throughput method to comprehensively analyzeantiviral antibodies using immunoprecipitation and massively parallel DNA sequencing of abacteriophage library displaying proteome-wide peptides from all human viruses Weassayed over 108 antibody-peptide interactions in 569 humans across four continentsnearly doubling the number of previously established viral epitopes We detectedantibodies to an average of 10 viral species per person and 84 species in at least twoindividuals Although rates of specific virus exposure were heterogeneous acrosspopulations antibody responses targeted strongly conserved ldquopublic epitopesrdquo for eachvirus suggesting that they may elicit highly similar antibodies VirScan is a powerfulapproach for studying interactions between the virome and the immune system

The collection of viruses found to infecthumans (the ldquohuman viromerdquo) can haveprofound effects on human health (1) Inaddition to directly causing acute or chronicillness viral infection can also alter host

immunity in more subtle ways leaving an in-

delible footprint on the immune system (2) Forexample latent herpesvirus infection has beenshown to confer symbiotic protection againstbacterial infection in mice through prolongedproduction of interferon-g and systemic activa-tion of macrophages (3) This interplay betweenvirome and host immunity has also been impli-cated in the pathogenesis of complex diseasessuch as type 1 diabetes inflammatory bowel dis-ease and asthma (4) Despite this growing ap-preciation for the importance of interactionsbetween the virome and host a comprehensivemethod to systematically characterize these inter-actions has yet to be developed (5)Viral infections can be detected by serological

or nucleic acidndashbased methods (6) However nu-cleic acid tests fail in cases where viruses havealready been cleared after causing or initiatingtissue damage and can miss viruses of low abun-dance or viruses not normally present in thesampled fluid or surface In contrast humoral re-sponses to infection typically arise within 2 weeksof initial exposure and can persist over years ordecades (7) Tests detecting antiviral antibodiesin peripheral blood can therefore identify on-going and cleared infections However currentserological methods are predominantly limitedto testing one virus at a time and are thereforeonly used to address specific clinical hypothesesScaling serological analyses to encompass thecomplete human virome poses substantial tech-nical challenges but would be of great valuefor better understanding host-virus interactionsand would overcome many of the limitations

associated with current clinical technologiesIn this work we present VirScan a programma-ble high-throughput method to comprehensivelyanalyze antiviral antibodies using immunopre-cipitation andmassively parallel DNA sequencingof a bacteriophage library displaying proteome-wide coverage of peptides from all human viruses

Results

The VirScan platform

VirScan uses the phage immunoprecipitationsequencing (PhIP-seq) technology previously de-veloped in our laboratory (8) Briefly we used aprogrammable DNA microarray to synthesize93904 200-mer oligonucleotides encoding 56-residue peptide tiles with 28-residue overlapsthat together span the reference protein se-quences (collapsed to 90 identity) of all vi-ruses annotated to have human tropism in theUniProt database (Fig 1A a and b) (9) This li-brary includes peptides from 206 species of virusand over 1000 different strains We cloned thelibrary into a T7 bacteriophage display vector forscreening (Fig 1A c)To perform a screen we incubate the library

with a serum sample containing antibodiesrecover the antibodies by using a mixture ofprotein Andash and Gndashcoated magnetic beads andremove unbound phage particles by washing(Fig 1A d and e) Last we perform polymerasechain reaction (PCR) and massively parallelsequencing on the phage DNA to quantify en-richment of each library member resulting fromantibody binding (Fig 1A f) Each sample isscreened in duplicate to ensure reproducibilityVirScan requires only 2 mg of immunoglobulin(lt1 ml of serum) per sample and can be auto-mated on a 96-well liquid handling robot (10)PCR product from 96 immunoprecipitationscan be individually barcoded and pooled for se-quencing reducing the cost for a comprehensiveviral antibody screen to about $25 per sampleAfter sequencing we tally the read count for

each peptide before (ldquoinputrdquo) and after (ldquooutputrdquo)immunoprecipitation We then fit a zero-inflatedgeneralized Poisson model to the distribution ofoutput read counts for each input read count andregress the parameters as a function of input readcount (fig S1) With use of this model we cal-culate a ndashlog10(P value) for the significance of eachpeptidersquos enrichment Last we call a peptide sig-nificantly enriched if its ndashlog10(P value) is greaterthan the reproducibility threshold of 23 in bothreplicates (fig S2)

Characterizing VirScanrsquos sensitivityand specificity

Figure 1B shows the antibody profiles of a set ofhuman viruses in sera from a typical group ofindividuals in a heat map format that illustratesthe number of enriched peptides from each virusWe frequently detected antibodies to multiplepeptides from common human viruses such asEpstein-Barr virus (EBV) cytomegalovirus (CMV)and rhinovirus As expected we observed morepeptides to be enriched from viruses with largerproteomes such as EBV and CMV likely because

RESEARCH

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-1

1Program in Biophysics Harvard University Cambridge MA02115 USA 2Harvard-Massachusetts Institute of Technology(MIT) Division of Health Sciences and TechnologyCambridge MA 02139 USA 3Division of GeneticsDepartment of Medicine Howard Hughes Medical InstituteBrigham and Womens Hospital Boston MA 02115 USA4Department of Genetics Harvard University Medical SchoolBoston MA 02115 USA 5Program in Biological andBiomedical Sciences Harvard University Cambridge MA02115 USA 6Solve MECFS Initiative Los Angeles CA90036 USA 7KwaZulu-Natal Research Institute forTuberculosis and HIV Nelson R Mandela School of MedicineUniversity of KwaZulu-Natal Durban South Africa 8HIVPathogenesis Programme Doris Duke Medical ResearchInstitute Nelson R Mandela School of Medicine DurbanSouth Africa 9Ragon Institute of Massachusetts GeneralHospital MIT and Harvard University Cambridge MA 02139USA 10Max Planck Institute for Infection BiologyChariteplatz D-10117 Berlin Germany 11Vaccine and CellularImmunology Laboratory Department of Medicine Faculty ofMedicine and Chula-Vaccine Research Center ChulalongkornUniversity Bangkok Thailand 12Asociacioacuten Civil IMPACTASalud y Educacioacuten Lima Peru 13AIDS Research Institute-IrsiCaixa and AIDS Unit Hospital Germans Trias i PujolUniversitat Autogravenoma de Barcelona Badalona SpainInstitucioacute Catalana de Recerca i Estudis Avanccedilats (ICREA)Barcelona Spain 14Division of GastroenterologyMassachusetts General Hospital Boston MA 02114 USA15Department of Neurology Yale School of Medicine New HavenCT 06520 USA 16Division of Immunology Department ofPathology Johns Hopkins University Baltimore MD 21205 USAThese authors contributed equally to this work daggerCorrespondingauthor E-mail selledgegeneticsmedharvardedu

on Septem

ber 17 2020

httpsciencesciencemagorg

Dow

nloaded from

there aremore epitopes available for recognitionWe noticed fewer enriched peptides in samplesfrom individuals less than 10 years of age com-paredwith their geographicallymatched controlsin line with an accumulation of viral infectionsthroughout adolescence and adulthood Howevertherewere occasional samples fromyoung donorswith very strong responses to viruses that cause

childhood illness such as parvovirus B19 andherpesvirus 6B which cause the ldquofifth diseaserdquoand ldquosixth diseaserdquo of the classical infectiouschildhood rashes respectively (11) These obser-vations are examined in greater detail in Fig 2Wedeveloped a computationalmethod to iden-

tify the set of viruses to which an individual hasbeen exposed based on the number of enriched

peptides identifiedper virus Brieflywe set a thresh-oldnumberof significantnon-overlappingenrichedpeptides for each virus We empirically determinedthat a threshold of three non-overlapping en-riched peptides gave the best performance fordetecting herpes simplex virus 1 (HSV1) com-pared with a commercial serologic test describedbelow (Table 1) For other viruses we adjusted the

aaa0698-2 5 JUNE 2015 bull VOL 348 ISSUE 6239 sciencemagorg SCIENCE

Fig 1 General VirScan analysis of the human virome (A) Construction ofthe virome peptide library and VirScan screening procedure (a) The viromepeptide library consists of 93904 56ndashamino acid peptides tiling with 28ndashaminoacid overlap across the proteomes of all known human viruses (b) The 200-ntDNA sequences encoding the peptides were printed on a releasable DNAmicroarray (c) The released DNA was amplified and cloned into a T7 phagedisplay vector and packaged into virus particles displaying the encoded peptideon its surface (d) The library is mixed with a sample containing antibodies thatbind to their cognatepeptide antigenon thephage surface (e) Theantibodiesareimmobilized and unbound phage are washed away (f) Last amplification of thebound DNA and high-throughput sequencing of the insert DNA from boundphage reveals peptides targeted by sample antibodies Ab antibody IP im-munoprecipitation (B) Antibody profile of randomly chosen group of donors toshow typical assay results Each row is a virus each column is a sample Thelabel above each chart indicateswhether the donors are over 10 years of age oratmost 10 years of ageThe color intensity of each cell indicates the number ofpeptides from the virus that were significantly enriched by antibodies in thesample (C) Scatter plot of the number of unique enriched peptides (afterapplying maximum parsimony filtering) detected in each sample against theviral load in that sample Data are shown for the HCV-positive and HIV-positive

samples for which we were able to obtain viral load data For the HIV-positivesamples red dots indicate samples fromdonors currently on highly active anti-retroviral therapy (HAART) at the time the sample was taken whereas bluedots indicate different donors before undergoing therapy IU internationalunits (D) Overlap between enriched peptides detected by VirScan and humanB cell epitopes from viruses in IEDBThe entire pink circle represents the 1392groups of nonredundant IEDB epitopes that are also present in the VirScanlibrary (out of 1559 clusters total)The overlap region represents the number ofgroups with an epitope that is also contained in an enriched peptide detectedby VirScan The purple-only region represents the number of nonredundantenriched peptides detected by VirScan that do not contain an IEDB epitopeData are shown for peptides enriched in at least one (left) or at least two (right)samples (E) Overlap between enriched peptides detected by VirScan andhuman B cell epitopes in IEDB from common human viruses The regionsrepresent the same values as in (D) except only epitopes corresponding to theindicated virus are considered and only peptides from that virus that wereenriched in at least two sampleswere considered (F) Distribution of numberofviruses detected in each sample The histogram depicts the frequency ofsamples binned by the number of virus species detected by VirScanThemeanand median of the distribution are both about 10 virus species

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

threshold to account for the size of the viralproteome (fig S3) Next we tally the number ofenriched peptides from each virus Antibodiesgenerated against a specific virus can cross-reactwith similar peptides from a related virus Thiswould lead to false positives because an antibodytargeted to an epitope from one virus to which adonor was exposed would also enrich a homolo-gous peptide from a related virus to which thedonor may not have been exposed In order toaddress this issue we adopted a maximum par-simony approach to infer the fewest number ofvirus exposures that could elicit the observedspectrum of antiviral peptide antibodies Groupsof enriched peptides that share a sevenndashaminoacid subsequence may be recognized by a singlespecific antibody so we only count them as oneepitope for the virus that has the greatest num-ber of other enriched peptides If this adjustedpeptide count is greater than the threshold forthat virus the sample is considered positive forthe virus For this analysis we also filtered outpeptides that were enriched in only 1 of the 569samples to avoid spurious hitsWith this analytical framework we measured

the performance of VirScan by using serum sam-ples from individuals known to be infected or

not infected with human immunodeficiency vi-rus (HIV) and hepatitis C virus (HCV) based oncommercial enzyme-linked immunosorbent assay(ELISA) and Western blot assays For both vi-ruses VirScan achieves very high sensitivitiesand specificities of ~95 or higher (Table 1) overa wide range of viral loads (Fig 1C) The viralgenotype was also known for the HCV-positivesamples Despite the over 70 amino acid se-quence conservation amongHCV genotypes (12)which poses a problem for all antibody-baseddetection methods VirScan correctly reportedtheHCV genotype in 69 of the samplesWe alsocompared VirScan to a commercially availableserology test that is type-specific for the highlyrelated HSV1 and HSV2 (Table 1) These resultsdemonstrate that VirScan performs well in dis-tinguishing between closely related viruses andviruses that range in size from small (HIV andHCV) to very large (HSV1 and HSV2) with highsensitivity and specificity

Population-level analysis ofviral exposures

After ascertaining the performance of VirScanfor a panel of viruses we undertook a large-scalescreening of samples with unknown exposure

history By using our multiplex approach we as-sayed over 106 million antibody-peptide inter-actions with samples from 569 human donors induplicate We detected antibody responses to anaverage of 10 species of virus per sample (Fig 1F)Each person is likely exposed tomultiple distinctstrains of some viral species We detected anti-body responses to 62 of the 206 species of virusin our library in at least five individuals and 84species in at least two individuals The mostfrequently detected viruses are generally thoseknown to commonly infect humans (Table 2 andtable S1) We occasionally detected what appearto be false positives thatmay be due to antibodiesthat cross-react with nonviral peptides For exam-ple 29 of the samples positive for cowpox vi-rus were right at the threshold of detection andhad antibodies against a peptide from the C4Lgene that shares an eightndashamino acid sequence(SESDSDSD D Asp E Glu S Ser) with theclumping factor B protein from Staphylococcusaureus against which humans are known to gen-erate antibodies (13) This will become less of anissue when we test more examples of sera fromindividuals with known infections to determinethe set of likely antigenic peptides for a givenvirus However the fact that we do not detect

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-3

Fig 2 Population stratification of the human virome immune responseThe bar graphs depict the differences in exposure to viruses between donors who are(A) less than 10 years of age versus over 10 years of age (B) HIV-positive versus HIV-negative residing in the United States (C) residing in Peru versus residing inthe United States (D) residing in South Africa versus residing in the United States and (E) residing in Thailand versus residing in the United States Asterisksindicate false discovery rate lt 005

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

high rates of very rare viruses strengthens ourconfidence in VirScanrsquos specificity (see supple-mentary discussion)We frequently detected antibodies to rhino-

virus and respiratory syncytial virus which arenormally found only in the respiratory tract in-dicating that VirScan using blood samples is stillable to detect viruses that do not cause viremiaWe also detected antibodies to influenza whichis normally cleared andpoliovirus towhichmostpeople in modern times generate antibodies

through vaccination Because the original antigenis no longer present we are likely detecting an-tibodies secreted by long-livedmemoryB cells (14)We detected antibodies to certain viruses

less frequently than expected based on previousseroprevalence studies that used optimized serumELISAs For example the frequency at which wedetect influenza (534) and poliovirus (337) islower than expected given that the majority ofthe population has been exposed to or vaccinatedagainst these viruses Thismay be due to reducedsensitivity because of a gradual narrowing anddecrease of the long-lived B cell response in theabsence of persistent antigen We also rarely de-tected antibody responses to small viruses suchas JC virus (JCV) and torque teno virus which arefrequently detected by using specific tests Webelieve that the disparity is due to low titers ofantibodies to unmodified linear epitopes fromthese viruses For example serum antibodiesagainst the major capsid protein of JCV are re-ported to only recognize conformational epitopes(15) Last the frequency of detecting varicellazoster virus (chicken pox) antibodies is also lowerthan expected (243) even though the frequen-cy of detecting other latent herpesviruses such asEBV (871) and CMV (485) is similar to theprevalence reported in epidemiological studies(16ndash18) This may reflect differences in how fre-quently these viruses shed antigens that stimu-late B cell responses or a more limited humoralresponse that relies on epitopes that cannot bedetected in a 56-residue peptide It might also bepossible to increase the sensitivity of detection ofthese viral antibodies by stimulating memory Bcells in vitro to probe the history of infectionmore deeplyTo assess differences in viral exposure between

populations we split the samples into differentgroups based on age HIV status and geographyWe first compared results from children underthe age of 10 to adults within the United States(HIV-positive individuals were excluded from thisanalysis) (Fig 2A) Fewer children were positivefor most viruses including EBV HSV1 HSV2and influenza virus which is consistent with ourpreliminary observations comparing the numberof enriched peptides (Fig 1B) In addition to thefact that children may generate lower antibodytiters in general these younger donors probablyhave not yet been exposed to certain viruses forexample HSV2 which is sexually transmitted (19)When comparing results from HIV-positive to

HIV-negative samples we foundmore of theHIV-positive samples to also be seropositive for addi-tional viruses includingHSV2 CMV andKaposirsquossarcomandashassociated herpesvirus (KSHV) (falsediscovery rate q lt 005 Fig 2B) These results areconsistent with prior studies indicating higherrisk of these co-infections in HIV positive patients(20ndash22) Patients with HIV may engage in activ-ities that put them at higher risk for exposure tothese viruses Alternatively these viruses may in-crease the risk of HIV infection HIV infectionmay reduce the immune systemrsquos ability to con-trol reactivation of normally dormant residentviruses or to prevent opportunistic infections

from taking hold and triggering a strong adaptiveimmune responseLast we compared evidence of viral exposure

among samples taken from adult HIV-negativedonors residing in countries (United States PeruThailand and South Africa) from four differentcontinents In general donors outside the UnitedStates had higher frequencies of seropositivity(Fig 2 C to E) For example CMV antibodieswere found in significantly higher frequencies insamples from Peru Thailand and South AfricaOther viruses such as KSHV and HSV1 were de-tected more frequently in donors from Peru andSouth Africa but not Thailand The observed de-tection frequency of different adenovirus speciesvaries across populations Adenovirus C seropos-itivity was found at similar frequencies in allregions but adenovirus D seropositivity was gen-erally higher outside the United States whereasadenovirus B seropositivity was higher in Peruand South Africa but not in Thailand The higherrates of virus exposure outside the United Statescould be due to differences in population densitycultural practices sanitation or genetic suscep-tibility Additionally influenza B seropositivitywas more common in the United States com-pared with other countries especially Peru andThailand The global incidence of influenza B ismuch lower than influenza A but the standardinfluenza vaccination contains both influenza Aand B strains so the elevated frequency of indi-viduals with seroreactivity may be due to higherrates of influenza vaccination in theUnited StatesOther viruses such as rhinovirus and EBV weredetected at very similar frequencies in all thegeographic regions

Analysis of viral epitope determinants

After analyzing responses on the whole-virus lev-el we focused our attention on the specific pep-tides targeted by these antibodies We detectedantibodies to a total of 8425 peptides in at leasttwo samples and 15052 in at least one sampleBecause of the presence ofmany related peptidesin our library and the Immune Epitope Database(IEDB) for the following analysis we consider apeptide unique only if it does not contain a con-tinuous seven-residue subsequence the estimatedsize of a linear epitope in common with anotherpeptide Analyzed as such our VirScan databasenearly doubles the 1559 unique human B cell ep-itopes from human viruses in the IEDB (23) Theepitopes identified in our unbiased analysis dem-onstrate a significant overlap with those con-tained in the IEDB (P lt 10minus30 Fisherrsquos exact textFig 1D) The amount of overlap is even greaterfor epitopes from viruses that commonly causeinfection (Fig 1E)Wewould likely have detectedeven more antigenic peptides in common withthe IEDB if we had tested more samples fromindividuals infected with rare viruses We nextanalyzed the amino acid composition of recur-rently enriched peptides Enriched peptides tendto have more proline and charged amino acidsand fewer hydrophobic amino acids which isconsistent with a previous analysis of B cell ep-itopes in the IEDB (fig S4) (24) This trend

aaa0698-4 5 JUNE 2015 bull VOL 348 ISSUE 6239 sciencemagorg SCIENCE

Table 2 Frequently detected viruses The column indicates the percentage of samples thatwere positive for the virus by VirScan Known HIV-and HCV-positive samples were excluded whenperforming this analysis

Virus species

Human herpesvirus 4 871Rhinovirus B 718Human adenovirus C 718Rhinovirus A 673Human respiratory syncytial virus 657Human herpesvirus 1 544Influenza A virus 534Human herpesvirus 6B 528Human herpesvirus 5 485Influenza B virus 405Poliovirus 337Human herpesvirus 3 243Human adenovirus F 204Human adenovirus B 168Human herpesvirus 2 155Enterovirus A 152Enterovirus B 133

Table 1 VirScanrsquos sensitivity and specificityon samples with known viral infections Sen-sitivity is the percentage of samples positive forthe virus as determined by VirScan out of all nknown positives Specificity is the percentage ofsamples negative for the virus by VirScan out ofall n known negatives

Virus Sensitivity (n) Specificity (n)

HCV 92 (26) 97 (34)HIV1 95 (61) 100 (33)HSV1 97 (38) 100 (6)HSV2 90 (20) 100 (24)We found that although the false negative samplesdid not meet our stringent cutoff for enriching multipleunique peptides they had detectable antibodies to arecurrent epitope By modifying the criterion to allowfor samples that enrich multiple homologous peptidesthat share a recurrent epitope as described in the textthe sensitivity of detecting HCV increases to 100and the sensitivity for detecting HIV increases to 97This modified criterion does not significantly affectspecificity (fig S13) The one false positive was froman individual whose HCV-negative status was self-reported but who had antibodies to as many HCVpeptides as 23 of the true HCV-positive individualsand is likely to be HCV-positive now or in the past It ispossible that this individual was exposed to HCV butcleared the infection If true the observed specificityfor HCV is 100

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

likely reflects enrichment for amino acids thatare surface-exposed or can form stronger inter-actions with antibodies

B cell responses target highly similar viralepitopes across individuals

We compared the profile of peptides recognizedby the antibody response in different individualsWe found that for a given protein each samplegenerally only had strong responses against oneto three immunodominant peptides (Fig 3) Un-expectedly we found that the vast majority ofseropositive samples for a given virus recognizedthe same immunodominant peptides suggestingthat the antiviral B cell response is highly stereo-typed across individuals For example in glyco-protein G from respiratory syncytial virus thereis only a single immunodominant peptide com-prising positions 141 to 196 that is targeted by allsamples with detectable antibodies to the pro-tein regardless of the country of origin (Fig 3A)

For other antigens we observed interpopu-lation serological differences For example twooverlapping peptides from positions 309 to 364and 337 to 392 of the penton base protein fromadenovirus C frequently elicited antibody re-sponses (Fig 3B) However donors from theUnited States andSouthAfrica hadmuch strongerresponses to peptide 309-364 (P lt 10minus6 t test)relative to donors from Thailand and Peru Weobserved that for the EBNA1 protein from EBVdonors from all four countries frequently hadstrong responses to peptide 393-448 and occa-sionally to peptide 589-644 However donorsfrom Thailand and Peru had much stronger re-sponses to peptide 57-112 (Plt 10minus6 t test) (Fig 3C)These differences may reflect variation in thestrains endemic in each region In addition poly-morphism of major histocompatibility complex(MHC) class II alleles immunoglobulin genesand other modifiers that shape immune re-sponses in each population likely play a role in

defining the relative immunodominance of anti-genic peptidesTo determine whether the humoral responses

that target an immunodominant peptide are ac-tually targeting precisely the same epitope weconstructed single- double- and triple-alaninescanningmutagenesis libraries for eight common-ly recognized peptides These were introducedinto the same T7 bacteriophage display vectorand subjected to the same immunoprecipitationand sequencing protocol using samples from theUnited States Mutants that disrupt the epitopediminish antibody binding affinity and peptideenrichment We found that for all eight peptidestested there was a single largely contiguous sub-sequence in which mutations disrupted bindingfor the majority of samples As expected the tri-ple mutants abolished antibody binding to agreater extent and the enrichment patternsweresimilar among single double and triple mutantsof the same peptide (Fig 4 and figs S5 to S11)

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-5

Fig 3The human antiviromeresponse recognizes a similarspectrum of peptides amonginfected individuals In theheat-map charts each row is apeptide tiling across the indi-cated protein and each columnis a sample The colored barabove each column labeled atthe top of the panels indicatesthe country of origin for thatsample The samples shown area subset of individuals withantibodies to at least one pep-tide from the protein The colorintensity of each cellcorresponds to the ndashlog10(Pvalue) measure of significanceof enrichment for a peptide in asample (greater values indi-cates stronger antibodyresponse) Data are shown for(A) human RSV attachmentglycoprotein G (G) (B) humanadenovirus C penton protein(L2) and (C) EBV nuclear anti-gen 1 (EBNA1) Data shown arethe mean of two replicates

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

For four of the eight peptides a 9ndash to 15ndashaminoacid region was critical for antibody recognitionin gt90 of samples (Fig 4 and figs S5 to S7)One other peptide had a region of similar sizethat was critical in about half of the samples (figS8) In another peptide a single region was im-portant for antibody recognition in the majorityof the samples but the extents of the critical re-gion varied slightly for different samples andoccasionally there were donors that recognized acompletely separate epitope (fig S9) The remain-ing two peptides contained a single triple mutantthat abolished binding in themajority of samplesbut the critical region also extended further todifferent extents depending on the sample (figsS10 and S11) Unexpectedly in one of these pep-tides in addition to themain region surroundingpositions 13 and 14 that is critical for binding asingle Gly36rarrAla36 (G36A) mutation disruptedbinding in almost half of the samples whereasnone of the double- or triple-alaninemutants that

also included the adjacent positions [Lys35 (L35)and G37] affected binding (fig S11) It is possiblethat G36 plays a role in helping the peptide adoptan antigenic conformation and multiple mutantscontaining the adjacent Leu or Gly residues res-cue this ability We occasionally saw other exam-ples of mutations that resulted in patterns ofdisrupted binding with no simple explanationillustrating the complexity of antibody-antigeninteractionThe discovery of recurring targeted epitopes

led us to ask whether we could apply this knowl-edge to improve the sensitivity of viral detectionwith VirScan We hypothesized that samplesshowing a strong response to a recurrently tar-geted ldquodiagnosticrdquo peptide which we defined asa peptide enriched in at least 30 of known pos-itive samples are likely to be seropositive even ifthey do not meet our stringent cutoff requiringat least two non-overlapping enriched peptidesWe tested how this modified criterion affected

our sensitivity and specificity in detecting HIVand HCV and found that it reduced the numberof false negatives without affecting the specificityof the assay (fig S13) We next turned our atten-tion to respiratory syncytial virus (RSV) a virusfor which our detected seroprevalence was lowerthan reported epidemiological rates suggestingimperfect sensitivity of our assay We tested serafrom 60 individuals for antibodies to RSV byELISA and found that 95 were positive abovethe reported sensitivity of the assay and consist-ent with near-universal exposure to this patho-gen Applying the modified criterion to thesesamples increased our rate of detectionbyVirScanfrom 63 to 97 (table S2) These data suggestthat assigning more weight to recurrently tar-geted epitopes can enhance the sensitivity ofVirScan and that the performance of the assaycan be improved by screening known positivesfor a particular virus

Discussion

We have developed VirScan a technology foridentifying viral exposure and B cell epitopesacross the entire known human virome in a sin-gle multiplex reaction using less than a drop ofblood VirScan uses DNA microarray synthesisand bacteriophage display to create a uniformsynthetic representation of peptide epitopes com-prising the human virome Immunoprecipitationand high-throughput DNA sequencing revealsthe peptides recognized by antibodies in the sam-ple VirScan is easily automated in 96-well formatto enable high-throughput sample processingBarcoding of samples during PCR enables pooledanalysis that can dramatically reduce the per-sample cost The VirScan approach has severaladvantages for studying the effect of viruses onthe host immune system By detecting antibodyresponses it can identify infectious agents thathave been cleared after an effective host responseCurrent serological methods of antiviral anti-body detection typically use the selection of asingle optimized antigen in order to achieve highaccuracy In contrast VirScanrsquos unique approachdoes not require such optimization in order toobtain similar performance VirScan achievessensitive detection by assaying each virusrsquos com-plete proteome to detect any antibodies directedto epitopes that can be captured in a 56-residuefragment and specificity by computationallyeliminating cross-reactive antibodies This un-biased approach identifies exposure to less well-studied viruses for which optimal serologicalantigens are not known and can be rapidly ex-tended to include new viruses as they are dis-covered (25)Although sensitive and selective VirScan has a

few limitations First it cannot detect epitopesthat require post-translationalmodifications Sec-ondly it cannot detect epitopes that involvediscontinuous sequences on protein fragmentsgreater than 56 residues In principle the lattercan be overcome byusing alternative technologiesthat allow for the display of full-length proteinssuch as parallel analysis of translated open read-ing frames (PLATO) (26) Third VirScan is likely

aaa0698-6 5 JUNE 2015 bull VOL 348 ISSUE 6239 sciencemagorg SCIENCE

Fig 4 Recognition of common epitopes within an antigenic peptide from human adenovirus Cpenton protein (L2) across individuals Each row is a sample Each column denotes the first mutatedposition for the (A) single- (B) double- and (C) triple-alaninemutant peptide starting with the N terminuson the left Each double- and triple-alaninemutant contains two or three adjacent mutations respectivelyextending toward the C terminus from the colored cell The color intensity of each cell indicates theenrichment of the mutant peptide relative to the wild-type For double-mutants the last position is blankThe same is true for the last two positions for triple mutants Data shown are the mean of two replicatesSingle-letter amino acid abbreviations are as follows F Phe H His I Ile K Lys N Asn P Pro Q Gln RArg TThr V Val and YTyr

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

to be less specific compared with certain nucleicacid tests that discern highly related virus strainsHowever VirScan demonstrates excellent sero-logical discrimination among similar virus speciessuch as HSV1 andHSV2 and can even distinguishthe genotype ofHCV69of the timeWe envisionthat VirScan will become an important tool forfirst-pass unbiased serologic screening appli-cations Individual viruses or viral proteins un-covered in this way can subsequently be analyzedin further detail by using more focused assays aswe have demonstrated for a panel of immunodo-minant epitopesWe have demonstrated that VirScan is a sen-

sitive and specific assay for detecting exposure toviruses across the human virome Because it canbe performed in high-throughput and requiresminimal sample and cost VirScan enables rapidand cost-effective screening of large numbers ofsamples to identify population-level differencesin virus exposure across the human virome Inthis work we analyzed over 106 million anti-body-viral peptide interactions in a comprehen-sive study of pan-virus serology in a large diversepopulation In doing so we detected 84 differentviral species in two or more individuals This islikely to be an underestimate of the history ofviral infection because only low levels of circu-lating antibodies may remain from infectionsthat were cleared in the distant past In additionan individual could be infected by multiple dis-tinct strains of each viral species We identifiedknown and novel differences in virus exposurebetween groups differing in age HIV status andgeographic location across four different con-tinents Our results are largely consistent withprevious studies validating the effectiveness ofVirScan For example CMVantibodieswere foundin significantly higher frequencies in Peru Thai-land and South Africa whereas KSHV andHSV1antibodies were detectedmore frequently in Peruand South Africa but not in Thailand (16 27ndash31)We also uncovered previously undocumentedserological differences such as an increased rateof antibodies against adenovirus B and RSV inHIV-positive individuals compared with HIV-negative individuals These differences may pro-vide insight into howHIV co-infection alters thebalance between host immunity and residentviruses as well as help to identify pathogens thatmay increase susceptibility to HIV and otherheterologous infections HIV infection may re-duce the immune systemrsquos ability to control re-activation of normally dormant resident virusesor to prevent opportunistic infections from takinghold and triggering a strong adaptive immuneresponse Beyond the epidemiological applica-tions demonstrated here VirScan could also beapplied to identify viral exposures that correlatewith disease or other phenotypes in virome-wideassociation studiesOur results identified a large number of novel

B cell epitopes cumulatively nearly doubling thenumber of all previously identified viral epitopesWe have used our data to identify globally im-munodominant and commonly recognized ldquopub-licrdquo epitopes For most species of viruses one or

more peptides are individually recognized inover 70 to 95 of samples positive for thatspecies (table S3) We identified a set of two pep-tides that together are recognized by gt95 ofall screened samples and a set of five peptidesthat together are recognized in gt99 of screenedsamples These public epitopes could be usedto improve vaccine design by piggybacking onthe existing antibody response against themFusing a public B cell epitope to a protein in avaccine to which we hope to induce an immuneresponsemay increase a vaccinersquos efficacy amonga broad population by improving presentationof that protein and aiding affinity maturationPreexisting B cells recognizing the public epi-tope can act as antigen presenting cells to pro-cess and present T cell epitopes of the fusedvaccine target on MHC class I and II (32) An-tibodies secreted by these B cells can also par-ticipate in immune complexes with the fusedvaccine target which are critical for folliculardendritic cells to prime class switching and af-finity maturation of B cells recognizing otherepitopes on the fused antigen (33) Last wedemonstrated that applying more weight tothese public epitopes increases the sensitivityof VirScan without significantly affecting spec-ificity suggesting that this limited subset of pep-tides can serve as the basis for the next generationof our assay or for other novel diagnosticsWe also found that the precise epitopes re-

cognized by the B cell response are highly similaramong individuals across many viral proteinsOne possible model for this notable similarity isthat these regions possess properties favorablefor antigenicity such as accessibility Anothermodel is that the same or highly similar B cellreceptor sequences that recognize these epitopesare commonly generated Identical T cell receptorsequences (ldquopublicrdquo clonotypes) have been foundinmultiple individuals and are thought to be theresult of biases during the recombination proc-ess that favor certain amino acid sequences(34) V(D)J recombination of the immunoglobulinheavy- and light-chain loci is also heavily biased(35) Highly similar or even identical complemen-tarity determining region 3 (CDR3) sequenceshave been observed in dengue virusndashspecific an-tibodies from different individuals (36) It is pos-sible that rather than being an exception fordengue-specific antibodies this represents ageneral phenomenon Inherent biases in V(D)Jrecombination generate the same or similar an-tibodies in multiple individuals that recognizehighly similar epitopes Slight differences in theantibodyCDR3sequencemay subtly alter antibody-antigen interaction leading to the slight var-iations observed in the extent of critical epitoperegions Sequencing of antigen-specific antibodygenes will be required to investigate these pos-sibilities The same principle may also apply to Tcell epitopes and their cognate T cell receptorsVirScan is a method that enables human

virome-wide explorationmdashat the epitope levelmdashof immune responses in large numbers of indi-viduals We have demonstrated its effectivenessfor determining viral exposure and characteriz-

ing viral B cell epitopes in high throughput andat high resolution Our preliminary studies haverevealed intriguing general properties of the hu-man immune system both at the individual andpopulation scale VirScan will be an importanttool in uncovering the effect of host-virome in-teractions on human health and disease andcould easily be expanded to include other humanpathogens such as bacteria fungi and protozoa

Materials and methods

Human donor samples

Specimens originating from human donors werecollected after informed written consent was ob-tained and under a protocol approved by the localgoverning human research protection committeeSecondary use of all samples for the purposesof this work was exempted by the Brigham andWomenrsquos Hospital Institutional Review Board(protocol number 2013P001337) Samples includeddonors residing in Thailand (n = 48) Peru (n =48) South Africa (n = 48) and the Unites StatesincludingHIV+ donors (n=61) andHCV+ donors(n = 26) All serum and plasma samples werestored in aliquots at ndash80degC until use

Design and cloning of viral peptideand scanning mutagenesislibrary sequences

For the virome peptide library we first down-loaded all protein sequences in the UniProt data-base from viruses with human host and collapsedon 90 sequence identity [wwwuniprotorgunirefquery=uniprot(host ldquoHuman+[9606]rdquo)+identity09] The clustering algorithm UniProtrepresents each group of protein sequencessharing at least 90 sequence similarity with asingle representative sequence Then we created56ndashamino acid (aa) peptide sequences tilingthrough all the proteins with 28-aa overlap Wereverse-translated these peptide sequences intoDNA codons optimized for expression in Esche-richia coli making synonymousmutations whennecessary to avoid restriction sites used in sub-sequent cloning steps (EcoRI and XhoI) Lastwe added the adapter sequence AGGAATTC-CGCTGCGT to the 5prime end and CAGGGAAGA-GCTCGAA to the 3prime end to form the 200-nucleotide(nt) oligonucleotide sequencesFor the scanning mutagenesis library we first

took the sequences of the peptides to be muta-genized For each peptide we made all single-mutant and consecutivedouble- and triple-mutantsequences scanning through the whole peptideNon-alanine amino acids were mutated to ala-nine and alanines were mutated to glycine Wereverse-translated these peptide sequences intoDNA codons making synonymous mutationswhen necessary to avoid restriction sites used insubsequent cloning steps (EcoRI and XhoI) Wealso made synonymous mutations to ensure thatthe 50 nt at the 5prime end of peptide sequence isunique to allow unambiguous mapping of thesequencing results Last we added the adaptersequence AGGAATTCCGCTGCGT to the 5prime endand CAGGGAAGAGCTCGAA to the 3prime end to formthe 200-nt oligonucleotide sequences

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-7

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

The 200-nt oligonucleotide sequences weresynthesized on a releasable DNAmicroarray WePCR-amplified the DNAwith the primers T7-PFAAATGATACGGCGGGAATTCCGCTGCGT) and T7-PRA (CAAGCAGAAGACTCGAGCTCTTCCCTG) di-gested the product with EcoRI and XhoI andcloned the fragment into the EcoRISalI site ofthe T7FNS2 vector (8) The resulting library waspackaged into T7 bacteriophage by using the T7Select Packaging Kit (EMD Millipore) and ampli-fied by using themanufacturer suggested protocol

Phage immunoprecipitationand sequencing

We performed phage immunoprecipitation andsequencing by using a slightlymodified version ofpreviously published PhIP-Seq protocols (8 10)First we blocked eachwell of a 96-deep-well platewith 1 ml of 3 bovine serum albumin in TBSTovernight on a rotator at 4degC To each preblockedwell we added sera or plasma containing about2 mg of immunoglobulinG (IgG) [quantified usinga Human IgG ELISA Quantitation Set (BethylLaboratories)] and 1 ml of the bacteriophagelibrary diluted to ~2 times 105 fold representation(2 times 1010 plaque-forming units for a library of105 clones) in phage extraction buffer (20 mMTris-HCl pH 80 100 mMNaCl 6 mMMgSO4)We performed two technical replicates for eachsample We allowed the antibodies to bind thephage overnight on a rotator at 4degC The nextday we added 20 ml each of magnetic protein Aand protein G Dynabeads (Invitrogen) to eachwell and allowed immunoprecipitation to occurfor 4 hours on a rotator at 4degC With a 96-wellmagnetic stand we then washed the beads threetimeswith 400 ml of PhIP-Seqwash buffer (50mMTris-HCl pH 75 150mMNaCl 01 NP-40) Afterthe final wash we resuspended the beads in40 ml of water and lysed the phage at 95degC for10mWe also lysed phage from the library beforeimmunoprecipitation (ldquoinputrdquo) and after immu-noprecipitation with beads aloneWe prepared the DNA for multiplexed Il-

lumina sequencing by using a slightly modifiedversion of a previously published protocol (36)Weperformed two rounds of PCR amplification on thelysedphagematerial usinghot startQ5polymeraseaccording to the manufacturer-suggested protocol(NEB) The first round of PCR used the primersIS7_HsORF5_2 (ACACTCTTTCCCTACACGACTC-CAGTCAGGTGTGATGCTC) and IS8_HsORF3_2(GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCC-GAGCTTATCGTCGTCATCC) The second roundof PCR used 1 ml of the first-round product andthe primers IS4_HsORF5_2 (AATGATACGGCGA-CCACCGAGATCTACACTCTTTCCCTACACGACTC-CAGT) and a different unique indexing primerfor each sample to bemultiplexed for sequencing(CAAGCAGAAGACGGCATACGAGATxxxxxxx-GTGACTGGAGTTCAGACGTGT where ldquoxxxxxxxrdquodenotes a unique 7-nt indexing sequence) Afterthe second round of PCR we determined theDNA concentration of each sample by quan-titative PCR and pooled equimolar amounts ofall samples for gel extraction After gel extractionthe pooled DNA was sequenced by the Harvard

Medical School Biopolymers Facility using a 50ndashbase pair read cycle on an IlluminaHiSeq 2000 or2500We pooled up to 192 samples for sequencingon each lane and generally obtained ~100 mil-lion to 200 million reads per lane (500000 to1000000 reads per sample)

Informatics and statistical analysis

We performed the initial informatics and statis-tical analysis by using a slightly modified versionof the previously published technique (8 10) Wefirstmapped the sequencing reads to the originallibrary sequences by using Bowtie and countedthe frequency of each clone in the ldquoinputrdquo andeach sample ldquooutputrdquo (37) Because the majorityof clones are not enriched we used the observeddistribution of output counts as a null distribu-tion We found that a zero-inflated generalizedpoisson distribution fits our output counts wellWe used this null distribution to calculate a Pvalue for the likelihood of enrichment for eachclone The probabilitymass function for the zero-inflated generalized poisson distribution is

PethY frac14 yTHORNfrac14 pthorn eth1 minus pTHORNfrac12qethqthorn lTHORNxminus1eminusqminusxl y frac14 0

eth1 minus pTHORNfrac12qethqthorn lTHORNxminus1eminusqminusxl y gt 0

We used maximum likelihood estimation toregress the parameters p q and l to fit the dis-tribution of counts after immunoprecipitationfor all clones present at a particular frequencycount in the input We repeated this procedurefor all of the observed input counts and foundthat q and l are well fit by linear regression and pby an exponential regression as a function of in-put count (fig S1) Last for each clonewe used itsinput count and the regression results to deter-mine the null distribution based on the zero-inflated generalized poisson model which weused to calculate the ndashlog10(P value) of obtain-ing the observed countTo call hits we determined the threshold for

reproducibility between technical replicates basedon a previously published method (10) Brieflywe made scatter plots of the log10 of the ndashlog10 (Pvalues) and used a slidingwindowofwidth 0005from0 to 2 across the axis of one replicate For allthe clones that fell within each window we cal-culated the median and median absolute devia-tion of the log10 of the ndashlog10 (P values) in theother replicate and plotted it against the windowlocation (fig S2) We called the threshold for re-producibility the first window in which the me-dian was greater than the median absolutedeviation We found that the distribution of thethreshold ndashlog10 (P value) was centered around amean of ~23 (fig S12) So we called a peptide ahit if the ndashlog10 (P value) was at least 23 in bothreplicates We eliminated the 593 hits that cameup in at least 3 of the 22 immunoprecipitationswith beads alone (negative control for nonspecificbinding) We also filtered out any peptides thatwere not enriched in at least two of the samplesTo call virus exposures we grouped peptides

according to the virus the peptide is derived

from We grouped all peptides from individual vi-ral strains for which we had complete proteomesThe samplewas counted as positive for a species ifit was positive for any strain from that species Forviral strains thathadpartial proteomeswegroupedthemwith other strains from the same species toform a complete set and bioinformatically elim-inated homologous peptides (see next para-graph) We set a threshold number of hits pervirus based on the size of the virus We foundthat there is approximately a power-law relation-ship between size of the virus and the averagenumber of hits per sample (fig S3) In comparingresults from VirScan to samples with known in-fection we empirically determined that a thresh-old of three hits for HSV1 worked the best Weused this value and the slope of the best fit lineto scale the threshold for other viruses We alsoset a minimum threshold of at least two hits inorder to avoid false positives from single spuri-ous hitsTo bioinformatically remove cross-reactive an-

tibodies we first sorted the viruses by total num-ber of hits in descending order We then iteratedthrough each virus in this order For each viruswe iterated through each peptide hit If the hitshared a subsequence of at least 7 aa with any hitpreviously observed in any of the viruses fromthat sample that hit was considered to be from across-reactive antibody and would be ignored forthat virus Otherwise the hit is considered to bespecific and the score for that virus is incre-mented by one In this way we summed only thepeptide hits that do not share any linear epi-topes We compared the final score for each virusto the threshold for that virus to determinewheth-er the sample is positive for exposure to that virusTo identify differences between populations

we first used Fisherrsquos exact test to calculate a Pvalue for the significance of association of virusexposure with one population versus anotherThenwe constructed anull distribution of Fisherrsquosexact P values by randomly permuting the sam-ple labels 1000 times and recalculating the Fisherrsquosexact P value for each virus With use of this nulldistribution we calculated the false discovery rateby dividing the number of permutation P valuesmore extreme than the one observed by the totalnumber of permutations

IEDB epitope overlap analysis

Wedownloaded data for all continuous humanBcell epitopes from IEDB and filtered out all non-viral epitopes (22) To avoid redundancy in these4549 viral epitopes we grouped together epi-topes that are 100 identical or share a 7-aa sub-sequence giving us 1559 nonredundant epitopegroups Of these groups 1392 contain a memberepitope that is also a subsequence of a peptide inthe VirScan library This represents the totalnumber of epitopes we could detect by VirScanTo determine the number of epitopes we de-tected we tallied the number of epitope groupswith at least one member that is contained in apeptide that was enriched in one or two samplesLast to determine the number of nonredundantnew epitopeswe detected we grouped non-IEDB

aaa0698-8 5 JUNE 2015 bull VOL 348 ISSUE 6239 sciencemagorg SCIENCE

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

epitopes containing peptides that share a seven-residue subsequence and counted the number ofthese nonredundant peptide groups

Scanning mutagenesis data analysis

First we estimated the fractional abundance ofeach peptide by dividing the number of readsfor that peptide by the total number of reads forthe sample Then we divided the fractional abun-dance of each peptide after immunoprecipitationby the fractional abundance before immunopre-cipitation to get the enrichment To calculaterelative enrichment we divided enrichment ofthe mutated peptide by enrichment of the wild-type peptide Because most of the single-mutantpeptides had wild-type levels of enrichment weaveraged enrichment of the wild-type peptideenrichment with the middle two quartiles of en-richment of single-mutant peptides to get a bet-ter estimate of the wild-type peptide enrichment

RSV and HSV1 and 2 serology

Serum from 44 donors was tested for HSV1 andHSV2 antibodies by using the HerpeSelect 1 and 2Immunoblot IgG kit (Focus Diagnostics) accord-ing to manufacturerrsquos protocol Serum from 60 do-nors was tested for RSV antibodies by usinganti-RSV IgG Human ELISA Kit (ab108765) ac-cording to manufacturerrsquos protocol

REFERENCES AND NOTES

1 K M Wylie G M Weinstock G A Storch Emerging view ofthe human virome Transl Res 160 283ndash290 (2012)doi 101016jtrsl201203006 pmid 22683423

2 B A Duerkop L V Hooper Resident viruses and theirinteractions with the immune system Nat Immunol 14654ndash659 (2013) doi 101038ni2614 pmid 23778792

3 E S Barton et al Herpesvirus latency confers symbioticprotection from bacterial infection Nature 447 326ndash329(2007) doi 101038nature05762 pmid 17507983

4 E F Foxman A Iwasaki Genome-virome interactionsExamining the role of common viral infections in complexdisease Nat Rev Microbiol 9 254ndash264 (2011) doi 101038nrmicro2541 pmid 21407242

5 M Lecuit M Eloit The human virome New tools andconcepts Trends Microbiol 21 510ndash515 (2013) doi 101016jtim201307001 pmid 23906500

6 I De Vlaminck et al Temporal response of the human virome toimmunosuppression and antiviral therapy Cell 155 1178ndash1187(2013) doi 101016jcell201310034 pmid 24267896

7 E Hammarlund et al Duration of antiviral immunity aftersmallpox vaccination Nat Med 9 1131ndash1137 (2003)doi 101038nm917 pmid 12925846

8 H B Larman et al Autoantigen discovery with a synthetichuman peptidome Nat Biotechnol 29 535ndash541 (2011)doi 101038nbt1856 pmid 21602805

9 UniProt Consortium Activities at the Universal ProteinResource (UniProt) Nucleic Acids Res 42 D191ndashD198 (2014)doi 101093nargkt1140 pmid 24253303

10 H B Larman et al PhIP-Seq characterization ofautoantibodies from patients with multiple sclerosis type

1 diabetes and rheumatoid arthritis J Autoimmun 43 1ndash9(2013) doi 101016jjaut201301013 pmid 23497938

11 C Bialecki H M Feder Jr J M Grant-Kels The six classicchildhood exanthems A review and update J Am AcadDermatol 21 891ndash903 (1989) doi 101016S0190-9622(89)70275-9 pmid 2681288

12 J H Lee W K Roth S Zeuzem Evaluation and comparison ofdifferent hepatitis C virus genotyping and serotyping assaysJ Hepatol 26 1001ndash1009 (1997) doi 101016S0168-8278(97)80108-0 pmid 9186830

13 H F L Wertheim et al Key role for clumping factor B inStaphylococcus aureus nasal colonization of humans PLOSMed 5 e17 (2008) doi 101371journalpmed0050017pmid 18198942

14 R A Manz A E Hauser F Hiepe A Radbruch Maintenance ofserum antibody levels Annu Rev Immunol 23 367ndash386 (2005)doi 101146annurevimmunol23021704115723 pmid 15771575

15 M Wang et al Human anti-JC virus serum reacts with nativebut not denatured JC virus major capsid protein VP1 J VirolMethods 78 171ndash176 (1999) doi 101016S0166-0934(98)00180-3 pmid 10204707

16 S A S Staras et al Seroprevalence of cytomegalovirusinfection in the United States 1988-1994 Clin Infect Dis 431143ndash1151 (2006) doi 101086508173 pmid 17029132

17 M A Reynolds D Kruszon-Moran A Jumaan D S SchmidG M McQuillan Varicella seroprevalence in the US Data fromthe National Health and Nutrition Examination Survey 1999-2004Public Health Rep 125 860ndash869 (2010)pmid 21121231

18 J I Cohen Epstein-Barr virus infection N Engl J Med 343481ndash492 (2000) doi 101056NEJM200008173430707pmid 10944566

19 L Dong et al A combination of serological assays to detecthuman antibodies to the avian influenza A H7N9 virus PLOSONE 9 e95612 (2014) doi 101371journalpone0095612pmid 24755627

20 P Patel et al Prevalence and risk factors associated withherpes simplex virus-2 infection in a contemporary cohort ofHIV-infected persons in the United States Sex Transm Dis39 154ndash160 (2012) doi 101097OLQ0b013e318239d7fdpmid 22249305

21 C T Stover et al Prevalence of and risk factors for viralinfections among human immunodeficiency virus(HIV)-infected and high-risk HIV-uninfected women J InfectDis 187 1388ndash1396 (2003)pmid 12717619

22 E A Engels et al Risk factors for human herpesvirus 8infection among adults in the United States and evidence forsexual transmission J Infect Dis 196 199ndash207 (2007)doi 101086518791 pmid 17570106

23 R Vita et al The immune epitope database 20 Nucleic Acids Res38 D854ndashD862 (2010) doi 101093nargkp1004 pmid 19906713

24 H Singh H R Ansari G P S Raghava Improved method forlinear B-cell epitope prediction using antigenrsquos primarysequence PLOS ONE 8 e62216 (2013) doi 101371journalpone0062216 pmid 23667458

25 J L Mokili F Rohwer B E Dutilh Metagenomics and futureperspectives in virus discovery Curr Opin Virol 2 63ndash77(2012) doi 101016jcoviro201112004 pmid 22440968

26 J Zhu et al Protein interaction discovery using parallelanalysis of translated ORFs (PLATO) Nat Biotechnol 3131ndash334 (2013) doi 101038nbt2539 pmid 23503679

27 Y Urwijitaroon S Teawpatanataworn A KitjareontarmPrevalence of cytomegalovirus antibody in Thai-northeasternblood donors Southeast Asian J Trop Med Public Health 24(suppl 1) 180ndash182 (1993) pmid 7886568

28 M J Cannon D S Schmid T B Hyde Review ofcytomegalovirus seroprevalence and demographiccharacteristics associated with infection Rev Med Virol 20202ndash213 (2010) doi 101002rmv655 pmid 20564615

29 S Mohanna et al Human herpesvirus-8 in Peruvian blooddonors A population with hyperendemic disease Clin Infect Dis44 558ndash561 (2007) doi 101086511044 pmid 17243060

30 D Ablashi et al Seroprevalence of human herpesvirus-8(HHV-8) in countries of Southeast Asia compared to the USAthe Caribbean and Africa Br J Cancer 81 893ndash897 (1999)doi 101038sjbjc6690782 pmid 10555764

31 J S Smith N J Robinson Age-specific prevalence of infectionwith herpes simplex virus types 2 and 1 A global reviewJ Infect Dis 186 (suppl 1) S3ndashS28 (2002) doi 101086343739 pmid 12353183

32 A Heit et al CpG-DNA aided cross-priming by cross-presenting B cells J Immunol 172 1501ndash1507 (2004)doi 104049jimmunol17231501 pmid 14734727

33 Y Aydar S Sukumar A K Szakal J G Tew The influence ofimmune complex-bearing follicular dendritic cells on the IgMresponse Ig class switching and production of high affinityIgG J Immunol 174 5358ndash5366 (2005) doi 104049jimmunol17495358 pmid 15843533

34 M F Quigley et al Convergent recombination shapes theclonotypic landscape of the naive T-cell repertoire Proc NatlAcad Sci USA 107 19414ndash19419 (2010) doi 101073pnas1010586107 pmid 20974936

35 K J L Jackson M J Kidd Y Wang A M Collins The shapeof the lymphocyte receptor repertoire Lessons from the B cellreceptor Front Immunol 4 263 (2013) doi 103389fimmu201300263 pmid 24032032

36 P Parameswaran et al Convergent antibody signatures inhuman dengue Cell Host Microbe 13 691ndash700 (2013)doi 101016jchom201305008 pmid 23768493

37 B Langmead C Trapnell M Pop S L Salzberg Ultrafast andmemory-efficient alignment of short DNA sequences to thehuman genome Genome Biol 10 R25 (2009)doi 101186gb-2009-10-3-r25 pmid 19261174

ACKNOWLEDGMENTS

We thank E Unger and S Buranapraditkun for providing reagentsK Wucherpfennig (Harvard) and H Ploegh (MIT) for criticalreading of the manuscript and TWIST Bioscience for providingaccess to their advanced oligonucleotide synthesis technology Thecohort in Durban South Africa was funded by the NIH(R37AI067073) and the International AIDS Vaccine Initiative(UKZNRSA1001) TN received additional funding from the SouthAfrican Research Chairs Initiative the Victor Daitz Foundation andan International Early Career Scientist Award from the HowardHughes Medical Institute RTC was funded by grants NIHDA033541 and AI082630 CB and JS were supported by NIHN01-AI-30024 and N01-Al-15422 NIHndashNational Institute of Dentaland Craniofacial Research R01 DE018925-04 the HIVACATprogram and CUTHIVAC 241904 KR is supported by TRF SeniorResearch Scholar the Thailand Research Fund the ChulalongkornUniversity Research Professor Program Thailand and NIH grantN01-A1-30024 GJX and TK were supported by the NSFGraduate Research Fellowships Program SJE and BW areInvestigators with the Howard Hughes Medical Institute GJXTK HBL and SJE are inventors on a patent application(application no PCTUS1470902) filed by Brigham and WomenrsquosHospital Incorporated that covers the use of phage displaylibraries to detect antiviral antibodies

SUPPLEMENTARY MATERIALS

wwwsciencemagorgcontent3486239aaa0698supplDC1Supplementary TextFigs S1 to S14Tables S1 to S3

12 October 2014 accepted 24 April 2015101126scienceaaa0698

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-9

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

Comprehensive serological profiling of human populations using a synthetic human virome

Christian Brander Raymond T Chung Kevin C OConnor Bruce Walker H Benjamin Larman and Stephen J ElledgeGeorge J Xu Tomasz Kula Qikai Xu Mamie Z Li Suzanne D Vernon Thumbi Ndungu Kiat Ruxrungtham Jorge Sanchez

DOI 101126scienceaaa0698 (6239) aaa0698348Science

this issue 101126scienceaaa0698Scienceresponses in most individuals targeted the same viral epitopesabout 10 viral species over their lifetime Despite differences in the rates of exposure to specific viruses the antibodysamples from nearly 600 people of differing ages and geographic locations and found that most had been exposed to

developed a blood test that identifies antibodies against all known human viruses They studied bloodet alpeptides Xu exposures Typically such tests measure only one virus at a time Using a synthetic representation of all human viralimmune system Blood tests that detect antiviral antibodies can provide information about both past and present viral

In addition to causing illness viruses leave indelible footprints behind because infection permanently alters thethe complete historyminusminusViral exposure

ARTICLE TOOLS httpsciencesciencemagorgcontent3486239aaa0698

MATERIALSSUPPLEMENTARY httpsciencesciencemagorgcontentsuppl201506033486239aaa0698DC1

CONTENTRELATED

httpstmsciencemagorgcontentscitransmed5203203ra126fullhttpstmsciencemagorgcontentscitransmed6242242ra83full

REFERENCES

httpsciencesciencemagorgcontent3486239aaa0698BIBLThis article cites 37 articles 3 of which you can access for free

PERMISSIONS httpwwwsciencemagorghelpreprints-and-permissions

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASScienceScience 1200 New York Avenue NW Washington DC 20005 The title (print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

Copyright copy 2015 American Association for the Advancement of Science

on Septem

ber 17 2020

httpsciencesciencemagorg

Dow

nloaded from

Page 3: VIRAL IMMUNOLOGY Comprehensive serological profiling of … · Systematic viral epitope scanning (VirScan). This method allows comprehensive analysis of antiviral antibodies in human

there aremore epitopes available for recognitionWe noticed fewer enriched peptides in samplesfrom individuals less than 10 years of age com-paredwith their geographicallymatched controlsin line with an accumulation of viral infectionsthroughout adolescence and adulthood Howevertherewere occasional samples fromyoung donorswith very strong responses to viruses that cause

childhood illness such as parvovirus B19 andherpesvirus 6B which cause the ldquofifth diseaserdquoand ldquosixth diseaserdquo of the classical infectiouschildhood rashes respectively (11) These obser-vations are examined in greater detail in Fig 2Wedeveloped a computationalmethod to iden-

tify the set of viruses to which an individual hasbeen exposed based on the number of enriched

peptides identifiedper virus Brieflywe set a thresh-oldnumberof significantnon-overlappingenrichedpeptides for each virus We empirically determinedthat a threshold of three non-overlapping en-riched peptides gave the best performance fordetecting herpes simplex virus 1 (HSV1) com-pared with a commercial serologic test describedbelow (Table 1) For other viruses we adjusted the

aaa0698-2 5 JUNE 2015 bull VOL 348 ISSUE 6239 sciencemagorg SCIENCE

Fig 1 General VirScan analysis of the human virome (A) Construction ofthe virome peptide library and VirScan screening procedure (a) The viromepeptide library consists of 93904 56ndashamino acid peptides tiling with 28ndashaminoacid overlap across the proteomes of all known human viruses (b) The 200-ntDNA sequences encoding the peptides were printed on a releasable DNAmicroarray (c) The released DNA was amplified and cloned into a T7 phagedisplay vector and packaged into virus particles displaying the encoded peptideon its surface (d) The library is mixed with a sample containing antibodies thatbind to their cognatepeptide antigenon thephage surface (e) Theantibodiesareimmobilized and unbound phage are washed away (f) Last amplification of thebound DNA and high-throughput sequencing of the insert DNA from boundphage reveals peptides targeted by sample antibodies Ab antibody IP im-munoprecipitation (B) Antibody profile of randomly chosen group of donors toshow typical assay results Each row is a virus each column is a sample Thelabel above each chart indicateswhether the donors are over 10 years of age oratmost 10 years of ageThe color intensity of each cell indicates the number ofpeptides from the virus that were significantly enriched by antibodies in thesample (C) Scatter plot of the number of unique enriched peptides (afterapplying maximum parsimony filtering) detected in each sample against theviral load in that sample Data are shown for the HCV-positive and HIV-positive

samples for which we were able to obtain viral load data For the HIV-positivesamples red dots indicate samples fromdonors currently on highly active anti-retroviral therapy (HAART) at the time the sample was taken whereas bluedots indicate different donors before undergoing therapy IU internationalunits (D) Overlap between enriched peptides detected by VirScan and humanB cell epitopes from viruses in IEDBThe entire pink circle represents the 1392groups of nonredundant IEDB epitopes that are also present in the VirScanlibrary (out of 1559 clusters total)The overlap region represents the number ofgroups with an epitope that is also contained in an enriched peptide detectedby VirScan The purple-only region represents the number of nonredundantenriched peptides detected by VirScan that do not contain an IEDB epitopeData are shown for peptides enriched in at least one (left) or at least two (right)samples (E) Overlap between enriched peptides detected by VirScan andhuman B cell epitopes in IEDB from common human viruses The regionsrepresent the same values as in (D) except only epitopes corresponding to theindicated virus are considered and only peptides from that virus that wereenriched in at least two sampleswere considered (F) Distribution of numberofviruses detected in each sample The histogram depicts the frequency ofsamples binned by the number of virus species detected by VirScanThemeanand median of the distribution are both about 10 virus species

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

threshold to account for the size of the viralproteome (fig S3) Next we tally the number ofenriched peptides from each virus Antibodiesgenerated against a specific virus can cross-reactwith similar peptides from a related virus Thiswould lead to false positives because an antibodytargeted to an epitope from one virus to which adonor was exposed would also enrich a homolo-gous peptide from a related virus to which thedonor may not have been exposed In order toaddress this issue we adopted a maximum par-simony approach to infer the fewest number ofvirus exposures that could elicit the observedspectrum of antiviral peptide antibodies Groupsof enriched peptides that share a sevenndashaminoacid subsequence may be recognized by a singlespecific antibody so we only count them as oneepitope for the virus that has the greatest num-ber of other enriched peptides If this adjustedpeptide count is greater than the threshold forthat virus the sample is considered positive forthe virus For this analysis we also filtered outpeptides that were enriched in only 1 of the 569samples to avoid spurious hitsWith this analytical framework we measured

the performance of VirScan by using serum sam-ples from individuals known to be infected or

not infected with human immunodeficiency vi-rus (HIV) and hepatitis C virus (HCV) based oncommercial enzyme-linked immunosorbent assay(ELISA) and Western blot assays For both vi-ruses VirScan achieves very high sensitivitiesand specificities of ~95 or higher (Table 1) overa wide range of viral loads (Fig 1C) The viralgenotype was also known for the HCV-positivesamples Despite the over 70 amino acid se-quence conservation amongHCV genotypes (12)which poses a problem for all antibody-baseddetection methods VirScan correctly reportedtheHCV genotype in 69 of the samplesWe alsocompared VirScan to a commercially availableserology test that is type-specific for the highlyrelated HSV1 and HSV2 (Table 1) These resultsdemonstrate that VirScan performs well in dis-tinguishing between closely related viruses andviruses that range in size from small (HIV andHCV) to very large (HSV1 and HSV2) with highsensitivity and specificity

Population-level analysis ofviral exposures

After ascertaining the performance of VirScanfor a panel of viruses we undertook a large-scalescreening of samples with unknown exposure

history By using our multiplex approach we as-sayed over 106 million antibody-peptide inter-actions with samples from 569 human donors induplicate We detected antibody responses to anaverage of 10 species of virus per sample (Fig 1F)Each person is likely exposed tomultiple distinctstrains of some viral species We detected anti-body responses to 62 of the 206 species of virusin our library in at least five individuals and 84species in at least two individuals The mostfrequently detected viruses are generally thoseknown to commonly infect humans (Table 2 andtable S1) We occasionally detected what appearto be false positives thatmay be due to antibodiesthat cross-react with nonviral peptides For exam-ple 29 of the samples positive for cowpox vi-rus were right at the threshold of detection andhad antibodies against a peptide from the C4Lgene that shares an eightndashamino acid sequence(SESDSDSD D Asp E Glu S Ser) with theclumping factor B protein from Staphylococcusaureus against which humans are known to gen-erate antibodies (13) This will become less of anissue when we test more examples of sera fromindividuals with known infections to determinethe set of likely antigenic peptides for a givenvirus However the fact that we do not detect

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-3

Fig 2 Population stratification of the human virome immune responseThe bar graphs depict the differences in exposure to viruses between donors who are(A) less than 10 years of age versus over 10 years of age (B) HIV-positive versus HIV-negative residing in the United States (C) residing in Peru versus residing inthe United States (D) residing in South Africa versus residing in the United States and (E) residing in Thailand versus residing in the United States Asterisksindicate false discovery rate lt 005

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

high rates of very rare viruses strengthens ourconfidence in VirScanrsquos specificity (see supple-mentary discussion)We frequently detected antibodies to rhino-

virus and respiratory syncytial virus which arenormally found only in the respiratory tract in-dicating that VirScan using blood samples is stillable to detect viruses that do not cause viremiaWe also detected antibodies to influenza whichis normally cleared andpoliovirus towhichmostpeople in modern times generate antibodies

through vaccination Because the original antigenis no longer present we are likely detecting an-tibodies secreted by long-livedmemoryB cells (14)We detected antibodies to certain viruses

less frequently than expected based on previousseroprevalence studies that used optimized serumELISAs For example the frequency at which wedetect influenza (534) and poliovirus (337) islower than expected given that the majority ofthe population has been exposed to or vaccinatedagainst these viruses Thismay be due to reducedsensitivity because of a gradual narrowing anddecrease of the long-lived B cell response in theabsence of persistent antigen We also rarely de-tected antibody responses to small viruses suchas JC virus (JCV) and torque teno virus which arefrequently detected by using specific tests Webelieve that the disparity is due to low titers ofantibodies to unmodified linear epitopes fromthese viruses For example serum antibodiesagainst the major capsid protein of JCV are re-ported to only recognize conformational epitopes(15) Last the frequency of detecting varicellazoster virus (chicken pox) antibodies is also lowerthan expected (243) even though the frequen-cy of detecting other latent herpesviruses such asEBV (871) and CMV (485) is similar to theprevalence reported in epidemiological studies(16ndash18) This may reflect differences in how fre-quently these viruses shed antigens that stimu-late B cell responses or a more limited humoralresponse that relies on epitopes that cannot bedetected in a 56-residue peptide It might also bepossible to increase the sensitivity of detection ofthese viral antibodies by stimulating memory Bcells in vitro to probe the history of infectionmore deeplyTo assess differences in viral exposure between

populations we split the samples into differentgroups based on age HIV status and geographyWe first compared results from children underthe age of 10 to adults within the United States(HIV-positive individuals were excluded from thisanalysis) (Fig 2A) Fewer children were positivefor most viruses including EBV HSV1 HSV2and influenza virus which is consistent with ourpreliminary observations comparing the numberof enriched peptides (Fig 1B) In addition to thefact that children may generate lower antibodytiters in general these younger donors probablyhave not yet been exposed to certain viruses forexample HSV2 which is sexually transmitted (19)When comparing results from HIV-positive to

HIV-negative samples we foundmore of theHIV-positive samples to also be seropositive for addi-tional viruses includingHSV2 CMV andKaposirsquossarcomandashassociated herpesvirus (KSHV) (falsediscovery rate q lt 005 Fig 2B) These results areconsistent with prior studies indicating higherrisk of these co-infections in HIV positive patients(20ndash22) Patients with HIV may engage in activ-ities that put them at higher risk for exposure tothese viruses Alternatively these viruses may in-crease the risk of HIV infection HIV infectionmay reduce the immune systemrsquos ability to con-trol reactivation of normally dormant residentviruses or to prevent opportunistic infections

from taking hold and triggering a strong adaptiveimmune responseLast we compared evidence of viral exposure

among samples taken from adult HIV-negativedonors residing in countries (United States PeruThailand and South Africa) from four differentcontinents In general donors outside the UnitedStates had higher frequencies of seropositivity(Fig 2 C to E) For example CMV antibodieswere found in significantly higher frequencies insamples from Peru Thailand and South AfricaOther viruses such as KSHV and HSV1 were de-tected more frequently in donors from Peru andSouth Africa but not Thailand The observed de-tection frequency of different adenovirus speciesvaries across populations Adenovirus C seropos-itivity was found at similar frequencies in allregions but adenovirus D seropositivity was gen-erally higher outside the United States whereasadenovirus B seropositivity was higher in Peruand South Africa but not in Thailand The higherrates of virus exposure outside the United Statescould be due to differences in population densitycultural practices sanitation or genetic suscep-tibility Additionally influenza B seropositivitywas more common in the United States com-pared with other countries especially Peru andThailand The global incidence of influenza B ismuch lower than influenza A but the standardinfluenza vaccination contains both influenza Aand B strains so the elevated frequency of indi-viduals with seroreactivity may be due to higherrates of influenza vaccination in theUnited StatesOther viruses such as rhinovirus and EBV weredetected at very similar frequencies in all thegeographic regions

Analysis of viral epitope determinants

After analyzing responses on the whole-virus lev-el we focused our attention on the specific pep-tides targeted by these antibodies We detectedantibodies to a total of 8425 peptides in at leasttwo samples and 15052 in at least one sampleBecause of the presence ofmany related peptidesin our library and the Immune Epitope Database(IEDB) for the following analysis we consider apeptide unique only if it does not contain a con-tinuous seven-residue subsequence the estimatedsize of a linear epitope in common with anotherpeptide Analyzed as such our VirScan databasenearly doubles the 1559 unique human B cell ep-itopes from human viruses in the IEDB (23) Theepitopes identified in our unbiased analysis dem-onstrate a significant overlap with those con-tained in the IEDB (P lt 10minus30 Fisherrsquos exact textFig 1D) The amount of overlap is even greaterfor epitopes from viruses that commonly causeinfection (Fig 1E)Wewould likely have detectedeven more antigenic peptides in common withthe IEDB if we had tested more samples fromindividuals infected with rare viruses We nextanalyzed the amino acid composition of recur-rently enriched peptides Enriched peptides tendto have more proline and charged amino acidsand fewer hydrophobic amino acids which isconsistent with a previous analysis of B cell ep-itopes in the IEDB (fig S4) (24) This trend

aaa0698-4 5 JUNE 2015 bull VOL 348 ISSUE 6239 sciencemagorg SCIENCE

Table 2 Frequently detected viruses The column indicates the percentage of samples thatwere positive for the virus by VirScan Known HIV-and HCV-positive samples were excluded whenperforming this analysis

Virus species

Human herpesvirus 4 871Rhinovirus B 718Human adenovirus C 718Rhinovirus A 673Human respiratory syncytial virus 657Human herpesvirus 1 544Influenza A virus 534Human herpesvirus 6B 528Human herpesvirus 5 485Influenza B virus 405Poliovirus 337Human herpesvirus 3 243Human adenovirus F 204Human adenovirus B 168Human herpesvirus 2 155Enterovirus A 152Enterovirus B 133

Table 1 VirScanrsquos sensitivity and specificityon samples with known viral infections Sen-sitivity is the percentage of samples positive forthe virus as determined by VirScan out of all nknown positives Specificity is the percentage ofsamples negative for the virus by VirScan out ofall n known negatives

Virus Sensitivity (n) Specificity (n)

HCV 92 (26) 97 (34)HIV1 95 (61) 100 (33)HSV1 97 (38) 100 (6)HSV2 90 (20) 100 (24)We found that although the false negative samplesdid not meet our stringent cutoff for enriching multipleunique peptides they had detectable antibodies to arecurrent epitope By modifying the criterion to allowfor samples that enrich multiple homologous peptidesthat share a recurrent epitope as described in the textthe sensitivity of detecting HCV increases to 100and the sensitivity for detecting HIV increases to 97This modified criterion does not significantly affectspecificity (fig S13) The one false positive was froman individual whose HCV-negative status was self-reported but who had antibodies to as many HCVpeptides as 23 of the true HCV-positive individualsand is likely to be HCV-positive now or in the past It ispossible that this individual was exposed to HCV butcleared the infection If true the observed specificityfor HCV is 100

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

likely reflects enrichment for amino acids thatare surface-exposed or can form stronger inter-actions with antibodies

B cell responses target highly similar viralepitopes across individuals

We compared the profile of peptides recognizedby the antibody response in different individualsWe found that for a given protein each samplegenerally only had strong responses against oneto three immunodominant peptides (Fig 3) Un-expectedly we found that the vast majority ofseropositive samples for a given virus recognizedthe same immunodominant peptides suggestingthat the antiviral B cell response is highly stereo-typed across individuals For example in glyco-protein G from respiratory syncytial virus thereis only a single immunodominant peptide com-prising positions 141 to 196 that is targeted by allsamples with detectable antibodies to the pro-tein regardless of the country of origin (Fig 3A)

For other antigens we observed interpopu-lation serological differences For example twooverlapping peptides from positions 309 to 364and 337 to 392 of the penton base protein fromadenovirus C frequently elicited antibody re-sponses (Fig 3B) However donors from theUnited States andSouthAfrica hadmuch strongerresponses to peptide 309-364 (P lt 10minus6 t test)relative to donors from Thailand and Peru Weobserved that for the EBNA1 protein from EBVdonors from all four countries frequently hadstrong responses to peptide 393-448 and occa-sionally to peptide 589-644 However donorsfrom Thailand and Peru had much stronger re-sponses to peptide 57-112 (Plt 10minus6 t test) (Fig 3C)These differences may reflect variation in thestrains endemic in each region In addition poly-morphism of major histocompatibility complex(MHC) class II alleles immunoglobulin genesand other modifiers that shape immune re-sponses in each population likely play a role in

defining the relative immunodominance of anti-genic peptidesTo determine whether the humoral responses

that target an immunodominant peptide are ac-tually targeting precisely the same epitope weconstructed single- double- and triple-alaninescanningmutagenesis libraries for eight common-ly recognized peptides These were introducedinto the same T7 bacteriophage display vectorand subjected to the same immunoprecipitationand sequencing protocol using samples from theUnited States Mutants that disrupt the epitopediminish antibody binding affinity and peptideenrichment We found that for all eight peptidestested there was a single largely contiguous sub-sequence in which mutations disrupted bindingfor the majority of samples As expected the tri-ple mutants abolished antibody binding to agreater extent and the enrichment patternsweresimilar among single double and triple mutantsof the same peptide (Fig 4 and figs S5 to S11)

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-5

Fig 3The human antiviromeresponse recognizes a similarspectrum of peptides amonginfected individuals In theheat-map charts each row is apeptide tiling across the indi-cated protein and each columnis a sample The colored barabove each column labeled atthe top of the panels indicatesthe country of origin for thatsample The samples shown area subset of individuals withantibodies to at least one pep-tide from the protein The colorintensity of each cellcorresponds to the ndashlog10(Pvalue) measure of significanceof enrichment for a peptide in asample (greater values indi-cates stronger antibodyresponse) Data are shown for(A) human RSV attachmentglycoprotein G (G) (B) humanadenovirus C penton protein(L2) and (C) EBV nuclear anti-gen 1 (EBNA1) Data shown arethe mean of two replicates

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

For four of the eight peptides a 9ndash to 15ndashaminoacid region was critical for antibody recognitionin gt90 of samples (Fig 4 and figs S5 to S7)One other peptide had a region of similar sizethat was critical in about half of the samples (figS8) In another peptide a single region was im-portant for antibody recognition in the majorityof the samples but the extents of the critical re-gion varied slightly for different samples andoccasionally there were donors that recognized acompletely separate epitope (fig S9) The remain-ing two peptides contained a single triple mutantthat abolished binding in themajority of samplesbut the critical region also extended further todifferent extents depending on the sample (figsS10 and S11) Unexpectedly in one of these pep-tides in addition to themain region surroundingpositions 13 and 14 that is critical for binding asingle Gly36rarrAla36 (G36A) mutation disruptedbinding in almost half of the samples whereasnone of the double- or triple-alaninemutants that

also included the adjacent positions [Lys35 (L35)and G37] affected binding (fig S11) It is possiblethat G36 plays a role in helping the peptide adoptan antigenic conformation and multiple mutantscontaining the adjacent Leu or Gly residues res-cue this ability We occasionally saw other exam-ples of mutations that resulted in patterns ofdisrupted binding with no simple explanationillustrating the complexity of antibody-antigeninteractionThe discovery of recurring targeted epitopes

led us to ask whether we could apply this knowl-edge to improve the sensitivity of viral detectionwith VirScan We hypothesized that samplesshowing a strong response to a recurrently tar-geted ldquodiagnosticrdquo peptide which we defined asa peptide enriched in at least 30 of known pos-itive samples are likely to be seropositive even ifthey do not meet our stringent cutoff requiringat least two non-overlapping enriched peptidesWe tested how this modified criterion affected

our sensitivity and specificity in detecting HIVand HCV and found that it reduced the numberof false negatives without affecting the specificityof the assay (fig S13) We next turned our atten-tion to respiratory syncytial virus (RSV) a virusfor which our detected seroprevalence was lowerthan reported epidemiological rates suggestingimperfect sensitivity of our assay We tested serafrom 60 individuals for antibodies to RSV byELISA and found that 95 were positive abovethe reported sensitivity of the assay and consist-ent with near-universal exposure to this patho-gen Applying the modified criterion to thesesamples increased our rate of detectionbyVirScanfrom 63 to 97 (table S2) These data suggestthat assigning more weight to recurrently tar-geted epitopes can enhance the sensitivity ofVirScan and that the performance of the assaycan be improved by screening known positivesfor a particular virus

Discussion

We have developed VirScan a technology foridentifying viral exposure and B cell epitopesacross the entire known human virome in a sin-gle multiplex reaction using less than a drop ofblood VirScan uses DNA microarray synthesisand bacteriophage display to create a uniformsynthetic representation of peptide epitopes com-prising the human virome Immunoprecipitationand high-throughput DNA sequencing revealsthe peptides recognized by antibodies in the sam-ple VirScan is easily automated in 96-well formatto enable high-throughput sample processingBarcoding of samples during PCR enables pooledanalysis that can dramatically reduce the per-sample cost The VirScan approach has severaladvantages for studying the effect of viruses onthe host immune system By detecting antibodyresponses it can identify infectious agents thathave been cleared after an effective host responseCurrent serological methods of antiviral anti-body detection typically use the selection of asingle optimized antigen in order to achieve highaccuracy In contrast VirScanrsquos unique approachdoes not require such optimization in order toobtain similar performance VirScan achievessensitive detection by assaying each virusrsquos com-plete proteome to detect any antibodies directedto epitopes that can be captured in a 56-residuefragment and specificity by computationallyeliminating cross-reactive antibodies This un-biased approach identifies exposure to less well-studied viruses for which optimal serologicalantigens are not known and can be rapidly ex-tended to include new viruses as they are dis-covered (25)Although sensitive and selective VirScan has a

few limitations First it cannot detect epitopesthat require post-translationalmodifications Sec-ondly it cannot detect epitopes that involvediscontinuous sequences on protein fragmentsgreater than 56 residues In principle the lattercan be overcome byusing alternative technologiesthat allow for the display of full-length proteinssuch as parallel analysis of translated open read-ing frames (PLATO) (26) Third VirScan is likely

aaa0698-6 5 JUNE 2015 bull VOL 348 ISSUE 6239 sciencemagorg SCIENCE

Fig 4 Recognition of common epitopes within an antigenic peptide from human adenovirus Cpenton protein (L2) across individuals Each row is a sample Each column denotes the first mutatedposition for the (A) single- (B) double- and (C) triple-alaninemutant peptide starting with the N terminuson the left Each double- and triple-alaninemutant contains two or three adjacent mutations respectivelyextending toward the C terminus from the colored cell The color intensity of each cell indicates theenrichment of the mutant peptide relative to the wild-type For double-mutants the last position is blankThe same is true for the last two positions for triple mutants Data shown are the mean of two replicatesSingle-letter amino acid abbreviations are as follows F Phe H His I Ile K Lys N Asn P Pro Q Gln RArg TThr V Val and YTyr

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

to be less specific compared with certain nucleicacid tests that discern highly related virus strainsHowever VirScan demonstrates excellent sero-logical discrimination among similar virus speciessuch as HSV1 andHSV2 and can even distinguishthe genotype ofHCV69of the timeWe envisionthat VirScan will become an important tool forfirst-pass unbiased serologic screening appli-cations Individual viruses or viral proteins un-covered in this way can subsequently be analyzedin further detail by using more focused assays aswe have demonstrated for a panel of immunodo-minant epitopesWe have demonstrated that VirScan is a sen-

sitive and specific assay for detecting exposure toviruses across the human virome Because it canbe performed in high-throughput and requiresminimal sample and cost VirScan enables rapidand cost-effective screening of large numbers ofsamples to identify population-level differencesin virus exposure across the human virome Inthis work we analyzed over 106 million anti-body-viral peptide interactions in a comprehen-sive study of pan-virus serology in a large diversepopulation In doing so we detected 84 differentviral species in two or more individuals This islikely to be an underestimate of the history ofviral infection because only low levels of circu-lating antibodies may remain from infectionsthat were cleared in the distant past In additionan individual could be infected by multiple dis-tinct strains of each viral species We identifiedknown and novel differences in virus exposurebetween groups differing in age HIV status andgeographic location across four different con-tinents Our results are largely consistent withprevious studies validating the effectiveness ofVirScan For example CMVantibodieswere foundin significantly higher frequencies in Peru Thai-land and South Africa whereas KSHV andHSV1antibodies were detectedmore frequently in Peruand South Africa but not in Thailand (16 27ndash31)We also uncovered previously undocumentedserological differences such as an increased rateof antibodies against adenovirus B and RSV inHIV-positive individuals compared with HIV-negative individuals These differences may pro-vide insight into howHIV co-infection alters thebalance between host immunity and residentviruses as well as help to identify pathogens thatmay increase susceptibility to HIV and otherheterologous infections HIV infection may re-duce the immune systemrsquos ability to control re-activation of normally dormant resident virusesor to prevent opportunistic infections from takinghold and triggering a strong adaptive immuneresponse Beyond the epidemiological applica-tions demonstrated here VirScan could also beapplied to identify viral exposures that correlatewith disease or other phenotypes in virome-wideassociation studiesOur results identified a large number of novel

B cell epitopes cumulatively nearly doubling thenumber of all previously identified viral epitopesWe have used our data to identify globally im-munodominant and commonly recognized ldquopub-licrdquo epitopes For most species of viruses one or

more peptides are individually recognized inover 70 to 95 of samples positive for thatspecies (table S3) We identified a set of two pep-tides that together are recognized by gt95 ofall screened samples and a set of five peptidesthat together are recognized in gt99 of screenedsamples These public epitopes could be usedto improve vaccine design by piggybacking onthe existing antibody response against themFusing a public B cell epitope to a protein in avaccine to which we hope to induce an immuneresponsemay increase a vaccinersquos efficacy amonga broad population by improving presentationof that protein and aiding affinity maturationPreexisting B cells recognizing the public epi-tope can act as antigen presenting cells to pro-cess and present T cell epitopes of the fusedvaccine target on MHC class I and II (32) An-tibodies secreted by these B cells can also par-ticipate in immune complexes with the fusedvaccine target which are critical for folliculardendritic cells to prime class switching and af-finity maturation of B cells recognizing otherepitopes on the fused antigen (33) Last wedemonstrated that applying more weight tothese public epitopes increases the sensitivityof VirScan without significantly affecting spec-ificity suggesting that this limited subset of pep-tides can serve as the basis for the next generationof our assay or for other novel diagnosticsWe also found that the precise epitopes re-

cognized by the B cell response are highly similaramong individuals across many viral proteinsOne possible model for this notable similarity isthat these regions possess properties favorablefor antigenicity such as accessibility Anothermodel is that the same or highly similar B cellreceptor sequences that recognize these epitopesare commonly generated Identical T cell receptorsequences (ldquopublicrdquo clonotypes) have been foundinmultiple individuals and are thought to be theresult of biases during the recombination proc-ess that favor certain amino acid sequences(34) V(D)J recombination of the immunoglobulinheavy- and light-chain loci is also heavily biased(35) Highly similar or even identical complemen-tarity determining region 3 (CDR3) sequenceshave been observed in dengue virusndashspecific an-tibodies from different individuals (36) It is pos-sible that rather than being an exception fordengue-specific antibodies this represents ageneral phenomenon Inherent biases in V(D)Jrecombination generate the same or similar an-tibodies in multiple individuals that recognizehighly similar epitopes Slight differences in theantibodyCDR3sequencemay subtly alter antibody-antigen interaction leading to the slight var-iations observed in the extent of critical epitoperegions Sequencing of antigen-specific antibodygenes will be required to investigate these pos-sibilities The same principle may also apply to Tcell epitopes and their cognate T cell receptorsVirScan is a method that enables human

virome-wide explorationmdashat the epitope levelmdashof immune responses in large numbers of indi-viduals We have demonstrated its effectivenessfor determining viral exposure and characteriz-

ing viral B cell epitopes in high throughput andat high resolution Our preliminary studies haverevealed intriguing general properties of the hu-man immune system both at the individual andpopulation scale VirScan will be an importanttool in uncovering the effect of host-virome in-teractions on human health and disease andcould easily be expanded to include other humanpathogens such as bacteria fungi and protozoa

Materials and methods

Human donor samples

Specimens originating from human donors werecollected after informed written consent was ob-tained and under a protocol approved by the localgoverning human research protection committeeSecondary use of all samples for the purposesof this work was exempted by the Brigham andWomenrsquos Hospital Institutional Review Board(protocol number 2013P001337) Samples includeddonors residing in Thailand (n = 48) Peru (n =48) South Africa (n = 48) and the Unites StatesincludingHIV+ donors (n=61) andHCV+ donors(n = 26) All serum and plasma samples werestored in aliquots at ndash80degC until use

Design and cloning of viral peptideand scanning mutagenesislibrary sequences

For the virome peptide library we first down-loaded all protein sequences in the UniProt data-base from viruses with human host and collapsedon 90 sequence identity [wwwuniprotorgunirefquery=uniprot(host ldquoHuman+[9606]rdquo)+identity09] The clustering algorithm UniProtrepresents each group of protein sequencessharing at least 90 sequence similarity with asingle representative sequence Then we created56ndashamino acid (aa) peptide sequences tilingthrough all the proteins with 28-aa overlap Wereverse-translated these peptide sequences intoDNA codons optimized for expression in Esche-richia coli making synonymousmutations whennecessary to avoid restriction sites used in sub-sequent cloning steps (EcoRI and XhoI) Lastwe added the adapter sequence AGGAATTC-CGCTGCGT to the 5prime end and CAGGGAAGA-GCTCGAA to the 3prime end to form the 200-nucleotide(nt) oligonucleotide sequencesFor the scanning mutagenesis library we first

took the sequences of the peptides to be muta-genized For each peptide we made all single-mutant and consecutivedouble- and triple-mutantsequences scanning through the whole peptideNon-alanine amino acids were mutated to ala-nine and alanines were mutated to glycine Wereverse-translated these peptide sequences intoDNA codons making synonymous mutationswhen necessary to avoid restriction sites used insubsequent cloning steps (EcoRI and XhoI) Wealso made synonymous mutations to ensure thatthe 50 nt at the 5prime end of peptide sequence isunique to allow unambiguous mapping of thesequencing results Last we added the adaptersequence AGGAATTCCGCTGCGT to the 5prime endand CAGGGAAGAGCTCGAA to the 3prime end to formthe 200-nt oligonucleotide sequences

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-7

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

The 200-nt oligonucleotide sequences weresynthesized on a releasable DNAmicroarray WePCR-amplified the DNAwith the primers T7-PFAAATGATACGGCGGGAATTCCGCTGCGT) and T7-PRA (CAAGCAGAAGACTCGAGCTCTTCCCTG) di-gested the product with EcoRI and XhoI andcloned the fragment into the EcoRISalI site ofthe T7FNS2 vector (8) The resulting library waspackaged into T7 bacteriophage by using the T7Select Packaging Kit (EMD Millipore) and ampli-fied by using themanufacturer suggested protocol

Phage immunoprecipitationand sequencing

We performed phage immunoprecipitation andsequencing by using a slightlymodified version ofpreviously published PhIP-Seq protocols (8 10)First we blocked eachwell of a 96-deep-well platewith 1 ml of 3 bovine serum albumin in TBSTovernight on a rotator at 4degC To each preblockedwell we added sera or plasma containing about2 mg of immunoglobulinG (IgG) [quantified usinga Human IgG ELISA Quantitation Set (BethylLaboratories)] and 1 ml of the bacteriophagelibrary diluted to ~2 times 105 fold representation(2 times 1010 plaque-forming units for a library of105 clones) in phage extraction buffer (20 mMTris-HCl pH 80 100 mMNaCl 6 mMMgSO4)We performed two technical replicates for eachsample We allowed the antibodies to bind thephage overnight on a rotator at 4degC The nextday we added 20 ml each of magnetic protein Aand protein G Dynabeads (Invitrogen) to eachwell and allowed immunoprecipitation to occurfor 4 hours on a rotator at 4degC With a 96-wellmagnetic stand we then washed the beads threetimeswith 400 ml of PhIP-Seqwash buffer (50mMTris-HCl pH 75 150mMNaCl 01 NP-40) Afterthe final wash we resuspended the beads in40 ml of water and lysed the phage at 95degC for10mWe also lysed phage from the library beforeimmunoprecipitation (ldquoinputrdquo) and after immu-noprecipitation with beads aloneWe prepared the DNA for multiplexed Il-

lumina sequencing by using a slightly modifiedversion of a previously published protocol (36)Weperformed two rounds of PCR amplification on thelysedphagematerial usinghot startQ5polymeraseaccording to the manufacturer-suggested protocol(NEB) The first round of PCR used the primersIS7_HsORF5_2 (ACACTCTTTCCCTACACGACTC-CAGTCAGGTGTGATGCTC) and IS8_HsORF3_2(GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCC-GAGCTTATCGTCGTCATCC) The second roundof PCR used 1 ml of the first-round product andthe primers IS4_HsORF5_2 (AATGATACGGCGA-CCACCGAGATCTACACTCTTTCCCTACACGACTC-CAGT) and a different unique indexing primerfor each sample to bemultiplexed for sequencing(CAAGCAGAAGACGGCATACGAGATxxxxxxx-GTGACTGGAGTTCAGACGTGT where ldquoxxxxxxxrdquodenotes a unique 7-nt indexing sequence) Afterthe second round of PCR we determined theDNA concentration of each sample by quan-titative PCR and pooled equimolar amounts ofall samples for gel extraction After gel extractionthe pooled DNA was sequenced by the Harvard

Medical School Biopolymers Facility using a 50ndashbase pair read cycle on an IlluminaHiSeq 2000 or2500We pooled up to 192 samples for sequencingon each lane and generally obtained ~100 mil-lion to 200 million reads per lane (500000 to1000000 reads per sample)

Informatics and statistical analysis

We performed the initial informatics and statis-tical analysis by using a slightly modified versionof the previously published technique (8 10) Wefirstmapped the sequencing reads to the originallibrary sequences by using Bowtie and countedthe frequency of each clone in the ldquoinputrdquo andeach sample ldquooutputrdquo (37) Because the majorityof clones are not enriched we used the observeddistribution of output counts as a null distribu-tion We found that a zero-inflated generalizedpoisson distribution fits our output counts wellWe used this null distribution to calculate a Pvalue for the likelihood of enrichment for eachclone The probabilitymass function for the zero-inflated generalized poisson distribution is

PethY frac14 yTHORNfrac14 pthorn eth1 minus pTHORNfrac12qethqthorn lTHORNxminus1eminusqminusxl y frac14 0

eth1 minus pTHORNfrac12qethqthorn lTHORNxminus1eminusqminusxl y gt 0

We used maximum likelihood estimation toregress the parameters p q and l to fit the dis-tribution of counts after immunoprecipitationfor all clones present at a particular frequencycount in the input We repeated this procedurefor all of the observed input counts and foundthat q and l are well fit by linear regression and pby an exponential regression as a function of in-put count (fig S1) Last for each clonewe used itsinput count and the regression results to deter-mine the null distribution based on the zero-inflated generalized poisson model which weused to calculate the ndashlog10(P value) of obtain-ing the observed countTo call hits we determined the threshold for

reproducibility between technical replicates basedon a previously published method (10) Brieflywe made scatter plots of the log10 of the ndashlog10 (Pvalues) and used a slidingwindowofwidth 0005from0 to 2 across the axis of one replicate For allthe clones that fell within each window we cal-culated the median and median absolute devia-tion of the log10 of the ndashlog10 (P values) in theother replicate and plotted it against the windowlocation (fig S2) We called the threshold for re-producibility the first window in which the me-dian was greater than the median absolutedeviation We found that the distribution of thethreshold ndashlog10 (P value) was centered around amean of ~23 (fig S12) So we called a peptide ahit if the ndashlog10 (P value) was at least 23 in bothreplicates We eliminated the 593 hits that cameup in at least 3 of the 22 immunoprecipitationswith beads alone (negative control for nonspecificbinding) We also filtered out any peptides thatwere not enriched in at least two of the samplesTo call virus exposures we grouped peptides

according to the virus the peptide is derived

from We grouped all peptides from individual vi-ral strains for which we had complete proteomesThe samplewas counted as positive for a species ifit was positive for any strain from that species Forviral strains thathadpartial proteomeswegroupedthemwith other strains from the same species toform a complete set and bioinformatically elim-inated homologous peptides (see next para-graph) We set a threshold number of hits pervirus based on the size of the virus We foundthat there is approximately a power-law relation-ship between size of the virus and the averagenumber of hits per sample (fig S3) In comparingresults from VirScan to samples with known in-fection we empirically determined that a thresh-old of three hits for HSV1 worked the best Weused this value and the slope of the best fit lineto scale the threshold for other viruses We alsoset a minimum threshold of at least two hits inorder to avoid false positives from single spuri-ous hitsTo bioinformatically remove cross-reactive an-

tibodies we first sorted the viruses by total num-ber of hits in descending order We then iteratedthrough each virus in this order For each viruswe iterated through each peptide hit If the hitshared a subsequence of at least 7 aa with any hitpreviously observed in any of the viruses fromthat sample that hit was considered to be from across-reactive antibody and would be ignored forthat virus Otherwise the hit is considered to bespecific and the score for that virus is incre-mented by one In this way we summed only thepeptide hits that do not share any linear epi-topes We compared the final score for each virusto the threshold for that virus to determinewheth-er the sample is positive for exposure to that virusTo identify differences between populations

we first used Fisherrsquos exact test to calculate a Pvalue for the significance of association of virusexposure with one population versus anotherThenwe constructed anull distribution of Fisherrsquosexact P values by randomly permuting the sam-ple labels 1000 times and recalculating the Fisherrsquosexact P value for each virus With use of this nulldistribution we calculated the false discovery rateby dividing the number of permutation P valuesmore extreme than the one observed by the totalnumber of permutations

IEDB epitope overlap analysis

Wedownloaded data for all continuous humanBcell epitopes from IEDB and filtered out all non-viral epitopes (22) To avoid redundancy in these4549 viral epitopes we grouped together epi-topes that are 100 identical or share a 7-aa sub-sequence giving us 1559 nonredundant epitopegroups Of these groups 1392 contain a memberepitope that is also a subsequence of a peptide inthe VirScan library This represents the totalnumber of epitopes we could detect by VirScanTo determine the number of epitopes we de-tected we tallied the number of epitope groupswith at least one member that is contained in apeptide that was enriched in one or two samplesLast to determine the number of nonredundantnew epitopeswe detected we grouped non-IEDB

aaa0698-8 5 JUNE 2015 bull VOL 348 ISSUE 6239 sciencemagorg SCIENCE

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

epitopes containing peptides that share a seven-residue subsequence and counted the number ofthese nonredundant peptide groups

Scanning mutagenesis data analysis

First we estimated the fractional abundance ofeach peptide by dividing the number of readsfor that peptide by the total number of reads forthe sample Then we divided the fractional abun-dance of each peptide after immunoprecipitationby the fractional abundance before immunopre-cipitation to get the enrichment To calculaterelative enrichment we divided enrichment ofthe mutated peptide by enrichment of the wild-type peptide Because most of the single-mutantpeptides had wild-type levels of enrichment weaveraged enrichment of the wild-type peptideenrichment with the middle two quartiles of en-richment of single-mutant peptides to get a bet-ter estimate of the wild-type peptide enrichment

RSV and HSV1 and 2 serology

Serum from 44 donors was tested for HSV1 andHSV2 antibodies by using the HerpeSelect 1 and 2Immunoblot IgG kit (Focus Diagnostics) accord-ing to manufacturerrsquos protocol Serum from 60 do-nors was tested for RSV antibodies by usinganti-RSV IgG Human ELISA Kit (ab108765) ac-cording to manufacturerrsquos protocol

REFERENCES AND NOTES

1 K M Wylie G M Weinstock G A Storch Emerging view ofthe human virome Transl Res 160 283ndash290 (2012)doi 101016jtrsl201203006 pmid 22683423

2 B A Duerkop L V Hooper Resident viruses and theirinteractions with the immune system Nat Immunol 14654ndash659 (2013) doi 101038ni2614 pmid 23778792

3 E S Barton et al Herpesvirus latency confers symbioticprotection from bacterial infection Nature 447 326ndash329(2007) doi 101038nature05762 pmid 17507983

4 E F Foxman A Iwasaki Genome-virome interactionsExamining the role of common viral infections in complexdisease Nat Rev Microbiol 9 254ndash264 (2011) doi 101038nrmicro2541 pmid 21407242

5 M Lecuit M Eloit The human virome New tools andconcepts Trends Microbiol 21 510ndash515 (2013) doi 101016jtim201307001 pmid 23906500

6 I De Vlaminck et al Temporal response of the human virome toimmunosuppression and antiviral therapy Cell 155 1178ndash1187(2013) doi 101016jcell201310034 pmid 24267896

7 E Hammarlund et al Duration of antiviral immunity aftersmallpox vaccination Nat Med 9 1131ndash1137 (2003)doi 101038nm917 pmid 12925846

8 H B Larman et al Autoantigen discovery with a synthetichuman peptidome Nat Biotechnol 29 535ndash541 (2011)doi 101038nbt1856 pmid 21602805

9 UniProt Consortium Activities at the Universal ProteinResource (UniProt) Nucleic Acids Res 42 D191ndashD198 (2014)doi 101093nargkt1140 pmid 24253303

10 H B Larman et al PhIP-Seq characterization ofautoantibodies from patients with multiple sclerosis type

1 diabetes and rheumatoid arthritis J Autoimmun 43 1ndash9(2013) doi 101016jjaut201301013 pmid 23497938

11 C Bialecki H M Feder Jr J M Grant-Kels The six classicchildhood exanthems A review and update J Am AcadDermatol 21 891ndash903 (1989) doi 101016S0190-9622(89)70275-9 pmid 2681288

12 J H Lee W K Roth S Zeuzem Evaluation and comparison ofdifferent hepatitis C virus genotyping and serotyping assaysJ Hepatol 26 1001ndash1009 (1997) doi 101016S0168-8278(97)80108-0 pmid 9186830

13 H F L Wertheim et al Key role for clumping factor B inStaphylococcus aureus nasal colonization of humans PLOSMed 5 e17 (2008) doi 101371journalpmed0050017pmid 18198942

14 R A Manz A E Hauser F Hiepe A Radbruch Maintenance ofserum antibody levels Annu Rev Immunol 23 367ndash386 (2005)doi 101146annurevimmunol23021704115723 pmid 15771575

15 M Wang et al Human anti-JC virus serum reacts with nativebut not denatured JC virus major capsid protein VP1 J VirolMethods 78 171ndash176 (1999) doi 101016S0166-0934(98)00180-3 pmid 10204707

16 S A S Staras et al Seroprevalence of cytomegalovirusinfection in the United States 1988-1994 Clin Infect Dis 431143ndash1151 (2006) doi 101086508173 pmid 17029132

17 M A Reynolds D Kruszon-Moran A Jumaan D S SchmidG M McQuillan Varicella seroprevalence in the US Data fromthe National Health and Nutrition Examination Survey 1999-2004Public Health Rep 125 860ndash869 (2010)pmid 21121231

18 J I Cohen Epstein-Barr virus infection N Engl J Med 343481ndash492 (2000) doi 101056NEJM200008173430707pmid 10944566

19 L Dong et al A combination of serological assays to detecthuman antibodies to the avian influenza A H7N9 virus PLOSONE 9 e95612 (2014) doi 101371journalpone0095612pmid 24755627

20 P Patel et al Prevalence and risk factors associated withherpes simplex virus-2 infection in a contemporary cohort ofHIV-infected persons in the United States Sex Transm Dis39 154ndash160 (2012) doi 101097OLQ0b013e318239d7fdpmid 22249305

21 C T Stover et al Prevalence of and risk factors for viralinfections among human immunodeficiency virus(HIV)-infected and high-risk HIV-uninfected women J InfectDis 187 1388ndash1396 (2003)pmid 12717619

22 E A Engels et al Risk factors for human herpesvirus 8infection among adults in the United States and evidence forsexual transmission J Infect Dis 196 199ndash207 (2007)doi 101086518791 pmid 17570106

23 R Vita et al The immune epitope database 20 Nucleic Acids Res38 D854ndashD862 (2010) doi 101093nargkp1004 pmid 19906713

24 H Singh H R Ansari G P S Raghava Improved method forlinear B-cell epitope prediction using antigenrsquos primarysequence PLOS ONE 8 e62216 (2013) doi 101371journalpone0062216 pmid 23667458

25 J L Mokili F Rohwer B E Dutilh Metagenomics and futureperspectives in virus discovery Curr Opin Virol 2 63ndash77(2012) doi 101016jcoviro201112004 pmid 22440968

26 J Zhu et al Protein interaction discovery using parallelanalysis of translated ORFs (PLATO) Nat Biotechnol 3131ndash334 (2013) doi 101038nbt2539 pmid 23503679

27 Y Urwijitaroon S Teawpatanataworn A KitjareontarmPrevalence of cytomegalovirus antibody in Thai-northeasternblood donors Southeast Asian J Trop Med Public Health 24(suppl 1) 180ndash182 (1993) pmid 7886568

28 M J Cannon D S Schmid T B Hyde Review ofcytomegalovirus seroprevalence and demographiccharacteristics associated with infection Rev Med Virol 20202ndash213 (2010) doi 101002rmv655 pmid 20564615

29 S Mohanna et al Human herpesvirus-8 in Peruvian blooddonors A population with hyperendemic disease Clin Infect Dis44 558ndash561 (2007) doi 101086511044 pmid 17243060

30 D Ablashi et al Seroprevalence of human herpesvirus-8(HHV-8) in countries of Southeast Asia compared to the USAthe Caribbean and Africa Br J Cancer 81 893ndash897 (1999)doi 101038sjbjc6690782 pmid 10555764

31 J S Smith N J Robinson Age-specific prevalence of infectionwith herpes simplex virus types 2 and 1 A global reviewJ Infect Dis 186 (suppl 1) S3ndashS28 (2002) doi 101086343739 pmid 12353183

32 A Heit et al CpG-DNA aided cross-priming by cross-presenting B cells J Immunol 172 1501ndash1507 (2004)doi 104049jimmunol17231501 pmid 14734727

33 Y Aydar S Sukumar A K Szakal J G Tew The influence ofimmune complex-bearing follicular dendritic cells on the IgMresponse Ig class switching and production of high affinityIgG J Immunol 174 5358ndash5366 (2005) doi 104049jimmunol17495358 pmid 15843533

34 M F Quigley et al Convergent recombination shapes theclonotypic landscape of the naive T-cell repertoire Proc NatlAcad Sci USA 107 19414ndash19419 (2010) doi 101073pnas1010586107 pmid 20974936

35 K J L Jackson M J Kidd Y Wang A M Collins The shapeof the lymphocyte receptor repertoire Lessons from the B cellreceptor Front Immunol 4 263 (2013) doi 103389fimmu201300263 pmid 24032032

36 P Parameswaran et al Convergent antibody signatures inhuman dengue Cell Host Microbe 13 691ndash700 (2013)doi 101016jchom201305008 pmid 23768493

37 B Langmead C Trapnell M Pop S L Salzberg Ultrafast andmemory-efficient alignment of short DNA sequences to thehuman genome Genome Biol 10 R25 (2009)doi 101186gb-2009-10-3-r25 pmid 19261174

ACKNOWLEDGMENTS

We thank E Unger and S Buranapraditkun for providing reagentsK Wucherpfennig (Harvard) and H Ploegh (MIT) for criticalreading of the manuscript and TWIST Bioscience for providingaccess to their advanced oligonucleotide synthesis technology Thecohort in Durban South Africa was funded by the NIH(R37AI067073) and the International AIDS Vaccine Initiative(UKZNRSA1001) TN received additional funding from the SouthAfrican Research Chairs Initiative the Victor Daitz Foundation andan International Early Career Scientist Award from the HowardHughes Medical Institute RTC was funded by grants NIHDA033541 and AI082630 CB and JS were supported by NIHN01-AI-30024 and N01-Al-15422 NIHndashNational Institute of Dentaland Craniofacial Research R01 DE018925-04 the HIVACATprogram and CUTHIVAC 241904 KR is supported by TRF SeniorResearch Scholar the Thailand Research Fund the ChulalongkornUniversity Research Professor Program Thailand and NIH grantN01-A1-30024 GJX and TK were supported by the NSFGraduate Research Fellowships Program SJE and BW areInvestigators with the Howard Hughes Medical Institute GJXTK HBL and SJE are inventors on a patent application(application no PCTUS1470902) filed by Brigham and WomenrsquosHospital Incorporated that covers the use of phage displaylibraries to detect antiviral antibodies

SUPPLEMENTARY MATERIALS

wwwsciencemagorgcontent3486239aaa0698supplDC1Supplementary TextFigs S1 to S14Tables S1 to S3

12 October 2014 accepted 24 April 2015101126scienceaaa0698

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-9

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

Comprehensive serological profiling of human populations using a synthetic human virome

Christian Brander Raymond T Chung Kevin C OConnor Bruce Walker H Benjamin Larman and Stephen J ElledgeGeorge J Xu Tomasz Kula Qikai Xu Mamie Z Li Suzanne D Vernon Thumbi Ndungu Kiat Ruxrungtham Jorge Sanchez

DOI 101126scienceaaa0698 (6239) aaa0698348Science

this issue 101126scienceaaa0698Scienceresponses in most individuals targeted the same viral epitopesabout 10 viral species over their lifetime Despite differences in the rates of exposure to specific viruses the antibodysamples from nearly 600 people of differing ages and geographic locations and found that most had been exposed to

developed a blood test that identifies antibodies against all known human viruses They studied bloodet alpeptides Xu exposures Typically such tests measure only one virus at a time Using a synthetic representation of all human viralimmune system Blood tests that detect antiviral antibodies can provide information about both past and present viral

In addition to causing illness viruses leave indelible footprints behind because infection permanently alters thethe complete historyminusminusViral exposure

ARTICLE TOOLS httpsciencesciencemagorgcontent3486239aaa0698

MATERIALSSUPPLEMENTARY httpsciencesciencemagorgcontentsuppl201506033486239aaa0698DC1

CONTENTRELATED

httpstmsciencemagorgcontentscitransmed5203203ra126fullhttpstmsciencemagorgcontentscitransmed6242242ra83full

REFERENCES

httpsciencesciencemagorgcontent3486239aaa0698BIBLThis article cites 37 articles 3 of which you can access for free

PERMISSIONS httpwwwsciencemagorghelpreprints-and-permissions

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASScienceScience 1200 New York Avenue NW Washington DC 20005 The title (print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

Copyright copy 2015 American Association for the Advancement of Science

on Septem

ber 17 2020

httpsciencesciencemagorg

Dow

nloaded from

Page 4: VIRAL IMMUNOLOGY Comprehensive serological profiling of … · Systematic viral epitope scanning (VirScan). This method allows comprehensive analysis of antiviral antibodies in human

threshold to account for the size of the viralproteome (fig S3) Next we tally the number ofenriched peptides from each virus Antibodiesgenerated against a specific virus can cross-reactwith similar peptides from a related virus Thiswould lead to false positives because an antibodytargeted to an epitope from one virus to which adonor was exposed would also enrich a homolo-gous peptide from a related virus to which thedonor may not have been exposed In order toaddress this issue we adopted a maximum par-simony approach to infer the fewest number ofvirus exposures that could elicit the observedspectrum of antiviral peptide antibodies Groupsof enriched peptides that share a sevenndashaminoacid subsequence may be recognized by a singlespecific antibody so we only count them as oneepitope for the virus that has the greatest num-ber of other enriched peptides If this adjustedpeptide count is greater than the threshold forthat virus the sample is considered positive forthe virus For this analysis we also filtered outpeptides that were enriched in only 1 of the 569samples to avoid spurious hitsWith this analytical framework we measured

the performance of VirScan by using serum sam-ples from individuals known to be infected or

not infected with human immunodeficiency vi-rus (HIV) and hepatitis C virus (HCV) based oncommercial enzyme-linked immunosorbent assay(ELISA) and Western blot assays For both vi-ruses VirScan achieves very high sensitivitiesand specificities of ~95 or higher (Table 1) overa wide range of viral loads (Fig 1C) The viralgenotype was also known for the HCV-positivesamples Despite the over 70 amino acid se-quence conservation amongHCV genotypes (12)which poses a problem for all antibody-baseddetection methods VirScan correctly reportedtheHCV genotype in 69 of the samplesWe alsocompared VirScan to a commercially availableserology test that is type-specific for the highlyrelated HSV1 and HSV2 (Table 1) These resultsdemonstrate that VirScan performs well in dis-tinguishing between closely related viruses andviruses that range in size from small (HIV andHCV) to very large (HSV1 and HSV2) with highsensitivity and specificity

Population-level analysis ofviral exposures

After ascertaining the performance of VirScanfor a panel of viruses we undertook a large-scalescreening of samples with unknown exposure

history By using our multiplex approach we as-sayed over 106 million antibody-peptide inter-actions with samples from 569 human donors induplicate We detected antibody responses to anaverage of 10 species of virus per sample (Fig 1F)Each person is likely exposed tomultiple distinctstrains of some viral species We detected anti-body responses to 62 of the 206 species of virusin our library in at least five individuals and 84species in at least two individuals The mostfrequently detected viruses are generally thoseknown to commonly infect humans (Table 2 andtable S1) We occasionally detected what appearto be false positives thatmay be due to antibodiesthat cross-react with nonviral peptides For exam-ple 29 of the samples positive for cowpox vi-rus were right at the threshold of detection andhad antibodies against a peptide from the C4Lgene that shares an eightndashamino acid sequence(SESDSDSD D Asp E Glu S Ser) with theclumping factor B protein from Staphylococcusaureus against which humans are known to gen-erate antibodies (13) This will become less of anissue when we test more examples of sera fromindividuals with known infections to determinethe set of likely antigenic peptides for a givenvirus However the fact that we do not detect

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-3

Fig 2 Population stratification of the human virome immune responseThe bar graphs depict the differences in exposure to viruses between donors who are(A) less than 10 years of age versus over 10 years of age (B) HIV-positive versus HIV-negative residing in the United States (C) residing in Peru versus residing inthe United States (D) residing in South Africa versus residing in the United States and (E) residing in Thailand versus residing in the United States Asterisksindicate false discovery rate lt 005

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

high rates of very rare viruses strengthens ourconfidence in VirScanrsquos specificity (see supple-mentary discussion)We frequently detected antibodies to rhino-

virus and respiratory syncytial virus which arenormally found only in the respiratory tract in-dicating that VirScan using blood samples is stillable to detect viruses that do not cause viremiaWe also detected antibodies to influenza whichis normally cleared andpoliovirus towhichmostpeople in modern times generate antibodies

through vaccination Because the original antigenis no longer present we are likely detecting an-tibodies secreted by long-livedmemoryB cells (14)We detected antibodies to certain viruses

less frequently than expected based on previousseroprevalence studies that used optimized serumELISAs For example the frequency at which wedetect influenza (534) and poliovirus (337) islower than expected given that the majority ofthe population has been exposed to or vaccinatedagainst these viruses Thismay be due to reducedsensitivity because of a gradual narrowing anddecrease of the long-lived B cell response in theabsence of persistent antigen We also rarely de-tected antibody responses to small viruses suchas JC virus (JCV) and torque teno virus which arefrequently detected by using specific tests Webelieve that the disparity is due to low titers ofantibodies to unmodified linear epitopes fromthese viruses For example serum antibodiesagainst the major capsid protein of JCV are re-ported to only recognize conformational epitopes(15) Last the frequency of detecting varicellazoster virus (chicken pox) antibodies is also lowerthan expected (243) even though the frequen-cy of detecting other latent herpesviruses such asEBV (871) and CMV (485) is similar to theprevalence reported in epidemiological studies(16ndash18) This may reflect differences in how fre-quently these viruses shed antigens that stimu-late B cell responses or a more limited humoralresponse that relies on epitopes that cannot bedetected in a 56-residue peptide It might also bepossible to increase the sensitivity of detection ofthese viral antibodies by stimulating memory Bcells in vitro to probe the history of infectionmore deeplyTo assess differences in viral exposure between

populations we split the samples into differentgroups based on age HIV status and geographyWe first compared results from children underthe age of 10 to adults within the United States(HIV-positive individuals were excluded from thisanalysis) (Fig 2A) Fewer children were positivefor most viruses including EBV HSV1 HSV2and influenza virus which is consistent with ourpreliminary observations comparing the numberof enriched peptides (Fig 1B) In addition to thefact that children may generate lower antibodytiters in general these younger donors probablyhave not yet been exposed to certain viruses forexample HSV2 which is sexually transmitted (19)When comparing results from HIV-positive to

HIV-negative samples we foundmore of theHIV-positive samples to also be seropositive for addi-tional viruses includingHSV2 CMV andKaposirsquossarcomandashassociated herpesvirus (KSHV) (falsediscovery rate q lt 005 Fig 2B) These results areconsistent with prior studies indicating higherrisk of these co-infections in HIV positive patients(20ndash22) Patients with HIV may engage in activ-ities that put them at higher risk for exposure tothese viruses Alternatively these viruses may in-crease the risk of HIV infection HIV infectionmay reduce the immune systemrsquos ability to con-trol reactivation of normally dormant residentviruses or to prevent opportunistic infections

from taking hold and triggering a strong adaptiveimmune responseLast we compared evidence of viral exposure

among samples taken from adult HIV-negativedonors residing in countries (United States PeruThailand and South Africa) from four differentcontinents In general donors outside the UnitedStates had higher frequencies of seropositivity(Fig 2 C to E) For example CMV antibodieswere found in significantly higher frequencies insamples from Peru Thailand and South AfricaOther viruses such as KSHV and HSV1 were de-tected more frequently in donors from Peru andSouth Africa but not Thailand The observed de-tection frequency of different adenovirus speciesvaries across populations Adenovirus C seropos-itivity was found at similar frequencies in allregions but adenovirus D seropositivity was gen-erally higher outside the United States whereasadenovirus B seropositivity was higher in Peruand South Africa but not in Thailand The higherrates of virus exposure outside the United Statescould be due to differences in population densitycultural practices sanitation or genetic suscep-tibility Additionally influenza B seropositivitywas more common in the United States com-pared with other countries especially Peru andThailand The global incidence of influenza B ismuch lower than influenza A but the standardinfluenza vaccination contains both influenza Aand B strains so the elevated frequency of indi-viduals with seroreactivity may be due to higherrates of influenza vaccination in theUnited StatesOther viruses such as rhinovirus and EBV weredetected at very similar frequencies in all thegeographic regions

Analysis of viral epitope determinants

After analyzing responses on the whole-virus lev-el we focused our attention on the specific pep-tides targeted by these antibodies We detectedantibodies to a total of 8425 peptides in at leasttwo samples and 15052 in at least one sampleBecause of the presence ofmany related peptidesin our library and the Immune Epitope Database(IEDB) for the following analysis we consider apeptide unique only if it does not contain a con-tinuous seven-residue subsequence the estimatedsize of a linear epitope in common with anotherpeptide Analyzed as such our VirScan databasenearly doubles the 1559 unique human B cell ep-itopes from human viruses in the IEDB (23) Theepitopes identified in our unbiased analysis dem-onstrate a significant overlap with those con-tained in the IEDB (P lt 10minus30 Fisherrsquos exact textFig 1D) The amount of overlap is even greaterfor epitopes from viruses that commonly causeinfection (Fig 1E)Wewould likely have detectedeven more antigenic peptides in common withthe IEDB if we had tested more samples fromindividuals infected with rare viruses We nextanalyzed the amino acid composition of recur-rently enriched peptides Enriched peptides tendto have more proline and charged amino acidsand fewer hydrophobic amino acids which isconsistent with a previous analysis of B cell ep-itopes in the IEDB (fig S4) (24) This trend

aaa0698-4 5 JUNE 2015 bull VOL 348 ISSUE 6239 sciencemagorg SCIENCE

Table 2 Frequently detected viruses The column indicates the percentage of samples thatwere positive for the virus by VirScan Known HIV-and HCV-positive samples were excluded whenperforming this analysis

Virus species

Human herpesvirus 4 871Rhinovirus B 718Human adenovirus C 718Rhinovirus A 673Human respiratory syncytial virus 657Human herpesvirus 1 544Influenza A virus 534Human herpesvirus 6B 528Human herpesvirus 5 485Influenza B virus 405Poliovirus 337Human herpesvirus 3 243Human adenovirus F 204Human adenovirus B 168Human herpesvirus 2 155Enterovirus A 152Enterovirus B 133

Table 1 VirScanrsquos sensitivity and specificityon samples with known viral infections Sen-sitivity is the percentage of samples positive forthe virus as determined by VirScan out of all nknown positives Specificity is the percentage ofsamples negative for the virus by VirScan out ofall n known negatives

Virus Sensitivity (n) Specificity (n)

HCV 92 (26) 97 (34)HIV1 95 (61) 100 (33)HSV1 97 (38) 100 (6)HSV2 90 (20) 100 (24)We found that although the false negative samplesdid not meet our stringent cutoff for enriching multipleunique peptides they had detectable antibodies to arecurrent epitope By modifying the criterion to allowfor samples that enrich multiple homologous peptidesthat share a recurrent epitope as described in the textthe sensitivity of detecting HCV increases to 100and the sensitivity for detecting HIV increases to 97This modified criterion does not significantly affectspecificity (fig S13) The one false positive was froman individual whose HCV-negative status was self-reported but who had antibodies to as many HCVpeptides as 23 of the true HCV-positive individualsand is likely to be HCV-positive now or in the past It ispossible that this individual was exposed to HCV butcleared the infection If true the observed specificityfor HCV is 100

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

likely reflects enrichment for amino acids thatare surface-exposed or can form stronger inter-actions with antibodies

B cell responses target highly similar viralepitopes across individuals

We compared the profile of peptides recognizedby the antibody response in different individualsWe found that for a given protein each samplegenerally only had strong responses against oneto three immunodominant peptides (Fig 3) Un-expectedly we found that the vast majority ofseropositive samples for a given virus recognizedthe same immunodominant peptides suggestingthat the antiviral B cell response is highly stereo-typed across individuals For example in glyco-protein G from respiratory syncytial virus thereis only a single immunodominant peptide com-prising positions 141 to 196 that is targeted by allsamples with detectable antibodies to the pro-tein regardless of the country of origin (Fig 3A)

For other antigens we observed interpopu-lation serological differences For example twooverlapping peptides from positions 309 to 364and 337 to 392 of the penton base protein fromadenovirus C frequently elicited antibody re-sponses (Fig 3B) However donors from theUnited States andSouthAfrica hadmuch strongerresponses to peptide 309-364 (P lt 10minus6 t test)relative to donors from Thailand and Peru Weobserved that for the EBNA1 protein from EBVdonors from all four countries frequently hadstrong responses to peptide 393-448 and occa-sionally to peptide 589-644 However donorsfrom Thailand and Peru had much stronger re-sponses to peptide 57-112 (Plt 10minus6 t test) (Fig 3C)These differences may reflect variation in thestrains endemic in each region In addition poly-morphism of major histocompatibility complex(MHC) class II alleles immunoglobulin genesand other modifiers that shape immune re-sponses in each population likely play a role in

defining the relative immunodominance of anti-genic peptidesTo determine whether the humoral responses

that target an immunodominant peptide are ac-tually targeting precisely the same epitope weconstructed single- double- and triple-alaninescanningmutagenesis libraries for eight common-ly recognized peptides These were introducedinto the same T7 bacteriophage display vectorand subjected to the same immunoprecipitationand sequencing protocol using samples from theUnited States Mutants that disrupt the epitopediminish antibody binding affinity and peptideenrichment We found that for all eight peptidestested there was a single largely contiguous sub-sequence in which mutations disrupted bindingfor the majority of samples As expected the tri-ple mutants abolished antibody binding to agreater extent and the enrichment patternsweresimilar among single double and triple mutantsof the same peptide (Fig 4 and figs S5 to S11)

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-5

Fig 3The human antiviromeresponse recognizes a similarspectrum of peptides amonginfected individuals In theheat-map charts each row is apeptide tiling across the indi-cated protein and each columnis a sample The colored barabove each column labeled atthe top of the panels indicatesthe country of origin for thatsample The samples shown area subset of individuals withantibodies to at least one pep-tide from the protein The colorintensity of each cellcorresponds to the ndashlog10(Pvalue) measure of significanceof enrichment for a peptide in asample (greater values indi-cates stronger antibodyresponse) Data are shown for(A) human RSV attachmentglycoprotein G (G) (B) humanadenovirus C penton protein(L2) and (C) EBV nuclear anti-gen 1 (EBNA1) Data shown arethe mean of two replicates

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

For four of the eight peptides a 9ndash to 15ndashaminoacid region was critical for antibody recognitionin gt90 of samples (Fig 4 and figs S5 to S7)One other peptide had a region of similar sizethat was critical in about half of the samples (figS8) In another peptide a single region was im-portant for antibody recognition in the majorityof the samples but the extents of the critical re-gion varied slightly for different samples andoccasionally there were donors that recognized acompletely separate epitope (fig S9) The remain-ing two peptides contained a single triple mutantthat abolished binding in themajority of samplesbut the critical region also extended further todifferent extents depending on the sample (figsS10 and S11) Unexpectedly in one of these pep-tides in addition to themain region surroundingpositions 13 and 14 that is critical for binding asingle Gly36rarrAla36 (G36A) mutation disruptedbinding in almost half of the samples whereasnone of the double- or triple-alaninemutants that

also included the adjacent positions [Lys35 (L35)and G37] affected binding (fig S11) It is possiblethat G36 plays a role in helping the peptide adoptan antigenic conformation and multiple mutantscontaining the adjacent Leu or Gly residues res-cue this ability We occasionally saw other exam-ples of mutations that resulted in patterns ofdisrupted binding with no simple explanationillustrating the complexity of antibody-antigeninteractionThe discovery of recurring targeted epitopes

led us to ask whether we could apply this knowl-edge to improve the sensitivity of viral detectionwith VirScan We hypothesized that samplesshowing a strong response to a recurrently tar-geted ldquodiagnosticrdquo peptide which we defined asa peptide enriched in at least 30 of known pos-itive samples are likely to be seropositive even ifthey do not meet our stringent cutoff requiringat least two non-overlapping enriched peptidesWe tested how this modified criterion affected

our sensitivity and specificity in detecting HIVand HCV and found that it reduced the numberof false negatives without affecting the specificityof the assay (fig S13) We next turned our atten-tion to respiratory syncytial virus (RSV) a virusfor which our detected seroprevalence was lowerthan reported epidemiological rates suggestingimperfect sensitivity of our assay We tested serafrom 60 individuals for antibodies to RSV byELISA and found that 95 were positive abovethe reported sensitivity of the assay and consist-ent with near-universal exposure to this patho-gen Applying the modified criterion to thesesamples increased our rate of detectionbyVirScanfrom 63 to 97 (table S2) These data suggestthat assigning more weight to recurrently tar-geted epitopes can enhance the sensitivity ofVirScan and that the performance of the assaycan be improved by screening known positivesfor a particular virus

Discussion

We have developed VirScan a technology foridentifying viral exposure and B cell epitopesacross the entire known human virome in a sin-gle multiplex reaction using less than a drop ofblood VirScan uses DNA microarray synthesisand bacteriophage display to create a uniformsynthetic representation of peptide epitopes com-prising the human virome Immunoprecipitationand high-throughput DNA sequencing revealsthe peptides recognized by antibodies in the sam-ple VirScan is easily automated in 96-well formatto enable high-throughput sample processingBarcoding of samples during PCR enables pooledanalysis that can dramatically reduce the per-sample cost The VirScan approach has severaladvantages for studying the effect of viruses onthe host immune system By detecting antibodyresponses it can identify infectious agents thathave been cleared after an effective host responseCurrent serological methods of antiviral anti-body detection typically use the selection of asingle optimized antigen in order to achieve highaccuracy In contrast VirScanrsquos unique approachdoes not require such optimization in order toobtain similar performance VirScan achievessensitive detection by assaying each virusrsquos com-plete proteome to detect any antibodies directedto epitopes that can be captured in a 56-residuefragment and specificity by computationallyeliminating cross-reactive antibodies This un-biased approach identifies exposure to less well-studied viruses for which optimal serologicalantigens are not known and can be rapidly ex-tended to include new viruses as they are dis-covered (25)Although sensitive and selective VirScan has a

few limitations First it cannot detect epitopesthat require post-translationalmodifications Sec-ondly it cannot detect epitopes that involvediscontinuous sequences on protein fragmentsgreater than 56 residues In principle the lattercan be overcome byusing alternative technologiesthat allow for the display of full-length proteinssuch as parallel analysis of translated open read-ing frames (PLATO) (26) Third VirScan is likely

aaa0698-6 5 JUNE 2015 bull VOL 348 ISSUE 6239 sciencemagorg SCIENCE

Fig 4 Recognition of common epitopes within an antigenic peptide from human adenovirus Cpenton protein (L2) across individuals Each row is a sample Each column denotes the first mutatedposition for the (A) single- (B) double- and (C) triple-alaninemutant peptide starting with the N terminuson the left Each double- and triple-alaninemutant contains two or three adjacent mutations respectivelyextending toward the C terminus from the colored cell The color intensity of each cell indicates theenrichment of the mutant peptide relative to the wild-type For double-mutants the last position is blankThe same is true for the last two positions for triple mutants Data shown are the mean of two replicatesSingle-letter amino acid abbreviations are as follows F Phe H His I Ile K Lys N Asn P Pro Q Gln RArg TThr V Val and YTyr

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

to be less specific compared with certain nucleicacid tests that discern highly related virus strainsHowever VirScan demonstrates excellent sero-logical discrimination among similar virus speciessuch as HSV1 andHSV2 and can even distinguishthe genotype ofHCV69of the timeWe envisionthat VirScan will become an important tool forfirst-pass unbiased serologic screening appli-cations Individual viruses or viral proteins un-covered in this way can subsequently be analyzedin further detail by using more focused assays aswe have demonstrated for a panel of immunodo-minant epitopesWe have demonstrated that VirScan is a sen-

sitive and specific assay for detecting exposure toviruses across the human virome Because it canbe performed in high-throughput and requiresminimal sample and cost VirScan enables rapidand cost-effective screening of large numbers ofsamples to identify population-level differencesin virus exposure across the human virome Inthis work we analyzed over 106 million anti-body-viral peptide interactions in a comprehen-sive study of pan-virus serology in a large diversepopulation In doing so we detected 84 differentviral species in two or more individuals This islikely to be an underestimate of the history ofviral infection because only low levels of circu-lating antibodies may remain from infectionsthat were cleared in the distant past In additionan individual could be infected by multiple dis-tinct strains of each viral species We identifiedknown and novel differences in virus exposurebetween groups differing in age HIV status andgeographic location across four different con-tinents Our results are largely consistent withprevious studies validating the effectiveness ofVirScan For example CMVantibodieswere foundin significantly higher frequencies in Peru Thai-land and South Africa whereas KSHV andHSV1antibodies were detectedmore frequently in Peruand South Africa but not in Thailand (16 27ndash31)We also uncovered previously undocumentedserological differences such as an increased rateof antibodies against adenovirus B and RSV inHIV-positive individuals compared with HIV-negative individuals These differences may pro-vide insight into howHIV co-infection alters thebalance between host immunity and residentviruses as well as help to identify pathogens thatmay increase susceptibility to HIV and otherheterologous infections HIV infection may re-duce the immune systemrsquos ability to control re-activation of normally dormant resident virusesor to prevent opportunistic infections from takinghold and triggering a strong adaptive immuneresponse Beyond the epidemiological applica-tions demonstrated here VirScan could also beapplied to identify viral exposures that correlatewith disease or other phenotypes in virome-wideassociation studiesOur results identified a large number of novel

B cell epitopes cumulatively nearly doubling thenumber of all previously identified viral epitopesWe have used our data to identify globally im-munodominant and commonly recognized ldquopub-licrdquo epitopes For most species of viruses one or

more peptides are individually recognized inover 70 to 95 of samples positive for thatspecies (table S3) We identified a set of two pep-tides that together are recognized by gt95 ofall screened samples and a set of five peptidesthat together are recognized in gt99 of screenedsamples These public epitopes could be usedto improve vaccine design by piggybacking onthe existing antibody response against themFusing a public B cell epitope to a protein in avaccine to which we hope to induce an immuneresponsemay increase a vaccinersquos efficacy amonga broad population by improving presentationof that protein and aiding affinity maturationPreexisting B cells recognizing the public epi-tope can act as antigen presenting cells to pro-cess and present T cell epitopes of the fusedvaccine target on MHC class I and II (32) An-tibodies secreted by these B cells can also par-ticipate in immune complexes with the fusedvaccine target which are critical for folliculardendritic cells to prime class switching and af-finity maturation of B cells recognizing otherepitopes on the fused antigen (33) Last wedemonstrated that applying more weight tothese public epitopes increases the sensitivityof VirScan without significantly affecting spec-ificity suggesting that this limited subset of pep-tides can serve as the basis for the next generationof our assay or for other novel diagnosticsWe also found that the precise epitopes re-

cognized by the B cell response are highly similaramong individuals across many viral proteinsOne possible model for this notable similarity isthat these regions possess properties favorablefor antigenicity such as accessibility Anothermodel is that the same or highly similar B cellreceptor sequences that recognize these epitopesare commonly generated Identical T cell receptorsequences (ldquopublicrdquo clonotypes) have been foundinmultiple individuals and are thought to be theresult of biases during the recombination proc-ess that favor certain amino acid sequences(34) V(D)J recombination of the immunoglobulinheavy- and light-chain loci is also heavily biased(35) Highly similar or even identical complemen-tarity determining region 3 (CDR3) sequenceshave been observed in dengue virusndashspecific an-tibodies from different individuals (36) It is pos-sible that rather than being an exception fordengue-specific antibodies this represents ageneral phenomenon Inherent biases in V(D)Jrecombination generate the same or similar an-tibodies in multiple individuals that recognizehighly similar epitopes Slight differences in theantibodyCDR3sequencemay subtly alter antibody-antigen interaction leading to the slight var-iations observed in the extent of critical epitoperegions Sequencing of antigen-specific antibodygenes will be required to investigate these pos-sibilities The same principle may also apply to Tcell epitopes and their cognate T cell receptorsVirScan is a method that enables human

virome-wide explorationmdashat the epitope levelmdashof immune responses in large numbers of indi-viduals We have demonstrated its effectivenessfor determining viral exposure and characteriz-

ing viral B cell epitopes in high throughput andat high resolution Our preliminary studies haverevealed intriguing general properties of the hu-man immune system both at the individual andpopulation scale VirScan will be an importanttool in uncovering the effect of host-virome in-teractions on human health and disease andcould easily be expanded to include other humanpathogens such as bacteria fungi and protozoa

Materials and methods

Human donor samples

Specimens originating from human donors werecollected after informed written consent was ob-tained and under a protocol approved by the localgoverning human research protection committeeSecondary use of all samples for the purposesof this work was exempted by the Brigham andWomenrsquos Hospital Institutional Review Board(protocol number 2013P001337) Samples includeddonors residing in Thailand (n = 48) Peru (n =48) South Africa (n = 48) and the Unites StatesincludingHIV+ donors (n=61) andHCV+ donors(n = 26) All serum and plasma samples werestored in aliquots at ndash80degC until use

Design and cloning of viral peptideand scanning mutagenesislibrary sequences

For the virome peptide library we first down-loaded all protein sequences in the UniProt data-base from viruses with human host and collapsedon 90 sequence identity [wwwuniprotorgunirefquery=uniprot(host ldquoHuman+[9606]rdquo)+identity09] The clustering algorithm UniProtrepresents each group of protein sequencessharing at least 90 sequence similarity with asingle representative sequence Then we created56ndashamino acid (aa) peptide sequences tilingthrough all the proteins with 28-aa overlap Wereverse-translated these peptide sequences intoDNA codons optimized for expression in Esche-richia coli making synonymousmutations whennecessary to avoid restriction sites used in sub-sequent cloning steps (EcoRI and XhoI) Lastwe added the adapter sequence AGGAATTC-CGCTGCGT to the 5prime end and CAGGGAAGA-GCTCGAA to the 3prime end to form the 200-nucleotide(nt) oligonucleotide sequencesFor the scanning mutagenesis library we first

took the sequences of the peptides to be muta-genized For each peptide we made all single-mutant and consecutivedouble- and triple-mutantsequences scanning through the whole peptideNon-alanine amino acids were mutated to ala-nine and alanines were mutated to glycine Wereverse-translated these peptide sequences intoDNA codons making synonymous mutationswhen necessary to avoid restriction sites used insubsequent cloning steps (EcoRI and XhoI) Wealso made synonymous mutations to ensure thatthe 50 nt at the 5prime end of peptide sequence isunique to allow unambiguous mapping of thesequencing results Last we added the adaptersequence AGGAATTCCGCTGCGT to the 5prime endand CAGGGAAGAGCTCGAA to the 3prime end to formthe 200-nt oligonucleotide sequences

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-7

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

The 200-nt oligonucleotide sequences weresynthesized on a releasable DNAmicroarray WePCR-amplified the DNAwith the primers T7-PFAAATGATACGGCGGGAATTCCGCTGCGT) and T7-PRA (CAAGCAGAAGACTCGAGCTCTTCCCTG) di-gested the product with EcoRI and XhoI andcloned the fragment into the EcoRISalI site ofthe T7FNS2 vector (8) The resulting library waspackaged into T7 bacteriophage by using the T7Select Packaging Kit (EMD Millipore) and ampli-fied by using themanufacturer suggested protocol

Phage immunoprecipitationand sequencing

We performed phage immunoprecipitation andsequencing by using a slightlymodified version ofpreviously published PhIP-Seq protocols (8 10)First we blocked eachwell of a 96-deep-well platewith 1 ml of 3 bovine serum albumin in TBSTovernight on a rotator at 4degC To each preblockedwell we added sera or plasma containing about2 mg of immunoglobulinG (IgG) [quantified usinga Human IgG ELISA Quantitation Set (BethylLaboratories)] and 1 ml of the bacteriophagelibrary diluted to ~2 times 105 fold representation(2 times 1010 plaque-forming units for a library of105 clones) in phage extraction buffer (20 mMTris-HCl pH 80 100 mMNaCl 6 mMMgSO4)We performed two technical replicates for eachsample We allowed the antibodies to bind thephage overnight on a rotator at 4degC The nextday we added 20 ml each of magnetic protein Aand protein G Dynabeads (Invitrogen) to eachwell and allowed immunoprecipitation to occurfor 4 hours on a rotator at 4degC With a 96-wellmagnetic stand we then washed the beads threetimeswith 400 ml of PhIP-Seqwash buffer (50mMTris-HCl pH 75 150mMNaCl 01 NP-40) Afterthe final wash we resuspended the beads in40 ml of water and lysed the phage at 95degC for10mWe also lysed phage from the library beforeimmunoprecipitation (ldquoinputrdquo) and after immu-noprecipitation with beads aloneWe prepared the DNA for multiplexed Il-

lumina sequencing by using a slightly modifiedversion of a previously published protocol (36)Weperformed two rounds of PCR amplification on thelysedphagematerial usinghot startQ5polymeraseaccording to the manufacturer-suggested protocol(NEB) The first round of PCR used the primersIS7_HsORF5_2 (ACACTCTTTCCCTACACGACTC-CAGTCAGGTGTGATGCTC) and IS8_HsORF3_2(GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCC-GAGCTTATCGTCGTCATCC) The second roundof PCR used 1 ml of the first-round product andthe primers IS4_HsORF5_2 (AATGATACGGCGA-CCACCGAGATCTACACTCTTTCCCTACACGACTC-CAGT) and a different unique indexing primerfor each sample to bemultiplexed for sequencing(CAAGCAGAAGACGGCATACGAGATxxxxxxx-GTGACTGGAGTTCAGACGTGT where ldquoxxxxxxxrdquodenotes a unique 7-nt indexing sequence) Afterthe second round of PCR we determined theDNA concentration of each sample by quan-titative PCR and pooled equimolar amounts ofall samples for gel extraction After gel extractionthe pooled DNA was sequenced by the Harvard

Medical School Biopolymers Facility using a 50ndashbase pair read cycle on an IlluminaHiSeq 2000 or2500We pooled up to 192 samples for sequencingon each lane and generally obtained ~100 mil-lion to 200 million reads per lane (500000 to1000000 reads per sample)

Informatics and statistical analysis

We performed the initial informatics and statis-tical analysis by using a slightly modified versionof the previously published technique (8 10) Wefirstmapped the sequencing reads to the originallibrary sequences by using Bowtie and countedthe frequency of each clone in the ldquoinputrdquo andeach sample ldquooutputrdquo (37) Because the majorityof clones are not enriched we used the observeddistribution of output counts as a null distribu-tion We found that a zero-inflated generalizedpoisson distribution fits our output counts wellWe used this null distribution to calculate a Pvalue for the likelihood of enrichment for eachclone The probabilitymass function for the zero-inflated generalized poisson distribution is

PethY frac14 yTHORNfrac14 pthorn eth1 minus pTHORNfrac12qethqthorn lTHORNxminus1eminusqminusxl y frac14 0

eth1 minus pTHORNfrac12qethqthorn lTHORNxminus1eminusqminusxl y gt 0

We used maximum likelihood estimation toregress the parameters p q and l to fit the dis-tribution of counts after immunoprecipitationfor all clones present at a particular frequencycount in the input We repeated this procedurefor all of the observed input counts and foundthat q and l are well fit by linear regression and pby an exponential regression as a function of in-put count (fig S1) Last for each clonewe used itsinput count and the regression results to deter-mine the null distribution based on the zero-inflated generalized poisson model which weused to calculate the ndashlog10(P value) of obtain-ing the observed countTo call hits we determined the threshold for

reproducibility between technical replicates basedon a previously published method (10) Brieflywe made scatter plots of the log10 of the ndashlog10 (Pvalues) and used a slidingwindowofwidth 0005from0 to 2 across the axis of one replicate For allthe clones that fell within each window we cal-culated the median and median absolute devia-tion of the log10 of the ndashlog10 (P values) in theother replicate and plotted it against the windowlocation (fig S2) We called the threshold for re-producibility the first window in which the me-dian was greater than the median absolutedeviation We found that the distribution of thethreshold ndashlog10 (P value) was centered around amean of ~23 (fig S12) So we called a peptide ahit if the ndashlog10 (P value) was at least 23 in bothreplicates We eliminated the 593 hits that cameup in at least 3 of the 22 immunoprecipitationswith beads alone (negative control for nonspecificbinding) We also filtered out any peptides thatwere not enriched in at least two of the samplesTo call virus exposures we grouped peptides

according to the virus the peptide is derived

from We grouped all peptides from individual vi-ral strains for which we had complete proteomesThe samplewas counted as positive for a species ifit was positive for any strain from that species Forviral strains thathadpartial proteomeswegroupedthemwith other strains from the same species toform a complete set and bioinformatically elim-inated homologous peptides (see next para-graph) We set a threshold number of hits pervirus based on the size of the virus We foundthat there is approximately a power-law relation-ship between size of the virus and the averagenumber of hits per sample (fig S3) In comparingresults from VirScan to samples with known in-fection we empirically determined that a thresh-old of three hits for HSV1 worked the best Weused this value and the slope of the best fit lineto scale the threshold for other viruses We alsoset a minimum threshold of at least two hits inorder to avoid false positives from single spuri-ous hitsTo bioinformatically remove cross-reactive an-

tibodies we first sorted the viruses by total num-ber of hits in descending order We then iteratedthrough each virus in this order For each viruswe iterated through each peptide hit If the hitshared a subsequence of at least 7 aa with any hitpreviously observed in any of the viruses fromthat sample that hit was considered to be from across-reactive antibody and would be ignored forthat virus Otherwise the hit is considered to bespecific and the score for that virus is incre-mented by one In this way we summed only thepeptide hits that do not share any linear epi-topes We compared the final score for each virusto the threshold for that virus to determinewheth-er the sample is positive for exposure to that virusTo identify differences between populations

we first used Fisherrsquos exact test to calculate a Pvalue for the significance of association of virusexposure with one population versus anotherThenwe constructed anull distribution of Fisherrsquosexact P values by randomly permuting the sam-ple labels 1000 times and recalculating the Fisherrsquosexact P value for each virus With use of this nulldistribution we calculated the false discovery rateby dividing the number of permutation P valuesmore extreme than the one observed by the totalnumber of permutations

IEDB epitope overlap analysis

Wedownloaded data for all continuous humanBcell epitopes from IEDB and filtered out all non-viral epitopes (22) To avoid redundancy in these4549 viral epitopes we grouped together epi-topes that are 100 identical or share a 7-aa sub-sequence giving us 1559 nonredundant epitopegroups Of these groups 1392 contain a memberepitope that is also a subsequence of a peptide inthe VirScan library This represents the totalnumber of epitopes we could detect by VirScanTo determine the number of epitopes we de-tected we tallied the number of epitope groupswith at least one member that is contained in apeptide that was enriched in one or two samplesLast to determine the number of nonredundantnew epitopeswe detected we grouped non-IEDB

aaa0698-8 5 JUNE 2015 bull VOL 348 ISSUE 6239 sciencemagorg SCIENCE

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

epitopes containing peptides that share a seven-residue subsequence and counted the number ofthese nonredundant peptide groups

Scanning mutagenesis data analysis

First we estimated the fractional abundance ofeach peptide by dividing the number of readsfor that peptide by the total number of reads forthe sample Then we divided the fractional abun-dance of each peptide after immunoprecipitationby the fractional abundance before immunopre-cipitation to get the enrichment To calculaterelative enrichment we divided enrichment ofthe mutated peptide by enrichment of the wild-type peptide Because most of the single-mutantpeptides had wild-type levels of enrichment weaveraged enrichment of the wild-type peptideenrichment with the middle two quartiles of en-richment of single-mutant peptides to get a bet-ter estimate of the wild-type peptide enrichment

RSV and HSV1 and 2 serology

Serum from 44 donors was tested for HSV1 andHSV2 antibodies by using the HerpeSelect 1 and 2Immunoblot IgG kit (Focus Diagnostics) accord-ing to manufacturerrsquos protocol Serum from 60 do-nors was tested for RSV antibodies by usinganti-RSV IgG Human ELISA Kit (ab108765) ac-cording to manufacturerrsquos protocol

REFERENCES AND NOTES

1 K M Wylie G M Weinstock G A Storch Emerging view ofthe human virome Transl Res 160 283ndash290 (2012)doi 101016jtrsl201203006 pmid 22683423

2 B A Duerkop L V Hooper Resident viruses and theirinteractions with the immune system Nat Immunol 14654ndash659 (2013) doi 101038ni2614 pmid 23778792

3 E S Barton et al Herpesvirus latency confers symbioticprotection from bacterial infection Nature 447 326ndash329(2007) doi 101038nature05762 pmid 17507983

4 E F Foxman A Iwasaki Genome-virome interactionsExamining the role of common viral infections in complexdisease Nat Rev Microbiol 9 254ndash264 (2011) doi 101038nrmicro2541 pmid 21407242

5 M Lecuit M Eloit The human virome New tools andconcepts Trends Microbiol 21 510ndash515 (2013) doi 101016jtim201307001 pmid 23906500

6 I De Vlaminck et al Temporal response of the human virome toimmunosuppression and antiviral therapy Cell 155 1178ndash1187(2013) doi 101016jcell201310034 pmid 24267896

7 E Hammarlund et al Duration of antiviral immunity aftersmallpox vaccination Nat Med 9 1131ndash1137 (2003)doi 101038nm917 pmid 12925846

8 H B Larman et al Autoantigen discovery with a synthetichuman peptidome Nat Biotechnol 29 535ndash541 (2011)doi 101038nbt1856 pmid 21602805

9 UniProt Consortium Activities at the Universal ProteinResource (UniProt) Nucleic Acids Res 42 D191ndashD198 (2014)doi 101093nargkt1140 pmid 24253303

10 H B Larman et al PhIP-Seq characterization ofautoantibodies from patients with multiple sclerosis type

1 diabetes and rheumatoid arthritis J Autoimmun 43 1ndash9(2013) doi 101016jjaut201301013 pmid 23497938

11 C Bialecki H M Feder Jr J M Grant-Kels The six classicchildhood exanthems A review and update J Am AcadDermatol 21 891ndash903 (1989) doi 101016S0190-9622(89)70275-9 pmid 2681288

12 J H Lee W K Roth S Zeuzem Evaluation and comparison ofdifferent hepatitis C virus genotyping and serotyping assaysJ Hepatol 26 1001ndash1009 (1997) doi 101016S0168-8278(97)80108-0 pmid 9186830

13 H F L Wertheim et al Key role for clumping factor B inStaphylococcus aureus nasal colonization of humans PLOSMed 5 e17 (2008) doi 101371journalpmed0050017pmid 18198942

14 R A Manz A E Hauser F Hiepe A Radbruch Maintenance ofserum antibody levels Annu Rev Immunol 23 367ndash386 (2005)doi 101146annurevimmunol23021704115723 pmid 15771575

15 M Wang et al Human anti-JC virus serum reacts with nativebut not denatured JC virus major capsid protein VP1 J VirolMethods 78 171ndash176 (1999) doi 101016S0166-0934(98)00180-3 pmid 10204707

16 S A S Staras et al Seroprevalence of cytomegalovirusinfection in the United States 1988-1994 Clin Infect Dis 431143ndash1151 (2006) doi 101086508173 pmid 17029132

17 M A Reynolds D Kruszon-Moran A Jumaan D S SchmidG M McQuillan Varicella seroprevalence in the US Data fromthe National Health and Nutrition Examination Survey 1999-2004Public Health Rep 125 860ndash869 (2010)pmid 21121231

18 J I Cohen Epstein-Barr virus infection N Engl J Med 343481ndash492 (2000) doi 101056NEJM200008173430707pmid 10944566

19 L Dong et al A combination of serological assays to detecthuman antibodies to the avian influenza A H7N9 virus PLOSONE 9 e95612 (2014) doi 101371journalpone0095612pmid 24755627

20 P Patel et al Prevalence and risk factors associated withherpes simplex virus-2 infection in a contemporary cohort ofHIV-infected persons in the United States Sex Transm Dis39 154ndash160 (2012) doi 101097OLQ0b013e318239d7fdpmid 22249305

21 C T Stover et al Prevalence of and risk factors for viralinfections among human immunodeficiency virus(HIV)-infected and high-risk HIV-uninfected women J InfectDis 187 1388ndash1396 (2003)pmid 12717619

22 E A Engels et al Risk factors for human herpesvirus 8infection among adults in the United States and evidence forsexual transmission J Infect Dis 196 199ndash207 (2007)doi 101086518791 pmid 17570106

23 R Vita et al The immune epitope database 20 Nucleic Acids Res38 D854ndashD862 (2010) doi 101093nargkp1004 pmid 19906713

24 H Singh H R Ansari G P S Raghava Improved method forlinear B-cell epitope prediction using antigenrsquos primarysequence PLOS ONE 8 e62216 (2013) doi 101371journalpone0062216 pmid 23667458

25 J L Mokili F Rohwer B E Dutilh Metagenomics and futureperspectives in virus discovery Curr Opin Virol 2 63ndash77(2012) doi 101016jcoviro201112004 pmid 22440968

26 J Zhu et al Protein interaction discovery using parallelanalysis of translated ORFs (PLATO) Nat Biotechnol 3131ndash334 (2013) doi 101038nbt2539 pmid 23503679

27 Y Urwijitaroon S Teawpatanataworn A KitjareontarmPrevalence of cytomegalovirus antibody in Thai-northeasternblood donors Southeast Asian J Trop Med Public Health 24(suppl 1) 180ndash182 (1993) pmid 7886568

28 M J Cannon D S Schmid T B Hyde Review ofcytomegalovirus seroprevalence and demographiccharacteristics associated with infection Rev Med Virol 20202ndash213 (2010) doi 101002rmv655 pmid 20564615

29 S Mohanna et al Human herpesvirus-8 in Peruvian blooddonors A population with hyperendemic disease Clin Infect Dis44 558ndash561 (2007) doi 101086511044 pmid 17243060

30 D Ablashi et al Seroprevalence of human herpesvirus-8(HHV-8) in countries of Southeast Asia compared to the USAthe Caribbean and Africa Br J Cancer 81 893ndash897 (1999)doi 101038sjbjc6690782 pmid 10555764

31 J S Smith N J Robinson Age-specific prevalence of infectionwith herpes simplex virus types 2 and 1 A global reviewJ Infect Dis 186 (suppl 1) S3ndashS28 (2002) doi 101086343739 pmid 12353183

32 A Heit et al CpG-DNA aided cross-priming by cross-presenting B cells J Immunol 172 1501ndash1507 (2004)doi 104049jimmunol17231501 pmid 14734727

33 Y Aydar S Sukumar A K Szakal J G Tew The influence ofimmune complex-bearing follicular dendritic cells on the IgMresponse Ig class switching and production of high affinityIgG J Immunol 174 5358ndash5366 (2005) doi 104049jimmunol17495358 pmid 15843533

34 M F Quigley et al Convergent recombination shapes theclonotypic landscape of the naive T-cell repertoire Proc NatlAcad Sci USA 107 19414ndash19419 (2010) doi 101073pnas1010586107 pmid 20974936

35 K J L Jackson M J Kidd Y Wang A M Collins The shapeof the lymphocyte receptor repertoire Lessons from the B cellreceptor Front Immunol 4 263 (2013) doi 103389fimmu201300263 pmid 24032032

36 P Parameswaran et al Convergent antibody signatures inhuman dengue Cell Host Microbe 13 691ndash700 (2013)doi 101016jchom201305008 pmid 23768493

37 B Langmead C Trapnell M Pop S L Salzberg Ultrafast andmemory-efficient alignment of short DNA sequences to thehuman genome Genome Biol 10 R25 (2009)doi 101186gb-2009-10-3-r25 pmid 19261174

ACKNOWLEDGMENTS

We thank E Unger and S Buranapraditkun for providing reagentsK Wucherpfennig (Harvard) and H Ploegh (MIT) for criticalreading of the manuscript and TWIST Bioscience for providingaccess to their advanced oligonucleotide synthesis technology Thecohort in Durban South Africa was funded by the NIH(R37AI067073) and the International AIDS Vaccine Initiative(UKZNRSA1001) TN received additional funding from the SouthAfrican Research Chairs Initiative the Victor Daitz Foundation andan International Early Career Scientist Award from the HowardHughes Medical Institute RTC was funded by grants NIHDA033541 and AI082630 CB and JS were supported by NIHN01-AI-30024 and N01-Al-15422 NIHndashNational Institute of Dentaland Craniofacial Research R01 DE018925-04 the HIVACATprogram and CUTHIVAC 241904 KR is supported by TRF SeniorResearch Scholar the Thailand Research Fund the ChulalongkornUniversity Research Professor Program Thailand and NIH grantN01-A1-30024 GJX and TK were supported by the NSFGraduate Research Fellowships Program SJE and BW areInvestigators with the Howard Hughes Medical Institute GJXTK HBL and SJE are inventors on a patent application(application no PCTUS1470902) filed by Brigham and WomenrsquosHospital Incorporated that covers the use of phage displaylibraries to detect antiviral antibodies

SUPPLEMENTARY MATERIALS

wwwsciencemagorgcontent3486239aaa0698supplDC1Supplementary TextFigs S1 to S14Tables S1 to S3

12 October 2014 accepted 24 April 2015101126scienceaaa0698

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-9

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

Comprehensive serological profiling of human populations using a synthetic human virome

Christian Brander Raymond T Chung Kevin C OConnor Bruce Walker H Benjamin Larman and Stephen J ElledgeGeorge J Xu Tomasz Kula Qikai Xu Mamie Z Li Suzanne D Vernon Thumbi Ndungu Kiat Ruxrungtham Jorge Sanchez

DOI 101126scienceaaa0698 (6239) aaa0698348Science

this issue 101126scienceaaa0698Scienceresponses in most individuals targeted the same viral epitopesabout 10 viral species over their lifetime Despite differences in the rates of exposure to specific viruses the antibodysamples from nearly 600 people of differing ages and geographic locations and found that most had been exposed to

developed a blood test that identifies antibodies against all known human viruses They studied bloodet alpeptides Xu exposures Typically such tests measure only one virus at a time Using a synthetic representation of all human viralimmune system Blood tests that detect antiviral antibodies can provide information about both past and present viral

In addition to causing illness viruses leave indelible footprints behind because infection permanently alters thethe complete historyminusminusViral exposure

ARTICLE TOOLS httpsciencesciencemagorgcontent3486239aaa0698

MATERIALSSUPPLEMENTARY httpsciencesciencemagorgcontentsuppl201506033486239aaa0698DC1

CONTENTRELATED

httpstmsciencemagorgcontentscitransmed5203203ra126fullhttpstmsciencemagorgcontentscitransmed6242242ra83full

REFERENCES

httpsciencesciencemagorgcontent3486239aaa0698BIBLThis article cites 37 articles 3 of which you can access for free

PERMISSIONS httpwwwsciencemagorghelpreprints-and-permissions

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASScienceScience 1200 New York Avenue NW Washington DC 20005 The title (print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

Copyright copy 2015 American Association for the Advancement of Science

on Septem

ber 17 2020

httpsciencesciencemagorg

Dow

nloaded from

Page 5: VIRAL IMMUNOLOGY Comprehensive serological profiling of … · Systematic viral epitope scanning (VirScan). This method allows comprehensive analysis of antiviral antibodies in human

high rates of very rare viruses strengthens ourconfidence in VirScanrsquos specificity (see supple-mentary discussion)We frequently detected antibodies to rhino-

virus and respiratory syncytial virus which arenormally found only in the respiratory tract in-dicating that VirScan using blood samples is stillable to detect viruses that do not cause viremiaWe also detected antibodies to influenza whichis normally cleared andpoliovirus towhichmostpeople in modern times generate antibodies

through vaccination Because the original antigenis no longer present we are likely detecting an-tibodies secreted by long-livedmemoryB cells (14)We detected antibodies to certain viruses

less frequently than expected based on previousseroprevalence studies that used optimized serumELISAs For example the frequency at which wedetect influenza (534) and poliovirus (337) islower than expected given that the majority ofthe population has been exposed to or vaccinatedagainst these viruses Thismay be due to reducedsensitivity because of a gradual narrowing anddecrease of the long-lived B cell response in theabsence of persistent antigen We also rarely de-tected antibody responses to small viruses suchas JC virus (JCV) and torque teno virus which arefrequently detected by using specific tests Webelieve that the disparity is due to low titers ofantibodies to unmodified linear epitopes fromthese viruses For example serum antibodiesagainst the major capsid protein of JCV are re-ported to only recognize conformational epitopes(15) Last the frequency of detecting varicellazoster virus (chicken pox) antibodies is also lowerthan expected (243) even though the frequen-cy of detecting other latent herpesviruses such asEBV (871) and CMV (485) is similar to theprevalence reported in epidemiological studies(16ndash18) This may reflect differences in how fre-quently these viruses shed antigens that stimu-late B cell responses or a more limited humoralresponse that relies on epitopes that cannot bedetected in a 56-residue peptide It might also bepossible to increase the sensitivity of detection ofthese viral antibodies by stimulating memory Bcells in vitro to probe the history of infectionmore deeplyTo assess differences in viral exposure between

populations we split the samples into differentgroups based on age HIV status and geographyWe first compared results from children underthe age of 10 to adults within the United States(HIV-positive individuals were excluded from thisanalysis) (Fig 2A) Fewer children were positivefor most viruses including EBV HSV1 HSV2and influenza virus which is consistent with ourpreliminary observations comparing the numberof enriched peptides (Fig 1B) In addition to thefact that children may generate lower antibodytiters in general these younger donors probablyhave not yet been exposed to certain viruses forexample HSV2 which is sexually transmitted (19)When comparing results from HIV-positive to

HIV-negative samples we foundmore of theHIV-positive samples to also be seropositive for addi-tional viruses includingHSV2 CMV andKaposirsquossarcomandashassociated herpesvirus (KSHV) (falsediscovery rate q lt 005 Fig 2B) These results areconsistent with prior studies indicating higherrisk of these co-infections in HIV positive patients(20ndash22) Patients with HIV may engage in activ-ities that put them at higher risk for exposure tothese viruses Alternatively these viruses may in-crease the risk of HIV infection HIV infectionmay reduce the immune systemrsquos ability to con-trol reactivation of normally dormant residentviruses or to prevent opportunistic infections

from taking hold and triggering a strong adaptiveimmune responseLast we compared evidence of viral exposure

among samples taken from adult HIV-negativedonors residing in countries (United States PeruThailand and South Africa) from four differentcontinents In general donors outside the UnitedStates had higher frequencies of seropositivity(Fig 2 C to E) For example CMV antibodieswere found in significantly higher frequencies insamples from Peru Thailand and South AfricaOther viruses such as KSHV and HSV1 were de-tected more frequently in donors from Peru andSouth Africa but not Thailand The observed de-tection frequency of different adenovirus speciesvaries across populations Adenovirus C seropos-itivity was found at similar frequencies in allregions but adenovirus D seropositivity was gen-erally higher outside the United States whereasadenovirus B seropositivity was higher in Peruand South Africa but not in Thailand The higherrates of virus exposure outside the United Statescould be due to differences in population densitycultural practices sanitation or genetic suscep-tibility Additionally influenza B seropositivitywas more common in the United States com-pared with other countries especially Peru andThailand The global incidence of influenza B ismuch lower than influenza A but the standardinfluenza vaccination contains both influenza Aand B strains so the elevated frequency of indi-viduals with seroreactivity may be due to higherrates of influenza vaccination in theUnited StatesOther viruses such as rhinovirus and EBV weredetected at very similar frequencies in all thegeographic regions

Analysis of viral epitope determinants

After analyzing responses on the whole-virus lev-el we focused our attention on the specific pep-tides targeted by these antibodies We detectedantibodies to a total of 8425 peptides in at leasttwo samples and 15052 in at least one sampleBecause of the presence ofmany related peptidesin our library and the Immune Epitope Database(IEDB) for the following analysis we consider apeptide unique only if it does not contain a con-tinuous seven-residue subsequence the estimatedsize of a linear epitope in common with anotherpeptide Analyzed as such our VirScan databasenearly doubles the 1559 unique human B cell ep-itopes from human viruses in the IEDB (23) Theepitopes identified in our unbiased analysis dem-onstrate a significant overlap with those con-tained in the IEDB (P lt 10minus30 Fisherrsquos exact textFig 1D) The amount of overlap is even greaterfor epitopes from viruses that commonly causeinfection (Fig 1E)Wewould likely have detectedeven more antigenic peptides in common withthe IEDB if we had tested more samples fromindividuals infected with rare viruses We nextanalyzed the amino acid composition of recur-rently enriched peptides Enriched peptides tendto have more proline and charged amino acidsand fewer hydrophobic amino acids which isconsistent with a previous analysis of B cell ep-itopes in the IEDB (fig S4) (24) This trend

aaa0698-4 5 JUNE 2015 bull VOL 348 ISSUE 6239 sciencemagorg SCIENCE

Table 2 Frequently detected viruses The column indicates the percentage of samples thatwere positive for the virus by VirScan Known HIV-and HCV-positive samples were excluded whenperforming this analysis

Virus species

Human herpesvirus 4 871Rhinovirus B 718Human adenovirus C 718Rhinovirus A 673Human respiratory syncytial virus 657Human herpesvirus 1 544Influenza A virus 534Human herpesvirus 6B 528Human herpesvirus 5 485Influenza B virus 405Poliovirus 337Human herpesvirus 3 243Human adenovirus F 204Human adenovirus B 168Human herpesvirus 2 155Enterovirus A 152Enterovirus B 133

Table 1 VirScanrsquos sensitivity and specificityon samples with known viral infections Sen-sitivity is the percentage of samples positive forthe virus as determined by VirScan out of all nknown positives Specificity is the percentage ofsamples negative for the virus by VirScan out ofall n known negatives

Virus Sensitivity (n) Specificity (n)

HCV 92 (26) 97 (34)HIV1 95 (61) 100 (33)HSV1 97 (38) 100 (6)HSV2 90 (20) 100 (24)We found that although the false negative samplesdid not meet our stringent cutoff for enriching multipleunique peptides they had detectable antibodies to arecurrent epitope By modifying the criterion to allowfor samples that enrich multiple homologous peptidesthat share a recurrent epitope as described in the textthe sensitivity of detecting HCV increases to 100and the sensitivity for detecting HIV increases to 97This modified criterion does not significantly affectspecificity (fig S13) The one false positive was froman individual whose HCV-negative status was self-reported but who had antibodies to as many HCVpeptides as 23 of the true HCV-positive individualsand is likely to be HCV-positive now or in the past It ispossible that this individual was exposed to HCV butcleared the infection If true the observed specificityfor HCV is 100

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

likely reflects enrichment for amino acids thatare surface-exposed or can form stronger inter-actions with antibodies

B cell responses target highly similar viralepitopes across individuals

We compared the profile of peptides recognizedby the antibody response in different individualsWe found that for a given protein each samplegenerally only had strong responses against oneto three immunodominant peptides (Fig 3) Un-expectedly we found that the vast majority ofseropositive samples for a given virus recognizedthe same immunodominant peptides suggestingthat the antiviral B cell response is highly stereo-typed across individuals For example in glyco-protein G from respiratory syncytial virus thereis only a single immunodominant peptide com-prising positions 141 to 196 that is targeted by allsamples with detectable antibodies to the pro-tein regardless of the country of origin (Fig 3A)

For other antigens we observed interpopu-lation serological differences For example twooverlapping peptides from positions 309 to 364and 337 to 392 of the penton base protein fromadenovirus C frequently elicited antibody re-sponses (Fig 3B) However donors from theUnited States andSouthAfrica hadmuch strongerresponses to peptide 309-364 (P lt 10minus6 t test)relative to donors from Thailand and Peru Weobserved that for the EBNA1 protein from EBVdonors from all four countries frequently hadstrong responses to peptide 393-448 and occa-sionally to peptide 589-644 However donorsfrom Thailand and Peru had much stronger re-sponses to peptide 57-112 (Plt 10minus6 t test) (Fig 3C)These differences may reflect variation in thestrains endemic in each region In addition poly-morphism of major histocompatibility complex(MHC) class II alleles immunoglobulin genesand other modifiers that shape immune re-sponses in each population likely play a role in

defining the relative immunodominance of anti-genic peptidesTo determine whether the humoral responses

that target an immunodominant peptide are ac-tually targeting precisely the same epitope weconstructed single- double- and triple-alaninescanningmutagenesis libraries for eight common-ly recognized peptides These were introducedinto the same T7 bacteriophage display vectorand subjected to the same immunoprecipitationand sequencing protocol using samples from theUnited States Mutants that disrupt the epitopediminish antibody binding affinity and peptideenrichment We found that for all eight peptidestested there was a single largely contiguous sub-sequence in which mutations disrupted bindingfor the majority of samples As expected the tri-ple mutants abolished antibody binding to agreater extent and the enrichment patternsweresimilar among single double and triple mutantsof the same peptide (Fig 4 and figs S5 to S11)

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-5

Fig 3The human antiviromeresponse recognizes a similarspectrum of peptides amonginfected individuals In theheat-map charts each row is apeptide tiling across the indi-cated protein and each columnis a sample The colored barabove each column labeled atthe top of the panels indicatesthe country of origin for thatsample The samples shown area subset of individuals withantibodies to at least one pep-tide from the protein The colorintensity of each cellcorresponds to the ndashlog10(Pvalue) measure of significanceof enrichment for a peptide in asample (greater values indi-cates stronger antibodyresponse) Data are shown for(A) human RSV attachmentglycoprotein G (G) (B) humanadenovirus C penton protein(L2) and (C) EBV nuclear anti-gen 1 (EBNA1) Data shown arethe mean of two replicates

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

For four of the eight peptides a 9ndash to 15ndashaminoacid region was critical for antibody recognitionin gt90 of samples (Fig 4 and figs S5 to S7)One other peptide had a region of similar sizethat was critical in about half of the samples (figS8) In another peptide a single region was im-portant for antibody recognition in the majorityof the samples but the extents of the critical re-gion varied slightly for different samples andoccasionally there were donors that recognized acompletely separate epitope (fig S9) The remain-ing two peptides contained a single triple mutantthat abolished binding in themajority of samplesbut the critical region also extended further todifferent extents depending on the sample (figsS10 and S11) Unexpectedly in one of these pep-tides in addition to themain region surroundingpositions 13 and 14 that is critical for binding asingle Gly36rarrAla36 (G36A) mutation disruptedbinding in almost half of the samples whereasnone of the double- or triple-alaninemutants that

also included the adjacent positions [Lys35 (L35)and G37] affected binding (fig S11) It is possiblethat G36 plays a role in helping the peptide adoptan antigenic conformation and multiple mutantscontaining the adjacent Leu or Gly residues res-cue this ability We occasionally saw other exam-ples of mutations that resulted in patterns ofdisrupted binding with no simple explanationillustrating the complexity of antibody-antigeninteractionThe discovery of recurring targeted epitopes

led us to ask whether we could apply this knowl-edge to improve the sensitivity of viral detectionwith VirScan We hypothesized that samplesshowing a strong response to a recurrently tar-geted ldquodiagnosticrdquo peptide which we defined asa peptide enriched in at least 30 of known pos-itive samples are likely to be seropositive even ifthey do not meet our stringent cutoff requiringat least two non-overlapping enriched peptidesWe tested how this modified criterion affected

our sensitivity and specificity in detecting HIVand HCV and found that it reduced the numberof false negatives without affecting the specificityof the assay (fig S13) We next turned our atten-tion to respiratory syncytial virus (RSV) a virusfor which our detected seroprevalence was lowerthan reported epidemiological rates suggestingimperfect sensitivity of our assay We tested serafrom 60 individuals for antibodies to RSV byELISA and found that 95 were positive abovethe reported sensitivity of the assay and consist-ent with near-universal exposure to this patho-gen Applying the modified criterion to thesesamples increased our rate of detectionbyVirScanfrom 63 to 97 (table S2) These data suggestthat assigning more weight to recurrently tar-geted epitopes can enhance the sensitivity ofVirScan and that the performance of the assaycan be improved by screening known positivesfor a particular virus

Discussion

We have developed VirScan a technology foridentifying viral exposure and B cell epitopesacross the entire known human virome in a sin-gle multiplex reaction using less than a drop ofblood VirScan uses DNA microarray synthesisand bacteriophage display to create a uniformsynthetic representation of peptide epitopes com-prising the human virome Immunoprecipitationand high-throughput DNA sequencing revealsthe peptides recognized by antibodies in the sam-ple VirScan is easily automated in 96-well formatto enable high-throughput sample processingBarcoding of samples during PCR enables pooledanalysis that can dramatically reduce the per-sample cost The VirScan approach has severaladvantages for studying the effect of viruses onthe host immune system By detecting antibodyresponses it can identify infectious agents thathave been cleared after an effective host responseCurrent serological methods of antiviral anti-body detection typically use the selection of asingle optimized antigen in order to achieve highaccuracy In contrast VirScanrsquos unique approachdoes not require such optimization in order toobtain similar performance VirScan achievessensitive detection by assaying each virusrsquos com-plete proteome to detect any antibodies directedto epitopes that can be captured in a 56-residuefragment and specificity by computationallyeliminating cross-reactive antibodies This un-biased approach identifies exposure to less well-studied viruses for which optimal serologicalantigens are not known and can be rapidly ex-tended to include new viruses as they are dis-covered (25)Although sensitive and selective VirScan has a

few limitations First it cannot detect epitopesthat require post-translationalmodifications Sec-ondly it cannot detect epitopes that involvediscontinuous sequences on protein fragmentsgreater than 56 residues In principle the lattercan be overcome byusing alternative technologiesthat allow for the display of full-length proteinssuch as parallel analysis of translated open read-ing frames (PLATO) (26) Third VirScan is likely

aaa0698-6 5 JUNE 2015 bull VOL 348 ISSUE 6239 sciencemagorg SCIENCE

Fig 4 Recognition of common epitopes within an antigenic peptide from human adenovirus Cpenton protein (L2) across individuals Each row is a sample Each column denotes the first mutatedposition for the (A) single- (B) double- and (C) triple-alaninemutant peptide starting with the N terminuson the left Each double- and triple-alaninemutant contains two or three adjacent mutations respectivelyextending toward the C terminus from the colored cell The color intensity of each cell indicates theenrichment of the mutant peptide relative to the wild-type For double-mutants the last position is blankThe same is true for the last two positions for triple mutants Data shown are the mean of two replicatesSingle-letter amino acid abbreviations are as follows F Phe H His I Ile K Lys N Asn P Pro Q Gln RArg TThr V Val and YTyr

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

to be less specific compared with certain nucleicacid tests that discern highly related virus strainsHowever VirScan demonstrates excellent sero-logical discrimination among similar virus speciessuch as HSV1 andHSV2 and can even distinguishthe genotype ofHCV69of the timeWe envisionthat VirScan will become an important tool forfirst-pass unbiased serologic screening appli-cations Individual viruses or viral proteins un-covered in this way can subsequently be analyzedin further detail by using more focused assays aswe have demonstrated for a panel of immunodo-minant epitopesWe have demonstrated that VirScan is a sen-

sitive and specific assay for detecting exposure toviruses across the human virome Because it canbe performed in high-throughput and requiresminimal sample and cost VirScan enables rapidand cost-effective screening of large numbers ofsamples to identify population-level differencesin virus exposure across the human virome Inthis work we analyzed over 106 million anti-body-viral peptide interactions in a comprehen-sive study of pan-virus serology in a large diversepopulation In doing so we detected 84 differentviral species in two or more individuals This islikely to be an underestimate of the history ofviral infection because only low levels of circu-lating antibodies may remain from infectionsthat were cleared in the distant past In additionan individual could be infected by multiple dis-tinct strains of each viral species We identifiedknown and novel differences in virus exposurebetween groups differing in age HIV status andgeographic location across four different con-tinents Our results are largely consistent withprevious studies validating the effectiveness ofVirScan For example CMVantibodieswere foundin significantly higher frequencies in Peru Thai-land and South Africa whereas KSHV andHSV1antibodies were detectedmore frequently in Peruand South Africa but not in Thailand (16 27ndash31)We also uncovered previously undocumentedserological differences such as an increased rateof antibodies against adenovirus B and RSV inHIV-positive individuals compared with HIV-negative individuals These differences may pro-vide insight into howHIV co-infection alters thebalance between host immunity and residentviruses as well as help to identify pathogens thatmay increase susceptibility to HIV and otherheterologous infections HIV infection may re-duce the immune systemrsquos ability to control re-activation of normally dormant resident virusesor to prevent opportunistic infections from takinghold and triggering a strong adaptive immuneresponse Beyond the epidemiological applica-tions demonstrated here VirScan could also beapplied to identify viral exposures that correlatewith disease or other phenotypes in virome-wideassociation studiesOur results identified a large number of novel

B cell epitopes cumulatively nearly doubling thenumber of all previously identified viral epitopesWe have used our data to identify globally im-munodominant and commonly recognized ldquopub-licrdquo epitopes For most species of viruses one or

more peptides are individually recognized inover 70 to 95 of samples positive for thatspecies (table S3) We identified a set of two pep-tides that together are recognized by gt95 ofall screened samples and a set of five peptidesthat together are recognized in gt99 of screenedsamples These public epitopes could be usedto improve vaccine design by piggybacking onthe existing antibody response against themFusing a public B cell epitope to a protein in avaccine to which we hope to induce an immuneresponsemay increase a vaccinersquos efficacy amonga broad population by improving presentationof that protein and aiding affinity maturationPreexisting B cells recognizing the public epi-tope can act as antigen presenting cells to pro-cess and present T cell epitopes of the fusedvaccine target on MHC class I and II (32) An-tibodies secreted by these B cells can also par-ticipate in immune complexes with the fusedvaccine target which are critical for folliculardendritic cells to prime class switching and af-finity maturation of B cells recognizing otherepitopes on the fused antigen (33) Last wedemonstrated that applying more weight tothese public epitopes increases the sensitivityof VirScan without significantly affecting spec-ificity suggesting that this limited subset of pep-tides can serve as the basis for the next generationof our assay or for other novel diagnosticsWe also found that the precise epitopes re-

cognized by the B cell response are highly similaramong individuals across many viral proteinsOne possible model for this notable similarity isthat these regions possess properties favorablefor antigenicity such as accessibility Anothermodel is that the same or highly similar B cellreceptor sequences that recognize these epitopesare commonly generated Identical T cell receptorsequences (ldquopublicrdquo clonotypes) have been foundinmultiple individuals and are thought to be theresult of biases during the recombination proc-ess that favor certain amino acid sequences(34) V(D)J recombination of the immunoglobulinheavy- and light-chain loci is also heavily biased(35) Highly similar or even identical complemen-tarity determining region 3 (CDR3) sequenceshave been observed in dengue virusndashspecific an-tibodies from different individuals (36) It is pos-sible that rather than being an exception fordengue-specific antibodies this represents ageneral phenomenon Inherent biases in V(D)Jrecombination generate the same or similar an-tibodies in multiple individuals that recognizehighly similar epitopes Slight differences in theantibodyCDR3sequencemay subtly alter antibody-antigen interaction leading to the slight var-iations observed in the extent of critical epitoperegions Sequencing of antigen-specific antibodygenes will be required to investigate these pos-sibilities The same principle may also apply to Tcell epitopes and their cognate T cell receptorsVirScan is a method that enables human

virome-wide explorationmdashat the epitope levelmdashof immune responses in large numbers of indi-viduals We have demonstrated its effectivenessfor determining viral exposure and characteriz-

ing viral B cell epitopes in high throughput andat high resolution Our preliminary studies haverevealed intriguing general properties of the hu-man immune system both at the individual andpopulation scale VirScan will be an importanttool in uncovering the effect of host-virome in-teractions on human health and disease andcould easily be expanded to include other humanpathogens such as bacteria fungi and protozoa

Materials and methods

Human donor samples

Specimens originating from human donors werecollected after informed written consent was ob-tained and under a protocol approved by the localgoverning human research protection committeeSecondary use of all samples for the purposesof this work was exempted by the Brigham andWomenrsquos Hospital Institutional Review Board(protocol number 2013P001337) Samples includeddonors residing in Thailand (n = 48) Peru (n =48) South Africa (n = 48) and the Unites StatesincludingHIV+ donors (n=61) andHCV+ donors(n = 26) All serum and plasma samples werestored in aliquots at ndash80degC until use

Design and cloning of viral peptideand scanning mutagenesislibrary sequences

For the virome peptide library we first down-loaded all protein sequences in the UniProt data-base from viruses with human host and collapsedon 90 sequence identity [wwwuniprotorgunirefquery=uniprot(host ldquoHuman+[9606]rdquo)+identity09] The clustering algorithm UniProtrepresents each group of protein sequencessharing at least 90 sequence similarity with asingle representative sequence Then we created56ndashamino acid (aa) peptide sequences tilingthrough all the proteins with 28-aa overlap Wereverse-translated these peptide sequences intoDNA codons optimized for expression in Esche-richia coli making synonymousmutations whennecessary to avoid restriction sites used in sub-sequent cloning steps (EcoRI and XhoI) Lastwe added the adapter sequence AGGAATTC-CGCTGCGT to the 5prime end and CAGGGAAGA-GCTCGAA to the 3prime end to form the 200-nucleotide(nt) oligonucleotide sequencesFor the scanning mutagenesis library we first

took the sequences of the peptides to be muta-genized For each peptide we made all single-mutant and consecutivedouble- and triple-mutantsequences scanning through the whole peptideNon-alanine amino acids were mutated to ala-nine and alanines were mutated to glycine Wereverse-translated these peptide sequences intoDNA codons making synonymous mutationswhen necessary to avoid restriction sites used insubsequent cloning steps (EcoRI and XhoI) Wealso made synonymous mutations to ensure thatthe 50 nt at the 5prime end of peptide sequence isunique to allow unambiguous mapping of thesequencing results Last we added the adaptersequence AGGAATTCCGCTGCGT to the 5prime endand CAGGGAAGAGCTCGAA to the 3prime end to formthe 200-nt oligonucleotide sequences

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-7

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

The 200-nt oligonucleotide sequences weresynthesized on a releasable DNAmicroarray WePCR-amplified the DNAwith the primers T7-PFAAATGATACGGCGGGAATTCCGCTGCGT) and T7-PRA (CAAGCAGAAGACTCGAGCTCTTCCCTG) di-gested the product with EcoRI and XhoI andcloned the fragment into the EcoRISalI site ofthe T7FNS2 vector (8) The resulting library waspackaged into T7 bacteriophage by using the T7Select Packaging Kit (EMD Millipore) and ampli-fied by using themanufacturer suggested protocol

Phage immunoprecipitationand sequencing

We performed phage immunoprecipitation andsequencing by using a slightlymodified version ofpreviously published PhIP-Seq protocols (8 10)First we blocked eachwell of a 96-deep-well platewith 1 ml of 3 bovine serum albumin in TBSTovernight on a rotator at 4degC To each preblockedwell we added sera or plasma containing about2 mg of immunoglobulinG (IgG) [quantified usinga Human IgG ELISA Quantitation Set (BethylLaboratories)] and 1 ml of the bacteriophagelibrary diluted to ~2 times 105 fold representation(2 times 1010 plaque-forming units for a library of105 clones) in phage extraction buffer (20 mMTris-HCl pH 80 100 mMNaCl 6 mMMgSO4)We performed two technical replicates for eachsample We allowed the antibodies to bind thephage overnight on a rotator at 4degC The nextday we added 20 ml each of magnetic protein Aand protein G Dynabeads (Invitrogen) to eachwell and allowed immunoprecipitation to occurfor 4 hours on a rotator at 4degC With a 96-wellmagnetic stand we then washed the beads threetimeswith 400 ml of PhIP-Seqwash buffer (50mMTris-HCl pH 75 150mMNaCl 01 NP-40) Afterthe final wash we resuspended the beads in40 ml of water and lysed the phage at 95degC for10mWe also lysed phage from the library beforeimmunoprecipitation (ldquoinputrdquo) and after immu-noprecipitation with beads aloneWe prepared the DNA for multiplexed Il-

lumina sequencing by using a slightly modifiedversion of a previously published protocol (36)Weperformed two rounds of PCR amplification on thelysedphagematerial usinghot startQ5polymeraseaccording to the manufacturer-suggested protocol(NEB) The first round of PCR used the primersIS7_HsORF5_2 (ACACTCTTTCCCTACACGACTC-CAGTCAGGTGTGATGCTC) and IS8_HsORF3_2(GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCC-GAGCTTATCGTCGTCATCC) The second roundof PCR used 1 ml of the first-round product andthe primers IS4_HsORF5_2 (AATGATACGGCGA-CCACCGAGATCTACACTCTTTCCCTACACGACTC-CAGT) and a different unique indexing primerfor each sample to bemultiplexed for sequencing(CAAGCAGAAGACGGCATACGAGATxxxxxxx-GTGACTGGAGTTCAGACGTGT where ldquoxxxxxxxrdquodenotes a unique 7-nt indexing sequence) Afterthe second round of PCR we determined theDNA concentration of each sample by quan-titative PCR and pooled equimolar amounts ofall samples for gel extraction After gel extractionthe pooled DNA was sequenced by the Harvard

Medical School Biopolymers Facility using a 50ndashbase pair read cycle on an IlluminaHiSeq 2000 or2500We pooled up to 192 samples for sequencingon each lane and generally obtained ~100 mil-lion to 200 million reads per lane (500000 to1000000 reads per sample)

Informatics and statistical analysis

We performed the initial informatics and statis-tical analysis by using a slightly modified versionof the previously published technique (8 10) Wefirstmapped the sequencing reads to the originallibrary sequences by using Bowtie and countedthe frequency of each clone in the ldquoinputrdquo andeach sample ldquooutputrdquo (37) Because the majorityof clones are not enriched we used the observeddistribution of output counts as a null distribu-tion We found that a zero-inflated generalizedpoisson distribution fits our output counts wellWe used this null distribution to calculate a Pvalue for the likelihood of enrichment for eachclone The probabilitymass function for the zero-inflated generalized poisson distribution is

PethY frac14 yTHORNfrac14 pthorn eth1 minus pTHORNfrac12qethqthorn lTHORNxminus1eminusqminusxl y frac14 0

eth1 minus pTHORNfrac12qethqthorn lTHORNxminus1eminusqminusxl y gt 0

We used maximum likelihood estimation toregress the parameters p q and l to fit the dis-tribution of counts after immunoprecipitationfor all clones present at a particular frequencycount in the input We repeated this procedurefor all of the observed input counts and foundthat q and l are well fit by linear regression and pby an exponential regression as a function of in-put count (fig S1) Last for each clonewe used itsinput count and the regression results to deter-mine the null distribution based on the zero-inflated generalized poisson model which weused to calculate the ndashlog10(P value) of obtain-ing the observed countTo call hits we determined the threshold for

reproducibility between technical replicates basedon a previously published method (10) Brieflywe made scatter plots of the log10 of the ndashlog10 (Pvalues) and used a slidingwindowofwidth 0005from0 to 2 across the axis of one replicate For allthe clones that fell within each window we cal-culated the median and median absolute devia-tion of the log10 of the ndashlog10 (P values) in theother replicate and plotted it against the windowlocation (fig S2) We called the threshold for re-producibility the first window in which the me-dian was greater than the median absolutedeviation We found that the distribution of thethreshold ndashlog10 (P value) was centered around amean of ~23 (fig S12) So we called a peptide ahit if the ndashlog10 (P value) was at least 23 in bothreplicates We eliminated the 593 hits that cameup in at least 3 of the 22 immunoprecipitationswith beads alone (negative control for nonspecificbinding) We also filtered out any peptides thatwere not enriched in at least two of the samplesTo call virus exposures we grouped peptides

according to the virus the peptide is derived

from We grouped all peptides from individual vi-ral strains for which we had complete proteomesThe samplewas counted as positive for a species ifit was positive for any strain from that species Forviral strains thathadpartial proteomeswegroupedthemwith other strains from the same species toform a complete set and bioinformatically elim-inated homologous peptides (see next para-graph) We set a threshold number of hits pervirus based on the size of the virus We foundthat there is approximately a power-law relation-ship between size of the virus and the averagenumber of hits per sample (fig S3) In comparingresults from VirScan to samples with known in-fection we empirically determined that a thresh-old of three hits for HSV1 worked the best Weused this value and the slope of the best fit lineto scale the threshold for other viruses We alsoset a minimum threshold of at least two hits inorder to avoid false positives from single spuri-ous hitsTo bioinformatically remove cross-reactive an-

tibodies we first sorted the viruses by total num-ber of hits in descending order We then iteratedthrough each virus in this order For each viruswe iterated through each peptide hit If the hitshared a subsequence of at least 7 aa with any hitpreviously observed in any of the viruses fromthat sample that hit was considered to be from across-reactive antibody and would be ignored forthat virus Otherwise the hit is considered to bespecific and the score for that virus is incre-mented by one In this way we summed only thepeptide hits that do not share any linear epi-topes We compared the final score for each virusto the threshold for that virus to determinewheth-er the sample is positive for exposure to that virusTo identify differences between populations

we first used Fisherrsquos exact test to calculate a Pvalue for the significance of association of virusexposure with one population versus anotherThenwe constructed anull distribution of Fisherrsquosexact P values by randomly permuting the sam-ple labels 1000 times and recalculating the Fisherrsquosexact P value for each virus With use of this nulldistribution we calculated the false discovery rateby dividing the number of permutation P valuesmore extreme than the one observed by the totalnumber of permutations

IEDB epitope overlap analysis

Wedownloaded data for all continuous humanBcell epitopes from IEDB and filtered out all non-viral epitopes (22) To avoid redundancy in these4549 viral epitopes we grouped together epi-topes that are 100 identical or share a 7-aa sub-sequence giving us 1559 nonredundant epitopegroups Of these groups 1392 contain a memberepitope that is also a subsequence of a peptide inthe VirScan library This represents the totalnumber of epitopes we could detect by VirScanTo determine the number of epitopes we de-tected we tallied the number of epitope groupswith at least one member that is contained in apeptide that was enriched in one or two samplesLast to determine the number of nonredundantnew epitopeswe detected we grouped non-IEDB

aaa0698-8 5 JUNE 2015 bull VOL 348 ISSUE 6239 sciencemagorg SCIENCE

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

epitopes containing peptides that share a seven-residue subsequence and counted the number ofthese nonredundant peptide groups

Scanning mutagenesis data analysis

First we estimated the fractional abundance ofeach peptide by dividing the number of readsfor that peptide by the total number of reads forthe sample Then we divided the fractional abun-dance of each peptide after immunoprecipitationby the fractional abundance before immunopre-cipitation to get the enrichment To calculaterelative enrichment we divided enrichment ofthe mutated peptide by enrichment of the wild-type peptide Because most of the single-mutantpeptides had wild-type levels of enrichment weaveraged enrichment of the wild-type peptideenrichment with the middle two quartiles of en-richment of single-mutant peptides to get a bet-ter estimate of the wild-type peptide enrichment

RSV and HSV1 and 2 serology

Serum from 44 donors was tested for HSV1 andHSV2 antibodies by using the HerpeSelect 1 and 2Immunoblot IgG kit (Focus Diagnostics) accord-ing to manufacturerrsquos protocol Serum from 60 do-nors was tested for RSV antibodies by usinganti-RSV IgG Human ELISA Kit (ab108765) ac-cording to manufacturerrsquos protocol

REFERENCES AND NOTES

1 K M Wylie G M Weinstock G A Storch Emerging view ofthe human virome Transl Res 160 283ndash290 (2012)doi 101016jtrsl201203006 pmid 22683423

2 B A Duerkop L V Hooper Resident viruses and theirinteractions with the immune system Nat Immunol 14654ndash659 (2013) doi 101038ni2614 pmid 23778792

3 E S Barton et al Herpesvirus latency confers symbioticprotection from bacterial infection Nature 447 326ndash329(2007) doi 101038nature05762 pmid 17507983

4 E F Foxman A Iwasaki Genome-virome interactionsExamining the role of common viral infections in complexdisease Nat Rev Microbiol 9 254ndash264 (2011) doi 101038nrmicro2541 pmid 21407242

5 M Lecuit M Eloit The human virome New tools andconcepts Trends Microbiol 21 510ndash515 (2013) doi 101016jtim201307001 pmid 23906500

6 I De Vlaminck et al Temporal response of the human virome toimmunosuppression and antiviral therapy Cell 155 1178ndash1187(2013) doi 101016jcell201310034 pmid 24267896

7 E Hammarlund et al Duration of antiviral immunity aftersmallpox vaccination Nat Med 9 1131ndash1137 (2003)doi 101038nm917 pmid 12925846

8 H B Larman et al Autoantigen discovery with a synthetichuman peptidome Nat Biotechnol 29 535ndash541 (2011)doi 101038nbt1856 pmid 21602805

9 UniProt Consortium Activities at the Universal ProteinResource (UniProt) Nucleic Acids Res 42 D191ndashD198 (2014)doi 101093nargkt1140 pmid 24253303

10 H B Larman et al PhIP-Seq characterization ofautoantibodies from patients with multiple sclerosis type

1 diabetes and rheumatoid arthritis J Autoimmun 43 1ndash9(2013) doi 101016jjaut201301013 pmid 23497938

11 C Bialecki H M Feder Jr J M Grant-Kels The six classicchildhood exanthems A review and update J Am AcadDermatol 21 891ndash903 (1989) doi 101016S0190-9622(89)70275-9 pmid 2681288

12 J H Lee W K Roth S Zeuzem Evaluation and comparison ofdifferent hepatitis C virus genotyping and serotyping assaysJ Hepatol 26 1001ndash1009 (1997) doi 101016S0168-8278(97)80108-0 pmid 9186830

13 H F L Wertheim et al Key role for clumping factor B inStaphylococcus aureus nasal colonization of humans PLOSMed 5 e17 (2008) doi 101371journalpmed0050017pmid 18198942

14 R A Manz A E Hauser F Hiepe A Radbruch Maintenance ofserum antibody levels Annu Rev Immunol 23 367ndash386 (2005)doi 101146annurevimmunol23021704115723 pmid 15771575

15 M Wang et al Human anti-JC virus serum reacts with nativebut not denatured JC virus major capsid protein VP1 J VirolMethods 78 171ndash176 (1999) doi 101016S0166-0934(98)00180-3 pmid 10204707

16 S A S Staras et al Seroprevalence of cytomegalovirusinfection in the United States 1988-1994 Clin Infect Dis 431143ndash1151 (2006) doi 101086508173 pmid 17029132

17 M A Reynolds D Kruszon-Moran A Jumaan D S SchmidG M McQuillan Varicella seroprevalence in the US Data fromthe National Health and Nutrition Examination Survey 1999-2004Public Health Rep 125 860ndash869 (2010)pmid 21121231

18 J I Cohen Epstein-Barr virus infection N Engl J Med 343481ndash492 (2000) doi 101056NEJM200008173430707pmid 10944566

19 L Dong et al A combination of serological assays to detecthuman antibodies to the avian influenza A H7N9 virus PLOSONE 9 e95612 (2014) doi 101371journalpone0095612pmid 24755627

20 P Patel et al Prevalence and risk factors associated withherpes simplex virus-2 infection in a contemporary cohort ofHIV-infected persons in the United States Sex Transm Dis39 154ndash160 (2012) doi 101097OLQ0b013e318239d7fdpmid 22249305

21 C T Stover et al Prevalence of and risk factors for viralinfections among human immunodeficiency virus(HIV)-infected and high-risk HIV-uninfected women J InfectDis 187 1388ndash1396 (2003)pmid 12717619

22 E A Engels et al Risk factors for human herpesvirus 8infection among adults in the United States and evidence forsexual transmission J Infect Dis 196 199ndash207 (2007)doi 101086518791 pmid 17570106

23 R Vita et al The immune epitope database 20 Nucleic Acids Res38 D854ndashD862 (2010) doi 101093nargkp1004 pmid 19906713

24 H Singh H R Ansari G P S Raghava Improved method forlinear B-cell epitope prediction using antigenrsquos primarysequence PLOS ONE 8 e62216 (2013) doi 101371journalpone0062216 pmid 23667458

25 J L Mokili F Rohwer B E Dutilh Metagenomics and futureperspectives in virus discovery Curr Opin Virol 2 63ndash77(2012) doi 101016jcoviro201112004 pmid 22440968

26 J Zhu et al Protein interaction discovery using parallelanalysis of translated ORFs (PLATO) Nat Biotechnol 3131ndash334 (2013) doi 101038nbt2539 pmid 23503679

27 Y Urwijitaroon S Teawpatanataworn A KitjareontarmPrevalence of cytomegalovirus antibody in Thai-northeasternblood donors Southeast Asian J Trop Med Public Health 24(suppl 1) 180ndash182 (1993) pmid 7886568

28 M J Cannon D S Schmid T B Hyde Review ofcytomegalovirus seroprevalence and demographiccharacteristics associated with infection Rev Med Virol 20202ndash213 (2010) doi 101002rmv655 pmid 20564615

29 S Mohanna et al Human herpesvirus-8 in Peruvian blooddonors A population with hyperendemic disease Clin Infect Dis44 558ndash561 (2007) doi 101086511044 pmid 17243060

30 D Ablashi et al Seroprevalence of human herpesvirus-8(HHV-8) in countries of Southeast Asia compared to the USAthe Caribbean and Africa Br J Cancer 81 893ndash897 (1999)doi 101038sjbjc6690782 pmid 10555764

31 J S Smith N J Robinson Age-specific prevalence of infectionwith herpes simplex virus types 2 and 1 A global reviewJ Infect Dis 186 (suppl 1) S3ndashS28 (2002) doi 101086343739 pmid 12353183

32 A Heit et al CpG-DNA aided cross-priming by cross-presenting B cells J Immunol 172 1501ndash1507 (2004)doi 104049jimmunol17231501 pmid 14734727

33 Y Aydar S Sukumar A K Szakal J G Tew The influence ofimmune complex-bearing follicular dendritic cells on the IgMresponse Ig class switching and production of high affinityIgG J Immunol 174 5358ndash5366 (2005) doi 104049jimmunol17495358 pmid 15843533

34 M F Quigley et al Convergent recombination shapes theclonotypic landscape of the naive T-cell repertoire Proc NatlAcad Sci USA 107 19414ndash19419 (2010) doi 101073pnas1010586107 pmid 20974936

35 K J L Jackson M J Kidd Y Wang A M Collins The shapeof the lymphocyte receptor repertoire Lessons from the B cellreceptor Front Immunol 4 263 (2013) doi 103389fimmu201300263 pmid 24032032

36 P Parameswaran et al Convergent antibody signatures inhuman dengue Cell Host Microbe 13 691ndash700 (2013)doi 101016jchom201305008 pmid 23768493

37 B Langmead C Trapnell M Pop S L Salzberg Ultrafast andmemory-efficient alignment of short DNA sequences to thehuman genome Genome Biol 10 R25 (2009)doi 101186gb-2009-10-3-r25 pmid 19261174

ACKNOWLEDGMENTS

We thank E Unger and S Buranapraditkun for providing reagentsK Wucherpfennig (Harvard) and H Ploegh (MIT) for criticalreading of the manuscript and TWIST Bioscience for providingaccess to their advanced oligonucleotide synthesis technology Thecohort in Durban South Africa was funded by the NIH(R37AI067073) and the International AIDS Vaccine Initiative(UKZNRSA1001) TN received additional funding from the SouthAfrican Research Chairs Initiative the Victor Daitz Foundation andan International Early Career Scientist Award from the HowardHughes Medical Institute RTC was funded by grants NIHDA033541 and AI082630 CB and JS were supported by NIHN01-AI-30024 and N01-Al-15422 NIHndashNational Institute of Dentaland Craniofacial Research R01 DE018925-04 the HIVACATprogram and CUTHIVAC 241904 KR is supported by TRF SeniorResearch Scholar the Thailand Research Fund the ChulalongkornUniversity Research Professor Program Thailand and NIH grantN01-A1-30024 GJX and TK were supported by the NSFGraduate Research Fellowships Program SJE and BW areInvestigators with the Howard Hughes Medical Institute GJXTK HBL and SJE are inventors on a patent application(application no PCTUS1470902) filed by Brigham and WomenrsquosHospital Incorporated that covers the use of phage displaylibraries to detect antiviral antibodies

SUPPLEMENTARY MATERIALS

wwwsciencemagorgcontent3486239aaa0698supplDC1Supplementary TextFigs S1 to S14Tables S1 to S3

12 October 2014 accepted 24 April 2015101126scienceaaa0698

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-9

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

Comprehensive serological profiling of human populations using a synthetic human virome

Christian Brander Raymond T Chung Kevin C OConnor Bruce Walker H Benjamin Larman and Stephen J ElledgeGeorge J Xu Tomasz Kula Qikai Xu Mamie Z Li Suzanne D Vernon Thumbi Ndungu Kiat Ruxrungtham Jorge Sanchez

DOI 101126scienceaaa0698 (6239) aaa0698348Science

this issue 101126scienceaaa0698Scienceresponses in most individuals targeted the same viral epitopesabout 10 viral species over their lifetime Despite differences in the rates of exposure to specific viruses the antibodysamples from nearly 600 people of differing ages and geographic locations and found that most had been exposed to

developed a blood test that identifies antibodies against all known human viruses They studied bloodet alpeptides Xu exposures Typically such tests measure only one virus at a time Using a synthetic representation of all human viralimmune system Blood tests that detect antiviral antibodies can provide information about both past and present viral

In addition to causing illness viruses leave indelible footprints behind because infection permanently alters thethe complete historyminusminusViral exposure

ARTICLE TOOLS httpsciencesciencemagorgcontent3486239aaa0698

MATERIALSSUPPLEMENTARY httpsciencesciencemagorgcontentsuppl201506033486239aaa0698DC1

CONTENTRELATED

httpstmsciencemagorgcontentscitransmed5203203ra126fullhttpstmsciencemagorgcontentscitransmed6242242ra83full

REFERENCES

httpsciencesciencemagorgcontent3486239aaa0698BIBLThis article cites 37 articles 3 of which you can access for free

PERMISSIONS httpwwwsciencemagorghelpreprints-and-permissions

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASScienceScience 1200 New York Avenue NW Washington DC 20005 The title (print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

Copyright copy 2015 American Association for the Advancement of Science

on Septem

ber 17 2020

httpsciencesciencemagorg

Dow

nloaded from

Page 6: VIRAL IMMUNOLOGY Comprehensive serological profiling of … · Systematic viral epitope scanning (VirScan). This method allows comprehensive analysis of antiviral antibodies in human

likely reflects enrichment for amino acids thatare surface-exposed or can form stronger inter-actions with antibodies

B cell responses target highly similar viralepitopes across individuals

We compared the profile of peptides recognizedby the antibody response in different individualsWe found that for a given protein each samplegenerally only had strong responses against oneto three immunodominant peptides (Fig 3) Un-expectedly we found that the vast majority ofseropositive samples for a given virus recognizedthe same immunodominant peptides suggestingthat the antiviral B cell response is highly stereo-typed across individuals For example in glyco-protein G from respiratory syncytial virus thereis only a single immunodominant peptide com-prising positions 141 to 196 that is targeted by allsamples with detectable antibodies to the pro-tein regardless of the country of origin (Fig 3A)

For other antigens we observed interpopu-lation serological differences For example twooverlapping peptides from positions 309 to 364and 337 to 392 of the penton base protein fromadenovirus C frequently elicited antibody re-sponses (Fig 3B) However donors from theUnited States andSouthAfrica hadmuch strongerresponses to peptide 309-364 (P lt 10minus6 t test)relative to donors from Thailand and Peru Weobserved that for the EBNA1 protein from EBVdonors from all four countries frequently hadstrong responses to peptide 393-448 and occa-sionally to peptide 589-644 However donorsfrom Thailand and Peru had much stronger re-sponses to peptide 57-112 (Plt 10minus6 t test) (Fig 3C)These differences may reflect variation in thestrains endemic in each region In addition poly-morphism of major histocompatibility complex(MHC) class II alleles immunoglobulin genesand other modifiers that shape immune re-sponses in each population likely play a role in

defining the relative immunodominance of anti-genic peptidesTo determine whether the humoral responses

that target an immunodominant peptide are ac-tually targeting precisely the same epitope weconstructed single- double- and triple-alaninescanningmutagenesis libraries for eight common-ly recognized peptides These were introducedinto the same T7 bacteriophage display vectorand subjected to the same immunoprecipitationand sequencing protocol using samples from theUnited States Mutants that disrupt the epitopediminish antibody binding affinity and peptideenrichment We found that for all eight peptidestested there was a single largely contiguous sub-sequence in which mutations disrupted bindingfor the majority of samples As expected the tri-ple mutants abolished antibody binding to agreater extent and the enrichment patternsweresimilar among single double and triple mutantsof the same peptide (Fig 4 and figs S5 to S11)

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-5

Fig 3The human antiviromeresponse recognizes a similarspectrum of peptides amonginfected individuals In theheat-map charts each row is apeptide tiling across the indi-cated protein and each columnis a sample The colored barabove each column labeled atthe top of the panels indicatesthe country of origin for thatsample The samples shown area subset of individuals withantibodies to at least one pep-tide from the protein The colorintensity of each cellcorresponds to the ndashlog10(Pvalue) measure of significanceof enrichment for a peptide in asample (greater values indi-cates stronger antibodyresponse) Data are shown for(A) human RSV attachmentglycoprotein G (G) (B) humanadenovirus C penton protein(L2) and (C) EBV nuclear anti-gen 1 (EBNA1) Data shown arethe mean of two replicates

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

For four of the eight peptides a 9ndash to 15ndashaminoacid region was critical for antibody recognitionin gt90 of samples (Fig 4 and figs S5 to S7)One other peptide had a region of similar sizethat was critical in about half of the samples (figS8) In another peptide a single region was im-portant for antibody recognition in the majorityof the samples but the extents of the critical re-gion varied slightly for different samples andoccasionally there were donors that recognized acompletely separate epitope (fig S9) The remain-ing two peptides contained a single triple mutantthat abolished binding in themajority of samplesbut the critical region also extended further todifferent extents depending on the sample (figsS10 and S11) Unexpectedly in one of these pep-tides in addition to themain region surroundingpositions 13 and 14 that is critical for binding asingle Gly36rarrAla36 (G36A) mutation disruptedbinding in almost half of the samples whereasnone of the double- or triple-alaninemutants that

also included the adjacent positions [Lys35 (L35)and G37] affected binding (fig S11) It is possiblethat G36 plays a role in helping the peptide adoptan antigenic conformation and multiple mutantscontaining the adjacent Leu or Gly residues res-cue this ability We occasionally saw other exam-ples of mutations that resulted in patterns ofdisrupted binding with no simple explanationillustrating the complexity of antibody-antigeninteractionThe discovery of recurring targeted epitopes

led us to ask whether we could apply this knowl-edge to improve the sensitivity of viral detectionwith VirScan We hypothesized that samplesshowing a strong response to a recurrently tar-geted ldquodiagnosticrdquo peptide which we defined asa peptide enriched in at least 30 of known pos-itive samples are likely to be seropositive even ifthey do not meet our stringent cutoff requiringat least two non-overlapping enriched peptidesWe tested how this modified criterion affected

our sensitivity and specificity in detecting HIVand HCV and found that it reduced the numberof false negatives without affecting the specificityof the assay (fig S13) We next turned our atten-tion to respiratory syncytial virus (RSV) a virusfor which our detected seroprevalence was lowerthan reported epidemiological rates suggestingimperfect sensitivity of our assay We tested serafrom 60 individuals for antibodies to RSV byELISA and found that 95 were positive abovethe reported sensitivity of the assay and consist-ent with near-universal exposure to this patho-gen Applying the modified criterion to thesesamples increased our rate of detectionbyVirScanfrom 63 to 97 (table S2) These data suggestthat assigning more weight to recurrently tar-geted epitopes can enhance the sensitivity ofVirScan and that the performance of the assaycan be improved by screening known positivesfor a particular virus

Discussion

We have developed VirScan a technology foridentifying viral exposure and B cell epitopesacross the entire known human virome in a sin-gle multiplex reaction using less than a drop ofblood VirScan uses DNA microarray synthesisand bacteriophage display to create a uniformsynthetic representation of peptide epitopes com-prising the human virome Immunoprecipitationand high-throughput DNA sequencing revealsthe peptides recognized by antibodies in the sam-ple VirScan is easily automated in 96-well formatto enable high-throughput sample processingBarcoding of samples during PCR enables pooledanalysis that can dramatically reduce the per-sample cost The VirScan approach has severaladvantages for studying the effect of viruses onthe host immune system By detecting antibodyresponses it can identify infectious agents thathave been cleared after an effective host responseCurrent serological methods of antiviral anti-body detection typically use the selection of asingle optimized antigen in order to achieve highaccuracy In contrast VirScanrsquos unique approachdoes not require such optimization in order toobtain similar performance VirScan achievessensitive detection by assaying each virusrsquos com-plete proteome to detect any antibodies directedto epitopes that can be captured in a 56-residuefragment and specificity by computationallyeliminating cross-reactive antibodies This un-biased approach identifies exposure to less well-studied viruses for which optimal serologicalantigens are not known and can be rapidly ex-tended to include new viruses as they are dis-covered (25)Although sensitive and selective VirScan has a

few limitations First it cannot detect epitopesthat require post-translationalmodifications Sec-ondly it cannot detect epitopes that involvediscontinuous sequences on protein fragmentsgreater than 56 residues In principle the lattercan be overcome byusing alternative technologiesthat allow for the display of full-length proteinssuch as parallel analysis of translated open read-ing frames (PLATO) (26) Third VirScan is likely

aaa0698-6 5 JUNE 2015 bull VOL 348 ISSUE 6239 sciencemagorg SCIENCE

Fig 4 Recognition of common epitopes within an antigenic peptide from human adenovirus Cpenton protein (L2) across individuals Each row is a sample Each column denotes the first mutatedposition for the (A) single- (B) double- and (C) triple-alaninemutant peptide starting with the N terminuson the left Each double- and triple-alaninemutant contains two or three adjacent mutations respectivelyextending toward the C terminus from the colored cell The color intensity of each cell indicates theenrichment of the mutant peptide relative to the wild-type For double-mutants the last position is blankThe same is true for the last two positions for triple mutants Data shown are the mean of two replicatesSingle-letter amino acid abbreviations are as follows F Phe H His I Ile K Lys N Asn P Pro Q Gln RArg TThr V Val and YTyr

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

to be less specific compared with certain nucleicacid tests that discern highly related virus strainsHowever VirScan demonstrates excellent sero-logical discrimination among similar virus speciessuch as HSV1 andHSV2 and can even distinguishthe genotype ofHCV69of the timeWe envisionthat VirScan will become an important tool forfirst-pass unbiased serologic screening appli-cations Individual viruses or viral proteins un-covered in this way can subsequently be analyzedin further detail by using more focused assays aswe have demonstrated for a panel of immunodo-minant epitopesWe have demonstrated that VirScan is a sen-

sitive and specific assay for detecting exposure toviruses across the human virome Because it canbe performed in high-throughput and requiresminimal sample and cost VirScan enables rapidand cost-effective screening of large numbers ofsamples to identify population-level differencesin virus exposure across the human virome Inthis work we analyzed over 106 million anti-body-viral peptide interactions in a comprehen-sive study of pan-virus serology in a large diversepopulation In doing so we detected 84 differentviral species in two or more individuals This islikely to be an underestimate of the history ofviral infection because only low levels of circu-lating antibodies may remain from infectionsthat were cleared in the distant past In additionan individual could be infected by multiple dis-tinct strains of each viral species We identifiedknown and novel differences in virus exposurebetween groups differing in age HIV status andgeographic location across four different con-tinents Our results are largely consistent withprevious studies validating the effectiveness ofVirScan For example CMVantibodieswere foundin significantly higher frequencies in Peru Thai-land and South Africa whereas KSHV andHSV1antibodies were detectedmore frequently in Peruand South Africa but not in Thailand (16 27ndash31)We also uncovered previously undocumentedserological differences such as an increased rateof antibodies against adenovirus B and RSV inHIV-positive individuals compared with HIV-negative individuals These differences may pro-vide insight into howHIV co-infection alters thebalance between host immunity and residentviruses as well as help to identify pathogens thatmay increase susceptibility to HIV and otherheterologous infections HIV infection may re-duce the immune systemrsquos ability to control re-activation of normally dormant resident virusesor to prevent opportunistic infections from takinghold and triggering a strong adaptive immuneresponse Beyond the epidemiological applica-tions demonstrated here VirScan could also beapplied to identify viral exposures that correlatewith disease or other phenotypes in virome-wideassociation studiesOur results identified a large number of novel

B cell epitopes cumulatively nearly doubling thenumber of all previously identified viral epitopesWe have used our data to identify globally im-munodominant and commonly recognized ldquopub-licrdquo epitopes For most species of viruses one or

more peptides are individually recognized inover 70 to 95 of samples positive for thatspecies (table S3) We identified a set of two pep-tides that together are recognized by gt95 ofall screened samples and a set of five peptidesthat together are recognized in gt99 of screenedsamples These public epitopes could be usedto improve vaccine design by piggybacking onthe existing antibody response against themFusing a public B cell epitope to a protein in avaccine to which we hope to induce an immuneresponsemay increase a vaccinersquos efficacy amonga broad population by improving presentationof that protein and aiding affinity maturationPreexisting B cells recognizing the public epi-tope can act as antigen presenting cells to pro-cess and present T cell epitopes of the fusedvaccine target on MHC class I and II (32) An-tibodies secreted by these B cells can also par-ticipate in immune complexes with the fusedvaccine target which are critical for folliculardendritic cells to prime class switching and af-finity maturation of B cells recognizing otherepitopes on the fused antigen (33) Last wedemonstrated that applying more weight tothese public epitopes increases the sensitivityof VirScan without significantly affecting spec-ificity suggesting that this limited subset of pep-tides can serve as the basis for the next generationof our assay or for other novel diagnosticsWe also found that the precise epitopes re-

cognized by the B cell response are highly similaramong individuals across many viral proteinsOne possible model for this notable similarity isthat these regions possess properties favorablefor antigenicity such as accessibility Anothermodel is that the same or highly similar B cellreceptor sequences that recognize these epitopesare commonly generated Identical T cell receptorsequences (ldquopublicrdquo clonotypes) have been foundinmultiple individuals and are thought to be theresult of biases during the recombination proc-ess that favor certain amino acid sequences(34) V(D)J recombination of the immunoglobulinheavy- and light-chain loci is also heavily biased(35) Highly similar or even identical complemen-tarity determining region 3 (CDR3) sequenceshave been observed in dengue virusndashspecific an-tibodies from different individuals (36) It is pos-sible that rather than being an exception fordengue-specific antibodies this represents ageneral phenomenon Inherent biases in V(D)Jrecombination generate the same or similar an-tibodies in multiple individuals that recognizehighly similar epitopes Slight differences in theantibodyCDR3sequencemay subtly alter antibody-antigen interaction leading to the slight var-iations observed in the extent of critical epitoperegions Sequencing of antigen-specific antibodygenes will be required to investigate these pos-sibilities The same principle may also apply to Tcell epitopes and their cognate T cell receptorsVirScan is a method that enables human

virome-wide explorationmdashat the epitope levelmdashof immune responses in large numbers of indi-viduals We have demonstrated its effectivenessfor determining viral exposure and characteriz-

ing viral B cell epitopes in high throughput andat high resolution Our preliminary studies haverevealed intriguing general properties of the hu-man immune system both at the individual andpopulation scale VirScan will be an importanttool in uncovering the effect of host-virome in-teractions on human health and disease andcould easily be expanded to include other humanpathogens such as bacteria fungi and protozoa

Materials and methods

Human donor samples

Specimens originating from human donors werecollected after informed written consent was ob-tained and under a protocol approved by the localgoverning human research protection committeeSecondary use of all samples for the purposesof this work was exempted by the Brigham andWomenrsquos Hospital Institutional Review Board(protocol number 2013P001337) Samples includeddonors residing in Thailand (n = 48) Peru (n =48) South Africa (n = 48) and the Unites StatesincludingHIV+ donors (n=61) andHCV+ donors(n = 26) All serum and plasma samples werestored in aliquots at ndash80degC until use

Design and cloning of viral peptideand scanning mutagenesislibrary sequences

For the virome peptide library we first down-loaded all protein sequences in the UniProt data-base from viruses with human host and collapsedon 90 sequence identity [wwwuniprotorgunirefquery=uniprot(host ldquoHuman+[9606]rdquo)+identity09] The clustering algorithm UniProtrepresents each group of protein sequencessharing at least 90 sequence similarity with asingle representative sequence Then we created56ndashamino acid (aa) peptide sequences tilingthrough all the proteins with 28-aa overlap Wereverse-translated these peptide sequences intoDNA codons optimized for expression in Esche-richia coli making synonymousmutations whennecessary to avoid restriction sites used in sub-sequent cloning steps (EcoRI and XhoI) Lastwe added the adapter sequence AGGAATTC-CGCTGCGT to the 5prime end and CAGGGAAGA-GCTCGAA to the 3prime end to form the 200-nucleotide(nt) oligonucleotide sequencesFor the scanning mutagenesis library we first

took the sequences of the peptides to be muta-genized For each peptide we made all single-mutant and consecutivedouble- and triple-mutantsequences scanning through the whole peptideNon-alanine amino acids were mutated to ala-nine and alanines were mutated to glycine Wereverse-translated these peptide sequences intoDNA codons making synonymous mutationswhen necessary to avoid restriction sites used insubsequent cloning steps (EcoRI and XhoI) Wealso made synonymous mutations to ensure thatthe 50 nt at the 5prime end of peptide sequence isunique to allow unambiguous mapping of thesequencing results Last we added the adaptersequence AGGAATTCCGCTGCGT to the 5prime endand CAGGGAAGAGCTCGAA to the 3prime end to formthe 200-nt oligonucleotide sequences

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-7

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

The 200-nt oligonucleotide sequences weresynthesized on a releasable DNAmicroarray WePCR-amplified the DNAwith the primers T7-PFAAATGATACGGCGGGAATTCCGCTGCGT) and T7-PRA (CAAGCAGAAGACTCGAGCTCTTCCCTG) di-gested the product with EcoRI and XhoI andcloned the fragment into the EcoRISalI site ofthe T7FNS2 vector (8) The resulting library waspackaged into T7 bacteriophage by using the T7Select Packaging Kit (EMD Millipore) and ampli-fied by using themanufacturer suggested protocol

Phage immunoprecipitationand sequencing

We performed phage immunoprecipitation andsequencing by using a slightlymodified version ofpreviously published PhIP-Seq protocols (8 10)First we blocked eachwell of a 96-deep-well platewith 1 ml of 3 bovine serum albumin in TBSTovernight on a rotator at 4degC To each preblockedwell we added sera or plasma containing about2 mg of immunoglobulinG (IgG) [quantified usinga Human IgG ELISA Quantitation Set (BethylLaboratories)] and 1 ml of the bacteriophagelibrary diluted to ~2 times 105 fold representation(2 times 1010 plaque-forming units for a library of105 clones) in phage extraction buffer (20 mMTris-HCl pH 80 100 mMNaCl 6 mMMgSO4)We performed two technical replicates for eachsample We allowed the antibodies to bind thephage overnight on a rotator at 4degC The nextday we added 20 ml each of magnetic protein Aand protein G Dynabeads (Invitrogen) to eachwell and allowed immunoprecipitation to occurfor 4 hours on a rotator at 4degC With a 96-wellmagnetic stand we then washed the beads threetimeswith 400 ml of PhIP-Seqwash buffer (50mMTris-HCl pH 75 150mMNaCl 01 NP-40) Afterthe final wash we resuspended the beads in40 ml of water and lysed the phage at 95degC for10mWe also lysed phage from the library beforeimmunoprecipitation (ldquoinputrdquo) and after immu-noprecipitation with beads aloneWe prepared the DNA for multiplexed Il-

lumina sequencing by using a slightly modifiedversion of a previously published protocol (36)Weperformed two rounds of PCR amplification on thelysedphagematerial usinghot startQ5polymeraseaccording to the manufacturer-suggested protocol(NEB) The first round of PCR used the primersIS7_HsORF5_2 (ACACTCTTTCCCTACACGACTC-CAGTCAGGTGTGATGCTC) and IS8_HsORF3_2(GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCC-GAGCTTATCGTCGTCATCC) The second roundof PCR used 1 ml of the first-round product andthe primers IS4_HsORF5_2 (AATGATACGGCGA-CCACCGAGATCTACACTCTTTCCCTACACGACTC-CAGT) and a different unique indexing primerfor each sample to bemultiplexed for sequencing(CAAGCAGAAGACGGCATACGAGATxxxxxxx-GTGACTGGAGTTCAGACGTGT where ldquoxxxxxxxrdquodenotes a unique 7-nt indexing sequence) Afterthe second round of PCR we determined theDNA concentration of each sample by quan-titative PCR and pooled equimolar amounts ofall samples for gel extraction After gel extractionthe pooled DNA was sequenced by the Harvard

Medical School Biopolymers Facility using a 50ndashbase pair read cycle on an IlluminaHiSeq 2000 or2500We pooled up to 192 samples for sequencingon each lane and generally obtained ~100 mil-lion to 200 million reads per lane (500000 to1000000 reads per sample)

Informatics and statistical analysis

We performed the initial informatics and statis-tical analysis by using a slightly modified versionof the previously published technique (8 10) Wefirstmapped the sequencing reads to the originallibrary sequences by using Bowtie and countedthe frequency of each clone in the ldquoinputrdquo andeach sample ldquooutputrdquo (37) Because the majorityof clones are not enriched we used the observeddistribution of output counts as a null distribu-tion We found that a zero-inflated generalizedpoisson distribution fits our output counts wellWe used this null distribution to calculate a Pvalue for the likelihood of enrichment for eachclone The probabilitymass function for the zero-inflated generalized poisson distribution is

PethY frac14 yTHORNfrac14 pthorn eth1 minus pTHORNfrac12qethqthorn lTHORNxminus1eminusqminusxl y frac14 0

eth1 minus pTHORNfrac12qethqthorn lTHORNxminus1eminusqminusxl y gt 0

We used maximum likelihood estimation toregress the parameters p q and l to fit the dis-tribution of counts after immunoprecipitationfor all clones present at a particular frequencycount in the input We repeated this procedurefor all of the observed input counts and foundthat q and l are well fit by linear regression and pby an exponential regression as a function of in-put count (fig S1) Last for each clonewe used itsinput count and the regression results to deter-mine the null distribution based on the zero-inflated generalized poisson model which weused to calculate the ndashlog10(P value) of obtain-ing the observed countTo call hits we determined the threshold for

reproducibility between technical replicates basedon a previously published method (10) Brieflywe made scatter plots of the log10 of the ndashlog10 (Pvalues) and used a slidingwindowofwidth 0005from0 to 2 across the axis of one replicate For allthe clones that fell within each window we cal-culated the median and median absolute devia-tion of the log10 of the ndashlog10 (P values) in theother replicate and plotted it against the windowlocation (fig S2) We called the threshold for re-producibility the first window in which the me-dian was greater than the median absolutedeviation We found that the distribution of thethreshold ndashlog10 (P value) was centered around amean of ~23 (fig S12) So we called a peptide ahit if the ndashlog10 (P value) was at least 23 in bothreplicates We eliminated the 593 hits that cameup in at least 3 of the 22 immunoprecipitationswith beads alone (negative control for nonspecificbinding) We also filtered out any peptides thatwere not enriched in at least two of the samplesTo call virus exposures we grouped peptides

according to the virus the peptide is derived

from We grouped all peptides from individual vi-ral strains for which we had complete proteomesThe samplewas counted as positive for a species ifit was positive for any strain from that species Forviral strains thathadpartial proteomeswegroupedthemwith other strains from the same species toform a complete set and bioinformatically elim-inated homologous peptides (see next para-graph) We set a threshold number of hits pervirus based on the size of the virus We foundthat there is approximately a power-law relation-ship between size of the virus and the averagenumber of hits per sample (fig S3) In comparingresults from VirScan to samples with known in-fection we empirically determined that a thresh-old of three hits for HSV1 worked the best Weused this value and the slope of the best fit lineto scale the threshold for other viruses We alsoset a minimum threshold of at least two hits inorder to avoid false positives from single spuri-ous hitsTo bioinformatically remove cross-reactive an-

tibodies we first sorted the viruses by total num-ber of hits in descending order We then iteratedthrough each virus in this order For each viruswe iterated through each peptide hit If the hitshared a subsequence of at least 7 aa with any hitpreviously observed in any of the viruses fromthat sample that hit was considered to be from across-reactive antibody and would be ignored forthat virus Otherwise the hit is considered to bespecific and the score for that virus is incre-mented by one In this way we summed only thepeptide hits that do not share any linear epi-topes We compared the final score for each virusto the threshold for that virus to determinewheth-er the sample is positive for exposure to that virusTo identify differences between populations

we first used Fisherrsquos exact test to calculate a Pvalue for the significance of association of virusexposure with one population versus anotherThenwe constructed anull distribution of Fisherrsquosexact P values by randomly permuting the sam-ple labels 1000 times and recalculating the Fisherrsquosexact P value for each virus With use of this nulldistribution we calculated the false discovery rateby dividing the number of permutation P valuesmore extreme than the one observed by the totalnumber of permutations

IEDB epitope overlap analysis

Wedownloaded data for all continuous humanBcell epitopes from IEDB and filtered out all non-viral epitopes (22) To avoid redundancy in these4549 viral epitopes we grouped together epi-topes that are 100 identical or share a 7-aa sub-sequence giving us 1559 nonredundant epitopegroups Of these groups 1392 contain a memberepitope that is also a subsequence of a peptide inthe VirScan library This represents the totalnumber of epitopes we could detect by VirScanTo determine the number of epitopes we de-tected we tallied the number of epitope groupswith at least one member that is contained in apeptide that was enriched in one or two samplesLast to determine the number of nonredundantnew epitopeswe detected we grouped non-IEDB

aaa0698-8 5 JUNE 2015 bull VOL 348 ISSUE 6239 sciencemagorg SCIENCE

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

epitopes containing peptides that share a seven-residue subsequence and counted the number ofthese nonredundant peptide groups

Scanning mutagenesis data analysis

First we estimated the fractional abundance ofeach peptide by dividing the number of readsfor that peptide by the total number of reads forthe sample Then we divided the fractional abun-dance of each peptide after immunoprecipitationby the fractional abundance before immunopre-cipitation to get the enrichment To calculaterelative enrichment we divided enrichment ofthe mutated peptide by enrichment of the wild-type peptide Because most of the single-mutantpeptides had wild-type levels of enrichment weaveraged enrichment of the wild-type peptideenrichment with the middle two quartiles of en-richment of single-mutant peptides to get a bet-ter estimate of the wild-type peptide enrichment

RSV and HSV1 and 2 serology

Serum from 44 donors was tested for HSV1 andHSV2 antibodies by using the HerpeSelect 1 and 2Immunoblot IgG kit (Focus Diagnostics) accord-ing to manufacturerrsquos protocol Serum from 60 do-nors was tested for RSV antibodies by usinganti-RSV IgG Human ELISA Kit (ab108765) ac-cording to manufacturerrsquos protocol

REFERENCES AND NOTES

1 K M Wylie G M Weinstock G A Storch Emerging view ofthe human virome Transl Res 160 283ndash290 (2012)doi 101016jtrsl201203006 pmid 22683423

2 B A Duerkop L V Hooper Resident viruses and theirinteractions with the immune system Nat Immunol 14654ndash659 (2013) doi 101038ni2614 pmid 23778792

3 E S Barton et al Herpesvirus latency confers symbioticprotection from bacterial infection Nature 447 326ndash329(2007) doi 101038nature05762 pmid 17507983

4 E F Foxman A Iwasaki Genome-virome interactionsExamining the role of common viral infections in complexdisease Nat Rev Microbiol 9 254ndash264 (2011) doi 101038nrmicro2541 pmid 21407242

5 M Lecuit M Eloit The human virome New tools andconcepts Trends Microbiol 21 510ndash515 (2013) doi 101016jtim201307001 pmid 23906500

6 I De Vlaminck et al Temporal response of the human virome toimmunosuppression and antiviral therapy Cell 155 1178ndash1187(2013) doi 101016jcell201310034 pmid 24267896

7 E Hammarlund et al Duration of antiviral immunity aftersmallpox vaccination Nat Med 9 1131ndash1137 (2003)doi 101038nm917 pmid 12925846

8 H B Larman et al Autoantigen discovery with a synthetichuman peptidome Nat Biotechnol 29 535ndash541 (2011)doi 101038nbt1856 pmid 21602805

9 UniProt Consortium Activities at the Universal ProteinResource (UniProt) Nucleic Acids Res 42 D191ndashD198 (2014)doi 101093nargkt1140 pmid 24253303

10 H B Larman et al PhIP-Seq characterization ofautoantibodies from patients with multiple sclerosis type

1 diabetes and rheumatoid arthritis J Autoimmun 43 1ndash9(2013) doi 101016jjaut201301013 pmid 23497938

11 C Bialecki H M Feder Jr J M Grant-Kels The six classicchildhood exanthems A review and update J Am AcadDermatol 21 891ndash903 (1989) doi 101016S0190-9622(89)70275-9 pmid 2681288

12 J H Lee W K Roth S Zeuzem Evaluation and comparison ofdifferent hepatitis C virus genotyping and serotyping assaysJ Hepatol 26 1001ndash1009 (1997) doi 101016S0168-8278(97)80108-0 pmid 9186830

13 H F L Wertheim et al Key role for clumping factor B inStaphylococcus aureus nasal colonization of humans PLOSMed 5 e17 (2008) doi 101371journalpmed0050017pmid 18198942

14 R A Manz A E Hauser F Hiepe A Radbruch Maintenance ofserum antibody levels Annu Rev Immunol 23 367ndash386 (2005)doi 101146annurevimmunol23021704115723 pmid 15771575

15 M Wang et al Human anti-JC virus serum reacts with nativebut not denatured JC virus major capsid protein VP1 J VirolMethods 78 171ndash176 (1999) doi 101016S0166-0934(98)00180-3 pmid 10204707

16 S A S Staras et al Seroprevalence of cytomegalovirusinfection in the United States 1988-1994 Clin Infect Dis 431143ndash1151 (2006) doi 101086508173 pmid 17029132

17 M A Reynolds D Kruszon-Moran A Jumaan D S SchmidG M McQuillan Varicella seroprevalence in the US Data fromthe National Health and Nutrition Examination Survey 1999-2004Public Health Rep 125 860ndash869 (2010)pmid 21121231

18 J I Cohen Epstein-Barr virus infection N Engl J Med 343481ndash492 (2000) doi 101056NEJM200008173430707pmid 10944566

19 L Dong et al A combination of serological assays to detecthuman antibodies to the avian influenza A H7N9 virus PLOSONE 9 e95612 (2014) doi 101371journalpone0095612pmid 24755627

20 P Patel et al Prevalence and risk factors associated withherpes simplex virus-2 infection in a contemporary cohort ofHIV-infected persons in the United States Sex Transm Dis39 154ndash160 (2012) doi 101097OLQ0b013e318239d7fdpmid 22249305

21 C T Stover et al Prevalence of and risk factors for viralinfections among human immunodeficiency virus(HIV)-infected and high-risk HIV-uninfected women J InfectDis 187 1388ndash1396 (2003)pmid 12717619

22 E A Engels et al Risk factors for human herpesvirus 8infection among adults in the United States and evidence forsexual transmission J Infect Dis 196 199ndash207 (2007)doi 101086518791 pmid 17570106

23 R Vita et al The immune epitope database 20 Nucleic Acids Res38 D854ndashD862 (2010) doi 101093nargkp1004 pmid 19906713

24 H Singh H R Ansari G P S Raghava Improved method forlinear B-cell epitope prediction using antigenrsquos primarysequence PLOS ONE 8 e62216 (2013) doi 101371journalpone0062216 pmid 23667458

25 J L Mokili F Rohwer B E Dutilh Metagenomics and futureperspectives in virus discovery Curr Opin Virol 2 63ndash77(2012) doi 101016jcoviro201112004 pmid 22440968

26 J Zhu et al Protein interaction discovery using parallelanalysis of translated ORFs (PLATO) Nat Biotechnol 3131ndash334 (2013) doi 101038nbt2539 pmid 23503679

27 Y Urwijitaroon S Teawpatanataworn A KitjareontarmPrevalence of cytomegalovirus antibody in Thai-northeasternblood donors Southeast Asian J Trop Med Public Health 24(suppl 1) 180ndash182 (1993) pmid 7886568

28 M J Cannon D S Schmid T B Hyde Review ofcytomegalovirus seroprevalence and demographiccharacteristics associated with infection Rev Med Virol 20202ndash213 (2010) doi 101002rmv655 pmid 20564615

29 S Mohanna et al Human herpesvirus-8 in Peruvian blooddonors A population with hyperendemic disease Clin Infect Dis44 558ndash561 (2007) doi 101086511044 pmid 17243060

30 D Ablashi et al Seroprevalence of human herpesvirus-8(HHV-8) in countries of Southeast Asia compared to the USAthe Caribbean and Africa Br J Cancer 81 893ndash897 (1999)doi 101038sjbjc6690782 pmid 10555764

31 J S Smith N J Robinson Age-specific prevalence of infectionwith herpes simplex virus types 2 and 1 A global reviewJ Infect Dis 186 (suppl 1) S3ndashS28 (2002) doi 101086343739 pmid 12353183

32 A Heit et al CpG-DNA aided cross-priming by cross-presenting B cells J Immunol 172 1501ndash1507 (2004)doi 104049jimmunol17231501 pmid 14734727

33 Y Aydar S Sukumar A K Szakal J G Tew The influence ofimmune complex-bearing follicular dendritic cells on the IgMresponse Ig class switching and production of high affinityIgG J Immunol 174 5358ndash5366 (2005) doi 104049jimmunol17495358 pmid 15843533

34 M F Quigley et al Convergent recombination shapes theclonotypic landscape of the naive T-cell repertoire Proc NatlAcad Sci USA 107 19414ndash19419 (2010) doi 101073pnas1010586107 pmid 20974936

35 K J L Jackson M J Kidd Y Wang A M Collins The shapeof the lymphocyte receptor repertoire Lessons from the B cellreceptor Front Immunol 4 263 (2013) doi 103389fimmu201300263 pmid 24032032

36 P Parameswaran et al Convergent antibody signatures inhuman dengue Cell Host Microbe 13 691ndash700 (2013)doi 101016jchom201305008 pmid 23768493

37 B Langmead C Trapnell M Pop S L Salzberg Ultrafast andmemory-efficient alignment of short DNA sequences to thehuman genome Genome Biol 10 R25 (2009)doi 101186gb-2009-10-3-r25 pmid 19261174

ACKNOWLEDGMENTS

We thank E Unger and S Buranapraditkun for providing reagentsK Wucherpfennig (Harvard) and H Ploegh (MIT) for criticalreading of the manuscript and TWIST Bioscience for providingaccess to their advanced oligonucleotide synthesis technology Thecohort in Durban South Africa was funded by the NIH(R37AI067073) and the International AIDS Vaccine Initiative(UKZNRSA1001) TN received additional funding from the SouthAfrican Research Chairs Initiative the Victor Daitz Foundation andan International Early Career Scientist Award from the HowardHughes Medical Institute RTC was funded by grants NIHDA033541 and AI082630 CB and JS were supported by NIHN01-AI-30024 and N01-Al-15422 NIHndashNational Institute of Dentaland Craniofacial Research R01 DE018925-04 the HIVACATprogram and CUTHIVAC 241904 KR is supported by TRF SeniorResearch Scholar the Thailand Research Fund the ChulalongkornUniversity Research Professor Program Thailand and NIH grantN01-A1-30024 GJX and TK were supported by the NSFGraduate Research Fellowships Program SJE and BW areInvestigators with the Howard Hughes Medical Institute GJXTK HBL and SJE are inventors on a patent application(application no PCTUS1470902) filed by Brigham and WomenrsquosHospital Incorporated that covers the use of phage displaylibraries to detect antiviral antibodies

SUPPLEMENTARY MATERIALS

wwwsciencemagorgcontent3486239aaa0698supplDC1Supplementary TextFigs S1 to S14Tables S1 to S3

12 October 2014 accepted 24 April 2015101126scienceaaa0698

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-9

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

Comprehensive serological profiling of human populations using a synthetic human virome

Christian Brander Raymond T Chung Kevin C OConnor Bruce Walker H Benjamin Larman and Stephen J ElledgeGeorge J Xu Tomasz Kula Qikai Xu Mamie Z Li Suzanne D Vernon Thumbi Ndungu Kiat Ruxrungtham Jorge Sanchez

DOI 101126scienceaaa0698 (6239) aaa0698348Science

this issue 101126scienceaaa0698Scienceresponses in most individuals targeted the same viral epitopesabout 10 viral species over their lifetime Despite differences in the rates of exposure to specific viruses the antibodysamples from nearly 600 people of differing ages and geographic locations and found that most had been exposed to

developed a blood test that identifies antibodies against all known human viruses They studied bloodet alpeptides Xu exposures Typically such tests measure only one virus at a time Using a synthetic representation of all human viralimmune system Blood tests that detect antiviral antibodies can provide information about both past and present viral

In addition to causing illness viruses leave indelible footprints behind because infection permanently alters thethe complete historyminusminusViral exposure

ARTICLE TOOLS httpsciencesciencemagorgcontent3486239aaa0698

MATERIALSSUPPLEMENTARY httpsciencesciencemagorgcontentsuppl201506033486239aaa0698DC1

CONTENTRELATED

httpstmsciencemagorgcontentscitransmed5203203ra126fullhttpstmsciencemagorgcontentscitransmed6242242ra83full

REFERENCES

httpsciencesciencemagorgcontent3486239aaa0698BIBLThis article cites 37 articles 3 of which you can access for free

PERMISSIONS httpwwwsciencemagorghelpreprints-and-permissions

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASScienceScience 1200 New York Avenue NW Washington DC 20005 The title (print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

Copyright copy 2015 American Association for the Advancement of Science

on Septem

ber 17 2020

httpsciencesciencemagorg

Dow

nloaded from

Page 7: VIRAL IMMUNOLOGY Comprehensive serological profiling of … · Systematic viral epitope scanning (VirScan). This method allows comprehensive analysis of antiviral antibodies in human

For four of the eight peptides a 9ndash to 15ndashaminoacid region was critical for antibody recognitionin gt90 of samples (Fig 4 and figs S5 to S7)One other peptide had a region of similar sizethat was critical in about half of the samples (figS8) In another peptide a single region was im-portant for antibody recognition in the majorityof the samples but the extents of the critical re-gion varied slightly for different samples andoccasionally there were donors that recognized acompletely separate epitope (fig S9) The remain-ing two peptides contained a single triple mutantthat abolished binding in themajority of samplesbut the critical region also extended further todifferent extents depending on the sample (figsS10 and S11) Unexpectedly in one of these pep-tides in addition to themain region surroundingpositions 13 and 14 that is critical for binding asingle Gly36rarrAla36 (G36A) mutation disruptedbinding in almost half of the samples whereasnone of the double- or triple-alaninemutants that

also included the adjacent positions [Lys35 (L35)and G37] affected binding (fig S11) It is possiblethat G36 plays a role in helping the peptide adoptan antigenic conformation and multiple mutantscontaining the adjacent Leu or Gly residues res-cue this ability We occasionally saw other exam-ples of mutations that resulted in patterns ofdisrupted binding with no simple explanationillustrating the complexity of antibody-antigeninteractionThe discovery of recurring targeted epitopes

led us to ask whether we could apply this knowl-edge to improve the sensitivity of viral detectionwith VirScan We hypothesized that samplesshowing a strong response to a recurrently tar-geted ldquodiagnosticrdquo peptide which we defined asa peptide enriched in at least 30 of known pos-itive samples are likely to be seropositive even ifthey do not meet our stringent cutoff requiringat least two non-overlapping enriched peptidesWe tested how this modified criterion affected

our sensitivity and specificity in detecting HIVand HCV and found that it reduced the numberof false negatives without affecting the specificityof the assay (fig S13) We next turned our atten-tion to respiratory syncytial virus (RSV) a virusfor which our detected seroprevalence was lowerthan reported epidemiological rates suggestingimperfect sensitivity of our assay We tested serafrom 60 individuals for antibodies to RSV byELISA and found that 95 were positive abovethe reported sensitivity of the assay and consist-ent with near-universal exposure to this patho-gen Applying the modified criterion to thesesamples increased our rate of detectionbyVirScanfrom 63 to 97 (table S2) These data suggestthat assigning more weight to recurrently tar-geted epitopes can enhance the sensitivity ofVirScan and that the performance of the assaycan be improved by screening known positivesfor a particular virus

Discussion

We have developed VirScan a technology foridentifying viral exposure and B cell epitopesacross the entire known human virome in a sin-gle multiplex reaction using less than a drop ofblood VirScan uses DNA microarray synthesisand bacteriophage display to create a uniformsynthetic representation of peptide epitopes com-prising the human virome Immunoprecipitationand high-throughput DNA sequencing revealsthe peptides recognized by antibodies in the sam-ple VirScan is easily automated in 96-well formatto enable high-throughput sample processingBarcoding of samples during PCR enables pooledanalysis that can dramatically reduce the per-sample cost The VirScan approach has severaladvantages for studying the effect of viruses onthe host immune system By detecting antibodyresponses it can identify infectious agents thathave been cleared after an effective host responseCurrent serological methods of antiviral anti-body detection typically use the selection of asingle optimized antigen in order to achieve highaccuracy In contrast VirScanrsquos unique approachdoes not require such optimization in order toobtain similar performance VirScan achievessensitive detection by assaying each virusrsquos com-plete proteome to detect any antibodies directedto epitopes that can be captured in a 56-residuefragment and specificity by computationallyeliminating cross-reactive antibodies This un-biased approach identifies exposure to less well-studied viruses for which optimal serologicalantigens are not known and can be rapidly ex-tended to include new viruses as they are dis-covered (25)Although sensitive and selective VirScan has a

few limitations First it cannot detect epitopesthat require post-translationalmodifications Sec-ondly it cannot detect epitopes that involvediscontinuous sequences on protein fragmentsgreater than 56 residues In principle the lattercan be overcome byusing alternative technologiesthat allow for the display of full-length proteinssuch as parallel analysis of translated open read-ing frames (PLATO) (26) Third VirScan is likely

aaa0698-6 5 JUNE 2015 bull VOL 348 ISSUE 6239 sciencemagorg SCIENCE

Fig 4 Recognition of common epitopes within an antigenic peptide from human adenovirus Cpenton protein (L2) across individuals Each row is a sample Each column denotes the first mutatedposition for the (A) single- (B) double- and (C) triple-alaninemutant peptide starting with the N terminuson the left Each double- and triple-alaninemutant contains two or three adjacent mutations respectivelyextending toward the C terminus from the colored cell The color intensity of each cell indicates theenrichment of the mutant peptide relative to the wild-type For double-mutants the last position is blankThe same is true for the last two positions for triple mutants Data shown are the mean of two replicatesSingle-letter amino acid abbreviations are as follows F Phe H His I Ile K Lys N Asn P Pro Q Gln RArg TThr V Val and YTyr

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

to be less specific compared with certain nucleicacid tests that discern highly related virus strainsHowever VirScan demonstrates excellent sero-logical discrimination among similar virus speciessuch as HSV1 andHSV2 and can even distinguishthe genotype ofHCV69of the timeWe envisionthat VirScan will become an important tool forfirst-pass unbiased serologic screening appli-cations Individual viruses or viral proteins un-covered in this way can subsequently be analyzedin further detail by using more focused assays aswe have demonstrated for a panel of immunodo-minant epitopesWe have demonstrated that VirScan is a sen-

sitive and specific assay for detecting exposure toviruses across the human virome Because it canbe performed in high-throughput and requiresminimal sample and cost VirScan enables rapidand cost-effective screening of large numbers ofsamples to identify population-level differencesin virus exposure across the human virome Inthis work we analyzed over 106 million anti-body-viral peptide interactions in a comprehen-sive study of pan-virus serology in a large diversepopulation In doing so we detected 84 differentviral species in two or more individuals This islikely to be an underestimate of the history ofviral infection because only low levels of circu-lating antibodies may remain from infectionsthat were cleared in the distant past In additionan individual could be infected by multiple dis-tinct strains of each viral species We identifiedknown and novel differences in virus exposurebetween groups differing in age HIV status andgeographic location across four different con-tinents Our results are largely consistent withprevious studies validating the effectiveness ofVirScan For example CMVantibodieswere foundin significantly higher frequencies in Peru Thai-land and South Africa whereas KSHV andHSV1antibodies were detectedmore frequently in Peruand South Africa but not in Thailand (16 27ndash31)We also uncovered previously undocumentedserological differences such as an increased rateof antibodies against adenovirus B and RSV inHIV-positive individuals compared with HIV-negative individuals These differences may pro-vide insight into howHIV co-infection alters thebalance between host immunity and residentviruses as well as help to identify pathogens thatmay increase susceptibility to HIV and otherheterologous infections HIV infection may re-duce the immune systemrsquos ability to control re-activation of normally dormant resident virusesor to prevent opportunistic infections from takinghold and triggering a strong adaptive immuneresponse Beyond the epidemiological applica-tions demonstrated here VirScan could also beapplied to identify viral exposures that correlatewith disease or other phenotypes in virome-wideassociation studiesOur results identified a large number of novel

B cell epitopes cumulatively nearly doubling thenumber of all previously identified viral epitopesWe have used our data to identify globally im-munodominant and commonly recognized ldquopub-licrdquo epitopes For most species of viruses one or

more peptides are individually recognized inover 70 to 95 of samples positive for thatspecies (table S3) We identified a set of two pep-tides that together are recognized by gt95 ofall screened samples and a set of five peptidesthat together are recognized in gt99 of screenedsamples These public epitopes could be usedto improve vaccine design by piggybacking onthe existing antibody response against themFusing a public B cell epitope to a protein in avaccine to which we hope to induce an immuneresponsemay increase a vaccinersquos efficacy amonga broad population by improving presentationof that protein and aiding affinity maturationPreexisting B cells recognizing the public epi-tope can act as antigen presenting cells to pro-cess and present T cell epitopes of the fusedvaccine target on MHC class I and II (32) An-tibodies secreted by these B cells can also par-ticipate in immune complexes with the fusedvaccine target which are critical for folliculardendritic cells to prime class switching and af-finity maturation of B cells recognizing otherepitopes on the fused antigen (33) Last wedemonstrated that applying more weight tothese public epitopes increases the sensitivityof VirScan without significantly affecting spec-ificity suggesting that this limited subset of pep-tides can serve as the basis for the next generationof our assay or for other novel diagnosticsWe also found that the precise epitopes re-

cognized by the B cell response are highly similaramong individuals across many viral proteinsOne possible model for this notable similarity isthat these regions possess properties favorablefor antigenicity such as accessibility Anothermodel is that the same or highly similar B cellreceptor sequences that recognize these epitopesare commonly generated Identical T cell receptorsequences (ldquopublicrdquo clonotypes) have been foundinmultiple individuals and are thought to be theresult of biases during the recombination proc-ess that favor certain amino acid sequences(34) V(D)J recombination of the immunoglobulinheavy- and light-chain loci is also heavily biased(35) Highly similar or even identical complemen-tarity determining region 3 (CDR3) sequenceshave been observed in dengue virusndashspecific an-tibodies from different individuals (36) It is pos-sible that rather than being an exception fordengue-specific antibodies this represents ageneral phenomenon Inherent biases in V(D)Jrecombination generate the same or similar an-tibodies in multiple individuals that recognizehighly similar epitopes Slight differences in theantibodyCDR3sequencemay subtly alter antibody-antigen interaction leading to the slight var-iations observed in the extent of critical epitoperegions Sequencing of antigen-specific antibodygenes will be required to investigate these pos-sibilities The same principle may also apply to Tcell epitopes and their cognate T cell receptorsVirScan is a method that enables human

virome-wide explorationmdashat the epitope levelmdashof immune responses in large numbers of indi-viduals We have demonstrated its effectivenessfor determining viral exposure and characteriz-

ing viral B cell epitopes in high throughput andat high resolution Our preliminary studies haverevealed intriguing general properties of the hu-man immune system both at the individual andpopulation scale VirScan will be an importanttool in uncovering the effect of host-virome in-teractions on human health and disease andcould easily be expanded to include other humanpathogens such as bacteria fungi and protozoa

Materials and methods

Human donor samples

Specimens originating from human donors werecollected after informed written consent was ob-tained and under a protocol approved by the localgoverning human research protection committeeSecondary use of all samples for the purposesof this work was exempted by the Brigham andWomenrsquos Hospital Institutional Review Board(protocol number 2013P001337) Samples includeddonors residing in Thailand (n = 48) Peru (n =48) South Africa (n = 48) and the Unites StatesincludingHIV+ donors (n=61) andHCV+ donors(n = 26) All serum and plasma samples werestored in aliquots at ndash80degC until use

Design and cloning of viral peptideand scanning mutagenesislibrary sequences

For the virome peptide library we first down-loaded all protein sequences in the UniProt data-base from viruses with human host and collapsedon 90 sequence identity [wwwuniprotorgunirefquery=uniprot(host ldquoHuman+[9606]rdquo)+identity09] The clustering algorithm UniProtrepresents each group of protein sequencessharing at least 90 sequence similarity with asingle representative sequence Then we created56ndashamino acid (aa) peptide sequences tilingthrough all the proteins with 28-aa overlap Wereverse-translated these peptide sequences intoDNA codons optimized for expression in Esche-richia coli making synonymousmutations whennecessary to avoid restriction sites used in sub-sequent cloning steps (EcoRI and XhoI) Lastwe added the adapter sequence AGGAATTC-CGCTGCGT to the 5prime end and CAGGGAAGA-GCTCGAA to the 3prime end to form the 200-nucleotide(nt) oligonucleotide sequencesFor the scanning mutagenesis library we first

took the sequences of the peptides to be muta-genized For each peptide we made all single-mutant and consecutivedouble- and triple-mutantsequences scanning through the whole peptideNon-alanine amino acids were mutated to ala-nine and alanines were mutated to glycine Wereverse-translated these peptide sequences intoDNA codons making synonymous mutationswhen necessary to avoid restriction sites used insubsequent cloning steps (EcoRI and XhoI) Wealso made synonymous mutations to ensure thatthe 50 nt at the 5prime end of peptide sequence isunique to allow unambiguous mapping of thesequencing results Last we added the adaptersequence AGGAATTCCGCTGCGT to the 5prime endand CAGGGAAGAGCTCGAA to the 3prime end to formthe 200-nt oligonucleotide sequences

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-7

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

The 200-nt oligonucleotide sequences weresynthesized on a releasable DNAmicroarray WePCR-amplified the DNAwith the primers T7-PFAAATGATACGGCGGGAATTCCGCTGCGT) and T7-PRA (CAAGCAGAAGACTCGAGCTCTTCCCTG) di-gested the product with EcoRI and XhoI andcloned the fragment into the EcoRISalI site ofthe T7FNS2 vector (8) The resulting library waspackaged into T7 bacteriophage by using the T7Select Packaging Kit (EMD Millipore) and ampli-fied by using themanufacturer suggested protocol

Phage immunoprecipitationand sequencing

We performed phage immunoprecipitation andsequencing by using a slightlymodified version ofpreviously published PhIP-Seq protocols (8 10)First we blocked eachwell of a 96-deep-well platewith 1 ml of 3 bovine serum albumin in TBSTovernight on a rotator at 4degC To each preblockedwell we added sera or plasma containing about2 mg of immunoglobulinG (IgG) [quantified usinga Human IgG ELISA Quantitation Set (BethylLaboratories)] and 1 ml of the bacteriophagelibrary diluted to ~2 times 105 fold representation(2 times 1010 plaque-forming units for a library of105 clones) in phage extraction buffer (20 mMTris-HCl pH 80 100 mMNaCl 6 mMMgSO4)We performed two technical replicates for eachsample We allowed the antibodies to bind thephage overnight on a rotator at 4degC The nextday we added 20 ml each of magnetic protein Aand protein G Dynabeads (Invitrogen) to eachwell and allowed immunoprecipitation to occurfor 4 hours on a rotator at 4degC With a 96-wellmagnetic stand we then washed the beads threetimeswith 400 ml of PhIP-Seqwash buffer (50mMTris-HCl pH 75 150mMNaCl 01 NP-40) Afterthe final wash we resuspended the beads in40 ml of water and lysed the phage at 95degC for10mWe also lysed phage from the library beforeimmunoprecipitation (ldquoinputrdquo) and after immu-noprecipitation with beads aloneWe prepared the DNA for multiplexed Il-

lumina sequencing by using a slightly modifiedversion of a previously published protocol (36)Weperformed two rounds of PCR amplification on thelysedphagematerial usinghot startQ5polymeraseaccording to the manufacturer-suggested protocol(NEB) The first round of PCR used the primersIS7_HsORF5_2 (ACACTCTTTCCCTACACGACTC-CAGTCAGGTGTGATGCTC) and IS8_HsORF3_2(GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCC-GAGCTTATCGTCGTCATCC) The second roundof PCR used 1 ml of the first-round product andthe primers IS4_HsORF5_2 (AATGATACGGCGA-CCACCGAGATCTACACTCTTTCCCTACACGACTC-CAGT) and a different unique indexing primerfor each sample to bemultiplexed for sequencing(CAAGCAGAAGACGGCATACGAGATxxxxxxx-GTGACTGGAGTTCAGACGTGT where ldquoxxxxxxxrdquodenotes a unique 7-nt indexing sequence) Afterthe second round of PCR we determined theDNA concentration of each sample by quan-titative PCR and pooled equimolar amounts ofall samples for gel extraction After gel extractionthe pooled DNA was sequenced by the Harvard

Medical School Biopolymers Facility using a 50ndashbase pair read cycle on an IlluminaHiSeq 2000 or2500We pooled up to 192 samples for sequencingon each lane and generally obtained ~100 mil-lion to 200 million reads per lane (500000 to1000000 reads per sample)

Informatics and statistical analysis

We performed the initial informatics and statis-tical analysis by using a slightly modified versionof the previously published technique (8 10) Wefirstmapped the sequencing reads to the originallibrary sequences by using Bowtie and countedthe frequency of each clone in the ldquoinputrdquo andeach sample ldquooutputrdquo (37) Because the majorityof clones are not enriched we used the observeddistribution of output counts as a null distribu-tion We found that a zero-inflated generalizedpoisson distribution fits our output counts wellWe used this null distribution to calculate a Pvalue for the likelihood of enrichment for eachclone The probabilitymass function for the zero-inflated generalized poisson distribution is

PethY frac14 yTHORNfrac14 pthorn eth1 minus pTHORNfrac12qethqthorn lTHORNxminus1eminusqminusxl y frac14 0

eth1 minus pTHORNfrac12qethqthorn lTHORNxminus1eminusqminusxl y gt 0

We used maximum likelihood estimation toregress the parameters p q and l to fit the dis-tribution of counts after immunoprecipitationfor all clones present at a particular frequencycount in the input We repeated this procedurefor all of the observed input counts and foundthat q and l are well fit by linear regression and pby an exponential regression as a function of in-put count (fig S1) Last for each clonewe used itsinput count and the regression results to deter-mine the null distribution based on the zero-inflated generalized poisson model which weused to calculate the ndashlog10(P value) of obtain-ing the observed countTo call hits we determined the threshold for

reproducibility between technical replicates basedon a previously published method (10) Brieflywe made scatter plots of the log10 of the ndashlog10 (Pvalues) and used a slidingwindowofwidth 0005from0 to 2 across the axis of one replicate For allthe clones that fell within each window we cal-culated the median and median absolute devia-tion of the log10 of the ndashlog10 (P values) in theother replicate and plotted it against the windowlocation (fig S2) We called the threshold for re-producibility the first window in which the me-dian was greater than the median absolutedeviation We found that the distribution of thethreshold ndashlog10 (P value) was centered around amean of ~23 (fig S12) So we called a peptide ahit if the ndashlog10 (P value) was at least 23 in bothreplicates We eliminated the 593 hits that cameup in at least 3 of the 22 immunoprecipitationswith beads alone (negative control for nonspecificbinding) We also filtered out any peptides thatwere not enriched in at least two of the samplesTo call virus exposures we grouped peptides

according to the virus the peptide is derived

from We grouped all peptides from individual vi-ral strains for which we had complete proteomesThe samplewas counted as positive for a species ifit was positive for any strain from that species Forviral strains thathadpartial proteomeswegroupedthemwith other strains from the same species toform a complete set and bioinformatically elim-inated homologous peptides (see next para-graph) We set a threshold number of hits pervirus based on the size of the virus We foundthat there is approximately a power-law relation-ship between size of the virus and the averagenumber of hits per sample (fig S3) In comparingresults from VirScan to samples with known in-fection we empirically determined that a thresh-old of three hits for HSV1 worked the best Weused this value and the slope of the best fit lineto scale the threshold for other viruses We alsoset a minimum threshold of at least two hits inorder to avoid false positives from single spuri-ous hitsTo bioinformatically remove cross-reactive an-

tibodies we first sorted the viruses by total num-ber of hits in descending order We then iteratedthrough each virus in this order For each viruswe iterated through each peptide hit If the hitshared a subsequence of at least 7 aa with any hitpreviously observed in any of the viruses fromthat sample that hit was considered to be from across-reactive antibody and would be ignored forthat virus Otherwise the hit is considered to bespecific and the score for that virus is incre-mented by one In this way we summed only thepeptide hits that do not share any linear epi-topes We compared the final score for each virusto the threshold for that virus to determinewheth-er the sample is positive for exposure to that virusTo identify differences between populations

we first used Fisherrsquos exact test to calculate a Pvalue for the significance of association of virusexposure with one population versus anotherThenwe constructed anull distribution of Fisherrsquosexact P values by randomly permuting the sam-ple labels 1000 times and recalculating the Fisherrsquosexact P value for each virus With use of this nulldistribution we calculated the false discovery rateby dividing the number of permutation P valuesmore extreme than the one observed by the totalnumber of permutations

IEDB epitope overlap analysis

Wedownloaded data for all continuous humanBcell epitopes from IEDB and filtered out all non-viral epitopes (22) To avoid redundancy in these4549 viral epitopes we grouped together epi-topes that are 100 identical or share a 7-aa sub-sequence giving us 1559 nonredundant epitopegroups Of these groups 1392 contain a memberepitope that is also a subsequence of a peptide inthe VirScan library This represents the totalnumber of epitopes we could detect by VirScanTo determine the number of epitopes we de-tected we tallied the number of epitope groupswith at least one member that is contained in apeptide that was enriched in one or two samplesLast to determine the number of nonredundantnew epitopeswe detected we grouped non-IEDB

aaa0698-8 5 JUNE 2015 bull VOL 348 ISSUE 6239 sciencemagorg SCIENCE

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

epitopes containing peptides that share a seven-residue subsequence and counted the number ofthese nonredundant peptide groups

Scanning mutagenesis data analysis

First we estimated the fractional abundance ofeach peptide by dividing the number of readsfor that peptide by the total number of reads forthe sample Then we divided the fractional abun-dance of each peptide after immunoprecipitationby the fractional abundance before immunopre-cipitation to get the enrichment To calculaterelative enrichment we divided enrichment ofthe mutated peptide by enrichment of the wild-type peptide Because most of the single-mutantpeptides had wild-type levels of enrichment weaveraged enrichment of the wild-type peptideenrichment with the middle two quartiles of en-richment of single-mutant peptides to get a bet-ter estimate of the wild-type peptide enrichment

RSV and HSV1 and 2 serology

Serum from 44 donors was tested for HSV1 andHSV2 antibodies by using the HerpeSelect 1 and 2Immunoblot IgG kit (Focus Diagnostics) accord-ing to manufacturerrsquos protocol Serum from 60 do-nors was tested for RSV antibodies by usinganti-RSV IgG Human ELISA Kit (ab108765) ac-cording to manufacturerrsquos protocol

REFERENCES AND NOTES

1 K M Wylie G M Weinstock G A Storch Emerging view ofthe human virome Transl Res 160 283ndash290 (2012)doi 101016jtrsl201203006 pmid 22683423

2 B A Duerkop L V Hooper Resident viruses and theirinteractions with the immune system Nat Immunol 14654ndash659 (2013) doi 101038ni2614 pmid 23778792

3 E S Barton et al Herpesvirus latency confers symbioticprotection from bacterial infection Nature 447 326ndash329(2007) doi 101038nature05762 pmid 17507983

4 E F Foxman A Iwasaki Genome-virome interactionsExamining the role of common viral infections in complexdisease Nat Rev Microbiol 9 254ndash264 (2011) doi 101038nrmicro2541 pmid 21407242

5 M Lecuit M Eloit The human virome New tools andconcepts Trends Microbiol 21 510ndash515 (2013) doi 101016jtim201307001 pmid 23906500

6 I De Vlaminck et al Temporal response of the human virome toimmunosuppression and antiviral therapy Cell 155 1178ndash1187(2013) doi 101016jcell201310034 pmid 24267896

7 E Hammarlund et al Duration of antiviral immunity aftersmallpox vaccination Nat Med 9 1131ndash1137 (2003)doi 101038nm917 pmid 12925846

8 H B Larman et al Autoantigen discovery with a synthetichuman peptidome Nat Biotechnol 29 535ndash541 (2011)doi 101038nbt1856 pmid 21602805

9 UniProt Consortium Activities at the Universal ProteinResource (UniProt) Nucleic Acids Res 42 D191ndashD198 (2014)doi 101093nargkt1140 pmid 24253303

10 H B Larman et al PhIP-Seq characterization ofautoantibodies from patients with multiple sclerosis type

1 diabetes and rheumatoid arthritis J Autoimmun 43 1ndash9(2013) doi 101016jjaut201301013 pmid 23497938

11 C Bialecki H M Feder Jr J M Grant-Kels The six classicchildhood exanthems A review and update J Am AcadDermatol 21 891ndash903 (1989) doi 101016S0190-9622(89)70275-9 pmid 2681288

12 J H Lee W K Roth S Zeuzem Evaluation and comparison ofdifferent hepatitis C virus genotyping and serotyping assaysJ Hepatol 26 1001ndash1009 (1997) doi 101016S0168-8278(97)80108-0 pmid 9186830

13 H F L Wertheim et al Key role for clumping factor B inStaphylococcus aureus nasal colonization of humans PLOSMed 5 e17 (2008) doi 101371journalpmed0050017pmid 18198942

14 R A Manz A E Hauser F Hiepe A Radbruch Maintenance ofserum antibody levels Annu Rev Immunol 23 367ndash386 (2005)doi 101146annurevimmunol23021704115723 pmid 15771575

15 M Wang et al Human anti-JC virus serum reacts with nativebut not denatured JC virus major capsid protein VP1 J VirolMethods 78 171ndash176 (1999) doi 101016S0166-0934(98)00180-3 pmid 10204707

16 S A S Staras et al Seroprevalence of cytomegalovirusinfection in the United States 1988-1994 Clin Infect Dis 431143ndash1151 (2006) doi 101086508173 pmid 17029132

17 M A Reynolds D Kruszon-Moran A Jumaan D S SchmidG M McQuillan Varicella seroprevalence in the US Data fromthe National Health and Nutrition Examination Survey 1999-2004Public Health Rep 125 860ndash869 (2010)pmid 21121231

18 J I Cohen Epstein-Barr virus infection N Engl J Med 343481ndash492 (2000) doi 101056NEJM200008173430707pmid 10944566

19 L Dong et al A combination of serological assays to detecthuman antibodies to the avian influenza A H7N9 virus PLOSONE 9 e95612 (2014) doi 101371journalpone0095612pmid 24755627

20 P Patel et al Prevalence and risk factors associated withherpes simplex virus-2 infection in a contemporary cohort ofHIV-infected persons in the United States Sex Transm Dis39 154ndash160 (2012) doi 101097OLQ0b013e318239d7fdpmid 22249305

21 C T Stover et al Prevalence of and risk factors for viralinfections among human immunodeficiency virus(HIV)-infected and high-risk HIV-uninfected women J InfectDis 187 1388ndash1396 (2003)pmid 12717619

22 E A Engels et al Risk factors for human herpesvirus 8infection among adults in the United States and evidence forsexual transmission J Infect Dis 196 199ndash207 (2007)doi 101086518791 pmid 17570106

23 R Vita et al The immune epitope database 20 Nucleic Acids Res38 D854ndashD862 (2010) doi 101093nargkp1004 pmid 19906713

24 H Singh H R Ansari G P S Raghava Improved method forlinear B-cell epitope prediction using antigenrsquos primarysequence PLOS ONE 8 e62216 (2013) doi 101371journalpone0062216 pmid 23667458

25 J L Mokili F Rohwer B E Dutilh Metagenomics and futureperspectives in virus discovery Curr Opin Virol 2 63ndash77(2012) doi 101016jcoviro201112004 pmid 22440968

26 J Zhu et al Protein interaction discovery using parallelanalysis of translated ORFs (PLATO) Nat Biotechnol 3131ndash334 (2013) doi 101038nbt2539 pmid 23503679

27 Y Urwijitaroon S Teawpatanataworn A KitjareontarmPrevalence of cytomegalovirus antibody in Thai-northeasternblood donors Southeast Asian J Trop Med Public Health 24(suppl 1) 180ndash182 (1993) pmid 7886568

28 M J Cannon D S Schmid T B Hyde Review ofcytomegalovirus seroprevalence and demographiccharacteristics associated with infection Rev Med Virol 20202ndash213 (2010) doi 101002rmv655 pmid 20564615

29 S Mohanna et al Human herpesvirus-8 in Peruvian blooddonors A population with hyperendemic disease Clin Infect Dis44 558ndash561 (2007) doi 101086511044 pmid 17243060

30 D Ablashi et al Seroprevalence of human herpesvirus-8(HHV-8) in countries of Southeast Asia compared to the USAthe Caribbean and Africa Br J Cancer 81 893ndash897 (1999)doi 101038sjbjc6690782 pmid 10555764

31 J S Smith N J Robinson Age-specific prevalence of infectionwith herpes simplex virus types 2 and 1 A global reviewJ Infect Dis 186 (suppl 1) S3ndashS28 (2002) doi 101086343739 pmid 12353183

32 A Heit et al CpG-DNA aided cross-priming by cross-presenting B cells J Immunol 172 1501ndash1507 (2004)doi 104049jimmunol17231501 pmid 14734727

33 Y Aydar S Sukumar A K Szakal J G Tew The influence ofimmune complex-bearing follicular dendritic cells on the IgMresponse Ig class switching and production of high affinityIgG J Immunol 174 5358ndash5366 (2005) doi 104049jimmunol17495358 pmid 15843533

34 M F Quigley et al Convergent recombination shapes theclonotypic landscape of the naive T-cell repertoire Proc NatlAcad Sci USA 107 19414ndash19419 (2010) doi 101073pnas1010586107 pmid 20974936

35 K J L Jackson M J Kidd Y Wang A M Collins The shapeof the lymphocyte receptor repertoire Lessons from the B cellreceptor Front Immunol 4 263 (2013) doi 103389fimmu201300263 pmid 24032032

36 P Parameswaran et al Convergent antibody signatures inhuman dengue Cell Host Microbe 13 691ndash700 (2013)doi 101016jchom201305008 pmid 23768493

37 B Langmead C Trapnell M Pop S L Salzberg Ultrafast andmemory-efficient alignment of short DNA sequences to thehuman genome Genome Biol 10 R25 (2009)doi 101186gb-2009-10-3-r25 pmid 19261174

ACKNOWLEDGMENTS

We thank E Unger and S Buranapraditkun for providing reagentsK Wucherpfennig (Harvard) and H Ploegh (MIT) for criticalreading of the manuscript and TWIST Bioscience for providingaccess to their advanced oligonucleotide synthesis technology Thecohort in Durban South Africa was funded by the NIH(R37AI067073) and the International AIDS Vaccine Initiative(UKZNRSA1001) TN received additional funding from the SouthAfrican Research Chairs Initiative the Victor Daitz Foundation andan International Early Career Scientist Award from the HowardHughes Medical Institute RTC was funded by grants NIHDA033541 and AI082630 CB and JS were supported by NIHN01-AI-30024 and N01-Al-15422 NIHndashNational Institute of Dentaland Craniofacial Research R01 DE018925-04 the HIVACATprogram and CUTHIVAC 241904 KR is supported by TRF SeniorResearch Scholar the Thailand Research Fund the ChulalongkornUniversity Research Professor Program Thailand and NIH grantN01-A1-30024 GJX and TK were supported by the NSFGraduate Research Fellowships Program SJE and BW areInvestigators with the Howard Hughes Medical Institute GJXTK HBL and SJE are inventors on a patent application(application no PCTUS1470902) filed by Brigham and WomenrsquosHospital Incorporated that covers the use of phage displaylibraries to detect antiviral antibodies

SUPPLEMENTARY MATERIALS

wwwsciencemagorgcontent3486239aaa0698supplDC1Supplementary TextFigs S1 to S14Tables S1 to S3

12 October 2014 accepted 24 April 2015101126scienceaaa0698

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-9

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

Comprehensive serological profiling of human populations using a synthetic human virome

Christian Brander Raymond T Chung Kevin C OConnor Bruce Walker H Benjamin Larman and Stephen J ElledgeGeorge J Xu Tomasz Kula Qikai Xu Mamie Z Li Suzanne D Vernon Thumbi Ndungu Kiat Ruxrungtham Jorge Sanchez

DOI 101126scienceaaa0698 (6239) aaa0698348Science

this issue 101126scienceaaa0698Scienceresponses in most individuals targeted the same viral epitopesabout 10 viral species over their lifetime Despite differences in the rates of exposure to specific viruses the antibodysamples from nearly 600 people of differing ages and geographic locations and found that most had been exposed to

developed a blood test that identifies antibodies against all known human viruses They studied bloodet alpeptides Xu exposures Typically such tests measure only one virus at a time Using a synthetic representation of all human viralimmune system Blood tests that detect antiviral antibodies can provide information about both past and present viral

In addition to causing illness viruses leave indelible footprints behind because infection permanently alters thethe complete historyminusminusViral exposure

ARTICLE TOOLS httpsciencesciencemagorgcontent3486239aaa0698

MATERIALSSUPPLEMENTARY httpsciencesciencemagorgcontentsuppl201506033486239aaa0698DC1

CONTENTRELATED

httpstmsciencemagorgcontentscitransmed5203203ra126fullhttpstmsciencemagorgcontentscitransmed6242242ra83full

REFERENCES

httpsciencesciencemagorgcontent3486239aaa0698BIBLThis article cites 37 articles 3 of which you can access for free

PERMISSIONS httpwwwsciencemagorghelpreprints-and-permissions

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASScienceScience 1200 New York Avenue NW Washington DC 20005 The title (print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

Copyright copy 2015 American Association for the Advancement of Science

on Septem

ber 17 2020

httpsciencesciencemagorg

Dow

nloaded from

Page 8: VIRAL IMMUNOLOGY Comprehensive serological profiling of … · Systematic viral epitope scanning (VirScan). This method allows comprehensive analysis of antiviral antibodies in human

to be less specific compared with certain nucleicacid tests that discern highly related virus strainsHowever VirScan demonstrates excellent sero-logical discrimination among similar virus speciessuch as HSV1 andHSV2 and can even distinguishthe genotype ofHCV69of the timeWe envisionthat VirScan will become an important tool forfirst-pass unbiased serologic screening appli-cations Individual viruses or viral proteins un-covered in this way can subsequently be analyzedin further detail by using more focused assays aswe have demonstrated for a panel of immunodo-minant epitopesWe have demonstrated that VirScan is a sen-

sitive and specific assay for detecting exposure toviruses across the human virome Because it canbe performed in high-throughput and requiresminimal sample and cost VirScan enables rapidand cost-effective screening of large numbers ofsamples to identify population-level differencesin virus exposure across the human virome Inthis work we analyzed over 106 million anti-body-viral peptide interactions in a comprehen-sive study of pan-virus serology in a large diversepopulation In doing so we detected 84 differentviral species in two or more individuals This islikely to be an underestimate of the history ofviral infection because only low levels of circu-lating antibodies may remain from infectionsthat were cleared in the distant past In additionan individual could be infected by multiple dis-tinct strains of each viral species We identifiedknown and novel differences in virus exposurebetween groups differing in age HIV status andgeographic location across four different con-tinents Our results are largely consistent withprevious studies validating the effectiveness ofVirScan For example CMVantibodieswere foundin significantly higher frequencies in Peru Thai-land and South Africa whereas KSHV andHSV1antibodies were detectedmore frequently in Peruand South Africa but not in Thailand (16 27ndash31)We also uncovered previously undocumentedserological differences such as an increased rateof antibodies against adenovirus B and RSV inHIV-positive individuals compared with HIV-negative individuals These differences may pro-vide insight into howHIV co-infection alters thebalance between host immunity and residentviruses as well as help to identify pathogens thatmay increase susceptibility to HIV and otherheterologous infections HIV infection may re-duce the immune systemrsquos ability to control re-activation of normally dormant resident virusesor to prevent opportunistic infections from takinghold and triggering a strong adaptive immuneresponse Beyond the epidemiological applica-tions demonstrated here VirScan could also beapplied to identify viral exposures that correlatewith disease or other phenotypes in virome-wideassociation studiesOur results identified a large number of novel

B cell epitopes cumulatively nearly doubling thenumber of all previously identified viral epitopesWe have used our data to identify globally im-munodominant and commonly recognized ldquopub-licrdquo epitopes For most species of viruses one or

more peptides are individually recognized inover 70 to 95 of samples positive for thatspecies (table S3) We identified a set of two pep-tides that together are recognized by gt95 ofall screened samples and a set of five peptidesthat together are recognized in gt99 of screenedsamples These public epitopes could be usedto improve vaccine design by piggybacking onthe existing antibody response against themFusing a public B cell epitope to a protein in avaccine to which we hope to induce an immuneresponsemay increase a vaccinersquos efficacy amonga broad population by improving presentationof that protein and aiding affinity maturationPreexisting B cells recognizing the public epi-tope can act as antigen presenting cells to pro-cess and present T cell epitopes of the fusedvaccine target on MHC class I and II (32) An-tibodies secreted by these B cells can also par-ticipate in immune complexes with the fusedvaccine target which are critical for folliculardendritic cells to prime class switching and af-finity maturation of B cells recognizing otherepitopes on the fused antigen (33) Last wedemonstrated that applying more weight tothese public epitopes increases the sensitivityof VirScan without significantly affecting spec-ificity suggesting that this limited subset of pep-tides can serve as the basis for the next generationof our assay or for other novel diagnosticsWe also found that the precise epitopes re-

cognized by the B cell response are highly similaramong individuals across many viral proteinsOne possible model for this notable similarity isthat these regions possess properties favorablefor antigenicity such as accessibility Anothermodel is that the same or highly similar B cellreceptor sequences that recognize these epitopesare commonly generated Identical T cell receptorsequences (ldquopublicrdquo clonotypes) have been foundinmultiple individuals and are thought to be theresult of biases during the recombination proc-ess that favor certain amino acid sequences(34) V(D)J recombination of the immunoglobulinheavy- and light-chain loci is also heavily biased(35) Highly similar or even identical complemen-tarity determining region 3 (CDR3) sequenceshave been observed in dengue virusndashspecific an-tibodies from different individuals (36) It is pos-sible that rather than being an exception fordengue-specific antibodies this represents ageneral phenomenon Inherent biases in V(D)Jrecombination generate the same or similar an-tibodies in multiple individuals that recognizehighly similar epitopes Slight differences in theantibodyCDR3sequencemay subtly alter antibody-antigen interaction leading to the slight var-iations observed in the extent of critical epitoperegions Sequencing of antigen-specific antibodygenes will be required to investigate these pos-sibilities The same principle may also apply to Tcell epitopes and their cognate T cell receptorsVirScan is a method that enables human

virome-wide explorationmdashat the epitope levelmdashof immune responses in large numbers of indi-viduals We have demonstrated its effectivenessfor determining viral exposure and characteriz-

ing viral B cell epitopes in high throughput andat high resolution Our preliminary studies haverevealed intriguing general properties of the hu-man immune system both at the individual andpopulation scale VirScan will be an importanttool in uncovering the effect of host-virome in-teractions on human health and disease andcould easily be expanded to include other humanpathogens such as bacteria fungi and protozoa

Materials and methods

Human donor samples

Specimens originating from human donors werecollected after informed written consent was ob-tained and under a protocol approved by the localgoverning human research protection committeeSecondary use of all samples for the purposesof this work was exempted by the Brigham andWomenrsquos Hospital Institutional Review Board(protocol number 2013P001337) Samples includeddonors residing in Thailand (n = 48) Peru (n =48) South Africa (n = 48) and the Unites StatesincludingHIV+ donors (n=61) andHCV+ donors(n = 26) All serum and plasma samples werestored in aliquots at ndash80degC until use

Design and cloning of viral peptideand scanning mutagenesislibrary sequences

For the virome peptide library we first down-loaded all protein sequences in the UniProt data-base from viruses with human host and collapsedon 90 sequence identity [wwwuniprotorgunirefquery=uniprot(host ldquoHuman+[9606]rdquo)+identity09] The clustering algorithm UniProtrepresents each group of protein sequencessharing at least 90 sequence similarity with asingle representative sequence Then we created56ndashamino acid (aa) peptide sequences tilingthrough all the proteins with 28-aa overlap Wereverse-translated these peptide sequences intoDNA codons optimized for expression in Esche-richia coli making synonymousmutations whennecessary to avoid restriction sites used in sub-sequent cloning steps (EcoRI and XhoI) Lastwe added the adapter sequence AGGAATTC-CGCTGCGT to the 5prime end and CAGGGAAGA-GCTCGAA to the 3prime end to form the 200-nucleotide(nt) oligonucleotide sequencesFor the scanning mutagenesis library we first

took the sequences of the peptides to be muta-genized For each peptide we made all single-mutant and consecutivedouble- and triple-mutantsequences scanning through the whole peptideNon-alanine amino acids were mutated to ala-nine and alanines were mutated to glycine Wereverse-translated these peptide sequences intoDNA codons making synonymous mutationswhen necessary to avoid restriction sites used insubsequent cloning steps (EcoRI and XhoI) Wealso made synonymous mutations to ensure thatthe 50 nt at the 5prime end of peptide sequence isunique to allow unambiguous mapping of thesequencing results Last we added the adaptersequence AGGAATTCCGCTGCGT to the 5prime endand CAGGGAAGAGCTCGAA to the 3prime end to formthe 200-nt oligonucleotide sequences

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-7

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

The 200-nt oligonucleotide sequences weresynthesized on a releasable DNAmicroarray WePCR-amplified the DNAwith the primers T7-PFAAATGATACGGCGGGAATTCCGCTGCGT) and T7-PRA (CAAGCAGAAGACTCGAGCTCTTCCCTG) di-gested the product with EcoRI and XhoI andcloned the fragment into the EcoRISalI site ofthe T7FNS2 vector (8) The resulting library waspackaged into T7 bacteriophage by using the T7Select Packaging Kit (EMD Millipore) and ampli-fied by using themanufacturer suggested protocol

Phage immunoprecipitationand sequencing

We performed phage immunoprecipitation andsequencing by using a slightlymodified version ofpreviously published PhIP-Seq protocols (8 10)First we blocked eachwell of a 96-deep-well platewith 1 ml of 3 bovine serum albumin in TBSTovernight on a rotator at 4degC To each preblockedwell we added sera or plasma containing about2 mg of immunoglobulinG (IgG) [quantified usinga Human IgG ELISA Quantitation Set (BethylLaboratories)] and 1 ml of the bacteriophagelibrary diluted to ~2 times 105 fold representation(2 times 1010 plaque-forming units for a library of105 clones) in phage extraction buffer (20 mMTris-HCl pH 80 100 mMNaCl 6 mMMgSO4)We performed two technical replicates for eachsample We allowed the antibodies to bind thephage overnight on a rotator at 4degC The nextday we added 20 ml each of magnetic protein Aand protein G Dynabeads (Invitrogen) to eachwell and allowed immunoprecipitation to occurfor 4 hours on a rotator at 4degC With a 96-wellmagnetic stand we then washed the beads threetimeswith 400 ml of PhIP-Seqwash buffer (50mMTris-HCl pH 75 150mMNaCl 01 NP-40) Afterthe final wash we resuspended the beads in40 ml of water and lysed the phage at 95degC for10mWe also lysed phage from the library beforeimmunoprecipitation (ldquoinputrdquo) and after immu-noprecipitation with beads aloneWe prepared the DNA for multiplexed Il-

lumina sequencing by using a slightly modifiedversion of a previously published protocol (36)Weperformed two rounds of PCR amplification on thelysedphagematerial usinghot startQ5polymeraseaccording to the manufacturer-suggested protocol(NEB) The first round of PCR used the primersIS7_HsORF5_2 (ACACTCTTTCCCTACACGACTC-CAGTCAGGTGTGATGCTC) and IS8_HsORF3_2(GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCC-GAGCTTATCGTCGTCATCC) The second roundof PCR used 1 ml of the first-round product andthe primers IS4_HsORF5_2 (AATGATACGGCGA-CCACCGAGATCTACACTCTTTCCCTACACGACTC-CAGT) and a different unique indexing primerfor each sample to bemultiplexed for sequencing(CAAGCAGAAGACGGCATACGAGATxxxxxxx-GTGACTGGAGTTCAGACGTGT where ldquoxxxxxxxrdquodenotes a unique 7-nt indexing sequence) Afterthe second round of PCR we determined theDNA concentration of each sample by quan-titative PCR and pooled equimolar amounts ofall samples for gel extraction After gel extractionthe pooled DNA was sequenced by the Harvard

Medical School Biopolymers Facility using a 50ndashbase pair read cycle on an IlluminaHiSeq 2000 or2500We pooled up to 192 samples for sequencingon each lane and generally obtained ~100 mil-lion to 200 million reads per lane (500000 to1000000 reads per sample)

Informatics and statistical analysis

We performed the initial informatics and statis-tical analysis by using a slightly modified versionof the previously published technique (8 10) Wefirstmapped the sequencing reads to the originallibrary sequences by using Bowtie and countedthe frequency of each clone in the ldquoinputrdquo andeach sample ldquooutputrdquo (37) Because the majorityof clones are not enriched we used the observeddistribution of output counts as a null distribu-tion We found that a zero-inflated generalizedpoisson distribution fits our output counts wellWe used this null distribution to calculate a Pvalue for the likelihood of enrichment for eachclone The probabilitymass function for the zero-inflated generalized poisson distribution is

PethY frac14 yTHORNfrac14 pthorn eth1 minus pTHORNfrac12qethqthorn lTHORNxminus1eminusqminusxl y frac14 0

eth1 minus pTHORNfrac12qethqthorn lTHORNxminus1eminusqminusxl y gt 0

We used maximum likelihood estimation toregress the parameters p q and l to fit the dis-tribution of counts after immunoprecipitationfor all clones present at a particular frequencycount in the input We repeated this procedurefor all of the observed input counts and foundthat q and l are well fit by linear regression and pby an exponential regression as a function of in-put count (fig S1) Last for each clonewe used itsinput count and the regression results to deter-mine the null distribution based on the zero-inflated generalized poisson model which weused to calculate the ndashlog10(P value) of obtain-ing the observed countTo call hits we determined the threshold for

reproducibility between technical replicates basedon a previously published method (10) Brieflywe made scatter plots of the log10 of the ndashlog10 (Pvalues) and used a slidingwindowofwidth 0005from0 to 2 across the axis of one replicate For allthe clones that fell within each window we cal-culated the median and median absolute devia-tion of the log10 of the ndashlog10 (P values) in theother replicate and plotted it against the windowlocation (fig S2) We called the threshold for re-producibility the first window in which the me-dian was greater than the median absolutedeviation We found that the distribution of thethreshold ndashlog10 (P value) was centered around amean of ~23 (fig S12) So we called a peptide ahit if the ndashlog10 (P value) was at least 23 in bothreplicates We eliminated the 593 hits that cameup in at least 3 of the 22 immunoprecipitationswith beads alone (negative control for nonspecificbinding) We also filtered out any peptides thatwere not enriched in at least two of the samplesTo call virus exposures we grouped peptides

according to the virus the peptide is derived

from We grouped all peptides from individual vi-ral strains for which we had complete proteomesThe samplewas counted as positive for a species ifit was positive for any strain from that species Forviral strains thathadpartial proteomeswegroupedthemwith other strains from the same species toform a complete set and bioinformatically elim-inated homologous peptides (see next para-graph) We set a threshold number of hits pervirus based on the size of the virus We foundthat there is approximately a power-law relation-ship between size of the virus and the averagenumber of hits per sample (fig S3) In comparingresults from VirScan to samples with known in-fection we empirically determined that a thresh-old of three hits for HSV1 worked the best Weused this value and the slope of the best fit lineto scale the threshold for other viruses We alsoset a minimum threshold of at least two hits inorder to avoid false positives from single spuri-ous hitsTo bioinformatically remove cross-reactive an-

tibodies we first sorted the viruses by total num-ber of hits in descending order We then iteratedthrough each virus in this order For each viruswe iterated through each peptide hit If the hitshared a subsequence of at least 7 aa with any hitpreviously observed in any of the viruses fromthat sample that hit was considered to be from across-reactive antibody and would be ignored forthat virus Otherwise the hit is considered to bespecific and the score for that virus is incre-mented by one In this way we summed only thepeptide hits that do not share any linear epi-topes We compared the final score for each virusto the threshold for that virus to determinewheth-er the sample is positive for exposure to that virusTo identify differences between populations

we first used Fisherrsquos exact test to calculate a Pvalue for the significance of association of virusexposure with one population versus anotherThenwe constructed anull distribution of Fisherrsquosexact P values by randomly permuting the sam-ple labels 1000 times and recalculating the Fisherrsquosexact P value for each virus With use of this nulldistribution we calculated the false discovery rateby dividing the number of permutation P valuesmore extreme than the one observed by the totalnumber of permutations

IEDB epitope overlap analysis

Wedownloaded data for all continuous humanBcell epitopes from IEDB and filtered out all non-viral epitopes (22) To avoid redundancy in these4549 viral epitopes we grouped together epi-topes that are 100 identical or share a 7-aa sub-sequence giving us 1559 nonredundant epitopegroups Of these groups 1392 contain a memberepitope that is also a subsequence of a peptide inthe VirScan library This represents the totalnumber of epitopes we could detect by VirScanTo determine the number of epitopes we de-tected we tallied the number of epitope groupswith at least one member that is contained in apeptide that was enriched in one or two samplesLast to determine the number of nonredundantnew epitopeswe detected we grouped non-IEDB

aaa0698-8 5 JUNE 2015 bull VOL 348 ISSUE 6239 sciencemagorg SCIENCE

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

epitopes containing peptides that share a seven-residue subsequence and counted the number ofthese nonredundant peptide groups

Scanning mutagenesis data analysis

First we estimated the fractional abundance ofeach peptide by dividing the number of readsfor that peptide by the total number of reads forthe sample Then we divided the fractional abun-dance of each peptide after immunoprecipitationby the fractional abundance before immunopre-cipitation to get the enrichment To calculaterelative enrichment we divided enrichment ofthe mutated peptide by enrichment of the wild-type peptide Because most of the single-mutantpeptides had wild-type levels of enrichment weaveraged enrichment of the wild-type peptideenrichment with the middle two quartiles of en-richment of single-mutant peptides to get a bet-ter estimate of the wild-type peptide enrichment

RSV and HSV1 and 2 serology

Serum from 44 donors was tested for HSV1 andHSV2 antibodies by using the HerpeSelect 1 and 2Immunoblot IgG kit (Focus Diagnostics) accord-ing to manufacturerrsquos protocol Serum from 60 do-nors was tested for RSV antibodies by usinganti-RSV IgG Human ELISA Kit (ab108765) ac-cording to manufacturerrsquos protocol

REFERENCES AND NOTES

1 K M Wylie G M Weinstock G A Storch Emerging view ofthe human virome Transl Res 160 283ndash290 (2012)doi 101016jtrsl201203006 pmid 22683423

2 B A Duerkop L V Hooper Resident viruses and theirinteractions with the immune system Nat Immunol 14654ndash659 (2013) doi 101038ni2614 pmid 23778792

3 E S Barton et al Herpesvirus latency confers symbioticprotection from bacterial infection Nature 447 326ndash329(2007) doi 101038nature05762 pmid 17507983

4 E F Foxman A Iwasaki Genome-virome interactionsExamining the role of common viral infections in complexdisease Nat Rev Microbiol 9 254ndash264 (2011) doi 101038nrmicro2541 pmid 21407242

5 M Lecuit M Eloit The human virome New tools andconcepts Trends Microbiol 21 510ndash515 (2013) doi 101016jtim201307001 pmid 23906500

6 I De Vlaminck et al Temporal response of the human virome toimmunosuppression and antiviral therapy Cell 155 1178ndash1187(2013) doi 101016jcell201310034 pmid 24267896

7 E Hammarlund et al Duration of antiviral immunity aftersmallpox vaccination Nat Med 9 1131ndash1137 (2003)doi 101038nm917 pmid 12925846

8 H B Larman et al Autoantigen discovery with a synthetichuman peptidome Nat Biotechnol 29 535ndash541 (2011)doi 101038nbt1856 pmid 21602805

9 UniProt Consortium Activities at the Universal ProteinResource (UniProt) Nucleic Acids Res 42 D191ndashD198 (2014)doi 101093nargkt1140 pmid 24253303

10 H B Larman et al PhIP-Seq characterization ofautoantibodies from patients with multiple sclerosis type

1 diabetes and rheumatoid arthritis J Autoimmun 43 1ndash9(2013) doi 101016jjaut201301013 pmid 23497938

11 C Bialecki H M Feder Jr J M Grant-Kels The six classicchildhood exanthems A review and update J Am AcadDermatol 21 891ndash903 (1989) doi 101016S0190-9622(89)70275-9 pmid 2681288

12 J H Lee W K Roth S Zeuzem Evaluation and comparison ofdifferent hepatitis C virus genotyping and serotyping assaysJ Hepatol 26 1001ndash1009 (1997) doi 101016S0168-8278(97)80108-0 pmid 9186830

13 H F L Wertheim et al Key role for clumping factor B inStaphylococcus aureus nasal colonization of humans PLOSMed 5 e17 (2008) doi 101371journalpmed0050017pmid 18198942

14 R A Manz A E Hauser F Hiepe A Radbruch Maintenance ofserum antibody levels Annu Rev Immunol 23 367ndash386 (2005)doi 101146annurevimmunol23021704115723 pmid 15771575

15 M Wang et al Human anti-JC virus serum reacts with nativebut not denatured JC virus major capsid protein VP1 J VirolMethods 78 171ndash176 (1999) doi 101016S0166-0934(98)00180-3 pmid 10204707

16 S A S Staras et al Seroprevalence of cytomegalovirusinfection in the United States 1988-1994 Clin Infect Dis 431143ndash1151 (2006) doi 101086508173 pmid 17029132

17 M A Reynolds D Kruszon-Moran A Jumaan D S SchmidG M McQuillan Varicella seroprevalence in the US Data fromthe National Health and Nutrition Examination Survey 1999-2004Public Health Rep 125 860ndash869 (2010)pmid 21121231

18 J I Cohen Epstein-Barr virus infection N Engl J Med 343481ndash492 (2000) doi 101056NEJM200008173430707pmid 10944566

19 L Dong et al A combination of serological assays to detecthuman antibodies to the avian influenza A H7N9 virus PLOSONE 9 e95612 (2014) doi 101371journalpone0095612pmid 24755627

20 P Patel et al Prevalence and risk factors associated withherpes simplex virus-2 infection in a contemporary cohort ofHIV-infected persons in the United States Sex Transm Dis39 154ndash160 (2012) doi 101097OLQ0b013e318239d7fdpmid 22249305

21 C T Stover et al Prevalence of and risk factors for viralinfections among human immunodeficiency virus(HIV)-infected and high-risk HIV-uninfected women J InfectDis 187 1388ndash1396 (2003)pmid 12717619

22 E A Engels et al Risk factors for human herpesvirus 8infection among adults in the United States and evidence forsexual transmission J Infect Dis 196 199ndash207 (2007)doi 101086518791 pmid 17570106

23 R Vita et al The immune epitope database 20 Nucleic Acids Res38 D854ndashD862 (2010) doi 101093nargkp1004 pmid 19906713

24 H Singh H R Ansari G P S Raghava Improved method forlinear B-cell epitope prediction using antigenrsquos primarysequence PLOS ONE 8 e62216 (2013) doi 101371journalpone0062216 pmid 23667458

25 J L Mokili F Rohwer B E Dutilh Metagenomics and futureperspectives in virus discovery Curr Opin Virol 2 63ndash77(2012) doi 101016jcoviro201112004 pmid 22440968

26 J Zhu et al Protein interaction discovery using parallelanalysis of translated ORFs (PLATO) Nat Biotechnol 3131ndash334 (2013) doi 101038nbt2539 pmid 23503679

27 Y Urwijitaroon S Teawpatanataworn A KitjareontarmPrevalence of cytomegalovirus antibody in Thai-northeasternblood donors Southeast Asian J Trop Med Public Health 24(suppl 1) 180ndash182 (1993) pmid 7886568

28 M J Cannon D S Schmid T B Hyde Review ofcytomegalovirus seroprevalence and demographiccharacteristics associated with infection Rev Med Virol 20202ndash213 (2010) doi 101002rmv655 pmid 20564615

29 S Mohanna et al Human herpesvirus-8 in Peruvian blooddonors A population with hyperendemic disease Clin Infect Dis44 558ndash561 (2007) doi 101086511044 pmid 17243060

30 D Ablashi et al Seroprevalence of human herpesvirus-8(HHV-8) in countries of Southeast Asia compared to the USAthe Caribbean and Africa Br J Cancer 81 893ndash897 (1999)doi 101038sjbjc6690782 pmid 10555764

31 J S Smith N J Robinson Age-specific prevalence of infectionwith herpes simplex virus types 2 and 1 A global reviewJ Infect Dis 186 (suppl 1) S3ndashS28 (2002) doi 101086343739 pmid 12353183

32 A Heit et al CpG-DNA aided cross-priming by cross-presenting B cells J Immunol 172 1501ndash1507 (2004)doi 104049jimmunol17231501 pmid 14734727

33 Y Aydar S Sukumar A K Szakal J G Tew The influence ofimmune complex-bearing follicular dendritic cells on the IgMresponse Ig class switching and production of high affinityIgG J Immunol 174 5358ndash5366 (2005) doi 104049jimmunol17495358 pmid 15843533

34 M F Quigley et al Convergent recombination shapes theclonotypic landscape of the naive T-cell repertoire Proc NatlAcad Sci USA 107 19414ndash19419 (2010) doi 101073pnas1010586107 pmid 20974936

35 K J L Jackson M J Kidd Y Wang A M Collins The shapeof the lymphocyte receptor repertoire Lessons from the B cellreceptor Front Immunol 4 263 (2013) doi 103389fimmu201300263 pmid 24032032

36 P Parameswaran et al Convergent antibody signatures inhuman dengue Cell Host Microbe 13 691ndash700 (2013)doi 101016jchom201305008 pmid 23768493

37 B Langmead C Trapnell M Pop S L Salzberg Ultrafast andmemory-efficient alignment of short DNA sequences to thehuman genome Genome Biol 10 R25 (2009)doi 101186gb-2009-10-3-r25 pmid 19261174

ACKNOWLEDGMENTS

We thank E Unger and S Buranapraditkun for providing reagentsK Wucherpfennig (Harvard) and H Ploegh (MIT) for criticalreading of the manuscript and TWIST Bioscience for providingaccess to their advanced oligonucleotide synthesis technology Thecohort in Durban South Africa was funded by the NIH(R37AI067073) and the International AIDS Vaccine Initiative(UKZNRSA1001) TN received additional funding from the SouthAfrican Research Chairs Initiative the Victor Daitz Foundation andan International Early Career Scientist Award from the HowardHughes Medical Institute RTC was funded by grants NIHDA033541 and AI082630 CB and JS were supported by NIHN01-AI-30024 and N01-Al-15422 NIHndashNational Institute of Dentaland Craniofacial Research R01 DE018925-04 the HIVACATprogram and CUTHIVAC 241904 KR is supported by TRF SeniorResearch Scholar the Thailand Research Fund the ChulalongkornUniversity Research Professor Program Thailand and NIH grantN01-A1-30024 GJX and TK were supported by the NSFGraduate Research Fellowships Program SJE and BW areInvestigators with the Howard Hughes Medical Institute GJXTK HBL and SJE are inventors on a patent application(application no PCTUS1470902) filed by Brigham and WomenrsquosHospital Incorporated that covers the use of phage displaylibraries to detect antiviral antibodies

SUPPLEMENTARY MATERIALS

wwwsciencemagorgcontent3486239aaa0698supplDC1Supplementary TextFigs S1 to S14Tables S1 to S3

12 October 2014 accepted 24 April 2015101126scienceaaa0698

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-9

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

Comprehensive serological profiling of human populations using a synthetic human virome

Christian Brander Raymond T Chung Kevin C OConnor Bruce Walker H Benjamin Larman and Stephen J ElledgeGeorge J Xu Tomasz Kula Qikai Xu Mamie Z Li Suzanne D Vernon Thumbi Ndungu Kiat Ruxrungtham Jorge Sanchez

DOI 101126scienceaaa0698 (6239) aaa0698348Science

this issue 101126scienceaaa0698Scienceresponses in most individuals targeted the same viral epitopesabout 10 viral species over their lifetime Despite differences in the rates of exposure to specific viruses the antibodysamples from nearly 600 people of differing ages and geographic locations and found that most had been exposed to

developed a blood test that identifies antibodies against all known human viruses They studied bloodet alpeptides Xu exposures Typically such tests measure only one virus at a time Using a synthetic representation of all human viralimmune system Blood tests that detect antiviral antibodies can provide information about both past and present viral

In addition to causing illness viruses leave indelible footprints behind because infection permanently alters thethe complete historyminusminusViral exposure

ARTICLE TOOLS httpsciencesciencemagorgcontent3486239aaa0698

MATERIALSSUPPLEMENTARY httpsciencesciencemagorgcontentsuppl201506033486239aaa0698DC1

CONTENTRELATED

httpstmsciencemagorgcontentscitransmed5203203ra126fullhttpstmsciencemagorgcontentscitransmed6242242ra83full

REFERENCES

httpsciencesciencemagorgcontent3486239aaa0698BIBLThis article cites 37 articles 3 of which you can access for free

PERMISSIONS httpwwwsciencemagorghelpreprints-and-permissions

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASScienceScience 1200 New York Avenue NW Washington DC 20005 The title (print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

Copyright copy 2015 American Association for the Advancement of Science

on Septem

ber 17 2020

httpsciencesciencemagorg

Dow

nloaded from

Page 9: VIRAL IMMUNOLOGY Comprehensive serological profiling of … · Systematic viral epitope scanning (VirScan). This method allows comprehensive analysis of antiviral antibodies in human

The 200-nt oligonucleotide sequences weresynthesized on a releasable DNAmicroarray WePCR-amplified the DNAwith the primers T7-PFAAATGATACGGCGGGAATTCCGCTGCGT) and T7-PRA (CAAGCAGAAGACTCGAGCTCTTCCCTG) di-gested the product with EcoRI and XhoI andcloned the fragment into the EcoRISalI site ofthe T7FNS2 vector (8) The resulting library waspackaged into T7 bacteriophage by using the T7Select Packaging Kit (EMD Millipore) and ampli-fied by using themanufacturer suggested protocol

Phage immunoprecipitationand sequencing

We performed phage immunoprecipitation andsequencing by using a slightlymodified version ofpreviously published PhIP-Seq protocols (8 10)First we blocked eachwell of a 96-deep-well platewith 1 ml of 3 bovine serum albumin in TBSTovernight on a rotator at 4degC To each preblockedwell we added sera or plasma containing about2 mg of immunoglobulinG (IgG) [quantified usinga Human IgG ELISA Quantitation Set (BethylLaboratories)] and 1 ml of the bacteriophagelibrary diluted to ~2 times 105 fold representation(2 times 1010 plaque-forming units for a library of105 clones) in phage extraction buffer (20 mMTris-HCl pH 80 100 mMNaCl 6 mMMgSO4)We performed two technical replicates for eachsample We allowed the antibodies to bind thephage overnight on a rotator at 4degC The nextday we added 20 ml each of magnetic protein Aand protein G Dynabeads (Invitrogen) to eachwell and allowed immunoprecipitation to occurfor 4 hours on a rotator at 4degC With a 96-wellmagnetic stand we then washed the beads threetimeswith 400 ml of PhIP-Seqwash buffer (50mMTris-HCl pH 75 150mMNaCl 01 NP-40) Afterthe final wash we resuspended the beads in40 ml of water and lysed the phage at 95degC for10mWe also lysed phage from the library beforeimmunoprecipitation (ldquoinputrdquo) and after immu-noprecipitation with beads aloneWe prepared the DNA for multiplexed Il-

lumina sequencing by using a slightly modifiedversion of a previously published protocol (36)Weperformed two rounds of PCR amplification on thelysedphagematerial usinghot startQ5polymeraseaccording to the manufacturer-suggested protocol(NEB) The first round of PCR used the primersIS7_HsORF5_2 (ACACTCTTTCCCTACACGACTC-CAGTCAGGTGTGATGCTC) and IS8_HsORF3_2(GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCC-GAGCTTATCGTCGTCATCC) The second roundof PCR used 1 ml of the first-round product andthe primers IS4_HsORF5_2 (AATGATACGGCGA-CCACCGAGATCTACACTCTTTCCCTACACGACTC-CAGT) and a different unique indexing primerfor each sample to bemultiplexed for sequencing(CAAGCAGAAGACGGCATACGAGATxxxxxxx-GTGACTGGAGTTCAGACGTGT where ldquoxxxxxxxrdquodenotes a unique 7-nt indexing sequence) Afterthe second round of PCR we determined theDNA concentration of each sample by quan-titative PCR and pooled equimolar amounts ofall samples for gel extraction After gel extractionthe pooled DNA was sequenced by the Harvard

Medical School Biopolymers Facility using a 50ndashbase pair read cycle on an IlluminaHiSeq 2000 or2500We pooled up to 192 samples for sequencingon each lane and generally obtained ~100 mil-lion to 200 million reads per lane (500000 to1000000 reads per sample)

Informatics and statistical analysis

We performed the initial informatics and statis-tical analysis by using a slightly modified versionof the previously published technique (8 10) Wefirstmapped the sequencing reads to the originallibrary sequences by using Bowtie and countedthe frequency of each clone in the ldquoinputrdquo andeach sample ldquooutputrdquo (37) Because the majorityof clones are not enriched we used the observeddistribution of output counts as a null distribu-tion We found that a zero-inflated generalizedpoisson distribution fits our output counts wellWe used this null distribution to calculate a Pvalue for the likelihood of enrichment for eachclone The probabilitymass function for the zero-inflated generalized poisson distribution is

PethY frac14 yTHORNfrac14 pthorn eth1 minus pTHORNfrac12qethqthorn lTHORNxminus1eminusqminusxl y frac14 0

eth1 minus pTHORNfrac12qethqthorn lTHORNxminus1eminusqminusxl y gt 0

We used maximum likelihood estimation toregress the parameters p q and l to fit the dis-tribution of counts after immunoprecipitationfor all clones present at a particular frequencycount in the input We repeated this procedurefor all of the observed input counts and foundthat q and l are well fit by linear regression and pby an exponential regression as a function of in-put count (fig S1) Last for each clonewe used itsinput count and the regression results to deter-mine the null distribution based on the zero-inflated generalized poisson model which weused to calculate the ndashlog10(P value) of obtain-ing the observed countTo call hits we determined the threshold for

reproducibility between technical replicates basedon a previously published method (10) Brieflywe made scatter plots of the log10 of the ndashlog10 (Pvalues) and used a slidingwindowofwidth 0005from0 to 2 across the axis of one replicate For allthe clones that fell within each window we cal-culated the median and median absolute devia-tion of the log10 of the ndashlog10 (P values) in theother replicate and plotted it against the windowlocation (fig S2) We called the threshold for re-producibility the first window in which the me-dian was greater than the median absolutedeviation We found that the distribution of thethreshold ndashlog10 (P value) was centered around amean of ~23 (fig S12) So we called a peptide ahit if the ndashlog10 (P value) was at least 23 in bothreplicates We eliminated the 593 hits that cameup in at least 3 of the 22 immunoprecipitationswith beads alone (negative control for nonspecificbinding) We also filtered out any peptides thatwere not enriched in at least two of the samplesTo call virus exposures we grouped peptides

according to the virus the peptide is derived

from We grouped all peptides from individual vi-ral strains for which we had complete proteomesThe samplewas counted as positive for a species ifit was positive for any strain from that species Forviral strains thathadpartial proteomeswegroupedthemwith other strains from the same species toform a complete set and bioinformatically elim-inated homologous peptides (see next para-graph) We set a threshold number of hits pervirus based on the size of the virus We foundthat there is approximately a power-law relation-ship between size of the virus and the averagenumber of hits per sample (fig S3) In comparingresults from VirScan to samples with known in-fection we empirically determined that a thresh-old of three hits for HSV1 worked the best Weused this value and the slope of the best fit lineto scale the threshold for other viruses We alsoset a minimum threshold of at least two hits inorder to avoid false positives from single spuri-ous hitsTo bioinformatically remove cross-reactive an-

tibodies we first sorted the viruses by total num-ber of hits in descending order We then iteratedthrough each virus in this order For each viruswe iterated through each peptide hit If the hitshared a subsequence of at least 7 aa with any hitpreviously observed in any of the viruses fromthat sample that hit was considered to be from across-reactive antibody and would be ignored forthat virus Otherwise the hit is considered to bespecific and the score for that virus is incre-mented by one In this way we summed only thepeptide hits that do not share any linear epi-topes We compared the final score for each virusto the threshold for that virus to determinewheth-er the sample is positive for exposure to that virusTo identify differences between populations

we first used Fisherrsquos exact test to calculate a Pvalue for the significance of association of virusexposure with one population versus anotherThenwe constructed anull distribution of Fisherrsquosexact P values by randomly permuting the sam-ple labels 1000 times and recalculating the Fisherrsquosexact P value for each virus With use of this nulldistribution we calculated the false discovery rateby dividing the number of permutation P valuesmore extreme than the one observed by the totalnumber of permutations

IEDB epitope overlap analysis

Wedownloaded data for all continuous humanBcell epitopes from IEDB and filtered out all non-viral epitopes (22) To avoid redundancy in these4549 viral epitopes we grouped together epi-topes that are 100 identical or share a 7-aa sub-sequence giving us 1559 nonredundant epitopegroups Of these groups 1392 contain a memberepitope that is also a subsequence of a peptide inthe VirScan library This represents the totalnumber of epitopes we could detect by VirScanTo determine the number of epitopes we de-tected we tallied the number of epitope groupswith at least one member that is contained in apeptide that was enriched in one or two samplesLast to determine the number of nonredundantnew epitopeswe detected we grouped non-IEDB

aaa0698-8 5 JUNE 2015 bull VOL 348 ISSUE 6239 sciencemagorg SCIENCE

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

epitopes containing peptides that share a seven-residue subsequence and counted the number ofthese nonredundant peptide groups

Scanning mutagenesis data analysis

First we estimated the fractional abundance ofeach peptide by dividing the number of readsfor that peptide by the total number of reads forthe sample Then we divided the fractional abun-dance of each peptide after immunoprecipitationby the fractional abundance before immunopre-cipitation to get the enrichment To calculaterelative enrichment we divided enrichment ofthe mutated peptide by enrichment of the wild-type peptide Because most of the single-mutantpeptides had wild-type levels of enrichment weaveraged enrichment of the wild-type peptideenrichment with the middle two quartiles of en-richment of single-mutant peptides to get a bet-ter estimate of the wild-type peptide enrichment

RSV and HSV1 and 2 serology

Serum from 44 donors was tested for HSV1 andHSV2 antibodies by using the HerpeSelect 1 and 2Immunoblot IgG kit (Focus Diagnostics) accord-ing to manufacturerrsquos protocol Serum from 60 do-nors was tested for RSV antibodies by usinganti-RSV IgG Human ELISA Kit (ab108765) ac-cording to manufacturerrsquos protocol

REFERENCES AND NOTES

1 K M Wylie G M Weinstock G A Storch Emerging view ofthe human virome Transl Res 160 283ndash290 (2012)doi 101016jtrsl201203006 pmid 22683423

2 B A Duerkop L V Hooper Resident viruses and theirinteractions with the immune system Nat Immunol 14654ndash659 (2013) doi 101038ni2614 pmid 23778792

3 E S Barton et al Herpesvirus latency confers symbioticprotection from bacterial infection Nature 447 326ndash329(2007) doi 101038nature05762 pmid 17507983

4 E F Foxman A Iwasaki Genome-virome interactionsExamining the role of common viral infections in complexdisease Nat Rev Microbiol 9 254ndash264 (2011) doi 101038nrmicro2541 pmid 21407242

5 M Lecuit M Eloit The human virome New tools andconcepts Trends Microbiol 21 510ndash515 (2013) doi 101016jtim201307001 pmid 23906500

6 I De Vlaminck et al Temporal response of the human virome toimmunosuppression and antiviral therapy Cell 155 1178ndash1187(2013) doi 101016jcell201310034 pmid 24267896

7 E Hammarlund et al Duration of antiviral immunity aftersmallpox vaccination Nat Med 9 1131ndash1137 (2003)doi 101038nm917 pmid 12925846

8 H B Larman et al Autoantigen discovery with a synthetichuman peptidome Nat Biotechnol 29 535ndash541 (2011)doi 101038nbt1856 pmid 21602805

9 UniProt Consortium Activities at the Universal ProteinResource (UniProt) Nucleic Acids Res 42 D191ndashD198 (2014)doi 101093nargkt1140 pmid 24253303

10 H B Larman et al PhIP-Seq characterization ofautoantibodies from patients with multiple sclerosis type

1 diabetes and rheumatoid arthritis J Autoimmun 43 1ndash9(2013) doi 101016jjaut201301013 pmid 23497938

11 C Bialecki H M Feder Jr J M Grant-Kels The six classicchildhood exanthems A review and update J Am AcadDermatol 21 891ndash903 (1989) doi 101016S0190-9622(89)70275-9 pmid 2681288

12 J H Lee W K Roth S Zeuzem Evaluation and comparison ofdifferent hepatitis C virus genotyping and serotyping assaysJ Hepatol 26 1001ndash1009 (1997) doi 101016S0168-8278(97)80108-0 pmid 9186830

13 H F L Wertheim et al Key role for clumping factor B inStaphylococcus aureus nasal colonization of humans PLOSMed 5 e17 (2008) doi 101371journalpmed0050017pmid 18198942

14 R A Manz A E Hauser F Hiepe A Radbruch Maintenance ofserum antibody levels Annu Rev Immunol 23 367ndash386 (2005)doi 101146annurevimmunol23021704115723 pmid 15771575

15 M Wang et al Human anti-JC virus serum reacts with nativebut not denatured JC virus major capsid protein VP1 J VirolMethods 78 171ndash176 (1999) doi 101016S0166-0934(98)00180-3 pmid 10204707

16 S A S Staras et al Seroprevalence of cytomegalovirusinfection in the United States 1988-1994 Clin Infect Dis 431143ndash1151 (2006) doi 101086508173 pmid 17029132

17 M A Reynolds D Kruszon-Moran A Jumaan D S SchmidG M McQuillan Varicella seroprevalence in the US Data fromthe National Health and Nutrition Examination Survey 1999-2004Public Health Rep 125 860ndash869 (2010)pmid 21121231

18 J I Cohen Epstein-Barr virus infection N Engl J Med 343481ndash492 (2000) doi 101056NEJM200008173430707pmid 10944566

19 L Dong et al A combination of serological assays to detecthuman antibodies to the avian influenza A H7N9 virus PLOSONE 9 e95612 (2014) doi 101371journalpone0095612pmid 24755627

20 P Patel et al Prevalence and risk factors associated withherpes simplex virus-2 infection in a contemporary cohort ofHIV-infected persons in the United States Sex Transm Dis39 154ndash160 (2012) doi 101097OLQ0b013e318239d7fdpmid 22249305

21 C T Stover et al Prevalence of and risk factors for viralinfections among human immunodeficiency virus(HIV)-infected and high-risk HIV-uninfected women J InfectDis 187 1388ndash1396 (2003)pmid 12717619

22 E A Engels et al Risk factors for human herpesvirus 8infection among adults in the United States and evidence forsexual transmission J Infect Dis 196 199ndash207 (2007)doi 101086518791 pmid 17570106

23 R Vita et al The immune epitope database 20 Nucleic Acids Res38 D854ndashD862 (2010) doi 101093nargkp1004 pmid 19906713

24 H Singh H R Ansari G P S Raghava Improved method forlinear B-cell epitope prediction using antigenrsquos primarysequence PLOS ONE 8 e62216 (2013) doi 101371journalpone0062216 pmid 23667458

25 J L Mokili F Rohwer B E Dutilh Metagenomics and futureperspectives in virus discovery Curr Opin Virol 2 63ndash77(2012) doi 101016jcoviro201112004 pmid 22440968

26 J Zhu et al Protein interaction discovery using parallelanalysis of translated ORFs (PLATO) Nat Biotechnol 3131ndash334 (2013) doi 101038nbt2539 pmid 23503679

27 Y Urwijitaroon S Teawpatanataworn A KitjareontarmPrevalence of cytomegalovirus antibody in Thai-northeasternblood donors Southeast Asian J Trop Med Public Health 24(suppl 1) 180ndash182 (1993) pmid 7886568

28 M J Cannon D S Schmid T B Hyde Review ofcytomegalovirus seroprevalence and demographiccharacteristics associated with infection Rev Med Virol 20202ndash213 (2010) doi 101002rmv655 pmid 20564615

29 S Mohanna et al Human herpesvirus-8 in Peruvian blooddonors A population with hyperendemic disease Clin Infect Dis44 558ndash561 (2007) doi 101086511044 pmid 17243060

30 D Ablashi et al Seroprevalence of human herpesvirus-8(HHV-8) in countries of Southeast Asia compared to the USAthe Caribbean and Africa Br J Cancer 81 893ndash897 (1999)doi 101038sjbjc6690782 pmid 10555764

31 J S Smith N J Robinson Age-specific prevalence of infectionwith herpes simplex virus types 2 and 1 A global reviewJ Infect Dis 186 (suppl 1) S3ndashS28 (2002) doi 101086343739 pmid 12353183

32 A Heit et al CpG-DNA aided cross-priming by cross-presenting B cells J Immunol 172 1501ndash1507 (2004)doi 104049jimmunol17231501 pmid 14734727

33 Y Aydar S Sukumar A K Szakal J G Tew The influence ofimmune complex-bearing follicular dendritic cells on the IgMresponse Ig class switching and production of high affinityIgG J Immunol 174 5358ndash5366 (2005) doi 104049jimmunol17495358 pmid 15843533

34 M F Quigley et al Convergent recombination shapes theclonotypic landscape of the naive T-cell repertoire Proc NatlAcad Sci USA 107 19414ndash19419 (2010) doi 101073pnas1010586107 pmid 20974936

35 K J L Jackson M J Kidd Y Wang A M Collins The shapeof the lymphocyte receptor repertoire Lessons from the B cellreceptor Front Immunol 4 263 (2013) doi 103389fimmu201300263 pmid 24032032

36 P Parameswaran et al Convergent antibody signatures inhuman dengue Cell Host Microbe 13 691ndash700 (2013)doi 101016jchom201305008 pmid 23768493

37 B Langmead C Trapnell M Pop S L Salzberg Ultrafast andmemory-efficient alignment of short DNA sequences to thehuman genome Genome Biol 10 R25 (2009)doi 101186gb-2009-10-3-r25 pmid 19261174

ACKNOWLEDGMENTS

We thank E Unger and S Buranapraditkun for providing reagentsK Wucherpfennig (Harvard) and H Ploegh (MIT) for criticalreading of the manuscript and TWIST Bioscience for providingaccess to their advanced oligonucleotide synthesis technology Thecohort in Durban South Africa was funded by the NIH(R37AI067073) and the International AIDS Vaccine Initiative(UKZNRSA1001) TN received additional funding from the SouthAfrican Research Chairs Initiative the Victor Daitz Foundation andan International Early Career Scientist Award from the HowardHughes Medical Institute RTC was funded by grants NIHDA033541 and AI082630 CB and JS were supported by NIHN01-AI-30024 and N01-Al-15422 NIHndashNational Institute of Dentaland Craniofacial Research R01 DE018925-04 the HIVACATprogram and CUTHIVAC 241904 KR is supported by TRF SeniorResearch Scholar the Thailand Research Fund the ChulalongkornUniversity Research Professor Program Thailand and NIH grantN01-A1-30024 GJX and TK were supported by the NSFGraduate Research Fellowships Program SJE and BW areInvestigators with the Howard Hughes Medical Institute GJXTK HBL and SJE are inventors on a patent application(application no PCTUS1470902) filed by Brigham and WomenrsquosHospital Incorporated that covers the use of phage displaylibraries to detect antiviral antibodies

SUPPLEMENTARY MATERIALS

wwwsciencemagorgcontent3486239aaa0698supplDC1Supplementary TextFigs S1 to S14Tables S1 to S3

12 October 2014 accepted 24 April 2015101126scienceaaa0698

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-9

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

Comprehensive serological profiling of human populations using a synthetic human virome

Christian Brander Raymond T Chung Kevin C OConnor Bruce Walker H Benjamin Larman and Stephen J ElledgeGeorge J Xu Tomasz Kula Qikai Xu Mamie Z Li Suzanne D Vernon Thumbi Ndungu Kiat Ruxrungtham Jorge Sanchez

DOI 101126scienceaaa0698 (6239) aaa0698348Science

this issue 101126scienceaaa0698Scienceresponses in most individuals targeted the same viral epitopesabout 10 viral species over their lifetime Despite differences in the rates of exposure to specific viruses the antibodysamples from nearly 600 people of differing ages and geographic locations and found that most had been exposed to

developed a blood test that identifies antibodies against all known human viruses They studied bloodet alpeptides Xu exposures Typically such tests measure only one virus at a time Using a synthetic representation of all human viralimmune system Blood tests that detect antiviral antibodies can provide information about both past and present viral

In addition to causing illness viruses leave indelible footprints behind because infection permanently alters thethe complete historyminusminusViral exposure

ARTICLE TOOLS httpsciencesciencemagorgcontent3486239aaa0698

MATERIALSSUPPLEMENTARY httpsciencesciencemagorgcontentsuppl201506033486239aaa0698DC1

CONTENTRELATED

httpstmsciencemagorgcontentscitransmed5203203ra126fullhttpstmsciencemagorgcontentscitransmed6242242ra83full

REFERENCES

httpsciencesciencemagorgcontent3486239aaa0698BIBLThis article cites 37 articles 3 of which you can access for free

PERMISSIONS httpwwwsciencemagorghelpreprints-and-permissions

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASScienceScience 1200 New York Avenue NW Washington DC 20005 The title (print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

Copyright copy 2015 American Association for the Advancement of Science

on Septem

ber 17 2020

httpsciencesciencemagorg

Dow

nloaded from

Page 10: VIRAL IMMUNOLOGY Comprehensive serological profiling of … · Systematic viral epitope scanning (VirScan). This method allows comprehensive analysis of antiviral antibodies in human

epitopes containing peptides that share a seven-residue subsequence and counted the number ofthese nonredundant peptide groups

Scanning mutagenesis data analysis

First we estimated the fractional abundance ofeach peptide by dividing the number of readsfor that peptide by the total number of reads forthe sample Then we divided the fractional abun-dance of each peptide after immunoprecipitationby the fractional abundance before immunopre-cipitation to get the enrichment To calculaterelative enrichment we divided enrichment ofthe mutated peptide by enrichment of the wild-type peptide Because most of the single-mutantpeptides had wild-type levels of enrichment weaveraged enrichment of the wild-type peptideenrichment with the middle two quartiles of en-richment of single-mutant peptides to get a bet-ter estimate of the wild-type peptide enrichment

RSV and HSV1 and 2 serology

Serum from 44 donors was tested for HSV1 andHSV2 antibodies by using the HerpeSelect 1 and 2Immunoblot IgG kit (Focus Diagnostics) accord-ing to manufacturerrsquos protocol Serum from 60 do-nors was tested for RSV antibodies by usinganti-RSV IgG Human ELISA Kit (ab108765) ac-cording to manufacturerrsquos protocol

REFERENCES AND NOTES

1 K M Wylie G M Weinstock G A Storch Emerging view ofthe human virome Transl Res 160 283ndash290 (2012)doi 101016jtrsl201203006 pmid 22683423

2 B A Duerkop L V Hooper Resident viruses and theirinteractions with the immune system Nat Immunol 14654ndash659 (2013) doi 101038ni2614 pmid 23778792

3 E S Barton et al Herpesvirus latency confers symbioticprotection from bacterial infection Nature 447 326ndash329(2007) doi 101038nature05762 pmid 17507983

4 E F Foxman A Iwasaki Genome-virome interactionsExamining the role of common viral infections in complexdisease Nat Rev Microbiol 9 254ndash264 (2011) doi 101038nrmicro2541 pmid 21407242

5 M Lecuit M Eloit The human virome New tools andconcepts Trends Microbiol 21 510ndash515 (2013) doi 101016jtim201307001 pmid 23906500

6 I De Vlaminck et al Temporal response of the human virome toimmunosuppression and antiviral therapy Cell 155 1178ndash1187(2013) doi 101016jcell201310034 pmid 24267896

7 E Hammarlund et al Duration of antiviral immunity aftersmallpox vaccination Nat Med 9 1131ndash1137 (2003)doi 101038nm917 pmid 12925846

8 H B Larman et al Autoantigen discovery with a synthetichuman peptidome Nat Biotechnol 29 535ndash541 (2011)doi 101038nbt1856 pmid 21602805

9 UniProt Consortium Activities at the Universal ProteinResource (UniProt) Nucleic Acids Res 42 D191ndashD198 (2014)doi 101093nargkt1140 pmid 24253303

10 H B Larman et al PhIP-Seq characterization ofautoantibodies from patients with multiple sclerosis type

1 diabetes and rheumatoid arthritis J Autoimmun 43 1ndash9(2013) doi 101016jjaut201301013 pmid 23497938

11 C Bialecki H M Feder Jr J M Grant-Kels The six classicchildhood exanthems A review and update J Am AcadDermatol 21 891ndash903 (1989) doi 101016S0190-9622(89)70275-9 pmid 2681288

12 J H Lee W K Roth S Zeuzem Evaluation and comparison ofdifferent hepatitis C virus genotyping and serotyping assaysJ Hepatol 26 1001ndash1009 (1997) doi 101016S0168-8278(97)80108-0 pmid 9186830

13 H F L Wertheim et al Key role for clumping factor B inStaphylococcus aureus nasal colonization of humans PLOSMed 5 e17 (2008) doi 101371journalpmed0050017pmid 18198942

14 R A Manz A E Hauser F Hiepe A Radbruch Maintenance ofserum antibody levels Annu Rev Immunol 23 367ndash386 (2005)doi 101146annurevimmunol23021704115723 pmid 15771575

15 M Wang et al Human anti-JC virus serum reacts with nativebut not denatured JC virus major capsid protein VP1 J VirolMethods 78 171ndash176 (1999) doi 101016S0166-0934(98)00180-3 pmid 10204707

16 S A S Staras et al Seroprevalence of cytomegalovirusinfection in the United States 1988-1994 Clin Infect Dis 431143ndash1151 (2006) doi 101086508173 pmid 17029132

17 M A Reynolds D Kruszon-Moran A Jumaan D S SchmidG M McQuillan Varicella seroprevalence in the US Data fromthe National Health and Nutrition Examination Survey 1999-2004Public Health Rep 125 860ndash869 (2010)pmid 21121231

18 J I Cohen Epstein-Barr virus infection N Engl J Med 343481ndash492 (2000) doi 101056NEJM200008173430707pmid 10944566

19 L Dong et al A combination of serological assays to detecthuman antibodies to the avian influenza A H7N9 virus PLOSONE 9 e95612 (2014) doi 101371journalpone0095612pmid 24755627

20 P Patel et al Prevalence and risk factors associated withherpes simplex virus-2 infection in a contemporary cohort ofHIV-infected persons in the United States Sex Transm Dis39 154ndash160 (2012) doi 101097OLQ0b013e318239d7fdpmid 22249305

21 C T Stover et al Prevalence of and risk factors for viralinfections among human immunodeficiency virus(HIV)-infected and high-risk HIV-uninfected women J InfectDis 187 1388ndash1396 (2003)pmid 12717619

22 E A Engels et al Risk factors for human herpesvirus 8infection among adults in the United States and evidence forsexual transmission J Infect Dis 196 199ndash207 (2007)doi 101086518791 pmid 17570106

23 R Vita et al The immune epitope database 20 Nucleic Acids Res38 D854ndashD862 (2010) doi 101093nargkp1004 pmid 19906713

24 H Singh H R Ansari G P S Raghava Improved method forlinear B-cell epitope prediction using antigenrsquos primarysequence PLOS ONE 8 e62216 (2013) doi 101371journalpone0062216 pmid 23667458

25 J L Mokili F Rohwer B E Dutilh Metagenomics and futureperspectives in virus discovery Curr Opin Virol 2 63ndash77(2012) doi 101016jcoviro201112004 pmid 22440968

26 J Zhu et al Protein interaction discovery using parallelanalysis of translated ORFs (PLATO) Nat Biotechnol 3131ndash334 (2013) doi 101038nbt2539 pmid 23503679

27 Y Urwijitaroon S Teawpatanataworn A KitjareontarmPrevalence of cytomegalovirus antibody in Thai-northeasternblood donors Southeast Asian J Trop Med Public Health 24(suppl 1) 180ndash182 (1993) pmid 7886568

28 M J Cannon D S Schmid T B Hyde Review ofcytomegalovirus seroprevalence and demographiccharacteristics associated with infection Rev Med Virol 20202ndash213 (2010) doi 101002rmv655 pmid 20564615

29 S Mohanna et al Human herpesvirus-8 in Peruvian blooddonors A population with hyperendemic disease Clin Infect Dis44 558ndash561 (2007) doi 101086511044 pmid 17243060

30 D Ablashi et al Seroprevalence of human herpesvirus-8(HHV-8) in countries of Southeast Asia compared to the USAthe Caribbean and Africa Br J Cancer 81 893ndash897 (1999)doi 101038sjbjc6690782 pmid 10555764

31 J S Smith N J Robinson Age-specific prevalence of infectionwith herpes simplex virus types 2 and 1 A global reviewJ Infect Dis 186 (suppl 1) S3ndashS28 (2002) doi 101086343739 pmid 12353183

32 A Heit et al CpG-DNA aided cross-priming by cross-presenting B cells J Immunol 172 1501ndash1507 (2004)doi 104049jimmunol17231501 pmid 14734727

33 Y Aydar S Sukumar A K Szakal J G Tew The influence ofimmune complex-bearing follicular dendritic cells on the IgMresponse Ig class switching and production of high affinityIgG J Immunol 174 5358ndash5366 (2005) doi 104049jimmunol17495358 pmid 15843533

34 M F Quigley et al Convergent recombination shapes theclonotypic landscape of the naive T-cell repertoire Proc NatlAcad Sci USA 107 19414ndash19419 (2010) doi 101073pnas1010586107 pmid 20974936

35 K J L Jackson M J Kidd Y Wang A M Collins The shapeof the lymphocyte receptor repertoire Lessons from the B cellreceptor Front Immunol 4 263 (2013) doi 103389fimmu201300263 pmid 24032032

36 P Parameswaran et al Convergent antibody signatures inhuman dengue Cell Host Microbe 13 691ndash700 (2013)doi 101016jchom201305008 pmid 23768493

37 B Langmead C Trapnell M Pop S L Salzberg Ultrafast andmemory-efficient alignment of short DNA sequences to thehuman genome Genome Biol 10 R25 (2009)doi 101186gb-2009-10-3-r25 pmid 19261174

ACKNOWLEDGMENTS

We thank E Unger and S Buranapraditkun for providing reagentsK Wucherpfennig (Harvard) and H Ploegh (MIT) for criticalreading of the manuscript and TWIST Bioscience for providingaccess to their advanced oligonucleotide synthesis technology Thecohort in Durban South Africa was funded by the NIH(R37AI067073) and the International AIDS Vaccine Initiative(UKZNRSA1001) TN received additional funding from the SouthAfrican Research Chairs Initiative the Victor Daitz Foundation andan International Early Career Scientist Award from the HowardHughes Medical Institute RTC was funded by grants NIHDA033541 and AI082630 CB and JS were supported by NIHN01-AI-30024 and N01-Al-15422 NIHndashNational Institute of Dentaland Craniofacial Research R01 DE018925-04 the HIVACATprogram and CUTHIVAC 241904 KR is supported by TRF SeniorResearch Scholar the Thailand Research Fund the ChulalongkornUniversity Research Professor Program Thailand and NIH grantN01-A1-30024 GJX and TK were supported by the NSFGraduate Research Fellowships Program SJE and BW areInvestigators with the Howard Hughes Medical Institute GJXTK HBL and SJE are inventors on a patent application(application no PCTUS1470902) filed by Brigham and WomenrsquosHospital Incorporated that covers the use of phage displaylibraries to detect antiviral antibodies

SUPPLEMENTARY MATERIALS

wwwsciencemagorgcontent3486239aaa0698supplDC1Supplementary TextFigs S1 to S14Tables S1 to S3

12 October 2014 accepted 24 April 2015101126scienceaaa0698

SCIENCE sciencemagorg 5 JUNE 2015 bull VOL 348 ISSUE 6239 aaa0698-9

RESEARCH | RESEARCH ARTICLEon S

eptember 17 2020

httpsciencesciencem

agorgD

ownloaded from

Comprehensive serological profiling of human populations using a synthetic human virome

Christian Brander Raymond T Chung Kevin C OConnor Bruce Walker H Benjamin Larman and Stephen J ElledgeGeorge J Xu Tomasz Kula Qikai Xu Mamie Z Li Suzanne D Vernon Thumbi Ndungu Kiat Ruxrungtham Jorge Sanchez

DOI 101126scienceaaa0698 (6239) aaa0698348Science

this issue 101126scienceaaa0698Scienceresponses in most individuals targeted the same viral epitopesabout 10 viral species over their lifetime Despite differences in the rates of exposure to specific viruses the antibodysamples from nearly 600 people of differing ages and geographic locations and found that most had been exposed to

developed a blood test that identifies antibodies against all known human viruses They studied bloodet alpeptides Xu exposures Typically such tests measure only one virus at a time Using a synthetic representation of all human viralimmune system Blood tests that detect antiviral antibodies can provide information about both past and present viral

In addition to causing illness viruses leave indelible footprints behind because infection permanently alters thethe complete historyminusminusViral exposure

ARTICLE TOOLS httpsciencesciencemagorgcontent3486239aaa0698

MATERIALSSUPPLEMENTARY httpsciencesciencemagorgcontentsuppl201506033486239aaa0698DC1

CONTENTRELATED

httpstmsciencemagorgcontentscitransmed5203203ra126fullhttpstmsciencemagorgcontentscitransmed6242242ra83full

REFERENCES

httpsciencesciencemagorgcontent3486239aaa0698BIBLThis article cites 37 articles 3 of which you can access for free

PERMISSIONS httpwwwsciencemagorghelpreprints-and-permissions

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASScienceScience 1200 New York Avenue NW Washington DC 20005 The title (print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

Copyright copy 2015 American Association for the Advancement of Science

on Septem

ber 17 2020

httpsciencesciencemagorg

Dow

nloaded from

Page 11: VIRAL IMMUNOLOGY Comprehensive serological profiling of … · Systematic viral epitope scanning (VirScan). This method allows comprehensive analysis of antiviral antibodies in human

Comprehensive serological profiling of human populations using a synthetic human virome

Christian Brander Raymond T Chung Kevin C OConnor Bruce Walker H Benjamin Larman and Stephen J ElledgeGeorge J Xu Tomasz Kula Qikai Xu Mamie Z Li Suzanne D Vernon Thumbi Ndungu Kiat Ruxrungtham Jorge Sanchez

DOI 101126scienceaaa0698 (6239) aaa0698348Science

this issue 101126scienceaaa0698Scienceresponses in most individuals targeted the same viral epitopesabout 10 viral species over their lifetime Despite differences in the rates of exposure to specific viruses the antibodysamples from nearly 600 people of differing ages and geographic locations and found that most had been exposed to

developed a blood test that identifies antibodies against all known human viruses They studied bloodet alpeptides Xu exposures Typically such tests measure only one virus at a time Using a synthetic representation of all human viralimmune system Blood tests that detect antiviral antibodies can provide information about both past and present viral

In addition to causing illness viruses leave indelible footprints behind because infection permanently alters thethe complete historyminusminusViral exposure

ARTICLE TOOLS httpsciencesciencemagorgcontent3486239aaa0698

MATERIALSSUPPLEMENTARY httpsciencesciencemagorgcontentsuppl201506033486239aaa0698DC1

CONTENTRELATED

httpstmsciencemagorgcontentscitransmed5203203ra126fullhttpstmsciencemagorgcontentscitransmed6242242ra83full

REFERENCES

httpsciencesciencemagorgcontent3486239aaa0698BIBLThis article cites 37 articles 3 of which you can access for free

PERMISSIONS httpwwwsciencemagorghelpreprints-and-permissions

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASScienceScience 1200 New York Avenue NW Washington DC 20005 The title (print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

Copyright copy 2015 American Association for the Advancement of Science

on Septem

ber 17 2020

httpsciencesciencemagorg

Dow

nloaded from