· web viewsamples were first spiked with dna and rna controls composed of one rna, and one dna,...

45
Supplementary appendix This appendix has been provided by the authors to give readers additional information about their work. 1 1 2 3 1 2

Upload: hakhanh

Post on 25-Apr-2019

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

Supplementary appendix

This appendix has been provided by the authors to give readers additional information about

their work.

1

1

2

3

12

Page 2:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

Untargeted next-generation sequencing-based first-line diagnosis of infection in

immunocompromised adults: a multicentre, blinded, prospective study

Online supplemental materials

Table of Contents

List of tables..........................................................................................................................................3

List of investigators................................................................................................................................4

Supplementary methods.......................................................................................................................8

Supplementary tables..........................................................................................................................13

Supplementary figures........................................................................................................................31

2

4

5

6

7

8

9

10

11

12

13

14

15

34

Page 3:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

List of tables

Supplementary Table 1. Conventional microbiological methods used in the standard procedures.............................................................................................................................................................13Supplementary Table 2. Concordant identifications by UNGS and standard procedures at inclusion.............................................................................................................................................14Supplementary Table 3. Divergent identifications by UNGS and standard procedures at inclusion.............................................................................................................................................................23Supplementary Table 4. Clinically-relevant viruses and bacteria in samples with concordant results for UNGS and standard procedures......................................................................................28Supplementary Table 5. Comparison of UNGS at inclusion with standard procedures running for 30 days for clinically-relevant viruses, bacteria and fungi..............................................................29

3

16

1718192021222324252627

56

Page 4:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

List of investigators

Country Investigator Affiliation(s) Qualification

France Perrine Parize Paris Descartes University, Sorbonne

Paris Cité, Necker Pasteur Center for

Infectious Diseases and Tropical

Medicine, Necker Enfants Malades

University Hospital, Institut Imagine,

Paris

MD

France Erika Muth PathoQuest, Paris

France Clémence Richaud Department of Microbiology,

European Georges Pompidou Hospital,

Assistance Publique-Hôpitaux de

Paris, Université Paris Descartes, Paris

MD

France Marlène Gratigny PathoQuest, Paris MSc

France Benoit Pilmis Paris Descartes University, Sorbonne

Paris Cité, Necker Pasteur Center for

Infectious Diseases and Tropical

Medicine, Necker Enfants Malades

University Hospital, Institut Imagine,

Paris

MD

France Arnaud Lamamy PathoQuest, Paris TECH

France Jean-Luc Mainardi Department of Microbiology,

European Georges Pompidou Hospital,

Assistance Publique-Hôpitaux de

Paris, Université Paris Descartes, Paris

MD, PhD

France Justine Cheval PathoQuest, Paris MSc

France Louise de Visser PathoQuest Paris MSc

France Florence Jagorel PathoQuest, Paris TECHN

4

28

78

Page 5:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

Country Investigator Affiliation(s) Qualification

France Laura Ben Yahia PathoQuest, Paris TECHN

France Geraldine Bamba PathoQuest, Paris TECHN

France Myriam Dubois PathoQuest, Paris B.Eng

France Olivier Join-Lambert Paris Descartes University, Sorbonne

Paris Cité, Laboratory of Microbiology,

Necker Enfants Malades University

Hospital, Paris

MD, PhD

France Marianne Leruez-Ville Paris Descartes University, Sorbonne

Paris Cité, Laboratory of Microbiology,

Necker Enfants Malades University

Hospital, Paris

MD, PhD

France Xavier Nassif Paris Descartes University, Sorbonne

Paris Cité, Laboratory of Microbiology,

Necker Enfants Malades University

Hospital, Paris

MD, PhD

France Agnes Lefort University Paris-Diderot, Hospital

Beaujon, Clichy

MD, PhD

France Fanny Lanternier Paris Descartes University, Sorbonne

Paris Cité, Necker Pasteur Center for

Infectious Diseases and Tropical

Medicine, Necker Enfants Malades

University Hospital, Institut Imagine,

Paris

MD, PhD

France Felipe Suarez Hematology Department, Necker

Hospital, Paris Descartes - Sorbonne

Paris Cité University, INSERM U1163

CNRS ERL8654, Imagine Institute,

Paris

MD, PhD

5910

Page 6:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

Country Investigator Affiliation(s) Qualification

France Olivier Lortholary Paris Descartes University, Sorbonne

Paris Cité, Necker Pasteur Center for

Infectious Diseases and Tropical

Medicine, Necker Enfants Malades

University Hospital, Institut Imagine,

Paris

MD, PhD

France Marc Lecuit Paris Descartes University, Sorbonne

Paris Cité, Necker Pasteur Center for

Infectious Diseases and Tropical

Medicine, Necker Enfants Malades

University Hospital, Institut Imagine,

Paris, and Institut Pasteur, Biology of

Infection Unit, Inserm U1117,

Pathogen Discovery Laboratory, Paris

MD, PhD

France Marc Eloit PathoQuest, Paris and Institut Pasteur,

Biology of Infection Unit, Inserm

U1117, Pathogen Discovery

Laboratory, Paris

DVM, PhD

France Emmanuel Guérot Groupe HEGP MD

France Juliette Pavie Groupe HEGP MD

France Georgia Malamut Groupe HEGP MD, PhD

France Bruno Landi Groupe HEGP MD

France Adrien Michon Groupe HEGP MD

France Isabelle Pierre Groupe HEGP MD

France Romain Guillemain Groupe HEGP MD

France Elisabeth Fabre Groupe HEGP MD

France Stéphane Oudard Groupe HEGP MD, PhD

France Alexis Ferré Groupe HEGP MD

61112

Page 7:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

Country Investigator Affiliation(s) Qualification

France Sarah Roussel Groupe HEGP MD

France Jean Pastre Groupe HEGP MD

France Eric Thervet Groupe HEGP MD, PhD

France Christophe Legendre Groupe Necker MD

France Anne Scemla Groupe Necker MD

France Olivier Hermine Groupe Necker MD, PhD

France David Lebeaux Groupe Necker MD

France Claire Aguilar Groupe Necker MD

7

29

1314

Page 8:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

Supplementary methods

Sample preparation and sequencing

Samples were first spiked with DNA and RNA controls composed of one RNA, and one DNA,

bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to allow for

quality control of the extraction, amplification and sequencing steps. Nucleic acids (DNA and

RNA) were extracted from plasma samples and other biological fluids, which had been

previously treated with a cocktail of nucleases to reduce contamination by free nucleic acids,

using the QIAamp Cador pathogen kit (Qiagen Cat. No. 54104, Hilden, Germany). The extraction

of bacterial nucleic acids was performed on whole blood using the MolYsis-5 reagents (Cat. No.

D-321-050, Molzym, Bremen, Germany). Extraction of nucleic acids was controlled by

quantitative real-time q(RT-)PCR based on the TaqMan® technology against target sequences

within bacteriophages introduced as internal controls. First-strand cDNA synthesis was

performed by using commercially available reagents (SuperScript III First-Strand synthesis

system for RT-PCR kit, Life Technologies Cat. No. 18080-051, Marly le Roi, France). Nucleic acids

were random-amplified by a multiple displacement amplification (MDA) reaction using the

bacteriophage Phi29 DNA polymerase (Qiagen Cat. No. 207043, Breme,, Germany).

Amplification of viral and bacterial nucleic acids was controlled by the same RT-PCR against

internal control sequences (bacteriophages), as described above. High molecular weight DNA

resulting from MDA was fragmented with a Covaris M220 ultrasonicator (Woburn, USA) at a

power of 50 W and at 200 cycles per burst for 160 seconds. Fragmented DNA was further

purified by Agencourt AMPure XP beads (Beckman Coulter, Cat. No. A63880-881, Villepinte,

France) and end-repaired (Life Technologies Cat. No. 4471252, Marly le Roi, France). Adapters

were ligated and DNA was nick-repaired (Ion Xpress Barcode Adapters kit, Life Technologies

Cat. No. 4471250, 4474009, 4474517-21). DNA was then purified with Agencourt AMPure XP

beads. This unamplified library was sized to 200 nucleotides using Solid Phase Reversible

Immobilization (SPRI) beads (Beckman Coulter, Cat. No. B23317-18). The library was then

amplified by PCR using Ion Torrent reagents (Life Technologies, ThermoFischer, Waltham,

8

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

1516

Page 9:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

USA). After purification of DNA with AMPure XP beads, and quantification and size verification

on an Agilent Bioanalyzer 2100 instrument, the library was sequenced on an Ion Proton

instrument (Life Technologies, ThermoFischer, Waltham, USA ) using a minimum of 40 million

reads per sample according to the manufacturer’s instructions.

Sequence filtering, mapping, and scoring

Various filtering steps were applied to each sequence; nucleotides at the extremities were

trimmed according to their quality. For size trimming, sequence length had to be greater than

60 nucleotides. The human sequences were then selectively suppressed by mapping against the

human genome (hg19, Burrows–Wheeler transform algorithm modified by PathoQuest SAS).

Sequences of Anelloviridae were filtered using a subtractive comparison against a proprietary

database of anelloviruses (provided by PathoQuest SAS, version October 2013, Burrows–

Wheeler transform algorithm modified by PathoQuest SAS, available on request). Remaining

sequences were then mapped against a proprietary database of viruses and bacteria created

and maintained by PathoQuest SAS (version mid-2013, database composition available on

request, Burrows–Wheeler transform algorithm modified by PathoQuest SAS). This database

was built using all available long length reference genome sequences downloaded from NCBI at

the date of the database (July 2013). We only focused on viral and bacterial species of clinical

interest. Species selection was performed using bibliography (for bacteria) plus International

Committee on Taxonomy of Viruses (ICTV) classification for viruses. Database, species list and

sequences in FASTA format are available on request. Mapping results were deposited in a

proprietary and dedicated data warehouse that stores and organises all alignment-related

information for each potential target. The bioinformatics pipeline, database schemes and expert

rules used for the analysis are proprietary. These tools were developed using industrial-level

quality processes and were fully tested (unit test and acceptance tests).

The scoring methodology used various metrics including genome coverage, genome

distribution, sequencing depth, and alignment quality.These metrics are tightly associated.

9

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

1718

Page 10:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

Thus, a high depth of sequencing on a high titrated pathogen in the sequencing library is

necessary to obtain a good genome coverage according to the Lander-Waterman theory. Our

scoring for each hit (i.e. pathogen) was the cumulative effect of these different metrics

(coverage percentage, number of unique bloc/part of genome identified [i.e. genome

fragmentation], genome distribution, p-value estimation.). We sought to normalise the score but

this presented a significant challenge: our test is untargeted and we had to deal with the

variable nature of different pathogens’ genomes (e.g. circular, DNA, RNA, and segmented) and a

heterogeneous amount of background noise (the portion of the sequencing library that targets

the host genome rather than pathogens).

As the pathogen capture is random (extraction and amplification), the proportion of a pathogen

in the sequencing library is not correlated to its initial titration. The only way to calibrate our

score and determine a significance threshold was to perform in silico simulation and biological

mock sample generation. Based on our experiments, we determined that a score below 100

should not be considered as a positive hit, but likely as a non-specific alignment with extremely

low genome coverage and biased read alignment localisation. Scores between 100 and 1000 are

in the grey zone, i.e. the genome was correctly identified but could lead to invalid taxonomic

assignment or unprecise taxonomy at the genus level. Scores above 1000 correspond to a well-

defined genome (high coverage or high genome coverage distribution) with non-ambiguous

alignments. The aim of the score is therefore to provide a way to discriminate taxonomic

assignment quality. It is not possible to use the score to select or emphasise relevant organisms

for infections. The score remains a good indicator for background noise detection or nucleic acid

contamination identification. Two types of bacteria always have a high score: commensal

bacteria located on the skin (e.g. Propionibacterium acnes) and reagent contaminants (e.g.

Escherichia coli).

The score was generated according to the following formula:

score=genom ecoverage∗genomecoverage segment number∗genomedistribution∗100

10

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

1920

Page 11:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

Genome coverage: percentage of nucleotides of the genome that are covered by at least

one read or alignment

Genome coverage segment number: number of continuous regions of the genome where

all nucleotides are covered by at least one read or alignment

Genome distribution: metric that measures the dispersion of all alignments along the

considered genome

The genome distribution metric depends on the genome coverage segment number and we

assumed that the coverage metric follows a discrete uniform distribution. We used this

distribution to position expected block positions (attractors) on the targeted genome. We then

computed the consistency between observed and expected block positions using a Pearson’s

chi-squared test (see supplementary figure 1).

Consequently, the genome distribution variable is a good indicator of bias regarding genome

coverage. This metric is directly correlated to the number of blocks, so it cannot be computed if

the genome coverage is too low or too high. If the genome coverage was <4 blocks, the value

was set to 0.01. If the genome coverage was > 50%, the value was set to 1.0. In this case, the

binomial distribution may be approximated by a normal distribution. Supplementary Figure 2

presents three genome coverage organisations that generate a low consistency metric, an

intermediate score and a perfect accordance with the uniform distribution. The score will

increase accordingly, reflecting the confidence in the taxonomic assignment for this

genome/target.

Python codes for scoring procedures are available on demand.

Measures taken to prevent or minimise contamination

Untargeted next-generation sequencing (UNGS) is a highly sensitive technique that can

potentially detect minute quantities of nucleic acid. A number of measures were therefore

implemented in the workflow to reduce the risk of contamination from each of the following

different sources:

11

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

2122

Page 12:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

1. Pre-amplification contamination from the laboratory environment: Cross-contamination

is a major issue for UNGS of libraries prepared by random amplification. Indeed, each

individual nucleic acid molecule has the potential to be amplified to the extent that it

will be detected by the sequencing. Therefore, the utmost precautions were taken

regarding the working environment and sample handling during the process (e.g.

nucleic acid decontamination before and after handling the samples).

2. Contamination by nucleic acids present in laboratory reagents: Although traces of

nucleic acid contamination may be present in research products such as nucleic acid

purification spin columns, the majority comes from enzymes. Every single enzyme used

in the described procedures (with the exception of nucleases which are degrading

nucleic acids) contains traces of the bacterial genome of the host in which it was

produced. There is currently no nucleic acid-free enzyme commercially available that

could be used in our methodology. Consequently, the enzyme contamination was

assessed, in order to exclude it from the results. For this reason, Escherichia coli hits

were excluded from the report as it is the preferred host for the expression of

recombinant proteins such as enzymes.

3. Sample cross-contamination: The risk of cross-contamination occurs during the

handling of plates or tubes containing several samples. This is in particular a risk for

samples tagged with the same barcode for sequencing. During this study, an interval of

2-3 months was therefore observed between using the same barcode.

12

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

2324

Page 13:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

Supplementary tables

13

157

158

2526

Page 14:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

Supplementary Table 1. Conventional microbiological methods used in the standard

procedures

Microbiological method/test

Bacterial diagnosis Blood culture and culture of others samples on appropriate media

(bacterial identification by MALDI-TOF mass spectrometry)

Serological diagnosis

Urine: Soluble Legionella and pneumococcal antigen detection

Respiratory samples: PCR Mycoplasma pneumoniae, Chlamydia

pneumoniae, Bordetella pertussis/parapertussis, Legionella

pneumophila

Urogenital samples: PCR Neisseria gonorrhoeae, Mycoplasma spp.,

Chlamydia trachomatis

Stool samples: PCR Clostridium difficile

PCR (any sample): 16S rRNA gene sequencing, PCR Bartonella

henselae, Helicobacter pylori, Staphylococcus aureus, Streptococcus

pyogenes, Kingella kingae, Listeria monocytogenes, Neisseria

meningitidis

Viral diagnosis Serological diagnosis

PCR influenza A/B, rhinovirus, enterovirus, RSV A/B,

metapneumovirus, parainfluenzae 1,2,3,4, coronavirus, HSV 1/2, VZV,

JC virus, BK virus, CMV, EBV, parvovirus B19, norovirus, adenovirus,

measles-virus, hepatitis virus A/B/C/Delta/E

Fungal diagnosis Direct examination

Culture and fungal identification by MALDI-TOF mass spectrometry

Galactomannan antigen detection in blood, detection of cryptococcal

antigen, 1,3 B-D glucan detection in blood

PCR: Aspergillus fumigatus, Pneumocystis jirovecii, Toxoplasma gondii,

Microsporidium

Abbreviations: CMV, HHV5, Cytomegalovirus; HSV, HHV1/2, Herpes Simplex Virus 1/2; MALDI-TOF, Matrix Assisted Laser Desorption Ionization – Time-of-Flight; RSV, Respiratory Syncytial Virus; VZV, Varicella Zoster Virus.

14

159

160

161162

2728

Page 15:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

Supplementary Table 2. Concordant identifications by UNGS and standard procedures at inclusion

Patient

identification

number

Immunodeficiency Clinical

presentation

and/or site of

infection

Pathogen

identified

Microbial

identification

by standard

procedures on

a sample also

tested by UNGS

Microbial

identification

by standard

procedures on

a sample not

tested by UNGS

Microbial

identification

by UNGS

Likelihood of

infectiona

1 HIV infection Fever Virus HIV 1 0 1 1

3 Chemotherapy Febrile

neutropenia

No diagnosisb 0 0 0 1

5 HSCT Febrile

neutropenia

No diagnosis 0 0 0 1

7 SOT Pleurisy No diagnosis 0 0 0 0

8 SOT Skin lesion No diagnosis 0 0 0 0

9 SOT Pneumonia No diagnosis 0 0 0 1

10 Other Pneumonia No diagnosis 0 0 0 1

11 PI Pneumonia No diagnosis 0 0 0 0

12 HSCT Fever No diagnosis 0 0 0 1

14 SOT Pneumonia No diagnosis 0 0 0 1

15

163

2930

Page 16:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

Patient

identification

number

Immunodeficiency Clinical

presentation

and/or site of

infection

Pathogen

identified

Microbial

identification

by standard

procedures on

a sample also

tested by UNGS

Microbial

identification

by standard

procedures on

a sample not

tested by UNGS

Microbial

identification

by UNGS

Likelihood of

infectiona

163132

Page 17:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

Patient

identification

number

Immunodeficiency Clinical

presentation

and/or site of

infection

Pathogen

identified

Microbial

identification

by standard

procedures on

a sample also

tested by UNGS

Microbial

identification

by standard

procedures on

a sample not

tested by UNGS

Microbial

identification

by UNGS

Likelihood of

infectiona

15 Chemotherapy Sub-cutaneous

abscess

Staphylococcus

aureus

1 0 1 1

20 SOT Arthritis No diagnosis 0 0 0 0

21 Other Pneumonia No diagnosis 0 0 0 1

22 Chemotherapy Febrile

neutropenia +

pneumonia

No diagnosis 0 0 0 1

23 Other Fever No diagnosis 0 0 0 1

28 SOT Uveitis Human Herpes

Virus type 5

(CMV)

1 0 1 1

30 Chemotherapy Skin lesion No diagnosis 0 0 0 1

31 Chemotherapy Febrile No diagnosis 0 0 0 1

173334

Page 18:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

Patient

identification

number

Immunodeficiency Clinical

presentation

and/or site of

infection

Pathogen

identified

Microbial

identification

by standard

procedures on

a sample also

tested by UNGS

Microbial

identification

by standard

procedures on

a sample not

tested by UNGS

Microbial

identification

by UNGS

Likelihood of

infectiona

neutropenia

183536

Page 19:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

Patient

identification

number

Immunodeficiency Clinical

presentation

and/or site of

infection

Pathogen

identified

Microbial

identification

by standard

procedures on

a sample also

tested by UNGS

Microbial

identification

by standard

procedures on

a sample not

tested by UNGS

Microbial

identification

by UNGS

Likelihood of

infectiona

32 Other Fever No diagnosis 0 0 0 1

34 Other Fever No diagnosis 0 0 0 1

35 HSCT Febrile

neutropenia

No diagnosis 0 0 0 1

36 PI Pneumonia No diagnosis 0 0 0 1

38 Chemotherapy Febrile

neutropenia

No diagnosis 0 0 0 1

39 SOT Skin infection Human Herpes

Virus type 3 (VZV)

1 0 1 1

40 SOT Skin infection Pseudomonas sp. 1 0 1 1

43 PI Pneumonia No diagnosis 0 0 0 1

44 Chemotherapy Pneumonia No diagnosis 0 0 0 1

46 SOT Fever No diagnosis 0 0 0 0

193738

Page 20:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

Patient

identification

number

Immunodeficiency Clinical

presentation

and/or site of

infection

Pathogen

identified

Microbial

identification

by standard

procedures on

a sample also

tested by UNGS

Microbial

identification

by standard

procedures on

a sample not

tested by UNGS

Microbial

identification

by UNGS

Likelihood of

infectiona

48 HSCT Flu like

syndrome

Influenza virus A 0 0 0 1

53 HSCT Disseminated

infection (lung,

muscle, skin)

Nocardia sp. 0 0 0 1

57 SOT Pneumonia No diagnosis 0 0 0 1

58 PI Fever No diagnosis 0 0 0 0

59 SOT Pneumonia Pseudomonas

aeruginosa

1 0 1 1

60 HSCT Liver lesion

(GVH)

No diagnosis 0 0 0 0

61 SOT Pneumonia Stenotrophomonas

maltophilia

1 0 1 1

203940

Page 21:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

Patient

identification

number

Immunodeficiency Clinical

presentation

and/or site of

infection

Pathogen

identified

Microbial

identification

by standard

procedures on

a sample also

tested by UNGS

Microbial

identification

by standard

procedures on

a sample not

tested by UNGS

Microbial

identification

by UNGS

Likelihood of

infectiona

62 Other Fever No diagnosis 0 0 0 0

63 HSCT Febrile

neutropenia

No diagnosis 0 0 0 1

64 Other Pneumonia No diagnosis 0 0 0 1

65 Chemotherapy Pneumonia No diagnosis 0 0 0 1

66 Chemotherapy Pneumonia No diagnosis 0 0 0 1

68 Other Fever No diagnosis 0 0 0 1

69 Other Myositis No diagnosis 0 0 0 1

70 PI Pneumonia No diagnosis 0 0 0 1

71 SOT Pneumonia Streptococcus

anginosus

1 0 1 1

72 Other Laryngitis No diagnosis 0 0 0 1

76 Other Pneumonia No diagnosis 0 0 0 1

214142

Page 22:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

Patient

identification

number

Immunodeficiency Clinical

presentation

and/or site of

infection

Pathogen

identified

Microbial

identification

by standard

procedures on

a sample also

tested by UNGS

Microbial

identification

by standard

procedures on

a sample not

tested by UNGS

Microbial

identification

by UNGS

Likelihood of

infectiona

77 SOT Febrile

neutropenia

No diagnosis 0 0 0 1

81 Chemotherapy Meningitis No diagnosis 0 0 0 0

83 SOT Liver lesion No diagnosis 0 0 0 1

86 Chemotherapy Colitis No diagnosis 0 0 0 1

88 Chemotherapy Febrile

neutropenia

No diagnosis 0 0 0 1

89 PI Pneumonia No diagnosis 0 0 0 1

90 Chemotherapy Fever No diagnosis 0 0 0 1

91 Other Fever No diagnosis 0 0 0 1

92 PI Febrile

adenomegalies

No diagnosis 0 0 0 0

94 HSCT Febrile No diagnosis 0 0 0 1

224344

Page 23:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

Patient

identification

number

Immunodeficiency Clinical

presentation

and/or site of

infection

Pathogen

identified

Microbial

identification

by standard

procedures on

a sample also

tested by UNGS

Microbial

identification

by standard

procedures on

a sample not

tested by UNGS

Microbial

identification

by UNGS

Likelihood of

infectiona

neutropenia

96 Other Fever No diagnosis 0 0 0 1

98 Chemotherapy Fever No diagnosis 0 0 0 1

101 Other Surgical site

infection

Finegoldia magna 1 0 1 1

102 Other Skin infection No diagnosis 0 0 0 1

103 Other Fever No diagnosis 0 0 0 1

Abbreviations: CMV, Cytomegalovirus; HSCT, hematopoietic stem cell transplantation; PI, primary immunodeficiency; SOT, solid organ transplantation; UNGS, untargeted next-

generation sequencing.

a Evaluated at the end of the trial using all available information.

b No diagnosis: No clinically-relevant viruses or bacteria were identified as being the agents responsible for the patient’s symptoms after consideration of all the available data and

documentation by the expert panel of clinicians.

23

164

165

166

167

168

4546

Page 24:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

Supplementary Table 3. Divergent identifications by UNGS and standard procedures at

inclusion

24

169

170

4748

Page 25:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

Pathogen Microbial

identification by

standard

procedures on a

sample also

tested by UNGS

Microbial

identification by

standard

procedures on a

sample not

tested by UNGS

Microbial

identification

by UNGS

Likelihood of

infectiona

Candida albicans 1 0 0 1

Aspergillus sp. 0 1 0 1

Aspergillus sp. 0 1 0 1

Clostridium sp. 0 1 0 1

Nocardia farcinica 0 1 0 1

Proteus mirabilis 0 1 0 1

Human Herpes

Virus type 1

(HSV1)

0 1 0 1

Candida albicans 0 1 0 1

Rhinovirus A 0 0 1 1

Porphyromonas

gingivalis

0 0 1 1

Adenovirus B 0 0 1 1

Pseudomonas sp. 0 0 1 1

Microsporidium sp. 0 1 0 1

Microsporidium sp. 0 1 0 1

Enterobacter

aerogenes

0 1 0 1

Klebsiella

pneumoniae

0 1 0 1

Pseudomonas sp. 0 0 1 1

Pseudomonas sp. 0 0 1 1

Pseudomonas sp. 0 0 1 1

254950

Page 26:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

Pathogen Microbial

identification by

standard

procedures on a

sample also

tested by UNGS

Microbial

identification by

standard

procedures on a

sample not

tested by UNGS

Microbial

identification

by UNGS

Likelihood of

infectiona

265152

Page 27:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

Pathogen Microbial

identification by

standard

procedures on a

sample also

tested by UNGS

Microbial

identification by

standard

procedures on a

sample not

tested by UNGS

Microbial

identification

by UNGS

Likelihood of

infectiona

Pseudomonas sp. 0 0 1 1

Aspergillus sp. 1 0 0 1

Pseudomonas sp. 0 0 1 1

Proteus mirabilis 0 1 1 1

Pseudomonas sp. 0 0 1 1

Influenza virus A 0 0 0 1

Campylobacter

jejuni

0 1 0 1

Human rhinovirus 0 1 0 1

Enterococcus

faecium

0 1 0 1

Staphylococcus

haemolyticus

0 1 0 1

Stenotrophomonas

maltophilia

0 1 0 1

Staphylococcus

warneri

0 0 1 1

Pseudomonas sp. 0 0 1 1

Nocardia sp. 0 0 0 1

Escherichia coli 0 1 0 1

Streptococcus

pneumoniae

0 1 0 1

275354

Page 28:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

Pathogen Microbial

identification by

standard

procedures on a

sample also

tested by UNGS

Microbial

identification by

standard

procedures on a

sample not

tested by UNGS

Microbial

identification

by UNGS

Likelihood of

infectiona

Streptococcus

intermedius

0 0 1 1

Pseudomonas sp. 0 0 1 1

Escherichia coli 1 0 0 1

Pseudomonas sp. 0 0 1 1

Streptococcus

pneumoniae

0 0 1 1

Pseudomonas sp. 0 0 1 1

Pseudomonas sp. 0 0 1 1

Enterococcus

faecium

0 1 0 1

Pseudomonas sp. 0 0 1 1

Pseudomonas sp. 0 0 1 1

Pseudomonas sp. 0 0 1 1

Pseudomonas sp. 0 0 1 1

Stenotrophomonas

maltophilia

0 0 1 1

Streptococcus

pneumoniae

0 0 1 1

Acinetobacter

baumannii

0 0 1 1

Pseudomonas sp. 0 0 1 1

285556

Page 29:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

Pathogen Microbial

identification by

standard

procedures on a

sample also

tested by UNGS

Microbial

identification by

standard

procedures on a

sample not

tested by UNGS

Microbial

identification

by UNGS

Likelihood of

infectiona

Acinetobacter

baumannii

0 0 1 1

Streptococcus

pneumoniae

0 0 1 1

Bacteroides fragilis 0 0 1 1

Finegoldia magna 0 0 1 1

Anaerococcus

prevotii

0 0 1 1

Streptococcus

parasanguinis

0 0 1 1

CMV 1 0 0 1

Acinetobacter

baumannii

0 0 1 1

Aspergillus sp. 0 1 0 1

Abbreviations: CMV, HHV5, Cytomegalovirus; UNGS, untargeted next-generation sequencing.

a Evaluated at the end of the trial using all available information.

29

171

172

173

5758

Page 30:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

Supplementary Table 4. Clinically-relevant viruses and bacteria in samples with

concordant results for UNGS and standard procedures

Number of diagnoses

(N=9)

All Antimicrobial

treatment

Viruses

HIV 1

CMV 1 Ganciclovir IV

VZV 1 Aciclovir IV

All Active

antimicrobial

therapy

Inactive

antimicrobial

therapy

No therapy

Bacteria

Staphylococcus aureus 1 1 0 0

Pseudomonas spp. 2 1 1

Stenotrophomonas

maltophilia

1 1

Streptococcus spp. 1 1

Finegoldia magna 1 1

Abbreviations: CMV, HHV5, Cytomegalovirus; HIV, Human Immunodeficiency Virus; UNGS, untargeted next-

generation sequencing; VZV, Varicella Zoster Virus.

30

174

175

176

177

178

5960

Page 31:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

Supplementary Table 5. Comparison of UNGS at inclusion with standard procedures

running for 30 days for clinically-relevant viruses, bacteria and fungi

Standard procedures-

positivea

Standard procedures-

negative

Total no. of

patients (%)

UNGS-positive 13b 23 36 (36)c

UNGS-negative 13 52b 65 (64)

Total no. of patients (%) 26 (26)c 75 (74) 101

Abbreviations: CI, confidence interval; SP, standard procedures; UNGS, untargeted next-generation sequencing.

a Positive: patient with a microbiological diagnosis.

b UNGS and SP were concordant for 65 of 101 patients (Kappa test=0.17 [95% CI,-0.02 to 0.37]).

c The detection rate of CRVB by UNGS and SP was not significantly different based on the McNemar test: P=0.133.

31

179

180

181

182

183

184

185

6162

Page 32:  · Web viewSamples were first spiked with DNA and RNA controls composed of one RNA, and one DNA, bacteriophage at a final concentration of 105 and 106 genomes/ml, respectively, to

Supplementary figures

Figure S1. Genome distribution metric calculation

Figure S2. Three different genome coverage and distribution models

32

186

187

188

189

190

191

192

6364