protein identification and quantification by data-independent acquisition and multi-parallel...

11
UNCORRECTED PROOF 1 Protein identification and quantification by 2 data-independent acquisition and multi-parallel 3 collision-induced dissociation mass spectrometry 4 (MS E ) in the chloroplast stroma proteome Stefan Q1 Q3 Helm, Dirk Dobritzsch, Anja Rödiger, Birgit Agne, Sacha Baginsky Q10 Institute of Biochemistry and Biotechnology, Martin Luther University Halle-Wittenberg, Weinbergweg 22, 06120 Halle (Saale), Germany 7 9 ARTICLE INFO 10 ABSTRACT 11 Article history: 12 Received 6 September 2013 13 Accepted 2 December 2013 14 15 We report here a systematic evaluation of a multiplex mass spectrometry method coupled with 16 ion mobility separation (HD-MS E ) for the identification and quantification of proteins in the 17 chloroplast stroma. We show that this method allows the robust quantification of reference 18 proteins in mixtures, and it detects concentration differences with high sensitivity when three 19 replicas are performed. Applied to the analysis of the chloroplast stroma proteome, HD-MS E 20 identified and quantified many chloroplast proteins that were not previously identified in 21 large-scale proteome analyses, suggesting HD-MS E as a suitable complementary tool for 22 discovery proteomics. We find that HD-MS E tends to underestimate protein abundances at 23 concentrations above 25 fmol, which is likely due to ion transmission loss and detector 24 saturation. This limitation can be circumvented by omitting the ion mobility separation step in 25 the HD-MS E workflow. The robustness of protein quantification is influenced by the selection of 26 peptides and their intensity distribution, therefore critical scrutiny of quantification results is 27 required. Based on the HD-MS E quantification of chloroplast stroma proteins we performed a 28 meta-analysis and compared published quantitative data with our results, using a parts per 29 million normalization scheme. Important pathways in the chloroplast stroma show quantitative 30 stability against different experimental conditions and quantification strategies. 31 32 Q4 Q5 Biological significance 33 Our analysis establishes MS E -based Hi3 quantification as a tool for the absolute quantification 34 of proteins in the chloroplast stroma. The meta-analysis performed with a parts per million 35 normalization scheme shows that quantitative proteomics data acquired in different labs and 36 with different quantification strategies yield comparable results for some metabolic pathways, 37 while others show a higher variability. Our data therefore indicate that such meta-analyses 38 allow distinguishing robust from fine-controlled metabolic pathways. 39 © 2013 Published by Elsevier B.V. 40 Keywords: 41 MS E 42 Absolute quantification 43 Chloroplast stroma proteome 44 ppm normalization 45 Proteome meta-analysis 46 47 48 49 50 51 52 Introduction 53 Large-scale protein quantification by data independent multi- 54 parallel collision induced dissociation (MS E ) mass spectrometry 55 Quantitative proteomics comes in various flavors and 56 many methods and approaches were developed that enable JOURNAL OF PROTEOMICS XX (2013) XXX XXX E-mail address: [email protected] (S. Baginsky). 1874-3919/$ see front matter © 2013 Published by Elsevier B.V. http://dx.doi.org/10.1016/j.jprot.2013.12.007 Available online at www.sciencedirect.com ScienceDirect www.elsevier.com/locate/jprot JPROT-01649; No of Pages 11 Please cite this article as: Helm S, et al, Protein identification and quantification by data-independent acquisition and multi- parallel collision-induced dissociation mass ..., J Prot (2013), http://dx.doi.org/10.1016/j.jprot.2013.12.007

Upload: sacha

Post on 21-Dec-2016

217 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Protein identification and quantification by data-independent acquisition and multi-parallel collision-induced dissociation mass spectrometry (MSE) in the chloroplast stroma proteome

1

2

3

4

5Q1 Q3

6Q10

7

910

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47484950

J O U R N A L O F P R O T E O M I C S X X ( 2 0 1 3 ) X X X – X X X

Ava i l ab l e on l i ne a t www.sc i enced i r ec t . com

ScienceDirectwww.e l sev i e r . com/ loca te / j p ro t

JPROT-01649; No of Pages 11

F

Protein identification and quantification bydata-independent acquisition and multi-parallelcollision-induced dissociation mass spectrometry(MSE) in the chloroplast stroma proteome

O

ROStefan Helm, Dirk Dobritzsch, Anja Rödiger, Birgit Agne, Sacha Baginsky

Institute of Biochemistry and Biotechnology, Martin Luther University Halle-Wittenberg, Weinbergweg 22, 06120 Halle (Saale), Germany

A R T I C L E I N F O

NCO

R

E-mail address: sacha.baginsky@biochem(S. Baginsky).

1874-3919/$ – see front matter © 2013 Publishttp://dx.doi.org/10.1016/j.jprot.2013.12.007

Please cite this article as: Helm S, et al,parallel collision-induced dissociation ma

PA B S T R A C T

Article history:Received 6 September 2013Accepted 2 December 2013

Q4Q5

RECTED

We report here a systematic evaluation of amultiplexmass spectrometrymethod coupled withion mobility separation (HD-MSE) for the identification and quantification of proteins in thechloroplast stroma. We show that this method allows the robust quantification of referenceproteins in mixtures, and it detects concentration differences with high sensitivity when threereplicas are performed. Applied to the analysis of the chloroplast stroma proteome, HD-MSE

identified and quantified many chloroplast proteins that were not previously identified inlarge-scale proteome analyses, suggesting HD-MSE as a suitable complementary tool fordiscovery proteomics. We find that HD-MSE tends to underestimate protein abundances atconcentrations above 25 fmol, which is likely due to ion transmission loss and detectorsaturation. This limitation can be circumvented by omitting the ion mobility separation step inthe HD-MSE workflow. The robustness of protein quantification is influenced by the selection ofpeptides and their intensity distribution, therefore critical scrutiny of quantification results isrequired. Based on the HD-MSE quantification of chloroplast stroma proteins we performed ameta-analysis and compared published quantitative data with our results, using a parts permillionnormalization scheme. Important pathways in the chloroplast stroma showquantitativestability against different experimental conditions and quantification strategies.

Biological significanceOur analysis establishesMSE-basedHi3 quantification as a tool for the absolute quantificationof proteins in the chloroplast stroma. The meta-analysis performed with a parts per millionnormalization scheme shows that quantitative proteomics data acquired in different labs andwithdifferent quantification strategies yield comparable results for somemetabolic pathways,while others show a higher variability. Our data therefore indicate that such meta-analysesallow distinguishing robust from fine-controlled metabolic pathways.

© 2013 Published by Elsevier B.V.

Keywords:MSE

Absolute quantificationChloroplast stroma proteomeppm normalizationProteome meta-analysis

U

5152

53

54

55

56

tech.uni-halle.de

hed by Elsevier B.V.

Protein identification anss ..., J Prot (2013), http://

Introduction

Large-scale protein quantification by data independent multi-parallel collision induced dissociation (MSE) mass spectrometry

Quantitative proteomics comes in various flavors andmany methods and approaches were developed that enable

d quantification by data-independent acquisition and multi-dx.doi.org/10.1016/j.jprot.2013.12.007

Page 2: Protein identification and quantification by data-independent acquisition and multi-parallel collision-induced dissociation mass spectrometry (MSE) in the chloroplast stroma proteome

T

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

2 J O U R N A L O F P R O T E O M I C S X X ( 2 0 1 3 ) X X X – X X X

UNCO

RREC

experimentalists to adapt protein quantification to theirparticular purpose [1–10]. In recent years, new mass spectro-metric acquisition techniques were introduced that samplein a data-independent fashion to circumvent problems withthe stochastic nature of peptide sampling in data-dependentacquisition. One of these methods is referred to as MSE.In MSE, the mass spectrometer switches between low andhigh collision energy with quadrupole settings that allow allprecursor ions within a chosen molecular mass window topass through. Thus, all peptides within themass windowwillbe detected and fragmented. The downside of this approachis the lost connection between precursor and fragment ions,and algorithms are required to re-establish this connectionby aligning elution profiles of precursor and product ions inthe chromatography. On the other hand, the rapid cyclingbetween low and high-collision energy and the fact that allpeptides are fragmented preserves more accurate quantitativeinformation compared with the standard data-dependentacquisition modes. Thus MSE is perfectly suitable for label-freequantification based on peptide XIC [11]. High-definition (HD)-MSE is based on the same principle as MSE but additionally usesion mobility (IMS) as a further peptide separation step. Becauseprecursor ions are fragmented after the IMS cell, ion mobilityis an additional characteristic feature for the alignment ofprecursor and product ions. This increases its sensitivitycompared to MSE alone because more peptides can be assignedwith higher confidence [12].

MSE allows absolute protein quantification by comparingthe intensity read-out of proteotypic sample peptides withthose of an internal standard. The correlation between signalintensity of up to three most intense ions and proteinconcentration is used to infer the abundance of proteins inthe sample. Silva and colleagues have shown that the countof measured signal intensity per amount of protein (so calledresponse factor) is constant for all proteins tested [11] providedthat the three peptides with the highest XIC read-out can beused for response factor calculation. Recent analyses showedthat this quantification strategy is not restricted to MSE, butalso works in the data-dependent acquisition mode [13]. Theadvantage of this approach lies in its applicability to samplesof low complexity. Furthermore it enables the quantificationof proteins from non-model crop plant species, because onlythree proteotypic peptides must be identified for quantifica-tion thus increasing the chances for homology-based peptidedetection [11,14].

Quantification of chloroplast stroma proteins

The chloroplast proteome has been studied intensely andhigh-quality plastid proteome maps are now available [15,16].However, quantitative information on the identified proteinsis not easily accessible from the published data. Many quanti-tative chloroplast proteomeanalysesused either 2-dimensionalgel electrophoresis or spectral counting. For example, Bischofand colleagues used normalized spectral counting (nSpC) tocompare the proteomes of wildtype plastids with non-photosynthetic plastids of a plastid protein import mutant[17]. Motohashi and colleagues used nSpC to identify proteinregulons in different albino/pale-green mutants [18]. Kim andcolleagues used normalized spectrum abundance factor (NSAF)

Please cite this article as: Helm S, et al, Protein identification anparallel collision-induced dissociation mass ..., J Prot (2013), http:/

ED P

RO

OF

quantification to compare the wildtype chloroplast proteomewith that of a clp mutant [19]. All analyses were carried out atthe level of the entire cell, and proteins were allocated to thechloroplast based on proteome reference tables a posteriori.Although the purposes of the aforementioned studies wereconfined by a specific hypothesis, quantification of proteinsin wildtype chloroplasts was done in all studies. So far,this quantitative information for enzymes of the chloroplastmetabolism was not used for metabolic modeling or otherpredictive analyses of organellar metabolism. This is owed tothe fact that protein quantification was performed in differentlabs and under different conditions, and that different quanti-fication and normalization schemes were used.

We report here a systematic assessment HD-MSE basedprotein quantification in the chloroplast stroma and compareour data with published information on chloroplast proteinquantities. To make the different data sets comparable, wenormalized protein abundance with the parts per millionnormalization scheme that was originally developed for aquantitative metaproteome analysis between Arabidopsis,Drosophila and Caenorhabditis elegans [20,21]. Our analysisshows that the quantification of proteins by HD-MSE yieldsrobust results, despite the wide dynamic range of proteinconcentrations in the chloroplast stroma. In individual cases,careful examination of peptide quantification characteristicsmay be necessary. This concerns the question which pep-tides were selected for XIC measurements as well as thedeviation of XIC among the three most intense proteotypicpeptide ions. Both parameters constitute a potential errorsource for quantification accuracy. The HD-MSE-based quan-tification showed that the quantitative distribution of pro-teins in different chloroplast functions is surprisingly robustbetween different published studies. This also entails thedistribution of enzymes within one pathway, suggesting thatquantitative meta-analyses as reported here are suitable toreveal basic robustness principles in the cellular metabolism.

Materials and methods

Materials

LC-MS grade solvents, including water with 0.01% (w/v) formicacid, water with 0.1% (w/v) trifluoroacetic acid and acetonitrilewith 0.1% (w/v) formic acid were obtained from Carl Roth(Karlsruhe, Germany). Porcine sequencing grade modifiedtrypsin was obtained from Promega (Mannheim, Germany).

Plant material

Arabidopsis thaliana plants (Columbia-0) were used for thepreparation of stroma extracts. These were either grown onplates at 22 °C, 8 h light, 16 h darkness, 150 μmol m−2 s−1 on½ MS media (replicate 1), containing 0.8% sucrose, or on soil inthe greenhouse (replicates 2 and 3). All plants were harvested onday21–28. Cellwallswere either enzymatically digested cellulase0.015 g/mL, macerozym 0.0375 g/mL [17] or disintegrated in ablender as described earlier [22]. Chloroplasts were isolated onPercoll (GE Healthcare, Solingen, Germany) gradients. Purifiedintact chloroplasts were then lysed in hypotonic medium. The

d quantification by data-independent acquisition and multi-/dx.doi.org/10.1016/j.jprot.2013.12.007

Page 3: Protein identification and quantification by data-independent acquisition and multi-parallel collision-induced dissociation mass spectrometry (MSE) in the chloroplast stroma proteome

T

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253Q6

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

3J O U R N A L O F P R O T E O M I C S X X ( 2 0 1 3 ) X X X – X X X

UNCO

RREC

stromal plastid subfraction was purified on a sucrose gradientby removing membrane components such as thylakoid andenvelope membranes as described previously [22].

In-solution digestion

Lysed stroma samples were centrifuged (4 °C, 21500 ×g, 5 min)to remove remaining membrane shreds and protein concen-tration was determined with Bradford assays [23]. 100 μgstromal proteins were precipitated in 80% acetone at −80 °Covernight and pelleted (4 °C, 21500 ×g, 10 min). Proteins weresolubilized in 233.5 μL 25 mM ammonium bicarbonate con-taining 0.05% (w/v) RapiGest SF (Waters, Eschborn, Germany)for 10 min at 80 °C, and reduced by adding 6.2 μL 10 mMdithiothreitol for further 10 min at 60 °C. Cysteine residueswere alkylated using 6.2 μL 30 mM iodacetamide for 30 minin the dark. Subsequently, 4 μl trypsin (0.25 mg/mL, finalconcentration: 1:100 (w/w) corresponding to 1 μg protease pervial) was added. Samples (final volume 250 μL) were digested(37 °C, overnight). In order to stop the digestion at pH valuesbelow 2.0, 2 μL 37%HCl was added. To avoid loading insolublematerial onto the C18 column, the peptide solutions werethoroughly centrifuged prior to sample loading (4 °C, 21500 ×g,30 min).

Nano-LC separation

For analytical runs we injected 1 μl of the in solution digestinto the chromatography system, containing around 400 ngprotein. Peptides were separated on a ACQUITY UPLC System(Waters, Eschborn, Germany) equipped with a 200 mm ×180 μm fused silica trap column packed with 5 μm SymmetryC18 (Waters, Eschborn, Germany) as well as a 250 mm × 75 μmfused silica separation column packed with 1.8 μm HSS T3 C18(Waters, Eschborn, Germany). After injection of 1 μL, peptideswere trapped for 5 min at 5 μL/min at 1% B (A: 0.1%trifluoroacetic acid in water, B: 0.1% formic acid in ACN)and separated at 300 nL/min in a linear gradient of 7–35% B(A: 0.1% formic acid in water, B: 0.1% formic acid in ACN)within 140 min.

MS: data independent acquisition HD-MSE with ion mobilityseparation

Nano-LC-HD-MSE data were acquired for three biologicalreplicates and for every biological replicate three technicalreplicates were performed. Eluting peptides were ionized at2.1 kV and 80 °C in a Waters nanoESI source using a Pre-cutPicoTip Emitter; 360 μm OD × 20 μm ID, 10 μm tip; 2.5′′ long(Waters, Eschborn, Germany) with the cone voltage set to40 V. Nitrogen was used for the cone, nano flow and purgegases with 10 L/h, 0.4 bar and 450 L/h, respectively. Intactpeptide mass spectra and fragmentation spectra were acquiredon a SYNAPT G2-S mass spectrometer (Waters, Eschborn,Germany) in resolution mode with positive ionization. Forion mobility separation, ions were accumulated in the trapcell with a release time of 500 μs and afterwards “cooled” for1000 μs (mobility separation delay after trap release) in thehelium cell that is located between trap and ion mobilityseparation (IMS) cell. Therefore, the helium pressure in this

Please cite this article as: Helm S, et al, Protein identification anparallel collision-induced dissociation mass ..., J Prot (2013), http://

ED P

RO

OF

cell was set to 4.7 mbar. The following ion separation, whichis based on the combination of ion size, shape and charge,occurred in the IMS cell, which was filled with nitrogen at apressure of 2.87 mbar. For optimal ion mobility separation,wave height (38 V) and wave velocity (ramped from 1200 m/sto 400 m/s) were optimized. Other parameters were main-tained at the predetermined, instrument-specific settings.Acquisition time was 140 min in a range of 50–2000 Da and1 s scan time. CID was achieved by Argon (collision gas)and function 2 high energy with ramp transfer collisionenergy of 25–55 V. Glu-Fib (Glu-1-Fibrinopeptide B, 250 fmol/μL,0.3 μL/min) was used as lock mass (m/z = 785.8426, z = 2)and mass correction was applied to the spectra during dataprocessing. ForMSE data acquisition the ionmobility separationwas disabled resulting in a helium pressure of 3.13e−4 mbarin the helium cell and a nitrogen pressure of 1.51e−4 mbar inthe IMS cell. All other acquisition parameters were retained asdescribed above.

Data processing

Data analysis was carried out by ProteinLynx Global Server(PLGS 3.0, Apex3D algorithmv. 2.128.5.0, 64 bit,Waters, Eschborn,Germany) with automated determination of chromatographicpeak width as well as MS TOF resolution. Lock mass value forcharge state two was defined as 785.8426 Da/e. Lock masswindow was set to 0.25 Da. Low/high energy threshold was setto 200/20 counts, respectively. Elution start time was 5 min,intensity threshold was set to 750 counts.

Database searching

Databank search query (PLGS workflow) was carried out asfollows: Peptide and fragment tolerance was set to automatic,two fragment ions match per peptide, at least five fragmentions for protein identification, and min 2 peptides per protein.A maximum protein mass was set to 250 kDa. Primary digestreagent was trypsin with one missed cleavage allowed.According to digest protocol fixed (carbamidomethyl on Cys)as well as variable (oxidation on Met) modifications were set.The false discovery rate (FDR) was set to 4% at the protein level.Furthermore, we defined 5 or 10 fmol/injection (as indicated)rabbit glycogen phosphorylase B (P00489) as internal calibrationstandard. MSE data were searched against the modifiedA. thaliana database (TAIR10, ftp://ftp.arabidopsis.org) contain-ing common contaminants such as keratin (ftp://ftp.thegpm.org/fasta/cRAP/crap.fasta) as well as a set of 6 proteins used asinternal standard for the quantification. Redundant entries aswell as splice variants were removed for database searching.ProteinLynx Global SERVER generated (“PRIDE plugin”) xml datahave been deposited to the ProteomeXchange Consortium(http://proteomecentral.proteomexchange.org) via the PRIDEpartner repository with the data set identifier PXD000446.

Statistical analysis, reference protein mixtures and datanormalization

Three independent biological replicates were analyzed atleast three times and data were analyzed by separatet-tests (two-sided, Welch test). For quantitative analyses

d quantification by data-independent acquisition and multi-dx.doi.org/10.1016/j.jprot.2013.12.007

Page 4: Protein identification and quantification by data-independent acquisition and multi-parallel collision-induced dissociation mass spectrometry (MSE) in the chloroplast stroma proteome

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321Q7

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

4 J O U R N A L O F P R O T E O M I C S X X ( 2 0 1 3 ) X X X – X X X

with reference protein mixtures, we used the followingconcentrations of predigested reference proteins that wereobtained from Waters (Eschborn, Germany). Mix A: 20 fmolenolase (ENO) (P00924), 40 fmol alcohol dehydrogenase (ADH)(P00330), 10 fmol ClpB (P63284), 30 fmol hemoglobin A/B (HBA)(P01966, P02081); Mix B – 1:2 dilution of Mix A; Mix C – 1:10dilution of Mix A; Mix I: 2 fmol ENO, 5 fmol ADH, 8 fmol ClpB,5 fmol HBA; Mix II: 3 fmol ENO, 7 fmol ADH, 10 fmol ClpB,6 fmol HBA; Mix III: 4 fmol ENO, 10 fmol ADH, 12 fmol ClpB;8 fmol HBA. Mixes A–C were measured with glycogen phos-phorylase B (GPB) (P00489) as internal standard at a concentra-tion of 10 fmol (added after the dilution), Mixes I–III weremeasured with GPB as internal standard at a concentration of5 fmol. The lyophilized protein powder was solubilized in 1 mlwater and further diluted and mixed as described above(Table 1). To compare information from current as well aspublished data sets, protein abundances were converted intoparts per million (ppm) values [20,21]. As a precondition thedata sets were filtered using a previously published proteomereference table [15] mainly derived from curated informationfrom the SUBA3 database (suba.plantenergy.uwa.edu.au).Quantities (Hi3, number of spectra) per data set were summedand set as one million. Each protein quantity was proportion-ated as a part of its data set to obtain its ppm value.

T

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

UNCO

RREC

Results and discussion

Robustness of HD-MSE-based protein quantification in referenceprotein mixtures

We assessed the robustness of HD-MSE-based protein quanti-fication with standardized samples of reference proteins thatwere mixed together in different abundance combinations.Rabbit glycogen phosphorylase (P00489) was used as internalstandard at a fixed concentration of 5 or 10 fmol as indicated.We restricted our analysis to the detection of proteins in therange between 0.5 and 25 fmol, because this range is mostrelevant for the biological sample under our experimentalsettings and because HD-MSE quantification works best inthis abundance window (see below and [24]). We find thatthe HD-MSE abundance measurement correctly reproduces thespiked abundances of the reference proteins with very littlevariation between the three analytical replicates (Fig. 1A).There is a small but consistent deviation between spiked andmeasured protein abundance that affects all reference proteinsto a similar extent, although several measurements wereperformed independently. Such a systematic deviation isprobably due to minor variations in the GPB concentrationbetween batches. Although this does not pose a problem forcomparative measurements as obvious from the dilutionexperiments (see below), the abundance value reported bythe software may slightly deviate from the true abundance ofthe protein in the biological sample. We therefore preferhenceforth referring to the absolute measured abundance ofa protein as “fmol equivalents”.

To mimic a biological standard experiment, we askedwhether HD-MSE correctly reproduces the abundance dif-ference between two proteins in two different mixes andwhether the measured difference is sufficient to be statistically

Please cite this article as: Helm S, et al, Protein identification anparallel collision-induced dissociation mass ..., J Prot (2013), http:/

ED P

RO

OF

significant. To this end, we calculated p-values from a two-sided t-test from three analytical replicates for every protein intwo different samples. As depicted in Fig. 1B, higher abundancedifferences between two proteins result in smaller p-values forthe rejection of the null hypothesis (proteins have the sameconcentration), as expected. The minimal difference betweentwo proteins that allowed rejecting the null hypothesis at ap-value < 0.05 was 25% between two enolase preparations(Fig. 1B). We furthermore performed a dilution experiment andcompared the true dilution factor with the reported ones. Thedata are presented in Fig. 1C. We find a good agreement in adilution range of 1:2, 1:5 and 1:10 thus abundance changes inthis range can be measured confidently. It should be notedhowever, that this requires protein amounts in the linear rangeof quantitative HD-MSE, which is the range between 0.5 fmoland at maximum 25 fmol. Above 25 fmol, HD-MSE quantifica-tion shows no linear response to changes in protein concentra-tion (see below and [24]).

MSE and HD-MSE-based protein quantification of referenceproteins in chloroplast stroma preparations

To assess the reliability of HD-MSE-based protein quantifica-tion in a complex proteome, we spiked four different proteinsin different abundance combinations into replicate measure-ments of chloroplast stroma preparations at concentrationsbetween 0.5 fmol and 100 fmol. Rabbit glycogen phosphory-lase (P00489) was used as internal standard at a fixedconcentration of 10 fmol. The concept of the Hi3 quantifica-tion method is illustrated in Fig. 2A on four sample proteins.For every protein, the three peptides with the highestintensity are used for the quantification. The sum of the ionintensities is then related to an internal standard of knownconcentration (see also above). The abundance of a protein iscalculated from this intensity ratio. Within one biologicalreplicate, the abundance ranking of the spiked proteinsis correctly reproduced by the quantification algorithm whilethe ranking shows some deviation between differentreplicates (Table 1, Suppl. Table 1 and Suppl. Fig. 1). Forexample, the spiked concentration of ClpB was 75 fmol inreplicate 1, 100 fmol in replicate 2 and 25 fmol in replicate3, while the calculated abundances were 66.19 fmol (replicate1), 57.18 fmol (replicate 2) and 31.42 fmol (replicate 3). Thisillustrates the limited dynamic range of quantitative HD-MSE

measurements making concentration above approximately25 fmol difficult for accurate quantification [24].

There is a deviation between the calculated absoluteamount and the spiked amount of some reference proteinsthat can be attributed to the different ionization propertiesof the individual peptides. For most reference proteins, theintensity differences between the Hi3 peptides are small(see GPB, ClpB and ADH in Fig. 1) and differences in theirionization are largely averaged out [11]. The largest deviationwas found for one HBA concentration where the best ionizingpeptide produced a much higher intensity read out thanthe second best (Fig. 2B). Large differences in the ionizationproperties of the three best ionizing peptides produce quantifi-cation inaccuracies that become visible by increased standarddeviationsof thequantification results in replicates (Suppl. Fig. 2).These are smaller for proteins with well-balanced peptide

d quantification by data-independent acquisition and multi-/dx.doi.org/10.1016/j.jprot.2013.12.007

Page 5: Protein identification and quantification by data-independent acquisition and multi-parallel collision-induced dissociation mass spectrometry (MSE) in the chloroplast stroma proteome

UNCO

RRECTED P

RO

OF

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

Table 1t1:1 – Documentation of quantification results of reference proteins in the chloroplast stroma preparation. Provided ist1:2 the mean value from three technical replicates.t1:3t1:4 Protein (identifier) Spiked fmol Mean value fmol Standard deviation % deviation Peptides used for quantification

t1:5 GPB (P00489) 10 VIFLENYRt1:6NLAENISRt1:7LITAIGDVVNHDPVVGDRt1:8TNFDAFPDKt1:9APNDFNLKt1:10INMAHLCIAGSHAVNGVAR

t1:11t1:12 Biological replicate 1t1:13 ENO (P00924) 25 17.65Q2 1.50 8.51 IGSEVYHNLK

t1:14VNQIGTLSESIKt1:15IEEELGDNAVFAGENFHHGDKL

t1:16 ClpB (P63284) 75 66.12 3.03 4.58 VIGQNEAVDAVSNAIRt1:17LPQVEGTGGDVQPSQDLVRt1:18VTDAEIAEVLAR

t1:19 ADH (P00330) 5 6.58 0.27 4.11 IGDYAGIKt1:20EALDFFARt1:21ANELLINVKt1:22VVGLSTLPEIYEK

t1:23 HBA (P01966)/HBB (P02081) 0.5 0.43 0.19 24.46 VGGHAAEYGAEALERt1:24VGGHAAEYGAEALERMFLSFPTTKt1:25AVEHLDDLPGALSELSDLHAHKt1:26MVLSAADKt1:27FLANVSTVLTSK

t1:28t1:29 Biological replicate 2t1:30 ENO 50 22.08 2.22 10.08 IGSEVYHNLK

t1:31IEEELGDNAVFAGENFHHGDKLt1:32VNQIGTLSESIK

t1:33 ClpB 100 57.44 3.77 6.56 VIGQNEAVDAVSNAIRt1:34LPQVEGTGGDVQPSQDLVRt1:35VTDAEIAEVLAR

t1:36 ADH 10 6.9 0.59 8.58 IGDYAGIKt1:37VVGLSTLPEIYEKt1:38EALDFFARt1:39VLGIDGGEGKEELFR

t1:40 HBA/HBB 1 – – –t1:41t1:42 Biological replicate 3t1:43 ENO 1 0.57 – – IGSEVYHNLK

t1:44VNQIGTLSESIKt1:45IEEELGDNAVFAGENFHHGDKL

t1:46 ClpB 25 31.42 1.27 4.05 VIGQNEAVDAVSNAIRt1:47LPQVEGTGGDVQPSQDLVRt1:48VTDAEIAEVLAR

t1:49 ADH 7.5 5.04 0,21 4.23 IGDYAGIKt1:50EALDFFARt1:51VLGIDGGEGKt1:52ANELLINVK

t1:53 HBA/HBB 2.5 1.2 1.59 50.76 AVEHLDDLPGALSELSDLHAHKt1:54FLANVSTVLTSKt1:55VGGHAAEYGAEALERt1:56TYFPHFDLSHGSAQVKt1:57MFLSFPTTK

5J O U R N A L O F P R O T E O M I C S X X ( 2 0 1 3 ) X X X – X X X

ionization properties. To assesswhether this is amajor problemfor complex samples, we plotted the peptide intensity differ-ence as boxplot for all identified proteins in the nine replicatestromameasurements. The median variance is around 0.3 andonly a few proteins produce large intensity differences betweenthe Hi3 peptides (Fig. 2C). We therefore suggest that theassumed inaccuracies because of peptide intensity varianceare rather small in a biological sample. Proteins that produce

Please cite this article as: Helm S, et al, Protein identification anparallel collision-induced dissociation mass ..., J Prot (2013), http://

highpeptide intensity variancesmay be either omitted fromtheabundance calculation, or scrutinized by other tools.

Quantification and identification of chloroplast stroma proteins

To identify and quantify proteins in the chloroplast stroma, weperformed an in solution digest of a stromapreparation in threebiological replicates with three technical replicates each and

d quantification by data-independent acquisition and multi-dx.doi.org/10.1016/j.jprot.2013.12.007

Page 6: Protein identification and quantification by data-independent acquisition and multi-parallel collision-induced dissociation mass spectrometry (MSE) in the chloroplast stroma proteome

T

RO

OF

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

0

5

10

15

20

25

0 5 10 15

mea

sure

d pr

otei

n am

ount

[fm

ol]

spiked protein amount [fmol]

ADH

ENO

ClpB

HBA/HBB

0

0.5

1

1.5

2

2.5

3

3.5

4

-0.5 0.5 1.5

log2 fold difference (protein)

set ratio

ADH

HBA/HBB

ENO

ClpB

0

2

4

6

8

10

12

14

16

1:2 1:10 1:5

Mix ratio

A B C

mea

sure

d ra

tio

-log1

0 p-

valu

e

Fig. 1 – Benchmarking of quantitative measurements with reference protein mixtures. (A) Scatter plot of spiked and measuredreference protein amount. Error bars indicate the deviation between three analytical replicates. (B) We measured the proteinMixes I–III and checked whether the abundance difference between two proteins is correctly reproduced and whether it isstatistically significant. The horizontal bar indicates the p-value 0.05, all values above represent p-values below this threshold.(C) We diluted Mix A in a dilution series 1:2, 1:5 and 1:10 (x-axis) and checked the measured ratio for all the different dilutions.Error bars indicate the deviation between three analytical replicates.

6 J O U R N A L O F P R O T E O M I C S X X ( 2 0 1 3 ) X X X – X X X

subsequently analyzed 0.4 μg of this digest with HD-MSE massspectrometry. Note that all subsequent fmol equivalentsreported throughout the manuscript refer to protein amountin 0.4 μg stroma protein. Altogether, 1011 proteins wereidentified in at least two replicates and 962 of these were alsoquantified (Suppl. Table 2). The most abundant protein is smallsubunit of RubisC/O with an average calculated quantity of67 fmol equivalents and the least abundant chloroplast

UNCO

RREC

0

10

20

30

40

50

60

70

80

90

0 0.2

devi

atio

n [%

]

Coefficie

0

5

10

15

20

25

30

A B C D E Fpeptides

ENO

0

20

40

60

80

100

120

A B C D E Fpeptides

ClpB

0

5

10

15

20

A B C D E Fpeptides

GPB

02468

101214

A B C D E Fpeptides

ADH

A B

Inte

nsity

* 1

03

Inte

nsity

* 1

03In

tens

ity *

103

Inte

nsity

* 1

03

Fig. 2 – Characteristics of the Hi3 quantification strategy (A) Illustionizing peptides for four sample proteins. Peptide intensities repsame measurement of a reference protein with known concentrabundance in a certain concentration range. (B) Relation of intensand deviation between spiked and calculated protein abundancedifference between the 1st and the 2ndmost intense ions dividedis expressed as percent (%) value between spiked and measuredperformed for all proteins in the nine stromameasurements. Theand does not vary between the different replicates.

Please cite this article as: Helm S, et al, Protein identification anparallel collision-induced dissociation mass ..., J Prot (2013), http:/

ED Pprotein is 2-phosphoglycolate phosphatase 2 (AT5G47760)

with 0.043 fmol equivalents. Thus, our HD-MSE analysis coversaround 4 orders of magnitude. There is a strong tendency tounderestimate the concentration of proteins with abundancesabove 25 fmol as indicated by the calculated values for theinternal standards (Table 1 and Fig. 3). This has been reportedbefore and a combination of ion transmission loss and detectorsaturation during ion mobility separation was suggested as

0.4 0.6 0.8 1nt of peptide intensity

difference

ADH

ENO

HBA

ClpB

-0.2

0.1

0.4

0.6

0.9

1.1

BR

1_1

BR

1_2

BR

1_3

BR

2_1

BR

2_2

BR

2_3

BR

3_1

BR

3_2

BR

3_3

coe

ffici

ent o

f pep

tide

inte

nsity

diff

eren

ce

C

ration of the Hi3 quantitative measurements of the three bestresented as black bars are added up and set in relation to theation. The total peptide intensity correlates with proteinity difference between best and second best ionizing peptide. The peptide intensity difference is expressed as theby the intensity of the 1st most intense ion (x-axis); deviationprotein amount. (C) The same calculation as in B wasmean coefficient of peptide intensity difference is around 0.3,

d quantification by data-independent acquisition and multi-/dx.doi.org/10.1016/j.jprot.2013.12.007

Page 7: Protein identification and quantification by data-independent acquisition and multi-parallel collision-induced dissociation mass spectrometry (MSE) in the chloroplast stroma proteome

RO

OF

426

427

428

429

430

431

432

433

434

435

Fig. 3 – HD-MSE quantification of proteins identified in all proteome studies analyzed here (n = 362). For every second protein,we provide the measured protein amount as mean value of nine replicates along with the standard deviation. Chloroplaststroma proteins are sorted by their abundance, the reference proteins are plotted by their calculated abundance. The deviationbetween calculated abundance and spiked abundance is provided in Table 1 and in Suppl. Table 1.

7J O U R N A L O F P R O T E O M I C S X X ( 2 0 1 3 ) X X X – X X X

explanation for this effect [12,24]. We therefore controlled theconcentration of stroma proteins with MSE and compared theabundance values with the HD-MSE data. As expected, MSE

determined much higher protein concentrations for proteinsabove approximately 25 fmol and the logarithmic scatter plot

UNCO

RRECT

-2

-1

0

1

2

3

4

-2.5 -2 -1.5 -1 -0.5 0

log1

0 pr

otei

n am

ount

MS

E [f

mol

]

log10 protein amoun

Fig. 4 – Comparison of HD-MSE and MSE protein quantification. Qperformed either with HD-MSE or MSE. Measured abundance valuin the high abundance region that is seen with the HD-MSE quantthe significant underestimation of proteins at concentrations abo

Please cite this article as: Helm S, et al, Protein identification anparallel collision-induced dissociation mass ..., J Prot (2013), http://

D Pshows a typical curvature in the high abundance range of the

HD-MSE measurement (Fig. 4) [24]. For example, RubisC/O largesubunit abundance in MSE is 3235 fmol equivalents comparedto 62 fmol equivalents in HD-MSE. MSE is much less sensitive inthe identification of proteins of low abundance and therefore

E

0.5 1 1.5 2 2.5

t HD-MSE [fmol]

uantification of the 262 most abundant stroma proteins wases are plotted in a logarithmic scatter plot. Note the curvatureitative values. This has been reported before [24] and is due tove 25–30 fmol by HD-MSE.

d quantification by data-independent acquisition and multi-dx.doi.org/10.1016/j.jprot.2013.12.007

Page 8: Protein identification and quantification by data-independent acquisition and multi-parallel collision-induced dissociation mass spectrometry (MSE) in the chloroplast stroma proteome

T

436

437

438439

440

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

464465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

495

496

497

498

499

500

501

8 J O U R N A L O F P R O T E O M I C S X X ( 2 0 1 3 ) X X X – X X X

not a general alternative to HD-MSE in proteomics experiments.The combination of both scan types may be applied to extendthe dynamic range of protein quantification [12].

Together, 909 proteins were identified in a minimum of 3replicates (technical and biological). The SUBA database lists623 of these as true plastid proteins and 11 as dual targeted toplastids and mitochondria in its consensus prediction [25].We furthermore identified 48 mitochondrial, 21 peroxisomal,and 139 cytosolic proteins. Surprisingly, some of the putativecontaminants are rather abundant proteins suggesting eithera specific enrichment of these proteins during the plastidisolation procedure or multiple targeting within the plant cell(Suppl. Tables 2 and 3). Of the putative cytosolic proteins, 81are plastid-predicted by at least one prediction algorithm [25].The 623 undisputed true plastid proteins span an abundancerange of 4 orders of magnitude and 17 of the low abundanceproteins have not been identified in a chloroplast proteomicanalysis before (Suppl. Table 3). Among these are impor-tant enzymes of purine biosynthesis such as glutaminephosphoribosyl pyrophosphate amidotransferase 1 and3 (AT2G16570, AT4G38880); dihydrodipicolinate reductase(AT3G59890) and dihydrodipicolinate synthase (AT3G60880)that are involved in lysine biosynthesis, the root Fd-NADPHreductase (AT1G30510) and a glutathione S-transferase familyprotein (AT4G19880) (Suppl. Table 3). Based on our data, wehave expanded the chloroplast proteome reference table wegenerated before by 111 proteins [26], which now comprises1266 proteins (Suppl. Table 3).

There is already a significant amount of information onchloroplast protein abundances available that allowed us tocompare the HD-MSE-based protein quantification with thenormalized spectral count quantification used in previous

UNCO

RREC

0

5

10

15

20

25

MapMa

current (mea

Bischof (S) e

Kim et al. 20

prot

ein

amou

nt in

ppm

x 1

0000

Fig. 5 – MapMan classification of identified chloroplast proteins.chloroplast proteins were identified by comparison with the chlo4) [26]. Abundance normalization was independently based on avalues for every protein in the MapMan bins were added up to gThe data from our analysis reported here are designated as “cur(technical and biological). All other data were extracted from theused the wildtype data set from Sucrose-grown plants for this co

Please cite this article as: Helm S, et al, Protein identification anparallel collision-induced dissociation mass ..., J Prot (2013), http:/

ED P

RO

OF

studies [15]. We transformed the quantitative data of identifiedproteins that are listed in the chloroplast proteome referencetable to ppm values, and compared our data set with threedifferent data sets that used normalized spectral counting onwhole leaf tissue for protein quantification [8,17–19]. Cumula-tive quantitative functional categorization of the proteins thatwere identified in all studies (n = 375) revealed high similaritybetween the different analyses (Fig. 5 and Suppl. Table 4) [27].Proteins with a function in redox homeostasis, amino acidbiosynthesis and “other” are enriched in our preparation (Fig. 4,bins 13, 21 and “other”). Amino acid biosynthesis is carried outby soluble chloroplast enzymes that are enriched by the stromapreparation. The same holds true for proteins involvedin chloroplast redox homeostasis. We found significantlyhigher concentrations of the abundant soluble chloroplastthioredoxins (M- and F-type), and the NADPH- and Fd-dependent thioredoxin reductases. The highest abundancedifferences (>4 fold) between proteins in the category “other”wereobserved for twounknownstromaproteins (AT5G28500andAT5G66090), an aldolase-type TIM barrel protein (AT5G13420),an isopropymalatemalate isomerase (AT4G13430) a dehydratasefamily protein (AT3G23940) and a prolyl oligopeptidase(AT2G47390).

We next analyzed the distribution of protein abundanceswithin the Calvin cycle, branched-chain amino acid (BCAA)and aromatic amino acid biosynthesis. Ranked by the medianabundance of pathway enzymes, the Calvin cycle is the mostprevalent pathway in chloroplasts followed by BCAA biosyn-thesis and aromatic amino acid biosynthesis, consistent withearly predictions on pathway abundance (Fig. 6) [28]. Theenzymes of the aromatic amino acid biosynthesis were notfully covered in the proteome studies providing a further

n bin

n value) Bischof et al.2011

t al.2011 Baerenfaller et al. 2008

09

All identified proteins were sorted into MapMan bins androplast proteome reference table (Supplemental Tables 3 andll identified chloroplast proteins in every study, and ppmive the final quantitative functional binning depicted here.rent”, and represented as mean value from nine replicatespublications cited in the legend. Bischof (S) indicates that wemparison.

d quantification by data-independent acquisition and multi-/dx.doi.org/10.1016/j.jprot.2013.12.007

Page 9: Protein identification and quantification by data-independent acquisition and multi-parallel collision-induced dissociation mass spectrometry (MSE) in the chloroplast stroma proteome

CTED P

RO

OF

502

503

504

505

506

507

508

509

510

511

512

513

514

515

516

517

Fig. 6 – Boxplot depiction of enzyme abundance distribution inthree chloroplast metabolic pathways. Chloroplast proteinswere identified from the indicated studies by comparisonagainst the chloroplast proteome reference table [26].Normalized abundance values were calculated on the basisof this dataset independently in every study, and thedistribution of enzyme abundances within aromatic (F, Y, W)and branched chain amino acid biosynthesis (V, L, I) and theCalvin cycle are depicted as box plots. The line in the middleof the box represents the median abundance value of theenzymes in the indicated pathway. Abbreviations are:current—mean abundance in our study, Kim—Kim et al.,2009 [19], Bae—Baerenfaller et al., 2008 [8], Bh—Bischof et al.,2011, Bh (S)—Bischof et al., 2011 [17] (data set wildtypegrown on Sucrose).

curr. Kim Bae Bh Bh (S)

2.5.1.54105 7 157 293 79

4.2.3.4406 0 0 0 0

4.2.1.10243 105 306 428 340

1.1.1.25243 105 306 428 340

2.7.1.710 0 0 0 0

2.5.1.19462 9 19 0 0

4.2.3.5287 27 150 268 207

5.4.99.50 0 0 0 0

2.6.1.78/790 0 0 0 0

4.1.3.27428 3 84 234 70

2.4.2.18202 56 493 157 223

4.2.1.51/911121 0 0 0 0

5.3.1.240 0 0 0 0

1.3.1.780 0 29 82 43

4.1.1.48670 5 297 549 218

4.2.1.20377 121 533 406 415

4.1.1.3969685 72250 70273 89359 94271

2.7.2.315369 8360 8398 7432 6733

1.2.1.1327455 19411 17442 14751 13175

5.3.1.17544 2236 3446 3492 4008

4.1.2.1320696 12956 11596 9190 6878

3.1.3.115009 2702 2861 4154 4070

2.2.1.113090 6291 13062 7859 7193

3.1.3.376819 3408 5620 5996 6861

5.1.3.12894 1989 5159 3028 3738

5.3.1.65727 1617 3769 4935 4648

2.7.1.199475 5320 6351 5960 6096

FY

W b

iosy

nth

esis

Cal

vin

cyl

ce

3-deoxy-7-phospho-heptulonate synthase

3-dehydroquinate synthase

3-dehydroquinate dehydratase

Shikimate dehydrogenase

Shikimate kinase

3-phosphoshikimate 1-carboxyvinyltransferase

Chorismate synthase

Chorismate mutase

Asp/Glu-prephenate aminotransferase

Anthranilate synthase

Anthranilate phospho-ribosyltransferase

Arogenate/Prephenate dehydratase

Phosphoribosyl-anthranilate isomerase

Arogenate dehydrogenase

Indole-3-glycerol-phosphate synthase

Tryptophan synthase

Ribulose-bisphosphate carboxylase

Phosphoglycerate kinase

Glyceraldehyde-3-phosphate dehydrogenase

Triose-phosphate isomerase

Fructose-bisphosphate aldolase

Fructose-bisphosphatase

esaloteksnarT

Sedoheptulose-bisphosphatase

Ribulose-phosphate 3-epimerase

Ribose-5-phosphate isomerase

Phospho-ribulokinase

9J O U R N A L O F P R O T E O M I C S X X ( 2 0 1 3 ) X X X – X X X

NCO

RREindication for their low abundance in unstressed photosyn-

thetic chloroplasts (Figs. 6 and 7). All three pathways show adistinctive pattern of enzyme abundances with the relativeabundance ranking of the enzymes being similar betweenthe different studies (Fig. 6). For example, RubisC/O is themost abundantCalvin cycle enzyme followedby glyceraldehyde-3-phosphate dehydrogenase (GAP-DH), fructose-bisphosphatealdolase and phosphoglycerat kinase (Fig 6).

The synthesis of Ile and Val proceeds in two parallelpathways that have four enzymes in common. The first twocommon steps are catalyzed by a low abundance acetolactatesynthase and a high abundance ketol acid reductoisomerase

UFig. 7 – Heat map depiction of enzyme abundance in threechloroplast metabolic pathways. The number reports theppm value from the normalization based on the chloroplastproteome reference table (as described in the caption toFigs. 4 and 5). The heat map was drawn by color coding ofpercent (%) values calculated by setting the most abundantenzyme in one pathway to 100%, and expressing all otherprotein abundances as % values compared to 100%.Abbreviations are: current—mean abundance in our study,Kim—Kim et al., 2009 [19], Bae—Baerenfaller et al., 2008 [8],Bh—Bischof et al., 2011, Bh (S)—Bischof et al., 2011 [17](data set wildtype grown on Sucrose).

Please cite this article as: Helm S, et al, Protein identification anparallel collision-induced dissociation mass ..., J Prot (2013), http://

(Fig. 7). Threonine desaminase that catalyzes the first commit-ted step in isoleucine synthesis is the least abundant enzyme ofthe entire pathway in all studies comparedhere. This enzyme isdevelopmentally regulated and accumulates to high levels in

4.3.1.190 0 16 45 0

2.2.1.674 3 64 70 81

1.1.1.863700 2131 2597 1715 1573

4.2.1.91488 202 314 536 532

2.3.3.13340 33 16 71 0

4.2.1.332657 713 623 872 1030

1.1.1.852867 761 719 968 985

4.2.1.352657 713 623 872 1030

2.6.1.421952 443 623 171 364

VL

I bio

syn

thes

is

Thr ammonia-lyase

Acetolactate synthase

Ketol-acid reductoisomerase

Dihydroxy-acid dehydratase

2-isopropylmalate synthase

3-isopropylmalate dehydratase

3-isopropylmalate dehydrogenase

(R)-2-methylmalate dehydratase

Branched-chain-amino-acid transaminase

d quantification by data-independent acquisition and multi-dx.doi.org/10.1016/j.jprot.2013.12.007

Page 10: Protein identification and quantification by data-independent acquisition and multi-parallel collision-induced dissociation mass spectrometry (MSE) in the chloroplast stroma proteome

T

O

538539540541542543544545546547548549550551552553554555

556

557

558

559

560

561

562

563

564

565

566

567

568

569

570

571

572

573

574

575

576

577

578

579

580

581

582

583

584

585

586

587

588

589

590

591

592593

594

595

596

597

598

599

600

601

602

603

604

605

606

607

608

609

610

611

612

613

614

615

616

617

618

619620

621

622

623624

625

626

627

628

62 9

630

631

632

633

634

635

636

637

638

639

640

641

642

643

644Q9

645

646

647

10 J O U R N A L O F P R O T E O M I C S X X ( 2 0 1 3 ) X X X – X X X

REC

fruit tissues, indicating the changing demand on Ile synthe-sis during development [29]. The synthesis of aromatic aminoacids Phe, Tyr and Trp is the least abundant amino acidbiosynthetic pathway in the chloroplast under the conditionsused for the proteome analyses compared here (Fig 6). Herewe find some differentiation in enzyme abundance betweenthe different analyses (Fig. 7). In all studies, the enzymes ofTrp synthesis are more abundant than those of Phe and Tyrsynthesis with the exception of prephenate dehydratasein our study. Within Trp synthesis, we find anthranilatesynthase, indole-3-glycerol-phosphate synthase and tryp-tophan synthase as the most abundant enzymes. In thestudies of Baerenfaller and Bischof, these are anthranilatephosphoribosyltransferase and tryptophan synthase on theone hand and indole-3 glycerol phosphate synthase on theother hand [8,17]. EPSP synthase, which is the target of theherbicide glyphosate and functions upstream of chorismatesynthesis, is abundant in our study while it was not detected byBischof,where 3-deoxy-7-phospho-heptulonate synthase – thatcatalyzes the first step in chorismate synthesis – is prevalent.

The cause of the different abundance relations withinthe aromatic amino acid biosynthesis in the different studiesis unclear and it is entirely possible that measurementinaccuracies are causing this effect. However, provided thatthe diversity between different studies is also highest forthe aromatic amino acid biosynthetic pathway, it is very likelythat it at least partially reflects the different conditions usedfor plant growth in the different laboratories. The abovecomparison revealed that in contrast to aromatic aminoacid biosynthesis, both Calvin cycle and BCAA biosynthesismaintain a constant abundance relation between the path-way enzymes. This could indicate a robustness requirementthat applies to the setup of these pathways in photosynthet-ically active chloroplasts. The meta-analysis we performedhere is a suitable tool to identify such robustness featuresand also identify enzymes whose accumulation is sensitivelyregulated in response to varying external conditions.

648

649

650

651

652

653

654

655

656

657

658

659

660

661

662

663

664

665

666

667

668

669

670

671

672

673

UNCO

R

Conclusions

Here we evaluate HD-MSE-based proteome analyses as atool for large-scale quantitative chloroplast proteomics. Incombination with ion mobility separation, HD-MSE analysesreach to high sensitivity allowing the identification andquantification of around 900 stroma proteins without pre-fractionation. Several of these proteins were identified forthe first time in a proteomics experiment, suggesting thatHD-MSE and standard DDA analyses generate partly comple-mentary identification results. HD-MSE produces reproduciblequantification results, but in contrast to MSE, underestimateshigh abundance proteins. A meta-proteome analysis betweenour study and several published chloroplast data sets revealed asurprising quantitative congruence for many chloroplast met-abolic functions and individual enzymes. On the technical side,this suggests that both MSE and spectral count quantification –although entirely different strategies – determine similarprotein abundances in large-scale experiments, underpinningthe suitability of both methods for protein quantification.On the cell metabolism side, our meta-analysis suggests that

Please cite this article as: Helm S, et al, Protein identification anparallel collision-induced dissociation mass ..., J Prot (2013), http:/

robustness principles operate at the level of individualmetabolic pathways. The analysis we report here may serveas a proof-of-concept that it is possible to distinguish robustfrom fine-controlled metabolic pathways with quantitativeproteomics data and convey the message that this type ofinformation is now available for quantitative modeling ofchloroplast metabolism.

Supplementary data to this article can be found online athttp://dx.doi.org/10.1016/j.jprot.2013.12.007.

OF

Acknowledgments

This work was supported by the European Regional Devel-opment Fundof theEuropeanCommissionvia grantW21004490,“Landesförderschwerpunkt Molekulare Biowissenschaften”,Land Sachsen-Anhalt.

ED P

RR E F E R E N C E S

[1] Nikolov MC, Schmidt C, Urlaub H. Quantitative massspectrometry-based proteomics: an overview. Methods MolBiol 2012;893:85–100.

[2] Baginsky S. Plant proteomics: concepts, applications, andnovel strategies for data interpretation. Mass Spectrom Rev2009;28:93–120.

[3] Thelen JJ, Miernyk JA. The proteomic future: where massspectrometry should be taking us. Biochem J 2012;444:169–81.

[4] Picotti P, Aebersold R. Selected reaction monitoring-basedproteomics: workflows, potential, pitfalls and future directions.Nat Methods 2012;9:555–66.

[5] Bindschedler LV, Cramer R. Quantitative plant proteomics.Proteomics 2011;11:756–75.

[6] Picotti P, et al. A complete mass-spectrometric map of theyeast proteome applied to quantitative trait analysis. Nature2013;494:266–70.

[7] Lu P, et al. Absolute protein expression profiling estimatesthe relative contributions of transcriptional and translationalregulation. Nat Biotechnol 2007;25:117–24.

[8] Baerenfaller K, et al. Genome-scale proteomics revealsArabidopsis thaliana gene models and proteome dynamics.Science 2008;320:938–41.

[9] Liu H, Sadygov RG, Yates III JR. A model for random samplingand estimation of relative protein abundance in shotgunproteomics. Anal Chem 2004;76:4193–201.

[10] Neilson KA, et al. Less label, more free: approaches inlabel-free quantitative mass spectrometry. Proteomics2011;11:535–53.

[11] Silva JC, et al. Absolute quantification of proteins byLCMSE—a virtue of parallel MS acquisition. Mol CellProteomics 2006;5:144–56.

[12] Bond NJ, et al. Improving qualitative and quantitativeperformance for MS(E)-based label-free proteomics. J ProteomeRes 2013;12:2340–53.

[13] Grossmann J, et al. Implementation and evaluation of relativeand absolute quantification in shotgun proteomics withlabel-free methods. J Proteomics 2010;73:1740–6.

[14] Grossmann J, et al. A workflow to increase the detection rateof proteins from unsequenced organisms in high-throughputproteomics experiments. Proteomics 2007;7:4245–54.

[15] van Wijk KJ, Baginsky S. Plastid proteomics in higher plants:current state and future goals. Plant Physiol2011;155:1578–88.

d quantification by data-independent acquisition and multi-/dx.doi.org/10.1016/j.jprot.2013.12.007

Page 11: Protein identification and quantification by data-independent acquisition and multi-parallel collision-induced dissociation mass spectrometry (MSE) in the chloroplast stroma proteome

674

675

676

677

678

679

680

681

682

683

684

685

686

687

688

689

690

691

692

693

694

695

696

697

698

699

700

701

702

703

704

705

706

707

708

709

710

711

712

713

714

715

716

717

718

719

720

721

723

11J O U R N A L O F P R O T E O M I C S X X ( 2 0 1 3 ) X X X – X X X

[16] Huang M, et al. Construction of plastid reference proteomesfor maize and Arabidopsis and evaluation of their orthologousrelationships; the concept of orthoproteomics. J Proteome Res2013;12:491–504.

[17] Bischof S, et al. Plastid proteome assembly withoutToc159: photosynthetic protein import and accumulationof N-acetylated plastid precursor proteins. Plant Cell2011;23:3911–28.

[18] Motohashi R, et al. Common and specific protein accumulationpatterns in different albino/pale-green mutants revealsregulon organization at the proteome level. Plant Physiol2012;160:2189–201.

[19] Kim J, et al. Subunits of the plastid ClpPR protease complexhave differential contributions to embryogenesis, plastidbiogenesis, and plant development in Arabidopsis. Plant Cell2009;21:1669–92.

[20] Wang M, et al. PaxDb, a database of protein abundanceaverages across all three domains of life. Mol Cell Proteomics2012;11:492–500.

[21] Schrimpf SP, et al. Comparative functional analysis of theCaenorhabditis elegans and Drosophila melanogasterproteomes. PLoS Biol 2009;7:616–27.

[22] Ferro M, et al. AT_CHLORO, a comprehensive chloroplastproteome database with subplastidial localization and curated

UNCO

RRECT

722

Please cite this article as: Helm S, et al, Protein identification anparallel collision-induced dissociation mass ..., J Prot (2013), http://

RO

OF

information on envelope proteins. Mol Cell Proteomics2010;9:1063–84.

[23] BradfordMM.A rapid and sensitivemethod for thequantitationof microgram quantities of protein utilizing the principle ofprotein-dye binding. Anal Biochem 1976;72:248–54.

[24] Shliaha PV, BondNJ, Gatto L, Lilley KS. Effects of travelingwaveion mobility separation on data independent acquisition inproteomics studies. J Proteome Res 2013;12:2323–39.

[25] Tanz SK, et al. SUBA3: a database for integratingexperimentation and prediction to define the SUBcellularlocation of proteins in Arabidopsis. Nucleic Acids Res2013;41(D1):D1185–91.

[26] Baginsky S, Gruissem W. The chloroplast kinase network:new insights from large-scale phosphoproteome profiling.Mol Plant 2009;2:1141–53.

[27] Thimm O, et al. MAPMAN: a user-driven tool to displaygenomics data sets onto diagrams of metabolic pathwaysand other biological processes. Plant J 2004;37:914–39.

[28] Kleffmann T, et al. The Arabidopsis thaliana chloroplastproteome reveals pathway abundance and novel proteinfunctions. Curr Biol 2004;14:354–62.

[29] Samach A, et al. Biosynthetic threonine deaminase gene oftomato—isolation, structure, and up-regulation in floralorgans. Proc Natl Acad Sci U S A 1991;88:2678–82.

ED P

d quantification by data-independent acquisition and multi-dx.doi.org/10.1016/j.jprot.2013.12.007