lattice qcd computing project project management overview · project management ... • similar...

21
Lattice QCD Computing Project Extension II Cost and Schedule Bill Boroski LQCD-ext Contractor Project Manager [email protected] CD-1 Review DOE Office of Science February 25, 2014

Upload: others

Post on 25-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lattice QCD Computing Project Project Management Overview · Project management ... • Similar level of details exist for determining the number of conventional cluster nodes and

Lattice QCD Computing Project Extension II

Cost and Schedule

Bill Boroski LQCD-ext Contractor Project Manager

[email protected]

CD-1 Review

DOE Office of Science

February 25, 2014

Page 2: Lattice QCD Computing Project Project Management Overview · Project management ... • Similar level of details exist for determining the number of conventional cluster nodes and

W. Boroski | Cost & Schedule | LQCD-ext II CD-1 Review | 25-Feb-2014 2

Outline

Estimated Total Project Cost

Basis of Estimate

Schedule Overview

Level-1 & Level-2 Schedule Milestones

Page 3: Lattice QCD Computing Project Project Management Overview · Project management ... • Similar level of details exist for determining the number of conventional cluster nodes and

W. Boroski | Cost & Schedule | LQCD-ext II CD-1 Review | 25-Feb-2014 3

Estimated Total Project Cost

Estimated Total Project Cost (TPC) Range = $14M to $18M

Based on preliminary guidance from OHEP and ONP

Budget has been set to match planning budget funding profile

Period of performance: FY15 through FY19

Project funds will be used to support the operation of existing and new hardware; and the procurement of new computing hardware to meet performance requirements and metrics.

Project funding covers:

Operations and maintenance

of production systems

Planning, acquisition and

deployment of new hardware

Project management

Management Reserve

Not in scope

Software development

Scientific software support

Preliminary Budget Distribution

$14 million budget)

Preliminary Budget Distribution

($18 million budget)

New Hardware

Deployment, 39%

Project Management, 4%

Operations &

Maintenance, 54%

Management Reserve, 3%

LQCD-ext IITotal Project Cost Distribution, by Spending Category

New Hardware

Deployment, 47%

Project Management, 3%

Operations &

Maintenance, 47%

Management Reserve, 2%

LQCD-ext IITotal Project Cost Distribution, by Spending Category

Page 4: Lattice QCD Computing Project Project Management Overview · Project management ... • Similar level of details exist for determining the number of conventional cluster nodes and

W. Boroski | Cost & Schedule | LQCD-ext II CD-1 Review | 25-Feb-2014 4

Procedure for Developing the Estimated Total

Project Cost (TPC)

1. Obtained planning budget guidance from OHEP and ONP

2. Reviewed the staffing model based on eight+ years of deployment and operating experience from previous existing project phases

3. Reviewed the acquisition strategy for the five-year extension project.

4. Review assumptions on the quantity of compute and storage hardware purchases based on estimated price/performance projections.

5. Reviewed the Level-of-Effort profile for the three host sites.

6. Updated personnel salary costs used in the Basis of Estimate. Used fully-loaded salary data from the three host sites to develop the personnel budget. Salary costs are escalated by 3% per year.

7. Reviewed the travel and M&S budgets, based on past experience.

8. Reviewed the level of mgmt reserve required to cover additional personnel expenses.

9. Subtracted the sum of these items from the budget guidance; balance is the budget available for new hardware procurements (compute and storage hardware).

Page 5: Lattice QCD Computing Project Project Management Overview · Project management ... • Similar level of details exist for determining the number of conventional cluster nodes and

W. Boroski | Cost & Schedule | LQCD-ext II CD-1 Review | 25-Feb-2014 5

Staffing Model Planning Assumptions

1. Level of Effort to Support New Deployment Planning

Assumptions:

1) Planning activities consume 20% of two people over a

six month period.

Activities include:

preparing RFI's, RFP's, requisiitions, etc.

prototype benchmarking

qualifying vendors

bid evaluatons and vendor selection

Estimated Level of Effort (FTE-mos)

Number of people 2

Level of effort 0.2

Duration (mos) 6

Total effort (FTE-mos) 2.4

Total effort (FTE-yrs) 0.2

2. Level of Effort to Support New Deployments

Assumptions:

1) Duration from receipt of hardware to production release = 3 months

2) Deployment crew consists of one senior-level manager and three

computer professionals.

3) Level of effort decreases over time as system becomes stable

Estimated Level of Effort (FTE-mos)

Person m1 m2 m3 m4 m5 m6

1 1 0.75 0.5 0.25 0.25 0.25

2 1 0.75 0.5 0.25 0.25 0.25

3 0.75 0.5 0.25

4 0.75 0.5

Total 3.5 2.5 1.25 0.5 0.5 0.5

Total FTE-mos 8.75

Total FTE-yrs 0.73

The following are examples of the underlying details and bases for estimates for the

assumptions associated with the LQCD-ext II detailed staffing and budget model.

• Similar level of details exist for determining the number of conventional cluster nodes and GPUs that

can be supported by one FTE, and the number of conventional cluster nodes and GPUs that can be

purchased with a specified dollar amount.

• Assumption parameters are all based on past operating experience.

Page 6: Lattice QCD Computing Project Project Management Overview · Project management ... • Similar level of details exist for determining the number of conventional cluster nodes and

W. Boroski | Cost & Schedule | LQCD-ext II CD-1 Review | 25-Feb-2014 6

Staffing Model Assumptions Assumptions: Basis:

0.35 fte for overall project management Based on operating experience

0.1 fte to manage BNL site Based on operating experience

0.2 fte to manage cluster site (FNAL, JLab) Based on operating experience

0.2 fte to plan, manage deployment See assumptions tab

0.5 fte to deploy new hardware See assumptions tab

0.0 fte of additional support for GPU deployment No additional incremental effort to deploy new GPU cluster (Jan '14 ->)

0.5 fte/site of base admin support for ops & maintenance, and to maintain expertise Based on operating experience

0.5 Steady-state file server admin support Based on operating experience (same level as budgeted in FY12-14)

900 Number of cluster nodes that can be supported by one FTE Based on operating experience; updated 01/14/14

450 Number of GPUs that can be supported by one FTE (FNAL) GPUs at FNAL require on average 2x more support than cluster nodes

225 Number of GPUs that can be supported by one FTE (JLab) GPUs at JLab require on average 4x more support than cluster nodes

283 Number of cluster nodes purchased with $1M in equipment funds (per year) Based on recent cost data; see assumptions tab

249 Number of GPUs purchased with $1M in equipment funds (per year) Based on recent cost data; see assumptions tab

21% M&S G&A rate at FNAL (% on the first $500K of the purchase)

60% M&S G&A rate at JLab (% on the first $50K of the purchase)

5% Fraction of total equipment budget allocated to storage hardware (FY10-12) Based on approved hardware baseline plan

8% Fraction of total equipment budget allocated to storage hardware (FY13-19) Storage budget increased to reflect growing needs

75% Fraction of storage budget allocated to deployment site (FY13-19)

25% Fraction of storage budget allocated to non-deployment site (FY13-19)

60% Fraction of remaining compute hardware budget allocated to cluster hardware Assumes 60/40 split between conventional and GPU hardware

827 JLab # of conventional nodes - starting point (9q @ 328; 10q @ 224; 12s @ 275 nodes)

660 JLab # of GPUs - starting point (9g @248 GPUs ;10g @ 212 GPUs; 11g @ 32 GPUs;12k @ 168 GPUs)

985 FNAL # of conventional nodes starting point (Ds @ 421 nodes; Bc @ 224 nodes; FY14c @ 340 nodes)

304 FNAL # of GPUs - starting point (Dsg @ 152 GPUs; FY14g @ 152 GPUs)

Calculated remaining compute hardware budget 0 0 0 0 0 0

FY15 FY16 FY17 FY18 FY19 Total

Compute hardware budget ($) - 609,245 827,135 1,138,135 1,306,175 3,880,690

BG/Q compute hardware budget ($) - - - - - - BG/Q deployment site

Cluster compute hardware budget ($) - 560,505 760,964 1,047,084 1,201,681 3,570,235 Cluster deployment site

Storage hardware budget ($) - 36,555 49,628 68,288 78,371 232,841 BG/Q or Cluster deployment site

Storage hardware budget ($) - 12,185 16,543 22,763 26,124 77,614 Non-deployment site

Storage hardware budget ($) - - - - - - Storage increment

Total equipment budget ($) - 609,245 827,135 1,138,135 1,306,175 3,880,690

% of compute budget allocated for IB clusters 100% 60% 60% 60% 60%

Allocation for conv. cluster hardware ($) - 336,303 456,579 628,251 721,009 2,142,141

Direct portion of conv. cluster budget - 306,303 426,579 521,851 721,009 1,975,741

Overhead portion of conv. cluster budget - 30,000 30,000 106,400 - 166,400

Allocation for GPU hardware ($) - 224,202 304,386 418,834 480,672 1,428,094

DIrect portion of GPU budget - 194,202 274,386 329,706 480,672 1,278,966

Overhead portion of GPU budget - 30,000 30,000 89,128 - 149,128

Estimated # of cluster nodes purchased - 87 121 147 204 558

Estimated # of GPUs purchased - 48 68 82 120 318

Actual cluster nodes purchased - - - - - -

Actual GPU nodes purchased - - - - - -

Actual GPUs purchased - - - - - -

Page 7: Lattice QCD Computing Project Project Management Overview · Project management ... • Similar level of details exist for determining the number of conventional cluster nodes and

Basis of Estimate – FNAL Planned Level of Effort Example

W. Boroski | Cost & Schedule | LQCD-ext II CD-1 Review | 25-Feb-2014 7

• Basis of Estimate derived using staffing model and assumptions

• Takes into account acquisition of new hardware and retirement of systems at end-of-life

• Average count per year assumes new systems brought online in Q3 of each fiscal year.

• Changing parameter values in the Assumptions table automatically updates these tables.

Level-of-Effort

Hardware Deployments

FNAL#nodes <#nodes> # GPUs <# GPUs> Site mgmt

Base ops

sys admin

support

Additional

cluster ops

support

Additional GPU

ops support

File server

admin

support

Total

operations

effort

FY15 start year 985 985 304 304 0.2 0.5 1.1 0.7 0.5 2.8

acquired 0 0

retired 0 0

end year 985 304

FY16 start year 985 880 304 266 0.2 0.5 1.0 0.6 0.5 2.6

acquired 0 0

retired 421 152

end year 564 152

FY17 start year 564 564 152 152 0.2 0.5 0.6 0.3 0.5 2.0

acquired 0 0

retired 0 0

end year 564 152

FY18 start year 564 553 152 138 0.2 0.5 0.6 0.3 0.5 1.9

acquired 180 97

retired 224 152

end year 520 97

FY19 start year 520 492 97 130 0.2 0.5 0.5 0.3 0.5 1.8

acquired 226 133

retired 340 0

end year 406 230

TOTAL 1.0 2.5 3.9 2.2 2.5 11.1

SS Operations Support (FTE-yrs)

Page 8: Lattice QCD Computing Project Project Management Overview · Project management ... • Similar level of details exist for determining the number of conventional cluster nodes and

W. Boroski | Cost & Schedule | LQCD-ext II CD-1 Review | 25-Feb-2014 8

Level of Effort

Summary

The Level of Effort

summary was developed

based on the staffing

model, deployment

assumptions, the hardware

deployment plan defined in

the LQCD-ext II Acquisition

Strategy, and over 8 years

of operating experience.

Level of effort shown is

associated with the $14M

budget scenario

LEVEL OF EFFORT (FTE-yrs)

FY15 FY16 FY17 FY18 FY19 Total

Brookhaven

Site Management - - - - -

Steady-state Operations Support 0.40 0.40 0.40 0.10 0.10

Deployment Planning - - - - -

Deployment Support - - - - -

Project Management - - - - -

Sub-total (BNL) 0.40 0.40 0.40 0.10 0.10 1.40

Fermilab

Site Management 0.20 0.20 0.20 0.20 0.20

Steady-state Operations Support 2.77 2.57 1.96 1.92 1.84

Deployment Planning - - - 0.20 0.20

Deployment Support - - - 0.50 0.50

Project Management 0.35 0.35 0.35 0.35 0.35

Sub-total (FNAL) 3.32 3.12 2.51 3.17 3.09 15.21

Thomas Jefferson National Accelerator Facility

Site Management 0.20 0.20 0.20 0.20 0.20

Steady-state Operations Support 3.14 2.97 2.36 1.94 1.72

Deployment Planning - 0.20 0.20 - -

Deployment Support - 0.50 0.50 - -

Project Management - - - - -

Sub-total (JLab) 3.34 3.87 3.26 2.14 1.92 14.53

Total

Site Management 0.40 0.40 0.40 0.40 0.40

Steady-state Operations Support 6.31 5.94 4.73 3.96 3.66

Deployment Planning - 0.20 0.20 0.20 0.20

Deployment Support - 0.50 0.50 0.50 0.50

Project Management 0.35 0.35 0.35 0.35 0.35

Total 7.06 7.39 6.18 5.41 5.11 31.14

Page 9: Lattice QCD Computing Project Project Management Overview · Project management ... • Similar level of details exist for determining the number of conventional cluster nodes and

Salary Cost Basis

Hidden cells contain fully-loaded salary costs for the various personnel categories listed on the

left.

Salary costs for BNL and JLab have been provided by those Site Managers. FNAL salary costs

have been provided by the Contractor Project Manager (located at FNAL).

Lab-specific fringe and overhead rates have been applied to all base salaries.

Salaries are escalated by 3% for inflation.

W. Boroski | Cost & Schedule | LQCD-ext II CD-1 Review | 25-Feb-2014 9

LQCD-ext II Cost Forecast THIS IS NORMALLY A HIDDEN SHEET AND SHOULD ONLY BE MADE VIEWABLE ON AN AS-NEEDED BASIS.

Cost Basis Worksheet DEFAULT IS TO KEEP THIS SHEET HIDDEN.

Inflation rate: 3.0%

Salary Cost Basis ($)

(Fully-loaded salary costs)

FY15 FY16 FY17 FY18 FY19 Basis of Estimate

BNL

Site Management 0 0 0 0 0 FY13-14: Fully-loaded FY13 average provided by Frank Quarant (see spreadsheet LQCD FY13 Proposed Labor.xlsx). FY14 scaled by inflation.

Operations Support 147,429 151,852 156,408 161,100 165,933 FY13-14: Fully-loaded FY13 average provided by Frank Quarant (see spreadsheet LQCD FY13 Proposed Labor.xlsx). FY14 scaled by inflation.

JLab

Site Management 355,350 366,011 376,991 388,301 345,000 FY14: Fully-loaded FY14 salary cost adjustment per Chip Watson e-mail dated 7/15/2013

Operations Support 208,060 214,302 220,731 227,353 202,000 FY14: Fully-loaded FY14 salary cost adjustment per Chip Watson e-mail dated 7/15/2013

Deployment Support 216,892 223,399 230,100 237,004 244,114 FY14: Fully-loaded FY14 salary cost adjustment per Chip Watson e-mail dated 7/15/2013 didn’t include revised deployment support cost, so Fy14 value left unchanged.

FNAL

Site Management 299,481 308,465 317,719 327,251 337,068 FY13-14: Fully-loaded FY13 salary cost adjustment per current FY12 salary and burden rate information. FY14 scaled by inflation.

Operations Support 253,522 261,128 268,961 277,030 285,341 FY13-14: Fully-loaded FY13 salary cost adjustment per current FY12 salary and burden rate information. FY14 scaled by inflation.

Deployment Planning 299,481 308,465 317,719 327,251 337,068 FY13-14: Fully-loaded FY13 salary cost adjustment per current FY12 salary and burden rate information. FY14 scaled by inflation.

Deployment Support 253,522 261,128 268,961 277,030 285,341 FY13-14: Fully-loaded FY13 salary cost adjustment per current FY12 salary and burden rate information. FY14 scaled by inflation.

Project Management 315,138 324,593 334,330 344,360 354,691 FY13-14: Fully-loaded FY13 salary cost adjustment per current FY12 salary and burden rate information. FY14 scaled by inflation.

Page 10: Lattice QCD Computing Project Project Management Overview · Project management ... • Similar level of details exist for determining the number of conventional cluster nodes and

W. Boroski | Cost & Schedule | LQCD-ext II CD-1 Review | 25-Feb-2014 10

Effort and Budget Summary Summary shown for $14 million funding scenario

LEVEL OF EFFORT (FTE-yrs)

FY15 FY16 FY17 FY18 FY19 Total

Site Management 0.40 0.40 0.40 0.40 0.40

Steady-state Operations Support 6.31 5.94 4.73 3.96 3.66

Deployment Planning 0.00 0.20 0.20 0.20 0.20

Deployment Support 0.00 0.50 0.50 0.50 0.50

Project Management 0.35 0.35 0.35 0.35 0.35

Total 7.06 7.39 6.18 5.41 5.11 31.14

BUDGET ($K)

FY15 FY16 FY17 FY18 FY19 Total

Total Project Cost

Personnel 1,655 1,809 1,566 1,457 1,360 7,846

Travel 17 17 17 17 17 84

M&S 283 283 283 102 102 1,053

Equipment (compute) - 743 992 1,242 1,333 4,310

Equipment (storage) - 65 65 108 116 353

Management Reserve 45 84 77 75 73 353

Total 2,000 3,000 3,000 3,000 3,000 14,000

CD-1 Planning Budget Guidance Profile 2,000 3,000 3,000 3,000 3,000 14,000

Additional funding for larger budget - - - - - -

Total CD-1 Planning Budget Profile 2,000 3,000 3,000 3,000 3,000 14,000

Notes:

1) Management reserve set at 20% of unspent deployment personnel budget and 3% of unspent steady-state ops personnel

Page 11: Lattice QCD Computing Project Project Management Overview · Project management ... • Similar level of details exist for determining the number of conventional cluster nodes and

W. Boroski | Cost & Schedule | LQCD-ext II CD-1 Review | 25-Feb-2014 11

Effort and Budget Summary Summary shown for $18 million funding scenario

LEVEL OF EFFORT (FTE-yrs)

FY15 FY16 FY17 FY18 FY19 Total

Site Management 0.40 0.40 0.40 0.40 0.40

Steady-state Operations Support 6.31 6.16 5.72 5.36 5.22

Deployment Planning 0.00 0.20 0.20 0.20 0.20

Deployment Support 0.00 0.50 0.50 0.50 0.50

Project Management 0.35 0.35 0.35 0.35 0.35

Total 7.06 7.61 7.17 6.81 6.67 35.31

BUDGET ($K)

FY15 FY16 FY17 FY18 FY19 Total

Total Project Cost

Personnel 1,655 1,861 1,803 1,794 1,720 8,832

Travel 17 17 17 17 17 84

M&S 283 283 283 102 102 1,053

Equipment (compute) - 1,614 1,702 1,843 1,912 7,070

Equipment (storage) - 140 111 160 166 578

Management Reserve 45 85 84 85 84 383

Total 2,000 4,000 4,000 4,000 4,000 18,000

CD-1 Planning Budget Guidance Profile 2,000 3,000 3,000 3,000 3,000 14,000

Additional funding for larger budget - 1,000 1,000 1,000 1,000 4,000

Total CD-1 Planning Budget Profile 2,000 4,000 4,000 4,000 4,000 18,000

Notes:

1) Management reserve set at 20% of unspent deployment personnel budget and 3% of unspent steady-state ops personnel

Page 12: Lattice QCD Computing Project Project Management Overview · Project management ... • Similar level of details exist for determining the number of conventional cluster nodes and

W. Boroski | Cost & Schedule | LQCD-ext II CD-1 Review | 25-Feb-2014 12

Budget Distribution by Expenditure Type

Expenditure Type Budget

($K)

Personnel 7,845

Travel 85

M&S 1,050

Compute/storage hardware 4,665

Management Reserve 355

Total 14,000

Personnel, 39%

Travel, 0.4%

M&S, 3%

Compute & Storage

Hardware, 55%

Management Reserve, 2%

LQCD-ext Total Project Cost, by Expenditure Type

Expenditure Type Budget

($K)

Personnel 8,830

Travel 85

M&S 1,050

Compute/storage hardware 7,650

Management Reserve 385

Total 18,000

Expenditure Type Budget

($K)

Personnel 7,115

Travel 64

M&S 610

Compute/storage hardware 10,008

Management Reserve 354

Total 18,150

LQCD-ext ($18.15M) LQCD-ext II ($18M) LQCD-ext II ($14M)

Personnel, 56%

Travel, 0.6%

M&S, 8%

Compute & Storage

Hardware, 33%

Management Reserve, 3%

LQCD-ext II Total Project Cost, by Expenditure Type

Personnel, 49%

Travel, 0.5%

M&S, 6%

Compute & Storage

Hardware, 42%

Management Reserve, 2%

LQCD-ext II Total Project Cost, by Expenditure Type

Page 13: Lattice QCD Computing Project Project Management Overview · Project management ... • Similar level of details exist for determining the number of conventional cluster nodes and

W. Boroski | Cost & Schedule | LQCD-ext II CD-1 Review | 25-Feb-2014 13

Total Project Cost Profile

Profile shows that hardware budget varies by year, based on funding profile and level

of personnel staffing required to support production systems.

Personnel cost requirements based on staffing model previously discussed. Level of

operations support is based on number of nodes and GPUs in production during that

year.

Escalation factor of 3% per year applied to personnel cost forecast.

These bottoms-up profiles have been adjusted to match the planning budget guidance

profiles.

Page 14: Lattice QCD Computing Project Project Management Overview · Project management ... • Similar level of details exist for determining the number of conventional cluster nodes and

W. Boroski | Cost & Schedule | LQCD-ext II CD-1 Review | 25-Feb-2014 14

Total Project Cost Profile Comparison $14 million Budget Scenario

The $14 million budget scenario represents a significant reduction in funding from current levels,

which had been back-loaded in the funding profile for the current project (LQCD-ext).

Personnel cost requirements are based on staffing model previously discussed. Level of operations

support is based on number of nodes and GPUs in production during each year.

Reduced funding level directly affects the amount of compute capacity we can deliver to the science

program.

Indicates 4-yr

system lifecycle.

Page 15: Lattice QCD Computing Project Project Management Overview · Project management ... • Similar level of details exist for determining the number of conventional cluster nodes and

W. Boroski | Cost & Schedule | LQCD-ext II CD-1 Review | 25-Feb-2014 15

Total Project Cost Profile Comparison $18 million Budget Scenario

The $18 million budget scenario is comparable to the existing LQCD-ext budget scenario, but the

fraction of the budget available for new hardware procurements is smaller due to the number of

systems inherited from LQCD-ext that we will be operating, and the lower funding level planned for

FY15 – we’re not purchasing new systems but we are operating existing systems.

The reduced level of funding in FY15 poses the same issue as with the $14 million budget scenario,

in that we will only be able to fund operations and maintenance in FY15; there are no funds to

procure new hardware in FY15.

Indicates 4-yr

system lifecycle

Page 16: Lattice QCD Computing Project Project Management Overview · Project management ... • Similar level of details exist for determining the number of conventional cluster nodes and

Management Reserve

Used to cover the cost of unanticipated but required labor expenses that arise during

the course of deploying new systems or supporting steady-state operations.

Allocated only after it is clear that costs cannot be covered by adjusting priorities or

rearranging work.

Set at 20% of the unspent deployment personnel budget and 3% of the unspent

steady-state operations personnel budget.

Unspent management reserve funds will be applied toward new hardware

procurement in the subsequent year.

Mgmt reserve funds will be controlled by the CPM. Any use of mgmt reserve funds

will be reported to the Federal Project Director and Project Monitor during the monthly

progress report and noted during the DOE annual progress review.

W. Boroski | Cost & Schedule | LQCD-ext II CD-1 Review | 25-Feb-2014 16

Page 17: Lattice QCD Computing Project Project Management Overview · Project management ... • Similar level of details exist for determining the number of conventional cluster nodes and

W. Boroski | Cost & Schedule | LQCD-ext II CD-1 Review | 25-Feb-2014 17

Schedule Overview LQCD-ext II schedule consists of two primary components: Steady-state operations

and new hardware deployments.

WBS elements capture work on a fiscal year basis

Steady-state operations activities are shown as constant level-of-effort activities

Activities associated with new hardware deployments follow a pre-established

sequence developed during the current project that has proven to be quite effective.

1. Update acquisition plans and alternative analysis for review and approval at the annual

project progress meeting (typically in May or June), for systems procured in the following

year.

2. Deploy new systems by June 30 in any given year (exception is when procurements span

fiscal year boundaries).

3. Summarize and report on delivered Tflop/s-yrs at end of each fiscal year.

Other recurring activities captured in the schedule include annual progress reviews,

annual testing of security controls and review of contingency plans, etc.

Costs are tracked down to Level 4 in the WBS

Page 18: Lattice QCD Computing Project Project Management Overview · Project management ... • Similar level of details exist for determining the number of conventional cluster nodes and

W. Boroski | Cost & Schedule | LQCD-ext II CD-1 Review | 25-Feb-2014 18

Work Breakdown Structure - preliminary

Page 19: Lattice QCD Computing Project Project Management Overview · Project management ... • Similar level of details exist for determining the number of conventional cluster nodes and

W. Boroski | Cost & Schedule | LQCD-ext II CD-1 Review | 25-Feb-2014 19

Level-1

Milestones

Level-1 milestones are defined in Table 1 of

the PEP and will be tracked in the WBS and

project plan.

No. Level 1 Milestone Fiscal

Year

1 Computer architecture planning for the FY16 procurement complete & reviewed Q3 2015

2

Procurement and deployment of 0 teraflops (sustained – Conventional Resources) in FY15

(no deployment in FY15 is planned, but this placeholder will account for any change in budget profile)

Q3 2015

3 100 Teraflops-years aggregate Conventional Resource computing delivered in FY15 Q4 2015

4 75 Effective Teraflops-years aggregate GPU-accelerated Resource computing delivered in

FY15 Q4 2015

5 Computer architecture planning for the FY17 procurement complete & reviewed Q3 2016

6 Procurement and deployment of 16 teraflops (sustained - Conventional Resources) in FY16 Q3 2016

7 Procurement and deployment of 20 effective teraflops (sustained – Accelerated Resources)

in FY16 Q3 2016

8 87 Teraflops-years aggregate Conventional Resource computing delivered in FY16 Q4 2016

9 68 Effective Teraflops-years aggregate GPU-accelerated Resource computing delivered in

FY16 Q4 2016

10 Computer architecture planning for the FY18 procurement complete & reviewed Q3 2017

11 Procurement and deployment of 32 teraflops (sustained - Conventional Resources) in FY17 Q3 2017

12 Procurement and deployment of 39 effective teraflops (sustained – Accelerated Resources)

in FY17 Q3 2017

13 98 Teraflops-years aggregate Conventional Resource computing delivered in FY17 Q4 2017

14 72 Teraflops-years aggregate GPU-accelerated Resource computing delivered in FY17 Q4 2017

15 Computer architecture planning for the FY19 procurement complete & reviewed Q3 2018

16 Procurement and deployment of 63 teraflops (sustained - Conventional Resources) in FY18 Q3 2018

17 Procurement and deployment of 77 teraflops (sustained – Accelerated Resources) in FY18 Q3 2018

18 142 Teraflops-years aggregate Conventional Resource computing delivered in FY18 Q4 2018

19 128 Effective Teraflops-years aggregate GPU-accelerated Resource computing delivered in FY18

Q4 2018

20 Procurement and deployment of 101 teraflops (sustained - Conventional Resources) in

FY19 Q3 2019

21 Procurement and deployment of 124 effective teraflops (sustained – Accelerated

Resources) in FY19 Q3 2019

22 185 Teraflops-years aggregate Conventional Resource computing delivered in FY19 Q4 2019

23 225 Effective Teraflops-years aggregate GPU-accelerated Resource computing delivered in

FY19 Q4 2019

Page 20: Lattice QCD Computing Project Project Management Overview · Project management ... • Similar level of details exist for determining the number of conventional cluster nodes and

Level-2 Milestones

Level-2 milestones are typical for each planned procurement

Level-2 milestones are tracked by the Project Office and reported to the

Federal Project Director and Project Monitor during the monthly

progress calls.

W. Boroski | Cost & Schedule | LQCD-ext II CD-1 Review | 25-Feb-2014 20

Level 2 Milestones

Preliminary System Design Document completed

Request for Information (RFI) released to vendors

Request for Proposal (RFP) released to vendors

Purchase subcontract awarded

Approval of first rack.

Delivery of remaining equipment

Successful installation and completion of Acceptance Test Plan

Release to “Friendly User” production

Release to full production

Level 2 milestones as shown in

Table 2 of the PEP

Page 21: Lattice QCD Computing Project Project Management Overview · Project management ... • Similar level of details exist for determining the number of conventional cluster nodes and

W. Boroski | Cost & Schedule | LQCD-ext II CD-1 Review | 25-Feb-2014 21

Summary Estimated Total Project Cost Range = $14M to $18M.

We have developed preliminary budgets for each of the planning budget scenarios

we have been given. Each budget has been aligned with funding profile guidance

Budget has been developed using a bottoms-up approach that leverages experience

gained from 8+ years of operating experience.

Project WBS and schedule will contain all steady-state and new deployment

activities, along with Level-1 and Level-2 milestones.

All Level-1 milestones will be defined in the Project Execution Plan

Other performance metrics and key performance indicators are also documented and

tracked in the PEP .

Effective cost and schedule control mechanisms are in place and have been

successfully used in the existing project; these will continue to be used and refined for

the extension.

The $14M budget scenario reflects a substantial reduction in funding from previous

levels. A significant restructuring of our operations and service delivery models will

be required to minimize the impact and increase the level of new computing capacity

that will be delivered to the science program.