may 24-26 2004 s. timm--infrastructure and provisioning at fermilab hdcf 1 infrastructure and...

25
May 24-26 2004 S. Timm--Infrastructure and Provisioning at Fermilab H 1 Infrastructure and Provisioning at the Fermilab High Density Computing Facility Steven C. Timm Fermilab HEPiX conference May 24-26, 2004

Upload: noah-wallace

Post on 28-Mar-2015

219 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: May 24-26 2004 S. Timm--Infrastructure and Provisioning at Fermilab HDCF 1 Infrastructure and Provisioning at the Fermilab High Density Computing Facility

May 24-26 2004 S. Timm--Infrastructure and Provisioning at Fermilab HDCF

1

Infrastructure and Provisioning at the Fermilab High Density

Computing Facility

Steven C. Timm

Fermilab

HEPiX conference

May 24-26, 2004

Page 2: May 24-26 2004 S. Timm--Infrastructure and Provisioning at Fermilab HDCF 1 Infrastructure and Provisioning at the Fermilab High Density Computing Facility

May 24-26 2004S. Timm--Infrastructure and Provisioning at Fermilab HDCF

2

Outline

Current Fermilab facilities Expected need for future Fermilab facilities Construction activity at High Density

Computing Facility Networking and power infrastructure Provisioning and management at remote

location

Page 3: May 24-26 2004 S. Timm--Infrastructure and Provisioning at Fermilab HDCF 1 Infrastructure and Provisioning at the Fermilab High Density Computing Facility

May 24-26 2004S. Timm--Infrastructure and Provisioning at Fermilab HDCF

3

A cast of thousands….

HDCF design done by Fermilab Facilities Engineering, Construction by outside contractor Managed by CD Operations (G. Bellendir et al) Requirements planning by taskforce of Computing

Division personnel including system administrators, department heads, networking, facilities people.

Rocks development work by S. Timm, M. Greaney, J. Kaiser

Page 4: May 24-26 2004 S. Timm--Infrastructure and Provisioning at Fermilab HDCF 1 Infrastructure and Provisioning at the Fermilab High Density Computing Facility

May 24-26 2004S. Timm--Infrastructure and Provisioning at Fermilab HDCF

4

Current Fermilab facilities

Feynman Computing Center built in 1988 (to house large IBM-compatible mainframe).

~18000 square feet of computer rooms ~200 tons of cooling Maximum input current 1800A Computer rooms backed up with UPS Full building backed up with generator ~1850 dual-CPU compute servers, ~200 multi-TB IDE RAID

servers in FCC right now Many other general-purpose servers, file servers, tape robots.

Page 5: May 24-26 2004 S. Timm--Infrastructure and Provisioning at Fermilab HDCF 1 Infrastructure and Provisioning at the Fermilab High Density Computing Facility

May 24-26 2004S. Timm--Infrastructure and Provisioning at Fermilab HDCF

5

Current facilities continued

Satellite computing facility in former experimental hall “New Muon Lab”

Historically for Lattice QCD clusters (208512) Now contains >320 other nodes waiting for

construction of new facility

Page 6: May 24-26 2004 S. Timm--Infrastructure and Provisioning at Fermilab HDCF 1 Infrastructure and Provisioning at the Fermilab High Density Computing Facility

May 24-26 2004S. Timm--Infrastructure and Provisioning at Fermilab HDCF

6

The long hot summer

In summer it takes considerably more energy to run the air conditioning.

Dependent on shallow pond for cooling water. In May building has already come within 25A (out of

1800) from having to shut down equipment to shed power load and avoid brownout.

Current equipment exhausts the cooling capacity of Feynman computing center as well as the electric

No way to increase either in existing building without long service outages.

Page 7: May 24-26 2004 S. Timm--Infrastructure and Provisioning at Fermilab HDCF 1 Infrastructure and Provisioning at the Fermilab High Density Computing Facility

May 24-26 2004S. Timm--Infrastructure and Provisioning at Fermilab HDCF

7

Computers just keep getting hotter

Anticipate that in fall ’04 we can buy dual Intel 3.6 GHz “Nocona” chip, ~105W apiece

Expect at least 2.5A current draw per node, maybe more, 12-13 kVA per rack of 40 nodes.

In FCC we have 32 computers per rack, 8-9 kVA Have problems cooling the top nodes even now. New facility will have 5x more cooling, 270 tons for

2000 square feet New facility will have up to 3000A of electrical current

available.

Page 8: May 24-26 2004 S. Timm--Infrastructure and Provisioning at Fermilab HDCF 1 Infrastructure and Provisioning at the Fermilab High Density Computing Facility

May 24-26 2004S. Timm--Infrastructure and Provisioning at Fermilab HDCF

8

We keep needing more computers Moore’s law doubling time isn’t holding true in commodity market Computing needs are growing faster than Moore’s law and must be met with more computers 5 year projections are based on plans from experiments.

Page 9: May 24-26 2004 S. Timm--Infrastructure and Provisioning at Fermilab HDCF 1 Infrastructure and Provisioning at the Fermilab High Density Computing Facility

May 24-26 2004S. Timm--Infrastructure and Provisioning at Fermilab HDCF

9

Fermi Cycles as a function of time

Fermi cycles vs time

0

500

1000

1500

2000

2500

3000J-

98

D-9

8

J-9

9

D-9

9

J-0

0

D-0

0

J-0

1

D-0

1

J-0

2

D-0

2

J-0

3

D-0

3

Time

Fer

mi C

ycle

s

Fermi Cycles

Moore's law F=2.02

Y=R*2^(X/F): Moore’s law says F=1.5 years,

F=2.02 years and growing. 1000 Fermi Cycles PIII 1 GHz

Page 10: May 24-26 2004 S. Timm--Infrastructure and Provisioning at Fermilab HDCF 1 Infrastructure and Provisioning at the Fermilab High Density Computing Facility

May 24-26 2004S. Timm--Infrastructure and Provisioning at Fermilab HDCF

10

Fermi Cycles per ampere as function of time

Performance per Ampere

0

500

1000

1500

2000

2500

9/1

/00

1/1

/01

5/1

/01

9/1

/01

1/1

/02

5/1

/02

9/1

/02

1/1

/03

5/1

/03

9/1

/03

1/1

/04

Time

Perfo

rm

an

ce/a

mp

ere

Page 11: May 24-26 2004 S. Timm--Infrastructure and Provisioning at Fermilab HDCF 1 Infrastructure and Provisioning at the Fermilab High Density Computing Facility

May 24-26 2004S. Timm--Infrastructure and Provisioning at Fermilab HDCF

11

Fermi cycles per dollar as function of time

Performance/price

0.00

0.50

1.00

1.50

2.00

2.50

3.00

J-9

8

J-9

9

J-0

0

J-0

1

J-0

2

J-0

3

Time

Fer

mi c

ycle

s / $

US

Performance/price

Page 12: May 24-26 2004 S. Timm--Infrastructure and Provisioning at Fermilab HDCF 1 Infrastructure and Provisioning at the Fermilab High Density Computing Facility

May 24-26 2004S. Timm--Infrastructure and Provisioning at Fermilab HDCF

12

Strategy:

Feynman center will be UPS and generator-backed facility for important servers

New HDCF will have UPS for graceful shutdown but no generator backup. Designed for high-density compute nodes (plus a few tape robots).

10-20 racks of existing 1U will be moved to new facility and reracked

Anticipate10-15 racks of new purchase this fall also in new building

Page 13: May 24-26 2004 S. Timm--Infrastructure and Provisioning at Fermilab HDCF 1 Infrastructure and Provisioning at the Fermilab High Density Computing Facility

May 24-26 2004S. Timm--Infrastructure and Provisioning at Fermilab HDCF

13

Location ofHDCF

1.5 miles away from FCC

No administrators will be housed there—will manage “lights out”

Page 14: May 24-26 2004 S. Timm--Infrastructure and Provisioning at Fermilab HDCF 1 Infrastructure and Provisioning at the Fermilab High Density Computing Facility

May 24-26 2004S. Timm--Infrastructure and Provisioning at Fermilab HDCF

14

Floor plan of HDCF

Room for 72 racks in each of 2 computer rooms.

Page 15: May 24-26 2004 S. Timm--Infrastructure and Provisioning at Fermilab HDCF 1 Infrastructure and Provisioning at the Fermilab High Density Computing Facility

May 24-26 2004S. Timm--Infrastructure and Provisioning at Fermilab HDCF

15

Cabling plan

Network Infrastructure

Will use bundles of individual Cat-6 cables

Page 16: May 24-26 2004 S. Timm--Infrastructure and Provisioning at Fermilab HDCF 1 Infrastructure and Provisioning at the Fermilab High Density Computing Facility

May 24-26 2004S. Timm--Infrastructure and Provisioning at Fermilab HDCF

16

Current status

Construction began early May

Occupancy Nov/Dec 2004 (est).

Phase I+II, space for 56 racks at that time.

Expected cost: US$2.8M.

Page 17: May 24-26 2004 S. Timm--Infrastructure and Provisioning at Fermilab HDCF 1 Infrastructure and Provisioning at the Fermilab High Density Computing Facility

May 24-26 2004S. Timm--Infrastructure and Provisioning at Fermilab HDCF

17

Power/console infrastructure

Cyclades AlterPath series Includes console servers, network-based KVM

adapters, and power strips Alterpath ACS48 runs PPC Linux Supports Kerberos 5 authentication Access control can be divided by each port Any number of power strip outlets can be associated

with each machine on each console port. All configurable via command line or Java-based GUI

Page 18: May 24-26 2004 S. Timm--Infrastructure and Provisioning at Fermilab HDCF 1 Infrastructure and Provisioning at the Fermilab High Density Computing Facility

May 24-26 2004S. Timm--Infrastructure and Provisioning at Fermilab HDCF

18

Power/console infrastructure

PM-10 Power strip

120VAC 30A

10 nodes/circuit

Four units/rack

Page 19: May 24-26 2004 S. Timm--Infrastructure and Provisioning at Fermilab HDCF 1 Infrastructure and Provisioning at the Fermilab High Density Computing Facility

May 24-26 2004S. Timm--Infrastructure and Provisioning at Fermilab HDCF

19

Installation with NPACI-Rocks

NPACI (National Partnership for Advanced Computational Infrastructure), lead institution is San Diego Supercomputing Center

Rocks—”ultimate cluster-in-a-box tool.” Combines Linux distribution, database, highly modified installer, and a large amount of parallel computing applications such as PBS, Maui, SGE, MPICH, Atlas, PVFS.

Rocks 3.0 based on Red Hat Linux 7.3 Rocks 3.1 and greater based on SRPMS of Red Hat

Enterprise Linux 3.0.

Page 20: May 24-26 2004 S. Timm--Infrastructure and Provisioning at Fermilab HDCF 1 Infrastructure and Provisioning at the Fermilab High Density Computing Facility

May 24-26 2004S. Timm--Infrastructure and Provisioning at Fermilab HDCF

20

Rocks vs. Fermi Linux comparison

REDHAT7.3

Adds:

Workgroups

Yum

OpenAFS

Fermi Kerberos/

OpenSSH

Adds:

Extended kickstart

HPC applications

MySQL database

Fermi Linux REDHAT Rocks 3.0

Page 21: May 24-26 2004 S. Timm--Infrastructure and Provisioning at Fermilab HDCF 1 Infrastructure and Provisioning at the Fermilab High Density Computing Facility

May 24-26 2004S. Timm--Infrastructure and Provisioning at Fermilab HDCF

21

Rocks Fermiarchitecture Application

Expects all compute nodes on private net behind a firewall

Reinstall node if any changes

All network services (DHCP, DNS, NIS) supplied by the frontend.

Nodes on public net Users won’t allow

downtime for frequent reinstall

Use yum and other Fermi Linux tools for security updates

Configure ROCKS to use our external network services

Page 22: May 24-26 2004 S. Timm--Infrastructure and Provisioning at Fermilab HDCF 1 Infrastructure and Provisioning at the Fermilab High Density Computing Facility

May 24-26 2004S. Timm--Infrastructure and Provisioning at Fermilab HDCF

22

Fermi extensions to Rocks

Fermi production farms currently have 752 nodes all installed with Rocks

This Rocks cluster has the most CPU’s registered of any cluster at rocksclusters.org

Added extra tables to database for customizing kickstart configuration (we have 14 different disk configurations)

Added Fermi Linux comps files to have all Fermi workgroups available in installs, and all added Fermi RPMS

Made slave frontend install servers during mass reinstall phases. During normal operation one install server is enough.

Added logic to recreate kerberos keytabs

Page 23: May 24-26 2004 S. Timm--Infrastructure and Provisioning at Fermilab HDCF 1 Infrastructure and Provisioning at the Fermilab High Density Computing Facility

May 24-26 2004S. Timm--Infrastructure and Provisioning at Fermilab HDCF

23

S.M.A.R.T Monitoring

“smartd” daemon from smartmontools package gives early warning of disk failures

Disk failures are ~70% of all hardware failures in our farms over last 5 years.

Run short self-test on all disks every day

Page 24: May 24-26 2004 S. Timm--Infrastructure and Provisioning at Fermilab HDCF 1 Infrastructure and Provisioning at the Fermilab High Density Computing Facility

May 24-26 2004S. Timm--Infrastructure and Provisioning at Fermilab HDCF

24

Temperature/power monitoring

Wrappers for lm_sensors feed NGOP and Ganglia. Measure average temperature of nodes over a month Alarm when 5C or 10C above average Page when 50% of any group at 10C above average Automated shutdown script activates when any single

node is over emergency temperature. Building-wide signal will provide notice that we are on

UPS power and have 5 minutes to shut down. Automated OS shutdown and SNMP poweroff scripts

Page 25: May 24-26 2004 S. Timm--Infrastructure and Provisioning at Fermilab HDCF 1 Infrastructure and Provisioning at the Fermilab High Density Computing Facility

May 24-26 2004S. Timm--Infrastructure and Provisioning at Fermilab HDCF

25

Reliability is key

Can only successfully manage remote clusters if hardware is reliable

All new contracts are written with vendor providing 3 year warranty parts and labor…they only make money if they build good hardware

30-day acceptance test is critical to identify hardware problems and fix them before production begins.

With 750 nodes and 99% reliability, still 8 nodes would be down a day.

Historically reliability is closer to 96% but new Intel-based Xeon nodes are much better.