sura birds of a feather i2 spring members meeting april 23, 2007 gary crane sura director it...
TRANSCRIPT
SURA Birds of a FeatherI2 Spring Members Meeting
April 23, 2007
Gary CraneSURA DirectorIT Initiatives
SURA I2 BoF Agenda
• Introductions• SURA Program Update• SERON Committee Formation• SURAgrid Update• New SURAgrid Corporate Partnerships
– New IBM Linux Cluster offering– Dell Linux Cluster offering
• Open Discussion
SURA IT Program Update• See current on-line IT Program Update at for
details: http://www.sura.org/programs/it.html– IT Steering Group activities & summary of interactions
with I2 & NLR • SURA letters to I2 leadership – Attachments 1&2
– SURAgrid status & activities– SURAgrid corporate partnership program– AtlanticWave update – AT&T GridFiber update– SERON background and committee structure – Attach 3– SURAgrid Governance Proposal – Attachment 4– SURAgrid application summaries – Attachments 5-9
• Multiple Genome Alignment – GSU• Urban Water System Threat Management Simulation – NC
State• Bio-electric Simulator for Whole Body Tissues – ODU• Storm Surge Modeling with ADCIRC – UNC-CH/RENCI• Searching Protein Databases - UAB
SERON Committee Activities Background
• SURA SE footprint– Has 63 research institutions– 35% of dues-paying I2 member– 7 of the 16 NLR memberships– All but 3 states have R&E network initiatives
• Strong history of connectivity leadership– SURANet– Regional Infrastructure Initiative & AT&T
Partnership– Operational RONs:
FLR, LEARN LONI, MATP, MAX, NCREN, OneNet, SLR
Held Two Community Organized and Led Information & Planning Meetings
• July 27, 2006 Atlanta meeting– Resulted in creation of 2 working groups
• Explore ways to leverage SURA region RONs to reduce costs / improve services
– Working group led by Larry Conrad • Leverage SURA region to influence I2 / NLR
relationship– Working group led by John Mullin
• February 20, 2007 Atlanta meeting– SERON Committee formed– SURA RON Traffic Analysis group formed
First Meeting of the SERON Committee
APRIL 17, 2007
• Agreed to continue to pursue cooperative efforts by SURA region RONs leveraging SURA
• Nominated a SERON Chair Charlie McMahon (LSU) and co-Chair Phil Halstead (FLR)
• Will work with SURA to continue to organize SERON community
SERON Products
• Several letters from SURA President and IT Committee Chair stating SURA regional views on I2/NLR developments
• An initial effort to organize a SURA regional traffic analysis
• Improved communications between I2 & NLR leaders and SURA region network leaders
• Active effort to nominate SURA regional networking leaders to I2 Council positions
SURAgrid Update
Mary Fran YafchakSURA IT Program Coordinator
SURAgrid Corporate Partnerships
• Existing IBM p575 partnership• New IBM e1350 Linux partnenrship• New Dell PowerEdge 1950 partnership• Significant product discounts• Owned and operated by SURAgrid
participants• Integrated into SURAgrid with 20% of
capacity available to SURAgrid
Existing P5 575 System Solution
• Robust Hardware with High Reliability Components
• 16 CPU scalability within a node• Low Latency High Performance
Switch Technology• AIX OS & Software Subsystems• High Compute Density Packaging• Ability to scale to very large
configurations
16W16W
Fed SW
BPA
2U
4U
16W16W16W16W16W16W
0.97 TFlop Solution For SURA
8, 16W Nodes at 1.9 Ghz128 Processors
Federation Switch
256 GB or 128 GB System Memory
Storage Capacity: 2.35 TBytes
1.7 TFlop Solution for SURA
14, 16W Nodes at 1.9 Ghz224 Processors
Federation Switch224 GB or 448 GB System MemoryStorage Capacity: 4.11 TBytes
Two Options
16W16W16W16W16W16W16W16W
Fed SW
BPA
2U
4U
16W16W16W16W16W16W
p5 575 Software
• AIX 5.3• General Parallel File System (GPFS) with WAN
Support• LoadLeveler• Cluster Systems Management (CSM)• Compilers (XL/FORTRAN, XLC)• Engineering and Scientific Subroutine Library (ESSL) • IBM’s Parallel Environment (PE)• Simultaneous Multi-Threading (SMT) Support• Virtualization, Micro-Partitioning, DLPAR
SURA Pricing for p5 575 Solutions
• 0.97 TFlop Solution– 8 Nodes $380,000.00 to SURA (16GB/Node)*– 8 Nodes $410,000.00 to SURA (32GB/Node)*
• 1.70 TFlop Solution– 14 Nodes $610,000.00 to SURA (16GB/Node)*– 14 Nodes $660,000.00 to SURA (32GB/Node)*
• Price Includes 3 Year Warranty– Hardware M-F, 8-5, Next Day Service
• Pricing Available Through the End of Calendar Year 2007
Net Price to Add a Node with 32 GB Memory : $56,752Net Price to Add a Node with 16 GB Memory : $53,000
New SURA e1350 Linux Cluster
• New IBM BladeCenter-H, new HS21XM Blades and Intel Quad-Core Processors
• 3 TFLOP Configuration– One Rack solution with GigE interconnect– 1GB/core– Combination Management/User node with storage
• 6 TFLOP – Performance Focused Solution for HPC– Two Rack solution utilizing DDR Infiniband– 2GB/core– Combination Management/User node with storage– Optional SAN supporting 4Gbs storage at 4.6Tbytes
Announced at last week’s SURA BoT meeting
3 TFLOP e1350 Cluster - U ###
424140393837363534
333231 F R R F30
2928
272625242322212019181716 F R R F15141312
11109876543
21
Blade Center2
x3650 Mgmt
Console SwitchNo Kbd.
Force10 48 Port SwitchSMC Tiger IV 8 Port
1U Blank
Bla
deS
erve
r
Bla
deS
erve
r
Bla
deS
erve
r
Bla
deS
erve
r
Bla
deS
erve
r
Bla
deS
erve
r
Bla
deS
erve
r
Bla
deS
erve
r
Fan 2 PS4 IB
PS3 Ser
Bla
deS
erve
r
Bla
deS
erve
r
Bla
deS
erve
r
Bla
deS
erve
r
Bla
deS
erve
r
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Bla
deS
erve
r
Bla
deS
erve
r
39M
2818
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Blade Center2
Mgt PS1 Fan 1
Bla
deS
erve
r
Bla
deS
erve
r
Bla
deS
erve
r
Bla
deS
erve
r
Bla
deS
erve
r
Bla
deS
erve
r
Bla
deS
erve
r
Bla
deS
erve
r
DBlk. PS2 Fan 2 PS4 IB
Bla
deS
erve
r
Bla
deS
erve
r
Fan 1 PS3 SerC Mgt PS1
39M
2818
Bla
deS
erve
r
Bla
deS
erve
r
Blk. PS2
Bla
deS
erve
r
Blade Center2
39M
2818
1 2 3 4 5 6 7 8 9 10 11 12 13 14
39M
2816
Blade Center2
Bla
deS
erve
r
Bla
deS
erve
r
A B
Bla
deS
erve
r
Bla
deS
erve
r
Bla
deS
erve
r
Bla
deS
erve
r
IB
Bla
deS
erve
r
Bla
deS
erve
r
Bla
deS
erve
r
Bla
deS
erve
r
Bla
deS
erve
r
Bla
deS
erve
r
Bla
deS
erve
r
Bla
deS
erve
r
Blk. PS2 Fan 2 PS4
1U Blank
Mgt PS1 Fan 1 PS3 Ser
3U Blank
32 Terminal Server32 Terminal Server
34 HS21XM Blade Servers in 3 BladeCenter H ChassisDual Quad-Core 2.67 GHz Clovertown Processors1 GB Memory per core73 GB SAS Disk per bladeGigE Ethernet to Blade with 10Gbit UplinkSerial Terminal Server connection to every bladeRedundant power/fans
x3650 2U Management/User NodeDual Quad-Core 2.67 GHz Clovertown Processors1 GB Memory per coreMyricom 10Gb NIC CardRAID Controller with (6) 300GB 10K Hot-swap SAS DrivesRedundant power/fans
Force10 48-port GigE Switch with 2 10Gb Uplinks
SMC 8-port 10Gb Ethernet Switch
(2) 32-port Cyclades Terminal Servers
RedHat ES 4 License and Media Kit (3 years update support)
Console Manger, Pull-out console, keyboard, mouse
One 42U Enterpise Rack, all cables, PDU’s
Shipping and Installation
5 Days onsite Consulting for configuration, skills transfer
3 Year Onsite Warranty
$217,285
70 HS21XM Blade Servers in 5 BladeCenter H ChassisDual Quad-Core 2.67 GHz Clovertown Processors2 GB Memory per core73 GB SAS Disk per bladeGigE Ethernet to BladeDDR Non-Blocking Voltaire Infiniband Low Latency NetworkSerial Terminal Server connection to every bladeRedundant power/fans
x3650 2U Management/User NodeDual Quad-Core 2.67 GHz Clovertown Processors1 GB Memory per coreMyricom 10Gb NIC CardRAID Controller with (6) 300GB 10K Hot-swap SAS DrivesRedundant power/fans
DDR Non-Blocking Infiniband Network
Force10 48-port GigE Switch
(3) 32-port Cyclades Terminal Servers
RedHat ES 4 License and Media Kit (3 years update support)
Console Manger, Pull-out console, keyboard, mouse
One 42U Enterpise Rack, all cables, PDU’s
Shipping and Installation
10 Days onsite Consulting for configuration, skills transfer
3 Year Onsite Warranty
6 TFLOP e1350 Cluster - U No. MT/M or PN U42 4241 4140 4039 3938 3837 L L L L L 37
36 3635 3534 S
FS
F
SF
SF 34
33 3332 3231 F R R F 31 F R30 3029 2928 2827 27
26 2625 2524 2423 2322 2221 2120 2019 1918 1817 1716 F R R F 16 F R15 1514 1413 1312 1211 1110 109 98 8
7 76 65 54 43 32 21 1
73.4
Blade Center Comp.
DS4700
73.4
73
.4
73.4
73
.4
73.4
73
.4
73.4
C
A
Main
48 Terminal Server
73.4
73
.4
73.4
73
.4
73.4
EXP
A
C
PS3
Blk. PS2 Fan 2 PS4
39M
2818
39M
2818
39M
2816
39M
2816
D
Blade Center Comp.
Mgt PS1 Fan 1
39M
2818
39M
2818
48 Terminal Server1U Blank
Blade Center Comp.
Bla
deS
erve
rB
lade
Ser
ver
PS3
GbE Ser
3U Blank
3U Blank1U Blank1U Blank
B73.4
73
.4
73.4
Bla
deS
erve
rB
lade
Ser
ver
3U Blank
3U Blank
Blade Center Comp.Mgt PS1 Fan 1
Bla
deS
erve
r
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
r
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
Mgt PS1 Fan 1
Blk. PS2 Fan 2
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
r
PS3 GbE Ser
PS4
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
r
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Blade Center Comp.Mgt PS1 Fan 1 PS3 GbE Ser
Blk. PS2 Fan 2 PS4
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
1U InfiniBand Sw
EXP810
73.4
73
.4
73.4
73
.4
73.4
73
.4
Bla
deS
erve
rB
lade
Ser
ver
73.4
73
.4
1U InfiniBand Sw1U InfiniBand Sw
1 2 3 4 5 6 7 8 9 10 11 12 13 14
1U InfiniBand Sw
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
73.4
73
.4
x3650 Storage Node
x3650 Mgmt Node
73.4
73
.4
73.4
73
.4
73.4
73
.4
1U GB Ethernet Sw
Console Switch1U Kbd. Mon.
GbE Ser
Blk. PS2 Fan 2 PS4
Bla
deS
erve
r
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Bla
deS
erve
r
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
r
Mgt PS1 Fan 1
Bla
deS
erve
r
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
r
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
r
1 2 3 4 5 6 7 8 9 10 11 12 13 14
PS3 GbE Ser
Bla
deS
erve
rB
lade
Ser
ver
Fan 2 PS4
1U InfiniBand Sw
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
Blk. PS2
1U InfiniBand Sw1U IB Core1U IB Core1U IB Core
$694,309
x3650 Storage NodeDual Quad-Core 2.67 GHz Clovertown Processors1 GB Memory per coreMyricom 10Gb NIC Card(2) 3.5" 73GB 10k Hot Swap SAS Drive(2) IBM 4-Gbps FC Dual-Port PCI-E HBARedundant power/fans3 Year Onsite 24x7x4Hour On-site Warranty
DS4700 Storage Subsystem4 Gbps Performance (Fiber Channel)EXP810 Expansion System(32) 4 Gbps FC, 146.8 GB/15K
Enhanced Disk Drive Module (E-DDM)Total 4.6 TB Storage Capacity
6 TFLOP e1350 Cluster Storage
Option - U No. MT/M or PN U42 4241 4140 4039 3938 3837 L L L L L 37
36 3635 3534 S
FS
F
SF
SF 34
33 3332 3231 F R R F 31 F R30 3029 2928 2827 27
26 2625 2524 2423 2322 2221 2120 2019 1918 1817 1716 F R R F 16 F R15 1514 1413 1312 1211 1110 109 98 8
7 76 65 54 43 32 21 1
73.4
Blade Center Comp.
DS4700
73.4
73
.4
73.4
73
.4
73.4
73
.4
73.4
C
A
Main
48 Terminal Server
73.4
73
.4
73.4
73
.4
73.4
EXP
A
C
PS3
Blk. PS2 Fan 2 PS4
39M
2818
39M
2818
39M
2816
39M
2816
D
Blade Center Comp.
Mgt PS1 Fan 1
39M
2818
39M
2818
48 Terminal Server1U Blank
Blade Center Comp.
Bla
deS
erve
rB
lade
Ser
ver
PS3
GbE Ser
3U Blank
3U Blank1U Blank1U Blank
B73.4
73
.4
73.4
Bla
deS
erve
rB
lade
Ser
ver
3U Blank
3U Blank
Blade Center Comp.Mgt PS1 Fan 1
Bla
deS
erve
r
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
r
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
Mgt PS1 Fan 1
Blk. PS2 Fan 2
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
r
PS3 GbE Ser
PS4
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
r
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Blade Center Comp.Mgt PS1 Fan 1 PS3 GbE Ser
Blk. PS2 Fan 2 PS4
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
1U InfiniBand Sw
EXP810
73.4
73
.4
73.4
73
.4
73.4
73
.4
Bla
deS
erve
rB
lade
Ser
ver
73.4
73
.4
1U InfiniBand Sw1U InfiniBand Sw
1 2 3 4 5 6 7 8 9 10 11 12 13 14
1U InfiniBand Sw
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
73.4
73
.4
x3650 Storage Node
x3650 Mgmt Node
73.4
73
.4
73.4
73
.4
73.4
73
.4
1U GB Ethernet Sw
Console Switch1U Kbd. Mon.
GbE Ser
Blk. PS2 Fan 2 PS4
Bla
deS
erve
r
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Bla
deS
erve
r
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
r
Mgt PS1 Fan 1
Bla
deS
erve
r
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
r
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
r
1 2 3 4 5 6 7 8 9 10 11 12 13 14
PS3 GbE Ser
Bla
deS
erve
rB
lade
Ser
ver
Fan 2 PS4
1U InfiniBand Sw
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
Bla
deS
erve
rB
lade
Ser
ver
Blk. PS2
1U InfiniBand Sw1U IB Core1U IB Core1U IB Core
$41,037
SURA – Dell Partnership• Complete Dell Dell PowerEdge 1950 2TFlop High Performance
Computing Cluster • SURAGrid Special Offer - $112,500
– Master Node – Dell PowerEdge 1950 (Qty = 1)– Compute Nodes – Dell PowerEdge 1950 (Qty = 27)– Gigabit Ethernet Interconnect
• Dell PowerConnect 6248 (Qty = 2)
– PowerEdge 4210, 42U Frame
– Platform Rocks – Cluster Management Software• 1 year support agreement
– Complete Rack & Stack, • including cabling, prior to delivery
– Complete Software Installation – • Operating system, Cluster Management Software
– 2 Day on-site systems engineer
•Dual 2.33GHz/2x4MB Cache, Quad Core Intel® Xeon E5345, 1333MHz FSB Processors • 12GB FBD 667MHz Memory • 80GB 7.2K RPM SATA Hard Drive • Red Hat Enterprise Linux WS v4 1Yr RHN Subscription, EM64T • 24X CD-ROM • 3 Years HPCC Next Business Day Parts and Labor On-Site Service
Compute Nodes – Dell PowerEdge 1950 (Qty = 27)
Master Node – Dell PowerEdge 1950 (Qty = 1) • Dual 2.33GHz/2x4MB Cache, Quad Core Intel® Xeon E5345, 1333MHz FSB Processors • 12GB FBD 667MHz Memory • Embedded RAID Controller – PERC5 • (2) 146GB, SAS Hard Drive (RAID1) • Dual On-Board 10/100/100 NICS • 24X CDRW/DVD-ROM • Dell Remote Assistance Card • Redundant Power Supply • Red Hat Enterprise Linux AS v4 1Yr Red Hat Network Subscription, EM64T • 3 Years Premier HPCC Support with Same Day 4 Hour Parts and Labor On-Site Service
Gigabit Ethernet Interconnect – Dell PowerConnect 6248 (Qty = 2) • PowerConnect 6248 Managed Switch, 48 Port 10/100/1000 Mbps • Four 10 Gigabit Ethernet uplinks • 3 Years Support with Next Business Day Parts Service Other Components - • PowerEdge 4210, Frame, Doors, Side Panel, Ground, 42U • (3) 24 Amp, Hi-Density PDU, 208V, with IEC to IEC Cords • 1U Rack Console with 15" LCD Display, Mini-Keyboard/Mouse Combo • Platform Rocks – Cluster Management Software with 1 year support agreement • Complete Rack & Stack, including cabling, prior to delivery • Complete Software Installation - Operating system, Cluster Management Software, etc. • 2 Day on-site systems engineer
For More Information Regarding IBM and Dell Discount Packages
Contact Gary Crane315-597-1459
SURAgrid Governance and Decision-Making Structure Overview
• See Tab 17 (page 28) of Board materials book for a copy of the SURAgrid Governance and Decision Making Structure Proposal
• SURAgrid Project Planning Working Group established at Sep 2006 In-Person Meeting to develop governance options for SURAgrid. Participants included:– Linda Akli, SURA– Gary Crane, SURA– Steve Johnson, Texas A&M University– Sandi Redman, University of Alabama in Huntsville– Don Riley, University of Maryland & SURA IT Fellow– Mike Sachon, Old Dominium University– Srikanth Sastry, Texas A&M University– Mary Fran Yafchak, SURA– Art Vandenberg, Georgia State University
SURAgrid Governance Overview• To date SURAgrid has been used Consensus based decision-
making with SURA facilitating process• State of maturity & investment >> formal governance needed• Expected purposes of formal governance
– Ensure those investing have appropriate role in governance– Support sustainable growth of active participation to enhance
SURAgrid infrastructure 3 initial membership classes• 3 Classes of Membership Defined
– 1. Contributing Member• Higher education or related org contributing significant resources to
advance SURAgrid regional infrastructure• SURA is contributing member by definition
– 2. Participating Member• Higher education or related org participating in SURAgrid activities other
than Contributing Member– 3. Partnership Member
• Entity (org, commercial, non-HE…) with strategic relationship with SURAgrid
SURAgrid Governance Overview
• SURAgrid Contributing Members form primary governing body
• Each SURAgrid Contributing Member will designate one SURAgrid voting member
• SURAgrid Governance Committee elected by SURAgrid Contributing Members– SURAGrid Governance Committee provides
• Act on behalf of Contributing Members• guidance, facilitation, reporting
• Initial SURAgrid Governance Committee will have 9 members: – 8 elected by contributing members– 1 appointed by SURA
Transitioning to the New SURAgrid Governance and Decision-Making Structure
• SURAgrid Participating Organization designate a SURAgrid LEAD – Done
• New governance structure approved by SURA IT Steering Group – Done
• New Governance structure approved by vote of SURAgrid participating leads – Done
• Call for nominations for SURAgrid Governance Committee candidates – Done
• Nominations will be accepted through midnight April 28
• Election of SURAgrid Governance Committee members is expected to be completed by May 12
New IBM e1350 Linux Cluster• BladeCenter-H Based Chassis
– Redundant power supplies and Fan Units– Advanced Management Module– Dual 10 Gbps Backplanes
• Fully integrated, tested and installed e1350 Cluster• Onsite configuration, setup and skills transfer• QUAD-CORE Intel Processors (8 Cores/Node)• Single Point of Support for the Cluster• Terminal Server connection to every Node• IBM 42U Enterprise Racks• Pull-out Console monitor, keyboard, mouse• Redundant power and fans on all nodes• 3 Year Onsite Warranty
– 9x5xNext Day on-site on Compute Nodes– 24x7x4 Hour on-site on Management Node, switches, racks (optional Storage)