1 cardiff university advanced research computing suppliers briefing dr hugh beedie (insrv and arcca...
TRANSCRIPT
1
Cardiff UniversityAdvanced Research Computing
Suppliers Briefing
Dr Hugh Beedie(INSRV and ARCCA CTO)
Dr Chris Dickson (SRIF3 HEC Programme Coordinator)
2
Cardiff University Supplier Briefing
• Purpose– Inform Suppliers of CU requirement and vision– Supplier to outline Technology Roadmap and
Service capabilities
3
Morning Session
• 11:00 to 11:40– Presentation from CU staff (30-35 minutes)– Questions from supplier (5-10 minutes)
• 11:40 to 12:40– Presentation from supplier (45-50 minutes)– Questions from CU staff (10-15 minutes)
• 12:40 to 13:00– Closed session for CU staff (20 minutes)
4
Afternoon Session
• 14:00 to 14:40– Presentation from CU staff (30-35 minutes)– Questions from supplier (5-10 minutes)
• 14:40 to 15:40– Presentation from supplier (45-50 minutes)– Questions from CU staff (10-15 minutes)
• 15:40 to 16:00– Closed session for CU staff (20 minutes)
7
Cardiff University Mission and Vision
• Mission: To pursue research, learning and teaching of international distinction and impact
• Vision: To be a world class university and to achieve the associated benefits for its students, staff and stakeholders
8
Cardiff UniversityScale and Scope
• 26,000 Students– 21,000 Undergraduates– 5,000 Postgraduates– 35,000 Applications for 4,500 Places
• 5,500 Staff– 2,500 Academic & Research
• 28 Academic Schools– 15 Schools RAE 5, 7 Schools RAE5*
9
Cardiff University Standing
• A member of the Russell Group of leading research universities
• Research Excellence– 7th UK University
Ranking• Research Assessment
Exercise 2001– Most Recent RAE
10
Cardiff UniversityStanding
• 1 Cambridge• 2 Imperial College• 3 Oxford• 7 Cardiff• 8 Manchester• 10 Southampton• 14 Edinburgh• 15 Bristol• 16 York• 21 Birmingham
Cardiff's RAE performance since 1992
1992 1996 2001
Po
siti
on
(b
y w
eig
hte
d a
vera
ge/
fte
of
ret'
d s
taff
)35th
15th
7th
30
20
10
40
0
11
Advanced Research ComputingCurrent Applications
• High Speed Cluster– Galaxy Formation– Gene Sequencing– Mantle Simulation– Molecular Simulation– Orthodontic Modelling– Protein Sequencing
• Gigabit Cluster– Gravitational Waves
• Shared Memory – Computational Fluid
Dynamics– Environmental
Engineering– Star Formation
• Condor– Protein Structure
Determination– Radiotherapy
Simulation
12
Advanced Research ComputingCurrent Schools
• Architecture• Biosciences• Business• Chemistry• Computer Science• Dentistry• Earth, Ocean and
Planetary Sciences• English, Communication
and Philosophy
• Engineering• History and Archaeology• Mathematics• Medicine• Optometry and Vision
Sciences• Pharmacy• Physics and Astronomy• Psychology
13
Cardiff HEC Strategy
• World class facility for the university
• Adopt a more coordinated approach
• University investment – New organisation - Advanced Research
Computing @ Cardiff (ARCCA)– Staff accommodation and machine room
• SRIF3 investment in equipment and supporting infrastructure
14
Offices and Machine Room
• Machine Room– Area 175 sq m– Ground Floor– False Flooring & Ceiling
• Power, Cooling, Network
• Power– Separate Bus-Bar + Meter
• Cooling– Separate Bus-Bar + Meter – Rack-based Cooling– Room-based Cooling
• Security– Key Card Access– Closed Circuit TV
• Fire + Water Damage– Smoke Detection– Water Detection– Automated Shutdown
• Office space for >12 staff
15
ARCCA Vision
– To build and sustain a position which takes the university to the forefront of leading research universities in the UK and internationally in this field
– To work very closely with all of the relevant academic schools and administrative directorates
16
ARCCA Objectives
– To transform the University’s approach to advanced research computing
– To bring together the advanced research computing community
– To introduce and encourage the use of new techniques and technologies
– To encourage and enable new user communities and new applications
– To encourage new and interdisciplinary research using advanced research computing techniques
17
ARCCA Organisation
– Director (TBC)– Chief Technology Officer (Hugh Beedie)– Application Specialists– Infrastructure Specialists– e-Research facilitation– Administrative Assistant
20
Cardiff’s Journey
• Separate research projects with project-based HEC facilities
• Some collaborations (e.g. Helix)
• Strategic decision to provide University level solution– SRIF3 grant– ARCCA organisation– Select strategic partner
21
Partnership
• State of the art of your technical offerings now and in Autumn 2007?
• Form factor, heat management, power consumption and management?
• HEC landscape over the next 3-7 years ?
• How, over 3-7 years, you can help us to achieve our ambitions to offer "best-in-class" research computing capabilities and services on a par with world-leading campus-level centres?
• Innovative and leap-frogging steps you can help us to take whilst at the same time being able to offer a service to today's "classic" HEC users.
22
Why Should You Partner Cardiff ?
• Research quality (7th in UK)• Condor leader (1500PCs by end 2006)• Completing and coordinating the HEC service
spectrum • Current breadth of academic involvement• WeSC involvement and NGS partner• Serious ambitions for
– More Interdisciplinary work– Wider research applications– New techniques
23
The HEC Spectrum
HPCTightly Coupled
Supercomputers
NUMA Machines£ Million+
HTCLoosely Coupled
Small Clusters
Campus Pools£ Thousand
£ H Thousand
Large Clusters
SMP£ H Thousand
£ Million
24
Tender Outline
• Lot 1. 2048 Core/512 Node Cluster with 4 Head Nodes, Interconnect Fabrics and Cluster S/W.
• Lot 2. Shared Memory Systems (8, 16 and 32-way SMP machines).
• Lot 3. 50TB+ Disk and 200TB+ Tape Storage Services for ARCCA facilities.
• Lot 4. (Optional). Specialist Hardware (FPGAs, accelerators, etc) with associated software
• Lot 5. (Optional) Other software tools (compilers, applications, application profiling & system management tools, other software, etc)
26
Lot 1- Head Nodes
• 16 cores / 4 nodes• 100% of nodes
– CPU Best value for money– RAM 4GB per core / 16GB per node– HDD 1TB per node– NIC 10GbE per node with TOE– NIC 4Gb dual-port fibre-channel
• 100% of nodes with– High Speed – Low Latency Interconnect
27
Lot 1 - Cluster Nodes
• 2048 cores / 512 nodes• 100% of nodes
– CPU Best value for money– RAM 2GB per core / 8GB per node– HDD 72GB per node– NIC 2xGbE per node
• 50%, 75%, 100% of nodes– High Speed – Low Latency Interconnect
• Cluster Management and Job scheduling
28
Lot 2 - Shared Memory Nodes
• 8, 16, and 32 way
• All systems– CPU Best value for money– RAM 2GB per core – HDD 1TB per system– NIC 2xGbE per system
29
Lot 3 - Disk Storage
• 50TB+ Disk Storage– Usable storage space– RAID 0, 1, 5 ???– Hierarchical Storage Management
• FC disk, SATA Disk, Tape Storage ?• Currently using Tivoli
– Connected to Head Nodes • Via 4Gb dual-port fibre-channel• Via existing SAN switch
30
Lot 3 - Tape Storage
• 200TB+ Tape Storage– Usable storage space– Hierarchical Storage Management
• Disk Storage• Currently using Tivoli
– Connected to Head Nodes • Via 4Gb dual-port fibre-channel• Via existing SAN switch
31
Lot 4 - Specialised Hardware
R & D - Innovative Computational Methods
• Computational Accelerators• Field-Programmable Gate Arrays• Programmable Graphics Processors• Cell Processors
32
Lot 5 - Other Software Tools
• We invite you to comment on the following – Extra Cluster Management Tools– Extra Cluster Diagnostic Tools– Other Development Tools
• Optimised Compilers & Libraries– Intel, Portland
• Debuggers– Parallel Debuggers
– Applications And Libraries• Profiling tools
33
Support Issues
• We invite you to comment on the following – Spares stock vs guaranteed replacement time– Development/test nodes X 8– Numbers of technical cluster support staff
• Location• Skill sets
– Single point of contact and responsibility• Including direct access to 3rd party tech support
34
Any Questions ?
Procurement contacts:
[email protected] Primary contact
[email protected] Secondary contact