simulation science at rwth aachen university euroad workshop - christian... · simsci as key...
TRANSCRIPT
Simulation Science Simulation Science at RWTH Aachen Universityat RWTH Aachen University
Christian BischofChristian Bischof
Institute for Scientific Computing andCenter for Computing and Communication
RWTH Aachen [email protected]
www.sc.rwth-aachen.de
RWTH AachenRWTH Aachen
RWTH Aachen RWTH Aachen OverviewOverview100 under- and postgraduate studies416 professors
2,000 other academic staff2,000 non-academic staff
29,600 students (5,600 from outside Germany)2154 first degrees698 doctoral degrees40 habilitations
373 Mill. EUR budget (without external funding)142 Mill. EUR external funding
9 Schools• Mathematics, Computer Sciences and Natural Sciences• Electrical Engineering and Information Technology • Civil Engineering• Mechanical Engineering• Georesources and Material Technology• Architecture• Arts• Economics• Medicine
Simulation ScienceSimulation Science
SimSciSimSci as Key Technologyas Key TechnologyMotto for RWTH Aachen University:
From the idea of today to the product of tomorrow
In strategic agreement between RWTH and ministry of science:
„The simulation of complex models lies at the core of innovative product design and basic research at RWTH Aachen.
Genesis of CCES at RWTHGenesis of CCES at RWTHTraditionally uncoordinated activities in various disciplines related to modelling, optimization, and experimental design
Despite high strategic importance no sustainable structureStructure needed for:
Potentialization instead of supplementation of competences Synergetic effects in research and education transgressing disciplinary boundaries
Center for Computational Engineering Sciences (CCES)
Founded in 2004 by 4 schoolsBundles mathematical, algorithmicand software competence. 4 new faculty lines filled, onother about to be filled.
CCES ActivitiesCCES ActivitiesNew course of study: (CES)Computational Engineering Scienceinstituted in 2002Fast-track from Bachelor to Ph.D.for excellent students.
“Aachen Institute for Advanced Study in Computational Engineering Sciences” (AICES)
New kind of collaboration between federally funded research institutions and a university: German Research School for Simulation Science with FZ Jülich (announced 10Sept06)
CES CES StudyStudy ProgramProgramEarly Emphasis on Methods, including from CS
Programming and Data StructuresHigh Performance ComputingData Intensive ComputingSoftware Engineering
Deepening of this knowledge in an application-specific context
6 semesters simulation technology, computer science, mathematics, engineering
4 semesters specialization in one of 3 application areas
This way CES-graduates do not fall between thedisciplinary cracks
Currently 50 students/yr.Change from Diploma to Bachelor/Master in WS 2007
Graduate School AICESGraduate School AICESAwarded within the German “University Excellence” Initiative1 Mio EUR/yr for 5 yrs
Objective: Integrated and coordinated education starting from Bachelor to Ph.D.
Fast-track option for extraordinary students
Goal: In total approximately 100 Ph.D. students related to CES
Introduction of an advising team as opposed to a traditional single advisor
Research groups lead by independent young researches
supported by 14 RWTH institutes
MathematischMathematisch--TechnischerTechnischerSoftwareSoftware--EntwicklerEntwickler ((MaTSeMaTSe))
Dualer Studiengang ab WS 07 – Kombination von
Ausbildung zum MaTSe (IHK-Prüfung, Ausbildungsvergütung)
FH-Bachelor Scientific Programming an der FH Aachen-Jülich
FH-Studiengang von AQAS im Juni 06 akkreditiert
Unterricht wird von FH, FZ Jülich und RWTH Aachen geleistet, Ausbildung findet dort und in Industrieunternehmen statt.
55 credits Mathematik, 55 credits Softwaretechnik, 40 credits Wahlpflicht und Praktische Arbeit
Aktuell 120 Azubis/Jahr an RWTH und FZJ
InfrastructureInfrastructure forfor CompSciCompSci
Current HPC System at RWTHCurrent HPC System at RWTH
Storage Area Network (SAN)
16 x E6900 Cluster(Fire Link)
4 x E25k Cluster(Fire Link)
8 x E2900 Cluster
Quad-OpteronCluster 4 x E25k systems with:
72 UltraSPARC IV , 288 Gbyte memory16 x E6900 systems with
24 UltraSPARC IV, 96 GByte memory8 x E2900 systems with
12 UltraSPARC IV, 48 GByte memory64 x quad-Opteron systems
2,2 GHz Opterons, 8 GByte memory4 x quad-Opteron systems (dual core)
2,2 GHz Opterons, 16 GByte memory2 x 8-way-Opteron systems (dual core)
2,6 GHz Opterons, 32 GByte memory1 x 32-way-T2000 system (8 core)
1 GHz UltraSPARC, 16 GByte memory
Total:over 4.6 TFlopsover 3.5 TByte memory
4 x E25k systems with:72 UltraSPARC IV , 288 Gbyte memory
16 x E6900 systems with24 UltraSPARC IV, 96 GByte memory
8 x E2900 systems with12 UltraSPARC IV, 48 GByte memory
64 x quad-Opteron systems2,2 GHz Opterons, 8 GByte memory
4 x quad-Opteron systems (dual core) 2,2 GHz Opterons, 16 GByte memory
2 x 8-way-Opteron systems (dual core) 2,6 GHz Opterons, 32 GByte memory
1 x 32-way-T2000 system (8 core) 1 GHz UltraSPARC, 16 GByte memory
Total:over 4.6 TFlopsover 3.5 TByte memory
NextNext HPC System at RWTHHPC System at RWTH10 Mio EUR planned
double last investment5 Mio already committed for 2008
New building (600 m2, 850 KW machine room)
Installation in stages from 1Q08 on20% of cycles for other universities in Northrhine-Westphalia (NRW)
SMP cluster with „thin“ and „fat“ nodes to accomodate diverse job mix and allow for „entry level capability computing“
New Building New Building InfrastructureInfrastructure forfor HPCHPC
Maschinenhalle
600 m600 m22
850 kW850 kW
7,5 Mio EUR (w/o USV and chilled water)
HighHigh--EndEnd VisualizationVisualizationEmphasis on virtual reality (immersive, real-time, user-centric)
ViSTA FlowLib Software – VR capable post-processing of large-scale simulation results
Post processing: hybrid parallelization, multi-resolution techniques, advanceddata managementVR: GPU-based rendering of particles, volume rendering, billboarding, multimodal interaction techniquesApps in flow & combustion, rhinosurgery, plastics processing, …Builds upon OpenSG and VTK community standards
Innovative CAVE as high-end display device5-sided, 3.60 x 2.70 x 2.70 m, 1600x1200 resolution„move module“ allows easy L-shaped reconfigurationDriven by commodity PC cluster
OtherOther CompSciCompSci InfrastructureInfrastructureNetworks
5 Gbit hookup to X-Win, backbone w/ 1-10 GBit, over 400 WLAN APs„Fibre to the desktop“ as RWTH-goalDark fibre Aachen – FZ Jülich and Aachen-Bonn-Köln-AC-BN-Birlinghoven (sponsored by Netcologne)
Data archiveTape System with 1,5 PByte (raw) installedArchival mirroring with FZ JülichdCache repository for Tier-II CMS Grid Center in Aachen
Computing-Cooperative Northrhine-Westphalia (RV-NRW)Federation across NRW universites, run in AachenSelf-service for HPC, RV-NRW grid, archive, Windchill portal
Grid ProjectsD-Grid: Partner in Infrastructure ProjectVIOLA: Remote Visualization
TheThe ImportanceImportance of of Parallel Parallel ProgrammingProgramming
Chip Design Chip Design ConsiderationsConsiderationsPower consumption (and heat generated) scales linearly with #transistors but quadratically with clock rate
Extrapolation with a 2,2 GHz Opteron (95 W) as basis:Processor w/ 2x clock rate + 2x cache: 22 x 2 x 95 W = 760 WChip with 8/ cores, ½x clock rate and 2x #transistors: (1/4 x 2) x 95 W = 50 W (theoretically twice as fast!)
There is no Moore‘s Law for memory latencyMemory latency halves only every six years.Hence bigger caches will be of limited use.
Conclusion: Integration of many cores on a chip700 MHz suffice Blue Gene for No. 1 slot in TOP500!All manufacturers are building multicore (i.e. SMP) chips
Parallel Parallel ComputingComputing becomesbecomes StandardStandardAround the corner are “pizza boxes” with
substantial multithreading low latency high throughput good floating point performance
In a few years, many or all applications will be multi-threaded
Substantially parallel SMP boxes with small footprint will be building blocks of large systems, commodity or custom.
Suitable SW-Design methodologies for hybrid programming? (Shared-memory programming is easier than MPI)
Sparse Matrix Vector MultiplicationSparse Matrix Vector Multiplication
0
100
200
300
400
500
600
700
800
900
0 5 10 15 20 25 30 35
# threads
MFL
OPS
/ M
OPS
SF T2000long longSF 2900long longSF T2000doubleSF 2900double
E
19,6 Mio nonzeros233,334 matrix dimension225 MB memory footprint
SF T2000
www.rz.rwth-aachen.de/hpc/hw/niagara.php
Top: Sunfire T2000 (8-core multithreaded Niagara, 1 GHz)Bottom: Sunfire 2900 (12 x dual-core Ultrasparc IV, 1,2 GHz)
Parallel VR Feature ExtractionParallel VR Feature ExtractionSamuel Sarholz & Andreas GerndtCenter for Computing and Communication, RWTH Aachen University
Complex and accurate fluiddynamics simulations are possible with HPC
Large-scale datasets
Want to extract usable flowfeatures
expensive algorithmsbut want interactivity in VR
Use parallel computing for feature extraction!
Nested Parallelization w/ Nested Parallelization w/ OpenMPOpenMPThree levels: Time, Block, CellFew blocks with exceptionally high load, as search algorithm adaptively refines search when targeting points of interest
high load imbalanceOpenMP feature: dynamic/guided loop schedules
Optimal thread distribution and thread scheduling?
// Loop over time levels#pragma omp parallel for num_threads(nTimeThreads) schedule(dynamic,1)for (curT=1; curT<=maxT; ++curT) {// Loop over Blocks#pragma omp parallel for num_threads(nBlockThreads) schedule(dynamic,1)for (curB=1; curB<=maxB; ++curB) {// Loop over Cells#pragma omp parallel for num_threads(nCellThreads) schedule(guided)for (curC=1; curC<=maxC; ++curC) {FindCriticalPoints (curT, curB, curC); // highly adaptive algorithm (bisectioning)} } }
Parallelization w/ Parallelization w/ OpenMPOpenMP
Speed-up on Sun Fire E25K (72 UltraSparc IV, dual-core @1,05 GHz)Time 4, Block 4, Cell 32 (dyn/guided): 33.9Without Load-balancing (static): 10.3 (MPI paradigm – data distribution)Using Guided weight 20 (sun specific): 55.3
Fig.: Load distribution of Engine datasethigh load imbalance
0
10
20
30
40
50
60
1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61Time Level
Run
time
[s]
Basic Load Total Runtime
ConclusionsConclusionsInterdisciplinarity is often talked about as being necessary forthe advancement of scienceRWTH one of the first schools in Germany to undertakefundamental structural steps towards establishing a sustainable culture for computational engineering science
CCES as a home for facultyStrong tie-in of computer science (5 CS profs in AICES effort)CompSci study programs at all levels (B.S., M.S., Ph.D.) both as consecutive and nonconsecutive CompSci programsLarge-scale investments for HPC+Data+Vizualization+Networksand know-how infrastructure to ensure productive use