research computing at harvard
DESCRIPTION
Research Computing at Harvard. John Huth. Topics. Support of computing in science (as opposed to desktop) is becoming more and more of an issue at research universities. Crimson Grid Initiative in Innovative Computing The EGG project Another kind of LHC computing challenge: - PowerPoint PPT PresentationTRANSCRIPT
Research Computing at Harvard
John Huth
Topics
• Support of computing in science (as opposed to desktop) is becoming more and more of an issue at research universities.
• Crimson Grid• Initiative in Innovative Computing• The EGG project• Another kind of LHC computing challenge:
– The inverse mapping problem.
The Crimson Grid InitiativeStarted in April 2004
A project to engineer a technology fabric in support of interdisciplinary &
collaborative computingJoy Sircar – Division of Engineering and
Applied Science
The Crimson Grid:
•A Scalable collaborative computing environment for research at the interface of science and engineering
•A Gateway to community/national/global computing infrastructures for interdisciplinary research
•A Test bed for faculty & IT-industry affiliates within the framework of a production environment for integrating HPC solutions for higher education & research
•A Campus Resource for skills & knowledge sharing for advanced systems administration & management of switched architectures
The Campus Grid Vision: Grid of Grids from Local to Global
Community Campus
National
CrimsonGrid-GLOW
ATLAS
OSG
OSG
OSG/ATLASUsers
CrimsonGridUsers
DEAS Condor pool
CrimsonGridGateway
Campus Grid “agreed”
UsersCG-GLOW
GT-GK
GT-GK NNINCondor pool
GT-GK CRC-ICondor pool
GT-GKWKSTNs
Condor pool
GT-GK
Power of Campus Grids
GLOW - ~1000 Procs
CG - ~750 Procs
In just 2 campuses !
…..
Grid use in first 12-months
Number of J obs
0 5000 10000 15000
2005-02
2005-04
2005-06
2005-08
2005-10
2005-12
2006-02
2006-04
2006-06*
First Use Research Areas in the Crimson Grid
•Nanoscience •Mesoscopic Physics •Quantum Chemistry and Quantum Chaos•Condensed Matter Physics•Chemistry at Harvard Molecular Mechanics-- CHARMM•Harvard Biorobotics Lab •Atmospheric Chemistry •Earth and Planetary Sciences (Ocean Modeling)• Solid and Structural Mechanics•Earth Sciences and Geophysics- earthquake engineering;•Complex Biosystems Modeling •Quantitative Social Science
Initiative in Innovative Computing
Alyssa Goodman (Director)
Tim Clark (Executive Director)
Filling the “Gap” between Science and Computer
Science
Increasingly, core problems in science require computational solution
Typically hire/“home grow” computationalists, but often lack the expertise or funding to go beyond the immediate pressing need
Focused on finding elegant solutions to basic computer
science challenges
Often see specific, “applied” problems as outside their
interests
Scientific disciplines
Computer Science departments
Continuum
“Pure” Discipline Science
(e.g. Galileo)
“Pure” Computer Science
(e.g. Turing)
“Computational Science”Missing at Most Universities
Filling the “computational science” gap: IIC
Problem-driven approach…focusing effort on solving problems that will have greatest impact &
educational valueCollaborative projects
…combining disciplinary knowledge with computer science expertise
Interdisciplinary effort…to ensure that best practices are shared across fields and that new
tools and methodologies will be broadly applicable
Links with industry…to draw on and learn from experience in applied computation
Institutional funding…to ensure effort is directed towards key needs and not driven solely by
narrow priorities of funding agencies
Where are the optimal “IIC” problems?
Low High
Computer Science Payoff
Dom
ain
Sci
ence
Payoff
Low
HIg
h
“Never Mind” Computer
Science Department
Science Departments
CSDepartments
What is the rightshape for
that boundary?
Visualization Distributed Computing
Databases/ Provenance
Analysis & Simulations
Instrumentation
Physically meaningful combination of diverse data types.
e-Science aspects of large collaborations.
Sharing of data and computational resources and tools in real-time.
Management, and rapid retrieval, of data.
“Research reproducibility” …where did the data come from? How?
Development of efficient algorithms.
Cross-disciplinary comparative tools (e.g. statistical).
Improved data acquisition.
Novel hardware approaches (e.g. GPUs, sensors).
IIC Research Branches( and Projects Draw upon >1 )
V
DC
DB/P AS I
Plus…Educational Programs that bring IIC Science to Harvard students, and to the public at large.
Data Intensive Project
• ATLAS/LHC computing – Tier 2
• Mileura Wide Field Array (MWA) – microwave examination of ultra-redshifted era – time of recombination.
• Pan-STARRS – optical telescope (Panoramic Survey Telescope And Rapid Response System)
EGG Project
• S. Youssef, J. Huth, D. Parkes, M. Seltzer, J. Shank
• Extension of PACMAN concept to resource allocation, cache management
In the beginning…
Software environment computing, i.e. creating and manipulating software environments
Economic mechanism design; bidding systems, provenance & file systems, resource prediction
GLOBUSCondor
PBS
LSF EGEE
Chimera
RLS VOMS
Resource Brokers
MonaLisa
Ganglia
OSGDial
PandadCache
SRMPacman
Gums Web services
Virtual Machines GridCat
BU Harvard
But what do these have to do with each other? …And how do they fit into the (over-)complicated world of grid computing?
But then, something very unusual happened…
Alien
LCG
Dirac
DISUN ACDC
VDT
VDS
DRM
ClarensGlue
EDG
Classads
Netlogger
CaponeEowyn gLite
ADAiVDGL
PPDG
setenv(Foo,Bar)
download(foo.tar.gz)
shell(make install)get(E)
“eggshell”
“caches”
“Pacman”
An installation
~ Various URLs with eggshell source code
[ Pacman is used by ATLAS, OSG, VDT, LCG, Globus, TeraGrid,… >350,000 downloads, ~500-1000 new installations per day in 50 countries around the world, supported on 14 OS.]
setenv(Foo,Bar)
download(foo.tar.gz)
shell(myjob < infile > outfile)
We can let all computations be “installations.”
put(E)
But which path should E follow?
setenv(Foo,Bar)
download(foo.tar.gz)Cache history Cache
contentsF( , , )=> ~Opportunity cost
Fast WAN
ATLAS v.10.5.0 already installed
put(job needing ATLAS 10.5.0)
Resolving the put ambiguity == Resource allocation
Eggshells Computers
On save()…
bidding process ->(C,E)
C.put(E)
...repeat...
“time>= 14 Nov.”
“bidding closes in 7 days”
A cache can be a marketplace
Eggshells go where they get the best prices
Computers go where there are the most buyers
The LHC Inverse Mapping Problem
• A CPU intensive problem
• N. Arkani-Hamed, G. Kane