e-infrastructure @ science
TRANSCRIPT
PowerPoint Presentation
E-Infrastructures @ science
Tom Vissere-Science [email protected]
Today
##ContextResearch
Who is SARA
Who am I
What is BiG Grid
Why is BiG Grid##Funding structures
National
International
Partnerships
##Us @ work
## Problems and challenges
SARA
Since 1971
Supporting research
Providing servicesNetwork,
HTC computing
HPC computing
Data-services & mass storage
Support and development; optimization, projects-support.
Being a partner
Huygens National Super IBM Power 6, 3328 cores, 15.25 TB of memory, 700 TB of disk space, 60 TFlop/s
LISA National Compute Cluster Dell cluster 4480 cores, 12 TB of memory, 20 TFlop/s
Grid Rerouces 2376 Cores, 3408 TB of disk, 2000 TB tape 12 BioInfo Sites Life Science Grid High Energy Physics, Astronomy, Bio Info Visualization
Tiled Panel Display Remote Visualization
NetworkSURFnet 6AMSixNetherlight
Innovative Infrastructures Cloud GPU Hadoop Beehub
About ME
MA Social Informatics @ UvA
Online scientific Collaboration in european project
2 years IBED
3 years @ SARA working for BiG Grid projectE-science and cloud services
Guardian angel
Community communicator
Account management
(Inter)National scientific communities
BiG Grid project proposalBoth a problem and opportunity; combine data-set, colloborate on analysis, share the maintenance and curation best practices
We need; reliable archiving, secure and easy access, retrieval facilities (discovery / search), communication about the data (now you access it)
BiG Grid project
NIKHEF, NCF, NBIC
Providing a world class e-science infrastructure
Part of the larger european grid
> 6000 compute cores
> 10 PB disk
Tape storage
Support and development
E-Infrastructure NL
WURLife Science Grid16 Grid cores18 TB diskRUNLife Science Grid32 Grid cores18 TB diskUMCGLife Science Grid32 Grid cores18 TB diskKeygeneLife Science Grid32 Grid cores18 TB diskErasmus MCLife Science Grid32 Grid cores18 TB diskLUMCLife Science Grid32 Grid cores18 TB diskUULife Science Grid32 Grid cores18 TB diskSARACentral Facillity2400 cores3450 TB disk4000 TB tape128 Cloud coresNikhefCentral Facillity2500 Grid cores1350 TB diskPhilips Research Central Facillity1648 Grid cores20 TB diskRUGCentral Facillity294 Grid cores34 TB diskAMCLife Science Grid32 Grid cores18 TB diskNKILife Science Grid32 Grid cores18 TB diskSARAHuygens SuperLisa ClusterVisualizationHadoopTUDLife Science Grid32 Grid cores18 TB disk
Data explosion
BiG Grid project proposalBoth a problem and opportunity; combine data-set, colloborate on analysis, share the maintenance and curation best practices
We need; reliable archiving, secure and easy access, retrieval facilities (discovery / search), communication about the data (now you access it)
e-science
1999 term
computation
collaboration
lots of data
Shift of paradigm
Google paper:The unreasonable effectiveness of data
Funding
It's all about the money?
NationalFES
OCW (NWO)
ELI
European, big ESFRI programmes
Companies
Us @ work
Data ingest service (sneakernet)
Harddrives coming from Hong Kong (BGI)
Are you serious?
Fast network; end-to-end
Backpack with drives
Couriers with drives
Set up experimental ingest
March 2012 in production
From problem to result
Definition of own role & contribution
Realistic objectives
Trustworthy knowledgeable partnerAdapting to pace and needs of scientific project
Education
Support
Development
Funding dedicated programmers
Experimental technologies
Keeping it all operational
Types of problems
Data intensive information intensive
Memory; IO; data-locality
Easy scalable complex integrated pipelines
Legacy; you never start from scratch
Licensing / privacy
Local policies
Who decides
EbioGrid platform
Create a national support basis for e-BioScience to both expert bioinformaticians and expert life scientists.
Exploit BIG Grid infrastructure in the life science R&D
Create functional Problem Solving Environments (PSEs) for the selected technology areas that deal with high demand in computing resources
Connect with the NBIC-BioAssist and BiG Grid programs.
e-BioScience
Life Science Research
ResearchSupport
BioAssist Engineering Team
Genomics
Bio-interpret.
Biobanking
Proteomics
Short cooperative projects
Task force
Tools
TA project
PSEs
NGS
MAS
MAT
NCS
BBC
BiG Grid
BioAssist
Support &Development Team
OperationsTeam
Cooperative projects
Analysis, design & implementationof software environment
Infrastructure
Installation and running of thecompute and storage systems
e-Core
e-BioGrid
Bioinformaticstools oriented
ICT Infrastructureoriented
Problems and challenges
Inspire and motivate
Keeping all stakeholders happyInfrastructure needs money
Funding implies overhead
Can't do everything, making choices
Sain development
Invisibility of infrastructure
Collaboration can bite individual excellence
Loose coupling or tight integration
Keeping up with fast changes
Conclusion
We have a collaborative challenge
NL/EU is very well positionedGreat potential of the network
Governments recognize importance
Keeping up with fast technology changes
Trans-disciplinarity and integration are key
Strong community is keyKnowledge, skills and technology
Will e-science become science again?
References / credits
http://www.biggrid.nl
http://www.e-biogrid.nl
http://www.sara.nl
http://www.nwo.nl
BBMRI image NY times: http://goo.gl/I130Q
http://www.bbmri.nl
http://www.egi.eu
http://www.necen.nl concept drawing by J.J.Bot