ichec - observation systems, technologies and big data
DESCRIPTION
Presnetation by Alastair McKinstryTRANSCRIPT
Observation Systems, Technologies and Big Data
Alastair McKinstry EPA Climate Workshop, 19 September 2013
EPA Climate Workshop, 19 September 2013 2
ICHEC overview
National Technology Centre Established in 2005 Hosted by NUI Galway
Mandate: HPC & Big Data/Data Analytics Industry engagement Platform Science & Technology
25 staff in Dublin & Galway Mix of software developers,
domain specialists 4 in Climate/Environmental area
Old vs New: a x1000 step change New: • Move the work to the data
– 100+ GB/day – 20-60m resolution, 12-15
bands
3 EPA Climate workshop, 19 September 2013
Old: • Everybody downloads the
data • e.g. data on 50km
grid. Few MB/day. • 1-3 bands.
Big Data: Networking
• ICHEC and HEAnet have 10gb links – Not affordable on commercial rates – Used in CMIP5 data project with eINIS – Point-to-Point with European partners
• Move one copy to Ireland, process it at an “Exploitation portal” – Share workflows. – Processing triggered on data arrival
EPA Climate workshop, 19 September 2013 4
Big Data: Compute
• Workflows are no longer a “hobby” task – Not on a simple PC at 20-50m, but …
• GPGPUs/ Intel MIC Accelerators: – 80 Tflop/s of capability on upcoming ICHEC system – C.f. 40 Tflop/s needed to process EUMETSAT data
• Shared workflows: atmospheric correction, QA • ICHEC has portal experience: BDI, Bioportal, • Automated: repeatability.
SFI Review – Royal Irish Academy, Dublin – 21st October 2010 5
Curation: an unsolved problem
• What to keep? • Useful to Ireland:
– Products, raw data not archived at primary sites – Archiving “just Ireland” gives valuable time series
ICHEC could provide a platform for this: – Funding needed from Beneficiaries or agencies. – Lack of sustainability a problem (C4I, CMIP5) – Curation needs human work: data scientists.
SFI Review – Royal Irish Academy, Dublin – 21st October 2010 6
Processing in Ireland ?
• Some products may not be produced upstream – E.g. Algal blooms for North
Atlantic • Need rapid processing
of raw data • Critical for aquaculture • Time critical.
– May pave way for ground station for later satellites
EPA Climate workshop, 19 September 2013 7
Data Fusion
• Combining Remote Sensing data with other datasets: – Ground truthing – Precipitation, soil
moisture (SMOS), runoff, river gauges, …
• Needs consistent data, interoperability: – Technical limitations – Orgs. To make data
available to each other: collaborations
EPA Climate workshop, 19 September 2013 8
Combining with models
• Experience with weather and climate • Coupling models and data
assimilation key science skills at ICHEC
• “Virtual Ireland” : assimilating observations and model data for
• Pollution control: e.g. ICOS • Flooding, hydrology • Policy analysis
EPA Climate workshop, 19 September 2013 9
Other datasets
• Not just Remote Sensing: – Make other datasets available: same grids, etc.
• Model data, observations,
– Somewhere for users to upload data: • Indexed, Archived, remapped to new formats • Data scientists who understand metadata and the
science behind the data
EPA Climate workshop, 19 September 2013 10
Citizen Science
• Data to the citizen: – A portal for making datasets available: – Making WxS layers available for GIS, Google
Earth, … – Enable “mashups”, analysis apps.
• From the citizen: – Smart apps for uploading observations,
measurements
EPA Climate workshop, 19 September 2013 11
Global opportunities
• Commercial spinoffs: tech. startups looking for testbeds of global opportunities – Promote tech. sector in Ireland, not just
exploitation of data in Ireland e.g. showcase big databases, fast networks
EPA Climate workshop, 19 September 2013 12
New value in old data
• The big investment has been made – Ireland’s contribution to ESA, – “Random” Datasets in public sector, academia
• Applications in: – Agriculture – Policy and planning – Tourism
EPA Climate workshop, 19 September 2013 13