optimizing discovery in the big science era · 2015-03-12 · optimizing discovery in the big...

Post on 03-Aug-2020

3 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Netherlands eScience Center

Optimizing discovery in the big science era

www.esciencecenter.nl

Prof. Dr. Wilco Hazeleger

The world around us

• Science and society are intimately

connected

Science becomes increasingly problem-

driven

Science increasingly inter-, multi-, trans-

disciplinary

The Big Science era

Mission

Enabling digitally enhanced research through

efficient use of scientific software, data, and e-

infrastructure

Application DomainsLife Sciences & eHealth, Environment & Sustainability,

Humanities & Social Sciences, Physical World & Beyond

e-InfrastructureComputing, Networking

Storage & Visualization

Organisation

Founding organisations (since 2011).

NWO – Netherlands Organisation for Scientific Research (2.7 M€ p.a.)SURF – Dutch higher education and research partnership for ICT (2.7 M€ p.a.)

Collaborative projects between NLeSC, academic partners and industry whichinclude our digital scientists; cash and in kind.

NLeSC research program on generic eScience concepts and tools.

NLeSC priority domains (demand-driven from science)

I. Environment & Sustainability

Climate, ecology, energy, logistics,

water management, agriculture & food

II. Life Sciences & eHealth

Next generation sequencing,

biobanking, molecules & man

III. Humanities & Social Sciences

SMART cities, text analysis, eBusiness,

creative technologies

IV. Physics and beyond

Astronomy, high-energy physics,

advanced materials, engineering

& manufacturing

NLeSC eScience competences applied in science

Big data analyticsStatistics, machine learning, visualisation, text mining

Optimized data handlingData base optimization, structured & unstructured data, real time data

Efficient computingDistributed & acceleratedcomputing, efficient algorithms

– broad oriented scientists at the interface of research and IT

– collaborating with domain researchers to implement eScience concepts and tools

– mostly PhDs with domain knowledge and IT skills

– Involved in projects, funded in cash and in kind

eScience Research Engineers = Digital Scientists

eScience Technology Platform

Core of NLeSC expertise; promotes exchange

and re-use of best practices

• Repository

– compute kernels, interfaces, libraries, tools, and scientific

workflows

• Knowledge base

– professional coding standards, coding styles, unit and

integration testing, and documentation

• Expertise center & meeting place

Collaborative Project Examples

Astronomy

Project Leader: Marco de Vos, ASTRON

Neuroimaging

Project Leader: Paul Tiesinga, Univ. of Nijmegen

eChemistry

Project Leader: Lars Ridder, Univ. of Wageningen

eScience Engineer: Marijn Sanders

Climatology

Project Leader: Henk Dijkstra, Univ. Utrecht

eScience Engineer: Jason Maassen

eEcology

Project Leader: Willem Bouten, Univ. Amsterdam

eFood Research

Project Leader: Wynand Alkema, Univ. Nijmegen

Life Sciences

Project Leader: Jan Willem Boiten, CTMM

Water Management

Project Leader: Prof. Nick van de Giesen, TU Delft

eHumanities

Project Leader: Guus Schreiber, Free Univ. Amsterdam

Green Genetics

Project Leader: Bernard de Geus, TTi Green Genetics

Collaborative Project Examples

Massive Point Clouds

Project Leader: Peter van Oosterom, Delft

University of Technology

Sim-City

Project Leader: Peter Sloot, UVA

SPuDisc

Project Leader: Maarten de Rijke, Univ Amsterdam

Summer in the City

Project Leader: Bert Holtslag, Univ. of Wageningen

eVisualization

Project Leader: Edwin Valentijn, Univ. Groningen

TwiNL

Project Leader: Antal van de Bosch, Univ. of Nijmegen

ODEX4All

Project Leader: Barend Mons, Leiden Univ.

eSiBayes

Project Leader: Willem Bouten, Univ. Amstedam

AMUSE

Project Leader: Simon Portegies Zwart, Leiden Univ.

Via Appia

Project Leader: Henk Scholten, VU Amsterdam

e-Ecology

NLeSC and UVA (Prof. W. Bouten)

Annotation tool and learning

Annotation tool and learning

Acceleration data to behavior

Machine learning

– Labeled train set

– Trained model

Schema from Natural Language Processing with Python, by Steven

Bird, Ewan Klein and Edward Loper, Copyright © 2009

e-Food

NLeSC and Prof. Alkema (Radboud Univ Nijmegen) and Dr Tops (VU and WUR)

Literature sources

Taste Ontologye.g. sweet, sour, bitter,umami, salty,

ropiness, TASR1

Ingredient ontologye.g. mannitol,sucrose,sorbitol,

alpha-terpineol, 4-

methylpentanoic acid,ethyl

propionate,flavonoid,caffeine

tag tag

Calculate

Compoundontology profiles

store

Classifying compounds according to

taste

~500 terms

~40.000 terms

Derived from ChEBI

A number of known food proteins

24 million

scientific

abstracts

Point clouds

NLeSC with TU-Delft

Point clouds

• Set of data points in some coordinate system

• In 3D coordinate system, points defined by X, Y, Z

• Possible to have more attributes. Ex: Color (R, G, B)

• NL surface

• 640 billion points

• 6 – 10 points per m

• 12 attributes

• 20 bytes/point

• 60000 files

2

Actual Height Model of the Netherlands (AHN2):

Massive point clouds for eSciences

LAS 11.64 TB

LAZ 1 TB

• Loading

• Organization

• Indexing

• Clustering

• Blocking

• Compressing

• Querying

• Parallel processing

• Level of detail / Data pyramid

Point could data bases

Point cloud databases

Massive point clouds for eSciences

e-Watercycle

NLeSC and TU-Delft, Utrecht Univ (Prof. N. vd Giessen)

Enabling digitally enhanced research

through efficient use of scientific

software, data and e-infrastructure

• Deals with data, data, data…and computing

• Domain overarching solutions needs cross-

discipline expertise, well defined interfaces,

and standardization

– eScience technology platform: software & expertise

– Application in domain sciences

Thank you

top related