ipt fwp (task 7) data science initiative · 2016 – 04 – 19 / crtree – romanov. albany, or...

1
Albany, OR Anchorage, AK Morgantown, WV Pittsburgh, PA Sugar Land, TX 2016 – 04 – 19 / CRTREE – Romanov IPT FWP (T ASK 7) DATA SCIENCE INITIATIVE Summary : • NETL data science initiative is a focused effort to support data-intensive research projects that will adapt and develop software tools so that it is possible to curate, archive and analyze large volumes of raw data, with the initial focus on fuel cell materials and advanced alloys, while they are used under extreme environments and power plant cycling conditions. Predictive materials degradation models will be validated against experimental data. Certain analytical models that could be used in simulation and visualization will be designed to account for the features and relationships uncovered using data analytics. The experimental data are gathered as part of the ongoing domain science research at NETL as well as through literature search and collaboration with external organizations. The developed analytical tools and user interfaces will allow the FE experts and technology developers to train the system and to extract hypotheses and promising technological approaches out of simulated and experimental data, in response to their questions and requests, through the integration of guided experimental research with computational sciences and engineering across time and length scales. It will provide the foundations of fundamental scientific understanding for advancing broad areas of science dealing with properties and behaviors of advanced materials and power plant components. Preliminary results of the alloys data analytics pilot [test temperature affecting tensile test outcomes; applied creep stress causing early rupture] are in line with domain knowledge expectations and [skewness of composition vs tensile test data] suggest that cluster analysis may be useful to organize the observed data into meaningful structures. PI: Slava Romanov [email protected] Collaboration with (subcontract award) CASE WESTERN RESERVE UNIVERSITY PILOT PROJECT FUEL CELLS PILOT PROJECT RAPID QUALIFICATION of ALLOYS SCALE BRIDGING MODEL VALIDATION DATA ANALYSIS VISUALIZATION Engagement in MATERIALS GENOME INITIATIVE JOULE Supercomputer (NETL) CWRU / SDLE Center, VUV-Lab, Materials Science & Engineering Department, Roger H. French © 2012 U.S. Department of Energy National Energy Technology Laboratory Pittsburgh, Pennsylvania 15236 24,152 2.6GHz Intel Sandy Bridge cores 72,576 GBs ofTotal RAM Full bisection bandwidth QDR Interconnect 1 PB of primary storage 3 PB of local scratch storage Rmax: 503.2 TFLOPs Rpeak: 413.5 TFLOPs Refreshment is planned The project team develops and utilizes unique capabilities in parallel and distributed computing, not only SQL databases, data ingestion, integration and preprocessing (cleaning, parsing, organizing according to the tidy/structured data principles, managing, maintaining, and validating), machine learning, artificial intelligence, fast and scalable exploratory data analysis, statistical inference, feature discovery, mathematical and semi- supervised generalized higher-order system equation modeling to identify relationships and latent variables, advanced expert systems and knowledge-based technologies. This represents NETL’s emerging expertise in data-intensive, domain-guided statistical design for optimal manufacturing, computational process and materials engineering, with uncertainty quantification to support decision-making, and additional scientific insight into complex, noisy, high-dimensional, and high-volume data sets from experiments and simulations. FY 2016 Milestone: Identify materials data management and processing approaches and UQ methodology for the 9Cr steel family of alloys.

Upload: others

Post on 20-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: IPT FWP (TASK 7) DATA SCIENCE INITIATIVE · 2016 – 04 – 19 / CRTREE – Romanov. Albany, OR Anchorage, AK Morgantown, WV Pittsburgh, PA Sugar Land, TX. IPT FWP (T. ASK. 7) DATA

Albany, OR Anchorage, AK Morgantown, WV Pittsburgh, PA Sugar Land, TX2016 – 04 – 19 / CRTREE – Romanov

IPT FWP (TASK 7) DATA SCIENCE INITIATIVE

Summary:

• NETL data science initiative is a focused effortto support data-intensive research projectsthat will adapt and develop software tools sothat it is possible to curate, archive andanalyze large volumes of raw data, with theinitial focus on fuel cell materials andadvanced alloys, while they are used underextreme environments and power plant cyclingconditions. Predictive materials degradationmodels will be validated against experimentaldata. Certain analytical models that could beused in simulation and visualization will bedesigned to account for the features andrelationships uncovered using data analytics.The experimental data are gathered as part ofthe ongoing domain science research at NETLas well as through literature search andcollaboration with external organizations. Thedeveloped analytical tools and user interfaceswill allow the FE experts and technologydevelopers to train the system and to extracthypotheses and promising technologicalapproaches out of simulated and experimentaldata, in response to their questions andrequests, through the integration of guidedexperimental research with computationalsciences and engineering across time andlength scales. It will provide the foundations offundamental scientific understanding foradvancing broad areas of science dealing withproperties and behaviors of advancedmaterials and power plant components.Preliminary results of the alloys data analyticspilot [test temperature affecting tensile testoutcomes; applied creep stress causing earlyrupture] are in line with domain knowledgeexpectations and [skewness of composition vstensile test data] suggest that cluster analysismay be useful to organize the observed datainto meaningful structures.

PI: Slava [email protected]

Collaboration with (subcontract award)CASE WESTERN RESERVE UNIVERSITY

PILOT PROJECTFUEL CELLS

PILOT PROJECTRAPID QUALIFICATIONof ALLOYS

SCALE BRIDGING

MODEL VALIDATION

DATA ANALYSIS VISUALIZATION

Engagement inMATERIALSGENOMEINITIATIVE

JOULE Supercomputer(NETL)

CWRU / SDLE Center, VUV-Lab, Materials Science & Engineering Department, Roger H. French © 2012

U.S. Department of EnergyNational Energy Technology Laboratory

Pittsburgh, Pennsylvania 15236

• 24,152 2.6GHz Intel Sandy Bridge cores• 72,576 GBs of Total RAM• Full bisection bandwidth QDR Interconnect• 1 PB of primary storage• 3 PB of local scratch storage• Rmax: 503.2 TFLOPs• Rpeak: 413.5 TFLOPs• Refreshment is planned

The project team develops and utilizes unique capabilities in parallel and distributed computing, not only SQL databases, data ingestion, integration and preprocessing (cleaning, parsing, organizing according to the tidy/structured data principles, managing, maintaining, and validating), machine learning, artificial intelligence, fast and scalable exploratory data analysis, statistical inference, feature discovery, mathematical and semi-supervised generalized higher-order system equation modeling to identify relationships and latent variables, advanced expert systems and knowledge-based technologies.

This represents NETL’s emerging expertise in data-intensive, domain-guided statistical design for optimal manufacturing, computational process and materials engineering, with uncertainty quantification to support decision-making, and additional scientific insight into complex, noisy, high-dimensional, and high-volume data sets from experiments and simulations.

FY 2016 Milestone: Identify materials data management and processing approaches and UQ methodology for the 9Cr steel family of alloys.