managed by ut-battelle for the department of energy xldb 2011 10/19/2011 scott a. klasky...

33
Managed by UT-Battelle for the Department of Energy In situ data processing for extreme-scale computing XLDB 2011 10/19/2011 Scott A. Klasky [email protected] To Exascale and beyond! H. Abbasi 2 , Q. Liu 2 , J. Logan 1 , M. Parashar 6 , N. Podhorszki 1 , K. Schwan 4 , A. Shoshani 3 , M. Wolf 4 , S. Ahern 1 , I.Altintas 9 , W.Bethel 3 , L. Chacon 1 , CS Chang 10 , J. Chen 5 , H. Childs 3 , J. Cummings 13 , C. Docan 6 , G. Eisenhauer 4 , S. Ethier 10 , R. Grout 7 , R. Kendall 1 , J. Kim 3 , T. Kordenbrock 5 , Z. Lin 10 , J. Lofstead 5 , X. Ma 15 , K. Moreland 5 , R. Oldfield 5 , V. Pascucci 12 , N. Samatova 15 , W. Schroeder 8 , R. Tchoua 1 , Y. Tian 14 , R. Vatsavai 1 , M. Vouk 15 , K. Wu 3 , W. Yu 14 , F. Zhang 6 , F. Zheng 4 1 ORNL, 2 U.T. Knoxville, 3 LBNL, 4 Georgia Tech, 5 Sandia Labs, 6 Rutgers, 7 NREL, 8 Kitware, 9 UCSD, 10 PPPL, 11 UC Irvine, 12 U. Utah, 13 Caltech, 14 Auburn University, 15 NCSU Thanks to: V. Bhat, B. Bland, S. Canon, T. Critchlow, G. Y. Fu, B. Geveci, G. Gibson, C. Jin, B. Ludaescher, S. Oral , M. Polte, R. Ross, J. Saltz, R. Sankaran, C. Silva, W. Tang, W. Wang, P. Widener

Post on 22-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

  • Slide 1
  • Managed by UT-Battelle for the Department of Energy XLDB 2011 10/19/2011 Scott A. Klasky [email protected] To Exascale and beyond! H. Abbasi 2, Q. Liu 2, J. Logan 1, M. Parashar 6, N. Podhorszki 1, K. Schwan 4, A. Shoshani 3, M. Wolf 4, S. Ahern 1, I.Altintas 9, W.Bethel 3, L. Chacon 1, CS Chang 10, J. Chen 5, H. Childs 3, J. Cummings 13, C. Docan 6, G. Eisenhauer 4, S. Ethier 10, R. Grout 7, R. Kendall 1, J. Kim 3, T. Kordenbrock 5, Z. Lin 10, J. Lofstead 5, X. Ma 15, K. Moreland 5, R. Oldfield 5, V. Pascucci 12, N. Samatova 15, W. Schroeder 8, R. Tchoua 1, Y. Tian 14, R. Vatsavai 1, M. Vouk 15, K. Wu 3, W. Yu 14, F. Zhang 6, F. Zheng 4 1 ORNL, 2 U.T. Knoxville, 3 LBNL, 4 Georgia Tech, 5 Sandia Labs, 6 Rutgers, 7 NREL, 8 Kitware, 9 UCSD, 10 PPPL, 11 UC Irvine, 12 U. Utah, 13 Caltech, 14 Auburn University, 15 NCSU Thanks to: V. Bhat, B. Bland, S. Canon, T. Critchlow, G. Y. Fu, B. Geveci, G. Gibson, C. Jin, B. Ludaescher, S. Oral, M. Polte, R. Ross, J. Saltz, R. Sankaran, C. Silva, W. Tang, W. Wang, P. Widener
  • Slide 2
  • Managed by UT-Battelle for the Department of Energy Outline I/O challenges for extreme scale computing The Adaptable I/O System (ADIOS) SOA In transit approach Future work Work supported under: ASCR : CPES, SDM, ASCR SDM Runtime Staging, SciDAC Application Partnership, OLCF, eXact co-design OFES : GPSC, GSEP NSF : HECURA, RDAV NASA : ROSES
  • Slide 3
  • Managed by UT-Battelle for the Department of Energy Extreme scale computing Trends More FLOPS Limited number of users at the extreme scale Problems Performance Resiliency Debugging Getting Science done Problems will get worse Need a revolutionary way to store, access, debug to get the science done! ASCI purple (49 TB/140 GB/s) JaguarPF (300 TB/200 GB/s) From J. Dongarra, Impact of Architecture and Technology for Extreme Scale on Software and Algorithm Design, Cross- cutting Technologies for Computing at the Exascale, February 2-5, 2010. Most people get < 5 GB/s at scale
  • Slide 4
  • Managed by UT-Battelle for the Department of Energy Next generation I/O and file system challenges At the architecture or node level Use increasingly deep memory hierarchies coupled with new memory properties At the system level Cope with I/O rates and volumes stress the interconnect and can severely limit application performance Can consume unsustainable levels of power At the exascale machine level Immense aggregate I/O needs with potentially uneven loads placed on underlying resource Can result in data hotspots, interconnect congestion and similar issues
  • Slide 5
  • Managed by UT-Battelle for the Department of Energy Our team works directly with (24% of time at OLCF) Fusion: XGC1, GTC, GTS, GTC-P, Pixie3D, M3D, GEM Combustion: S3D, Raptor, Boxlib Relativity: PAMR Nuclear Physics: Nuclear Physics Astrophysics: Chimera, Flash many more application teams Requirements (in order) Fast I/O with low overhead Simplicity in using I/O Self Describing file formats QoS Extras, if possible In-transit visualization Couple multiple codes Perform in transit workflows Applications are the main drivers
  • Slide 6
  • Managed by UT-Battelle for the Department of Energy ADIOS: Adaptable I/O System Provides portable, fast, scalable, easy-to-use, metadata rich output with a simple API Change I/O method by changing XML Layered software architecture: Allows plug-ins for different I/O implementations Abstracts the API from the method used for I/O Open source: http://www.nccs.gov/user-support/center-projects/adios/ Research methods from many groups. S3D: 10X performance improvement over previous work Chimera: 10X performance improvement XGC1 code 40 GB/s, SCEC code 30 GB/s GTC code 40 GB/s, GTS code: 35 GB/s SCEC code 10X performance improvement
  • Slide 7
  • Managed by UT-Battelle for the Department of Energy We want our system to be so easy, even a chimp can use it Even I can use it! Sustainable Fast Scalable Portable
  • Slide 8
  • Managed by UT-Battelle for the Department of Energy Usability of Optimizations New technologies are usually constrained by the lack of usability in extracting performance Next generation I/O frameworks must address this concern Partitioning the task of optimizations from the actual description of the I/O Allow framework to optimize for both read/write performance on machines with high-variability, and increase the average I/O performance std dev. time
  • Slide 9
  • Managed by UT-Battelle for the Department of Energy Data Formats: Elastic Data Organization File systems have increased in complexity due to high levels of concurrency Must understand how to place data on the file system How to decrease the overhead of both writing and reading data to and from these complex file systems File formats should be Easy to use, easy to program, high performance, portable, self-describing, and able to work efficiently on parallel file systems and simple file systems. Allow for simple queries into the data A major challenge has always been in the understanding of the performance of I/O for both reading and writing, and in trying to optimize for both of these cases
  • Slide 10
  • Managed by UT-Battelle for the Department of Energy Excellent read performance for patterns Use Hilbert curve to place chunks on lustre file system with an Elastic Data Organization. Theoretical concurrencyObserved Performance
  • Slide 11
  • Managed by UT-Battelle for the Department of Energy I/O skeltal code creation s3d.xml skel xml s3d_skel.xml skel params s3d_params.xml skel src skel makefile skel submit Makefile Submit scripts Source files make Executables make deploy Skel_s3d I/O Skeletal Applications are automatically generated Generated from ADIOS XML file No computation or communication. Standard build mechanism Standard measurement strategies Extensible collection of applications Skel generates source code, makefile, and submission scripts
  • Slide 12
  • Managed by UT-Battelle for the Department of Energy SOA philosophy The overarching design philosophy of our framework is based on the Service-Oriented Architecture Used to deal with system/application complexity, rapidly changing requirements, evolving target platforms, and diverse teams Applications constructed by assembling services based on a universal view of their functionality using a well-defined API Service implementations can be changed easily Integrated simulation can be assembled using these services Manage complexity while maintaining performance/scalability Complexity from the problem (complex physics) Complexity from the codes and how they are Complexity of underlying disruptive infrastructure Complexity from coordination across codes and research teams
  • Slide 13
  • Managed by UT-Battelle for the Department of Energy Data staging 101 Why staging? Decouple file system performance variations and limitations from application run time Enables optimizations based on dynamic number of writers High bandwidth data extraction from application with RDMA Scalable data movement with shared resources requires us to manage the transfers Asynchronous data movement, for low impact on application Managing interference Use small staging area for large data output
  • Slide 14
  • Managed by UT-Battelle for the Department of Energy Our discovery of data scheduling for asynchronous I/O Scheduling technique Nave scheduling (Continuous Drain), can be slower than synchronous I/O Smart scheduling can be scalable and much better than synchronous I/O
  • Slide 15
  • Managed by UT-Battelle for the Department of Energy In situ processing Data movement is a deterrent to effective use of the data The output costs increase the runtime and force the user to reduce the output data Input costs can make up to 90% of the total running time for scientific workflows Our goal is to create an infrastructure to make it easy for you to forget the technology and focus on your science! Cool!
  • Slide 16
  • Managed by UT-Battelle for the Department of Energy Monolithic staging applications Multiple pipelines are separate staging areas Data movement is between each staging code High level of fault tolerance Staging Revolution: Phase 1 Application Staging Process 1 Staging Process 2 Storage Staging Area
  • Slide 17
  • Managed by UT-Battelle for the Department of Energy Shared Space Abstraction in the Staging Area Semantically-specialized virtual shared space abstraction in the staging area Shared (distributed) data object Simple put/get/query API Supports application application, workflows E.g. ADIOS DataSpaces (HPDC10) Constructed on-the-fly on hybrid staging nodes Indexes data for quick access and retrieval Provides asynchronous coordination and interaction support Complements existing interaction/coordination mechanisms Code coupling using DataSpaces Maintain locality for in-situ exchange Complex geometry-based queries In-space (online) data transformation and manipulations
  • Slide 18
  • Managed by UT-Battelle for the Department of Energy Phase 2: Introducing Plugins Disruptive step Staging codes broken down into multiple plugins Staging area is launched as a framework to launch plugins Data movement occurs between all plugins Scheduler decides resource allocations Application Plugin A Plugin B Plugin E Plugin E Plugin C Plugin D Staging Area
  • Slide 19
  • Managed by UT-Battelle for the Department of Energy Resources are co-allocated to plugins Plugins are executed consecutively, co-located plugins for no data movement Scheduler attempts to minimize data movement Phase 3: Managing the staging area Application Plugin A Plugin B Plugin E Plugin E Plugin C Plugin D Managed Staging Area Co-located plugins
  • Slide 20
  • Managed by UT-Battelle for the Department of Energy Application Plugin B Plugin E Plugin E Plugin C Plugin D Managed Staging Area Plugins executed within the application node B1B1 B1B1 B2B2 B2B2 Plugin A Next evolution of staging Move plugins/services back to application Largest computational capacity Integrate with plugins in the remote staging area Hybrid staging
  • Slide 21
  • Managed by UT-Battelle for the Department of Energy Third-Party Plugins in the Staging Area Data processing plugins in the hybrid staging area In-situ data processing Analytics pipelines Many issues Programming (data and control) models for plugins Deployment mechanisms Robustness, correctness, etc. Multiple approaches Code, binary, scripts, etc. Several implementations in ADIOS ActiveSpace, SmartTap, etc. E.g., ADIOS ActiveSpaces: Dynamically deploy custom application data transformation/filters on-demand and execute in staging area (DataSpaces) Provide the programming support to define custom data kernels to operate on data objects of interest Runtime system to dynamically deploy binary code to DataSpaces, and execute them on the relevant data objects Define a filter, implement and compile with the application E.g., min(), max(), sum(), avg(), etc.
  • Slide 22
  • Managed by UT-Battelle for the Department of Energy Local Statistics Local features Binning of data Sorting within a bucket Compression Global Statistics Global features Ordered sorts Indexing of data Compression Spatial correlation Topology mapping Interactive Visualization Temporal Correlations Validation On Compute Nodes 10 6 cores On Staging Nodes 10 3 cores Minimize cross node communication Minimize cross timestep communication Everything else Especially those which require human interaction Post processing 10^3 Disks Dividing up the pipeline
  • Slide 23
  • Managed by UT-Battelle for the Department of Energy Data Management (Data Reduction/in situ) ImpactObjectives Develop a lossy and lossless compression methodology to reduce incompressible spatio-temporal double precision scientific data Ensure high compression rates with ISABELA, ISOBAR, while adhering to user-controlled accuracy bounds Provide fast and accurate approximate query processing technique directly over ISABELA-compressed scientific data Up to 85% reduction in storage on data from petascale simulation applications: S3D, GTS, XGC, Guaranteed bounds in per-point error, and overall 0.99 correlation, and 0.01 NRMSE with original data Communication-free, minimal overhead technique easily pluggable into any simulation application ISABELA Compression Accuracy at 75% reduction in size Analysis of XGC1 Temperature Data
  • Slide 24
  • Managed by UT-Battelle for the Department of Energy Data Management (Multi resolution) Storage of the data on each process group in multiple levels for contiguous arrays Real-time decisions to decide which levels to stream to the staging resources, and which data to store Can reduce overall storage by real-time streaming/storing reduced representations of the data when its within an error-bound
  • Slide 25
  • Managed by UT-Battelle for the Department of Energy
  • Slide 26
  • Conclusions and Future Direction ADIOS, developed by many institutions, is a proven DOE framework which blends together Self-describing file formats Staging methods with visualization and analysis plugins Advanced data management plugins, for data reduction, multi-resolution data reduction, and advanced workflow scheduling Allows for multiplexing of output using a novel SOA approach New in situ techniques are critical for application success on extreme scale architecture We still must use this to prepare data for further exploratory data analytics Hybrid staging integration with cloud resources for next generation HPC-Cloud services New customers and new technologies bring new challenges AMR Image processing Experimental and Sensor Data
  • Slide 27
  • Managed by UT-Battelle for the Department of Energy Community highlights Fusion 2006 Battelle Annual report, cover image http://www.battelle.org/annualreports/ar2006/def ault.htm 2/16/2007 Supercomputing Keys Fusion Research http://www.hpcwire.com/hpcwire/2007-02- 16/supercomputing_keys_fusion_research-1.html http://www.hpcwire.com/hpcwire/2007-02- 16/supercomputing_keys_fusion_research-1.html 7/14/2008 Researchers Conduct Breakthrough Fusion Simulation http://www.hpcwire.com/hpcwire/2008-07- 14/researchers_conduct_breakthrough_fusion_sim ulation.html http://www.hpcwire.com/hpcwire/2008-07- 14/researchers_conduct_breakthrough_fusion_sim ulation.html 8/27/2009 Fusion Gets Faster http://www.hpcwire.com/hpcwire/2009-07- 27/fusion_gets_faster.html http://www.hpcwire.com/hpcwire/2009-07- 27/fusion_gets_faster.html 4/21/2011 Jaguar Supercomputing Harnesses Heat for Fusion Energy http://www.hpcwire.com/hpcwire/2011-04- 19/jaguar_supercomputer_harnesses_heat_for_fusi on_energy.html http://www.hpcwire.com/hpcwire/2011-04- 19/jaguar_supercomputer_harnesses_heat_for_fusi on_energy.html Combustion 9/29/2009 ADIOS Ignites Combustion Simulations http://www.hpcwire.com/hpcwire/2009-10- 29/adios_ignites_combustion_simulations.html http://www.hpcwire.com/hpcwire/2009-10- 29/adios_ignites_combustion_simulations.html Computer science 2/12/2009 Breaking the bottleneck in computer data flow http://ascr- discovery.science.doe.gov/bigiron/io3.shtml http://ascr- discovery.science.doe.gov/bigiron/io3.shtml 9/12/2010 Say Hello to the NEW ADIOS http://www.hpcwire.com/hpcwire/2010-09- 12/say_hello_to_the_new_adios.html http://www.hpcwire.com/hpcwire/2010-09- 12/say_hello_to_the_new_adios.html 2/28/2011 OLCF, Partners Release eSiMon Dashboard Simulation Tool http://www.hpcwire.com/hpcwire/2011-02- 28/olcf_partners_release_esimon_dashboard_simu lation_tool.html http://www.hpcwire.com/hpcwire/2011-02- 28/olcf_partners_release_esimon_dashboard_simu lation_tool.html 7/21/2011 ORNL ADIOS Team Releases Version 1.3 of Adaptive Input/Output System http://www.hpcwire.com/hpcwire/2011-07- 21/ornl_adios_team_releases_version_1.3_of_adap tive_input_output_system.html http://www.hpcwire.com/hpcwire/2011-07- 21/ornl_adios_team_releases_version_1.3_of_adap tive_input_output_system.html
  • Slide 28
  • Managed by UT-Battelle for the Department of Energy 2008 I/O Pipeline Publications SOA 1. S. Klasky, et al., In Situ data processing for extreme- scale computing, appear SciDAC 2011. 2. C. Docan, J. Cummings, S. Klasky, M. Parashar, Moving the Code to the Data Dynamic Code Deployment using ActiveSpaces, IPDPS 2011. 3. F. Zhang, C. Docan, M. Parashar, S. Klasky, Enabling Multi-Physics Coupled Simulations within the PGAS Programming Framework, CCgrid 2011. 4. C. Docan, S. Klasky, M. Parashar, DataSpaces: An Interaction and Coordination Framework for Coupled Simulation Workflows, HPDC10. ACM, Chicago Ill. 5. Cummings, Klasky, et al., EFFIS: and End-to-end Framework for Fusion Integrated Simulation, PDP 2010, http://www.pdp2010.org/. 6. C. Docan, J. Cummings, S. Klasky, M. Parashar, N. Podhorszki, F. Zhang, Experiments with Memory-to- Memory Coupling for End-to-End fusion Simulation Workflows, ccGrid2010, IEEE Computer Society Press 2010. 7. N. Podhorszki, S. Klasky, et al.: Plasma fusion code coupling using scalable I/O services and scientific workflows. SC-WORKS 2009 8. C. Docan, M. Parashar and S. Klasky. DataSpaces: An Interaction and Coordination Framework for Coupled Simulation Workflows. Journal of Cluster Computing, 2011 Data Formats 8. Y. Tian, S. Klasky, H. Abbasi, J. Lofstead, R. Grout, N. Podhorszki, Q. Liu, Y. Wang, W. Yu, EDO: Improving Read Performance for Scientific Applications Through Elastic Data Organization, to appear Cluster 2011. 9. Y. Tian, SRC: Enabling Petascale Data Analysis for Scientific Applications Through Data Reorganization, ICS 2011, First Place: Student Research Competition 10. M. Polte, J. Lofstead, J. Bent, G. Gibson, S. Klasky, "...And Eat it Too: High Read Performance in Write- Optimized HPC I/O Middleware File Formats," in In Proceedings of Petascale Data Storage Workshop 2009 at Supercomputing 2009, ed, 2009. 11. S. Klasky, et al., Adaptive IO System (ADIOS), Cray User Group Meeting 2008.
  • Slide 29
  • Managed by UT-Battelle for the Department of Energy 2008 I/O Pipeline Publications SOA 1. S. Klasky, et al., In Situ data processing for extreme- scale computing, appear SciDAC 2011. 2. C. Docan, J. Cummings, S. Klasky, M. Parashar, Moving the Code to the Data Dynamic Code Deployment using ActiveSpaces, IPDPS 2011. 3. F. Zhang, C. Docan, M. Parashar, S. Klasky, Enabling Multi-Physics Coupled Simulations within the PGAS Programming Framework, CCgrid 2011. 4. C. Docan, S. Klasky, M. Parashar, DataSpaces: An Interaction and Coordination Framework for Coupled Simulation Workflows, HPDC10. ACM, Chicago Ill. 5. Cummings, Klasky, et al., EFFIS: and End-to-end Framework for Fusion Integrated Simulation, PDP 2010, http://www.pdp2010.org/. 6. C. Docan, J. Cummings, S. Klasky, M. Parashar, N. Podhorszki, F. Zhang, Experiments with Memory-to- Memory Coupling for End-to-End fusion Simulation Workflows, ccGrid2010, IEEE Computer Society Press 2010. 7. N. Podhorszki, S. Klasky, et al.: Plasma fusion code coupling using scalable I/O services and scientific workflows. SC-WORKS 2009 Data Formats 8. Y. Tian, S. Klasky, H. Abbasi, J. Lofstead, R. Grout, N. Podhorszki, Q. Liu, Y. Wang, W. Yu, EDO: Improving Read Performance for Scientific Applications Through Elastic Data Organization, to appear Cluster 2011. 9. Y. Tian, SRC: Enabling Petascale Data Analysis for Scientific Applications Through Data Reorganization, ICS 2011, First Place: Student Research Competition 10. M. Polte, J. Lofstead, J. Bent, G. Gibson, S. Klasky, "...And Eat it Too: High Read Performance in Write- Optimized HPC I/O Middleware File Formats," in In Proceedings of Petascale Data Storage Workshop 2009 at Supercomputing 2009, ed, 2009. 11. S. Klasky, et al., Adaptive IO System (ADIOS), Cray User Group Meeting 2008. Data Staging 12. H. Abbasi, G. Eisenhauer, S. Klasky, K. Schwan, M. Wolf. Just In Time: Adding Value to IO Pipelines of High Performance Applications with JITStaging, HPDC 2011. 13.H. Abbasi, M. Wolf, G. Eisenhauer, S. Klasky, K. Schwan, F. Zheng, DataStager: scalable data staging services for petascale applications, Cluster Computer, Springer 1386-7857, pp. 277-290, Vol 13, Issue 3, 2010. 14.C. Docan, M. Parashar, S. Klasky: Enabling high-speed asynchronous data extraction and transfer using DART. Concurrency and Computation: Practice and Experience 22(9): 1181-1204 (2010) 15. H. Abbasi, Wolf, M., Eisenhauer, G., Klasky, S., Schwan, K.,, Zheng, F. 2009. DataStager: scalable data staging services for petascale applications. In Proceedings of the 18th ACM international Symposium on High Performance Distributed Computing (Garching, Germany, June 11 - 13, 2009). HPDC '09. ACM, New York, NY, 39-48. 16. H. Abbasi, J. Lofstead, F. Zheng, S. Klasky, K. Schwan, M. Wolf, Extending I/O through High Performance Data Services, Cluster Computing 2009, New Orleans, LA, August 2009. 17.C. Docan, M. Parashar, S. Klasky, Enabling High Speed Asynchronous Data Extraction and Transfer Using DART, Proceedings of the 17 th International Symposium on High-Performance Distributed Computing (HPDC), Boston, MA, USA, IEEE Computer Society Press, June 2008. 18.H. Abbasi, M. Wolf, K. Schwan, S. Klasky, Managed Streams: Scalable I/O, HPDC 2008.
  • Slide 30
  • Managed by UT-Battelle for the Department of Energy 2008 I/O Pipeline Publications Usability of Optimizations 19. J. Lofstead, F. Zheng, Q. Liu, S. Klasky, R. Oldfield, T. Kordenbrock, Karsten Schwan, Matthew Wolf. "Managing Variability in the IO Performance of Petascale Storage Systems". In Proceedings of SC 10. New Orleans, LA. November 2010. 20. J. Lofstead, M. Polte, G. Gibson, S. Klasky, K. Schwan, R. Oldfield, M. Wolf, Six Degrees of Scientific Data: Reading Patterns for extreme scale science IO, HPDC 2011. 21. Y. Xiao, I. Holod, W. L. Zhang, S. Klasky, Z. H. Lin, Fluctuation characteristics and transport properties of collisionless trapped electron mode turbulence, Physics of Plasmas, 17, 2010. 22. J. Lofstead, F. Zheng, S. Klasky, K. Schwan, Adaptable Metadata Rich IO Methods for Portable High Performance IO, IPDPS 2009, IEEE Computer Society Press 2009. 23. Lofstead, Zheng, Klasky, Schwan, Input/Output APIs and Data Organization for High Performance Scientific Computing, PDSW SC2008. 24. J. Lofstead, S. Klasky, K. Schwan, N. Podhorszki, C. Jin, Flexible IO and Integration for Scientific Codes Through the adaptable IO System, Challenges of Large Applications in Distributed Environments (CLADE), June 2008. 25. Y. Tian, S. Klasky, B. Wang, H. Abbasi, N. Podhorszki, R. Grout, M. Wolf, Chimera: Efficient Multiresolution Data Representation through Chimeric Organization, submitted to IPDPS 2012.
  • Slide 31
  • Managed by UT-Battelle for the Department of Energy 2008 I/O Pipeline Publications Data Management 26. J. Kim, H. Abbasi, C. Docan, S. Klasky, Q. Liu, N. Podhorszki, A. Shoshani, K. Wu, Parallel In Situ Indexing for Data- Intensive Computing, accepted 2011 LDAV 27. T. Critchlow, et al., Working with Workflows: Highlights from 5 years Building Scientific Workflows, SciDAC 2011. 28. S. Lakshminarasimhan, N. Shah, Stephane Ethier, Scott Klasky, Rob Latham, Rob Ross, Nagiza F. Samatova, Compressing the Incompressible with ISABELA: In Situ Reduction of Spatio-Temporal Data, Europar 2011. 29. S. Lakshminarasimhan, J. Jenkins, I. Arkatkar, Z. Gong, H. Kolla, S. H. Ku, S. Ethier, J. Chen, C.S. Chang, S. Klasky, R. Latham, R. Ross,, N. F. Samatova, ISABELA-QA: Query- driven Analytics with ISABELA-compressed Extreme-Scale Scientific Data, to appear SC 2011. 30. K. Wu, R. Sinha, C. Jones, S. Ethier, S. Klasky, K. L. Ma, A. Shoshani, M. Winslett, Finding Regions of Interest on Toroidal Meshes, to appear in Journal of Computational Science and Discovery, 2011 31. J. Logan, S. Klasky, H. Abbasi, et al., Skel: generative software for producing I/O skeletal applications, submitted to Workshop on D 3 Science, 2011. 32. E. Schendel, Y. Jin, N. Shah, J. Chen, C.S. Chang, S.-H. Ku, S. Ethier, S. Klasky, R. Latham, R. Ross, N. Samatova, ISOBAR Preconditioner for Effective and High- throughput Lossless Data Compression, submitted to ICDE 2012. 33. I. Arkatkar, J. Jenkins, S. Lakshminarasimhan, N. Shah, E. Schendel, S. Ethier, CS Chang, J. Chen, H. Kolla, S. Klasky, R. Ross, N. Samatova, ALACRI2TY: Analytics- driven lossless data compression for rapid in-situ indexing, storing, and querying, submitted to ICDE 2012. 34. Z. Gong, S. Lakshminarasimhan, J. Jenkins, H. Kolla, S. Ethier, J. Chen, R. Ross, S. Klasky, N. Samatova, Multi- level layout optimizations for efficient spatio-temporal queries of ISABELA-compressed data, submitted to ICDE 2012. 35. Y. Jin, S. Lakshminarasimhan, N. Shah, Z. Gong, C Chang, J. Chen, S. Ethier, H. Kolla, S. Ku, S. Klasky, R. Latham, R. Ross, K. Schuchardt, N. Samatova, S-preconditioner for Multi- fold Data Reduction with Guaranteed user-controlled accuracy, ICDM 2011.
  • Slide 32
  • Managed by UT-Battelle for the Department of Energy 2008 I/O Pipeline Publications Simulation Monitoring 36.R. Tchoua, S. Klasky, N. Podhorszki, B. Grimm, A. Khan, E. Santos, C. Silva, P. Mouallem, M. Vouk, "Collaborative Monitoring and Analysis for Simulation Scientist", in Proceedings of CTS 2010. 37.P. Mouallem, R. Barreto, S. Klasky, N. Podhorszki, M. Vouk. 2009. Tracking Files in the Kepler Provenance Framework. In Proceedings of the 21st International Conference on Scientific and Statistical Database Management (SSDBM 2009), Marianne Winslett (Ed.). Springer-Verla, 273-282. 38.E. Santos, J. Tierny, A. Khan, B. Grimm, L. Lins, J. Freire, V. Pascucci, C. Silva, S. Klasky, R. Barreto, N. Podhorszki, Enabling Advanced Visualization Tools in a Web-Based Simulation Monitoring System, escience 2009. 39.R. Barreto, S. Klasky, N. Podhorszki, P. Mouallem, M. Vouk: Collaboration Portal for Petascale Simulations, International Symposium on Collaborative Technologies and Systems, pp. 384-393, Baltimore, Maryland, May 2009. 40.R. Barreto, S. Klasky, C. Jin, N. Podhorszki, Framework for Integrating End to End SDM Technologies and Applications (FIESTA)., Chi 09 workshop, The Changing Face of Digital Science, 2009. 41.Klasky, et al., Collaborative Visualization Spaces for Petascale Simulations, to appear in the 2008 International Symposium on Collaborative Technologies and Systems (CTS 2008). 42.R. Tchoua, S. Klasky, et. Al, Schema for Collaborative Monitoring and Visualization of Scientific Data, PDAC 2011.
  • Slide 33
  • Managed by UT-Battelle for the Department of Energy 2008 I/O Pipeline Publications In situ processing 43. A. Shoshani, et al., The Scientific Data Management Center: Available Technologies and Highlights, SciDAC 2011. 44. A. Shoshani, S. Klasky, R. Ross, Scientific Data Management: Challenges and Approaches in the Extreme Scale Era, SciDAC 2010. 45. F. Zheng, H. Abbasi, C. Docan, J. Lofstead, Q. Liu, S. Klasky, M. Parashar, N. Podhorszki, K. Schwan, M. Wolf, PreDatA - Preparatory Data Analytics on Peta-Scale Machines, IPDPS 2010, IEEE Computer Society Press 2010. 46. S. Klasky, et al., High Throughput Data Movement,, Scientific Data Management: Challenges, Existing Technologies, and Deployment, Editors: A. Shoshani and D. Rotem, Chapman and Hall, 2009 47. G. Eisenhauer, M. Wolf, H. Abbasi, S. Klasky, K. Schwan, A Type System for High Performance Communication and Computation, submitted to Workshop on D 3 Science, 2011. 48. F. Zhang, C. Docan, M. Parashar, S. Klasky, N. Podhorszki, H. Abbasi, Enabling In-situ Execution of Coupled Scientific Workflow on Multi-core Platform, submitted IPDPS 2012. 49. F. Zheng, H. Abbasi, J. Cao, J. Dayal, K. Schwan, M. Wolf, S. Klasky, N. Podhorszki, In-Situ I/O Processing: A Case for Location Flexibility, PDSW 2011. 50. F. Zheng, J. Cao, J. Dayal, G. Eisenhauer, K. Schwan, M. Wolf, H. Abbasi, S. Klasky, N. Podhorszki, High End Scientific Codes with Computational I/O Pipelines: Improving their End-to-End Performance, PDAC 2011. 51. K. Moreland, R. Oldfield, P. Marion, S. Jourdain, N. Podhorszki, C. Docan, M. Parashar, M. Hereld, M. Papka, S. Klasky, Examples of In Transit Visualization, PDAC 2011. Misc. 52. C. S. Chang, S. Ku, et al., Whole-volume integrated gyrokinetic simulation of plasma turbulence in realistic diverted-tokamak geometry, SciDAC 2009. 53. C. S. Chang, S. Klasky, et al., Towards a first-principles integrated simulation of tokamak edge plasmas, SciDAC 2008. 54. Z Lin, Y. Xiao, W. Deng, I. Holod, C. Kamath, S. Klasky, Z. Wang, H. Zhang, Size scaling and nondiffusive features of electron heat transport in multi-scale turbulence, IAEA 2010. 55. Z. Lin, Y. Xiao, I. Holod, W. Zhang, W. Deng, S. Klasky, J. Lofstead, C. Kamath, N. Wichmann, Advanced simulation of electron heat transport in fusion plasmas, SciDAC 2009.