leveraging distributed research cloud infrastructures for ... · xsede jetstream, moc. •...
TRANSCRIPT
![Page 1: Leveraging Distributed Research Cloud Infrastructures for ... · XSEDE JetStream, MOC. • Network-centric platform to bridge the abstraction gap. • Data-aware scheduling in Workflow](https://reader033.vdocument.in/reader033/viewer/2022050210/5f5c7c4fba2c564502446d20/html5/thumbnails/1.jpg)
Leveraging Distributed Research Cloud Infrastructures for Domain Science Research and Experimentation
Anirban Mandal, Cong Wang, Paul Ruth, Komal Thareja (RENCI, UNC – Chapel Hill)
Ewa Deelman, George Papadimitriou (USC/ISI)
Michael Zink, Eric Lyons (UMass Amherst)
Ivan Rodero, J. J. Villalobos (Rutgers University)
Funded by the National Science Foundation
Grant #1826997
Drew Angerer, Getty ImagesOpen Cloud Workshop, Boston University, March 2020
![Page 2: Leveraging Distributed Research Cloud Infrastructures for ... · XSEDE JetStream, MOC. • Network-centric platform to bridge the abstraction gap. • Data-aware scheduling in Workflow](https://reader033.vdocument.in/reader033/viewer/2022050210/5f5c7c4fba2c564502446d20/html5/thumbnails/2.jpg)
Distributed Research Clouds for Domain Sciences
• Develop novel algorithms and mechanisms to offer optimized data-flows across different kinds of national CI.
• Dynamic multi-cloud resource provisioning - ExoGENI, Chameleon, XSEDE JetStream, MOC.
• Network-centric platform to bridge the abstraction gap.
• Data-aware scheduling in Workflow Management Systems (Pegasus).
• Deploy solutions for use in observational science communities - CASA and OOI.
Major challenge is integration of data CI to science workflows: data movement across diverse infrastructure, complex workflows, distributed repositories, and elastic application compute/storage/network requirements.
![Page 3: Leveraging Distributed Research Cloud Infrastructures for ... · XSEDE JetStream, MOC. • Network-centric platform to bridge the abstraction gap. • Data-aware scheduling in Workflow](https://reader033.vdocument.in/reader033/viewer/2022050210/5f5c7c4fba2c564502446d20/html5/thumbnails/3.jpg)
Multi-cloud Provisioning and Network Orchestration
• Resource requirements from applications are generated using a Gantt chart abstraction
• Mobius network-centric platform• Multi-cloud provisioning of compute and
storage resources• Layer 2 network provisioning• Resource monitoring and control• https://github.com/RENCI-NRIG/Mobius• Exposes a REST API for Applications and
Workflow Management Systems• Management, Optimization & Prioritization
of Data Flows with virtualized SDX
CASA OOI
![Page 4: Leveraging Distributed Research Cloud Infrastructures for ... · XSEDE JetStream, MOC. • Network-centric platform to bridge the abstraction gap. • Data-aware scheduling in Workflow](https://reader033.vdocument.in/reader033/viewer/2022050210/5f5c7c4fba2c564502446d20/html5/thumbnails/4.jpg)
CASA: Collaborative Adaptive Sensing of the Atmosphere
• Traditional Next Generation Weather Radars (NEXRAD)
• High power, long range• Limited ability to observe the lower
part of the atmosphere because of the earth's curvature
• CASA• Network of short range Doppler
radars deployed in DFW area• Adjustable sensing modes in response
to quick weather changes• Suitable for near-ground weather
events: tornado, hail, high winds
![Page 5: Leveraging Distributed Research Cloud Infrastructures for ... · XSEDE JetStream, MOC. • Network-centric platform to bridge the abstraction gap. • Data-aware scheduling in Workflow](https://reader033.vdocument.in/reader033/viewer/2022050210/5f5c7c4fba2c564502446d20/html5/thumbnails/5.jpg)
CASA Workflows
> 7M people, >100K businesses, >1500 Corporate HQs
![Page 6: Leveraging Distributed Research Cloud Infrastructures for ... · XSEDE JetStream, MOC. • Network-centric platform to bridge the abstraction gap. • Data-aware scheduling in Workflow](https://reader033.vdocument.in/reader033/viewer/2022050210/5f5c7c4fba2c564502446d20/html5/thumbnails/6.jpg)
SC’19 SCinet Tech Challenge
Distributed Research Cloud Deployment with Mobiusfor CASA Applications
![Page 7: Leveraging Distributed Research Cloud Infrastructures for ... · XSEDE JetStream, MOC. • Network-centric platform to bridge the abstraction gap. • Data-aware scheduling in Workflow](https://reader033.vdocument.in/reader033/viewer/2022050210/5f5c7c4fba2c564502446d20/html5/thumbnails/7.jpg)
CASA Operational Workflows on Chameleon and ExoGENIwith Layer 2 Provisioned Data Flows
• Distributed testbeds• Heterogeneous: ExoGENI and
Chameleon testbeds
• ExoGENI• 11 workers (VMs)• 4 cores and 12 GB RAM• NFS storage
• Chameleon• 4 workers (bare metal)• 24 cores and 192 GB RAM
• Radar data repository• UNT via L2 stitching port
• Connected via 10 Gbpsnetwork.
![Page 8: Leveraging Distributed Research Cloud Infrastructures for ... · XSEDE JetStream, MOC. • Network-centric platform to bridge the abstraction gap. • Data-aware scheduling in Workflow](https://reader033.vdocument.in/reader033/viewer/2022050210/5f5c7c4fba2c564502446d20/html5/thumbnails/8.jpg)
CASA Hail Workflow
Multi-workflow display of hail (orange) and wind (red) contours, with GIS boundaries and infrastructure overlaid during a severe weather event.
Pegasus workflow for Hail
CASA radar data
![Page 9: Leveraging Distributed Research Cloud Infrastructures for ... · XSEDE JetStream, MOC. • Network-centric platform to bridge the abstraction gap. • Data-aware scheduling in Workflow](https://reader033.vdocument.in/reader033/viewer/2022050210/5f5c7c4fba2c564502446d20/html5/thumbnails/9.jpg)
Current Research Thrusts
• Optimization & prioritization of science data flows with virtualized SDX.
• Active monitoring and control for maintaining QoS.
• Supporting wider federation of clouds, including EC2, CloudLab, and OCT.
• Data-aware workflow scheduling. • Deploying novel CASA workflows (e.g.
drone path planning).• Support for streaming data and on-
demand workflows from Ocean Observatory Initiative (OOI) NSF Large Facility.
![Page 10: Leveraging Distributed Research Cloud Infrastructures for ... · XSEDE JetStream, MOC. • Network-centric platform to bridge the abstraction gap. • Data-aware scheduling in Workflow](https://reader033.vdocument.in/reader033/viewer/2022050210/5f5c7c4fba2c564502446d20/html5/thumbnails/10.jpg)
Thank you !
Funded by the National Science Foundation
Grant #1826997
Funded by the National Science FoundationGrant #1826997