cce-pps metric overview
TRANSCRIPT
Martin Kwok(FNAL)CCE-PPS meeting25 Jun, 2021
CCE-PPS Metric overview
06/18/21 Martin Kwok | CCE-PPS Metric overview
Overview & Organization
• Goal of the metric discussion:- One of the main deliverables of CCE-PPS- Ask the same set of questions to all portability layers to compare/evaluate each
technology • Metric template: link • Some questions are more detailed than others
- Be as objective as we can on the subjective topic• Currently we have 2 documents:
- SYCL Based on the experience from FastCaloSim, ACTS, (cuRAND?)
- KokkosBased on the experience from Patatrack, FastCaloSim, Wirecell
• How do we integrate different project’s input into the metric document?- Propose to address comments in future metric discussion
• Last metric discussion:- Apr16: minutes (not much about what we discussed)
2
06/18/21 Martin Kwok | CCE-PPS Metric overview
Metric progress
• Overview of metric contributions by project - Will check the diff every time we have a metric discussion - Kindly let me know when you have written something to the document
• To be expanded to more columns - What’s the next document to target?
• Candidates include: HIP, Alpaka …
3
Projects SYCL Kokkos1 FastCaloSim2 Patatrack3 Wirecell4 ACTS5 P2R
Not yet started
In progress
Completed (at least 1 round)
06/18/21 Martin Kwok | CCE-PPS Metric overview
Metric progress
4
Topics Metric formulation
SYCL Kokkos
1 Ease of Learning Language2 Code conversion3 Extent of modification to existing code4 Extent of modifications to EDM / Data5 Extent of modifications to build rules 6 Hardware Mapping7 Feature Availability8 Address needs of all workflows9 Long term sustainability and code
stability10 Compilation time11 Run time12 Ease of debugging13 Aesthetics14 Flexibility
Needs input Some input there
Good amount of content
06/18/21 Martin Kwok | CCE-PPS Metric overview
Metric template
Extent of modifications to EDM / Data • No sub-bullet points • How are data handled across different memory space? • Support for custom data types? Any limitation?
Feature Availability• Reductions, kernel chaining, callbacks, etc • Concurrent kernels ➜ very important • Need a complete list • Can we expand the list now?
Address needs of all workflows• Scaling with number of kernels per application (LHC has many, Neutrino has few) • Scaling with number of collaborators - This question is best answered from summarizing experiences of different projects - Each project covers some phase space
• “Are you memory bound?” “Are you CPU bound?” - Very application dependent
• To be extended … - Can we expand the list now?
5
06/18/21 Martin Kwok | CCE-PPS Metric overview
Metric templateRun time
• Running same use case example on CPU w/ new design on accelerator on comparable resources • Does it degrade performance of CPU code (or use significantly more memory)
• Will need much more details to answers performance comparisons • Include reference(slides/paper) to detailed performance studies
6
SYCL
Kokkos
06/18/21 Martin Kwok | CCE-PPS Metric overview
Metric templateRun time
• Running same use case example on CPU w/ new design on accelerator on comparable resources • Does it degrade performance of CPU code (or use significantly more memory)
• Will need much more details to answers performance comparisons • Include reference(slides/paper) to detailed performance studies
• Summary table for each project? Or each technology?
7
Backends SYCL Kokkos Native1 CPU (Serial)2 CPU (Parallel)3 Nvidia 3 Intel 4 AMD
Example throughput/runtime table: Patatrack
Put necessary remarks in caption/ following paragraph
06/18/21 Martin Kwok | CCE-PPS Metric overview
Metric templateRun time
• Running same use case example on CPU w/ new design on accelerator on comparable resources • Does it degrade performance of CPU code (or use significantly more memory)
• Will need much more details to answers performance comparisons • Include reference(slides/paper) to detailed performance studies
• Summary table for each project? Or each technology?
8
Backends FastCaloSim ACTS Patatrack1 CPU (Serial)2 CPU (Parallel)3 Nvidia 3 Intel 4 AMD
Example throughput/runtime table: SYCL
Put necessary remarks in caption/ following paragraph
06/18/21 Martin Kwok | CCE-PPS Metric overview
Open discussion
• Kokkos- Discuss comments on google doc
• SYCL- Any topics?
9