performance analysis of machine learning algorithms for...
Post on 04-Oct-2020
4 Views
Preview:
TRANSCRIPT
Performance Analysis of Machine LearningAlgorithms for Phylanx: An Asynchronous Array
Processing ToolkitWeile Wei, Rod Tohid, Bibek Wagle, Shahrzad Shirzad, Parsa Amini,
Bita Hasheminezhad, Katy Williams, Adrian Serio, Hartmut Kaiser
Louisiana State University
Abstract
Phylanx, is an asynchronous array processing toolkit which transforms Python and NumPyoperations into code which can be executed in parallel on HPC resources by mappingPython and NumPy functions and variables into a dependency tree executed by HPX, ageneral purpose, parallel, task-based runtime system written in C++. In this poster, wepresent early results that compare our implementation of widely used machine learningalgorithms to accepted NumPy standards.
Python
DecoratedCode
Output
TransformationRules
Frontend
Optimizer
Compiler Executor HPXAST
AST AST
Execution
Tree
Tasks
Result
Figure 1:Phylanx program flow
Phylanx program flow: Phylanx frontend generates an AST (PhySL) of the decoratedPython code. The AST could be directly passed to the compiler to generate the executiontree or, optionally, fed to the optimizer first and then the compiler. Once the Kernel isinvoked, Phylanx triggers the evaluation of the the execution tree on HPX. After finishingthe evaluation, the result is returned in Python.
Phylanx and its External Libraries
Vis
ualiz
atio
nT
ools
Phylanx
Frontend
Optimizer
BackendCompiler
Executor
Perf. Counters
pybi
nd11
Python
NumPy
Blaze
HPX
Figure 2:Phylanx toolkit and its interactions
with external libraries.
I Phylanx’s data structures rely on the
high-performance open-source C++ library Blaze,
which supports HPX as a parallelization library
back-end and perfectly maps its data to Python data
structures.
I To avoid data copies between Python and C++, we
take advantage of Python buffer protocol through
pybind11 library.
I Each Python list is mapped to a C++ vector and
1-D and 2-D NumPy arrays are mapped to a Blaze
vector and Blaze matrix respectively.
Visualization of AST using Traveler Tool
Figure 3:Visualization of AST using Traveler Tool.
References
[1] R Tohid, Bibek Wagle, Shahrzad Shirzad, Patrick Diehl, Adrian Serio, Alireza Kheirkhahan,Parsa Amini, Katy Williams, Kate Isaacs, Kevin Huck, et al.Asynchronous execution of python code on task based runtime systems.arXiv preprint arXiv:1810.07591, 2018.
Acknowledgments
I This material is based upon work supported by the National Science Foundation underGrant No. 1737785. Any opinions, findings, and conclusions or recommendations expressedin this material are those of the author(s) and do not necessarily reflect the views of theNational Science Foundation.
I This work is supported by The Defense Technical Information Center under the contract:DTIC Contract FA8075-14-D-0002/0007.
Performance Results
Figure 4:Comparing reference implementation of the Logistic Regression algorithm Squares in
NumPy with the corresponding PhySL code. Experiment was performed on a node consisting
of two Intel(R) Xeon(R) CPU E5-2660 v3 clocked at 2.6GHZ, with 10 cores (20 threads) each
providing a total of 20 cores and 128 GB DDR4 Memory.
Figure 5:Comparing reference implementation of the Alternating Least Squares in NumPy with
the corresponding PhySL code. All listed experiments below were performed on nodes consisting
of two Intel Xeon E5-2450 CPUs clocked at 2.10GHZ providing a total of 16 cores and 48GB
1333 MHZ DDR3 memory.
Figure 6:Comparing reference implementation of the K-Means algorithm in NumPy with the
corresponding PhySL code.
Figure 7:Comparing reference implementation of the Neural Networks algorithm in NumPy with
the corresponding PhySL code.
Conclusion
I Our early results show that the Alternating Least Square as well as the Logistic Regressionalgorithm outperforms the NumPy implementation in a few cases.
I While Neural Networks and K-Means needs much improvement, we are confident that withnew features and further performance improvements we will be able to match oroutperform these NumPy benchmarks.
https://github.com/STEllAR-GROUP/phylanx wwei9@lsu.edu
top related