locuz hpc services · install and configure pfs & shared sanfs (lustre, glustre, cxfs, sam-qfs,...
TRANSCRIPT
locuz.com
Locuz HPC Services
Professional Services
Locuz HPC Services – Overview In today’s global environment, organizations need to be highly competitive to strive for better outcomes.
Whether it’s improved financial performance, shorter product development cycles, better understanding of
molecular level interactions, or more efficient ways to simulate the behavior of materials at Nano-scale, High
Performance Computing (HPC) is one of the latest tools used to resolve these complicated computing
problems.
Locuz offers high value, integrated HPC solutions based on the following strengths:
Over a decade of experience in Grid Computing, MPP's, SMP's, Enterprise storage and Parallel file
systems and Virtualization.
A broad portfolio of superior products and technologies covering different hardware and operating
platforms from OEM partners to solve complex problems.
HPC Centre for excellence for application parallelization and GPU.
Fig 1: Locuz HPC Framework
Locuz offers comprehensive solutions for High Performance Computing based on loosely coupled clusters,
SMP, accelerator based systems, High performance storage, application parallelization. We have designed
domain specific solutions considering the challenges and business requirements of respective industry
verticals. In addition, our services can help organizations to optimize and overcome obstacles to parallelism
by adopting revolutionary approaches to High Performance Computing.
HPC Practice - Service Offerings
Unique Offerings (Innovation by Locuz Labs)
Value Proposition Broad industry expertise
Field tested methodologies – HPC
Clusters projects that have National
Importance
Domain focused
Lifecycle Approach
Engineered products that are highly
customizable
Designed & deployed over 170
Clusters
Provide enterprise grade support with
24X7 Remote support operations
Architecture Consulting
Big Data Infrastructure
Services
CUDA & MIC Services
CPU & GPU Clusters
Industry domain focused
Workload Management and
Portal
Cloud Services – Onboarding &
Migration
Implementation & Remote
Management
HPC Practice – Skill Chart
HPC Application Services – Sample Offerings
The HPC Application services from Locuz, encapsulate the approach for installation, configuration and
support of various HPC Applications, both Open Source Apps and Apps which are commercially licensed.
Some sample application services from Locuz
include:
Application tuning for performance
improvement
Application benchmarking for various
comparison scenarios
Performance tuning of storage, operating
system, IB and MPIs for performance
Converting codes to parallel from serial
version
Converting CPU codes to run on
GPGPU using CUDA, PGI OpenACC
and OpenCL (Full / Partial)
Porting open source CPU applications on
GP-GPU and Intel Phi accelerator
The services cut across a range of
applications across different fields such as:
Computational Chemistry : Gaussian,
Quantum Espresso, VASP
Molecular Dynamics: NAMD ,
Gromacs, Lampps
Bioinformatics : Bioperf - Benchmark
suite with many bioinformatics
applications., mpiBLAST, FASTA
Manufacturing - Fluid dynamics :
Ansys Fluent, Ansys CFX,
OpenFOAM
Manufacturing - Structural Analysis:
Abaqus FEA, MSC Nastran
Climate and weather modelling: WRF,
MM5, MOMS
HPC Cluster Deployment - Sample Statement of Work For an HPC Cluster comprising Admin Node, Service Nodes, Rack Leader Nodes & Compute Nodes.
Hardware Configuration
System Discovery, Installation, and
Configuration
Running Cluster Configuration
Tool i.e configure-cluster.
Set NTP on Admin Node.
Service Node Discovery,
Installation, and Configuration
Rack Leader Discovery,
Installation, and configuration
Installing Software on the
Rack Leader Controllers and
Service Nodes.
Discovering Compute Nodes
InfiniBand Configuration
Setting Up an NFS Home
Server on a Service Node
(from NAS box or from Local
disk pool)
NIS /LDAP Server
configuration
Creating User Accounts
IPMI configuration
Updating Document.
Racking, Stacking, Labelling of Compute
Rack.
Inter-rack connectivity
Single-RAIL/Dual-RAIL IB connectivity
in FAT-TREE protocol.
CMC network configuration and
cabling.
Ethernet cabling for LAN access and IO
Nodes.
FC cabling to storage.
Verifying the connectivity and updating
document.
Installation, configuration and
Validation of Admin/controller server
for
RAID controller
Operating System.
Verifying Admin node and
updating document.
Creating Image of service
Node
Creating Image of leader
Nodes
Creating Image of
compute Nodes.
Software Configuration
System Maintenance, Monitoring, and
Debugging
Monitoring and Managing the
compute Blades i.e IP101,
IP103, & IP105.
GigE monitoring and
troubleshooting
IB Monitoring and
troubleshooting
Chassis monitoring through
LED, SMC console.
Installing package in image.
Cloning Image
Modifying Image
Updating Image
Pushing image in nodes.
Hardware troubleshooting, if
required.
Performance Test
Burn Test, Stress test, IO test,
and Benchmarking (HPL, IOR,
IOZone)
P2P (GPU)
Installing Compiler(s), Scientific & math
Libraries, MPI(s) and other software on
cluster (compute Image)
CUDA Installation
Install, configure, and/or validate
Scheduler. (PBSPro, Open Grid Engine,
UNIVA, Torque, Moab Suite)
Install, configure and/or validate
performance suite, Accelerators, MPI.
Install and configure PFS & Shared
SANFS (LUSTRE, Glustre, CXFS, SAM-
QFS, StorNextFS, PANFS)
Installation and configuration of various
backup software (Symantec, Legato /
EMC Networker, Dell Netvault,
Commvault etc.)
Updating Document.
IO Nodes
Install configure and
validate InfiniStorage.
Configure RAID and
LUNs.
MAP LUNs to MDS/OSS
Nodes.
Install and configure
LUSTRE /Gluster
Sample Deployment Scope (for All Applications)
Additional scope for Open Source Applications:
Prepare dependent environment for App to be deployed in Master / Central Node
Prepare required package / image of App to deploy on compute nodes (either thru a cluster
automation tool or manual installation)
Installation and configuration of applications on Linux-based HPC cluster / MS Windows-based
HPC Cluster, Individual Servers, and/or Workstations
Integration of applications with scheduler (e.g. Open Grid Engine, Univa, Torque, PBS Pro,
LSF, MS Scheduler)
Integration of Application to the suitable MPI & dependent library environment
Configuration of scheduler to integrate it to the License Server and license parsing resource
management feature of scheduler.
Create a separate license complex for open source schedulers e.g. OGE & Torque
Integration of applications with job portal (e.g PBS, LSF, Ganana)
Compilation of application source code according to OS kernel version, available compiler version
and required math libraries, MPIs and OFED. (e.g best suitable MPI based on application
requirement, using tuned IB parameters.)
Installation and configuration of compiled binary of application on Linux based HPC
cluster.
Fine tuning application and environment by using optimized parameters.
Reference Case Studies
Customer Need
Implementation services
of IBM Platform HPC
infra with integration of
end-user custom codes
enabling them to
seamlessly run HPC
infrastructure.
Solution - HPC Infra Services & Remote Support
Implementation of pHPC (IBM Platform HPC)
solution, including LSF scheduler, PAC (Platform
application Center, and PCM (Platform cluster
manager).
Integrating end users applications on pHPC for optimal
performance and ease of use.
Onsite end user training for effective use the HPC
infrastructure
Remote reactive support based on SLA’s
Benefits
Single point ownership for complete integration of HPC infrastructure.
Reduction in HPC setup build time
Remote reactive support for HPC troubleshooting reducing overall support cost and also
enabling faster resolution.
Govt Meteorological Agency – Central Africa
Customer Need
An extremely reliable,
high-performance
Compute & storage
monitoring &
management Service that
could easily scale to
support its growing
worldwide seismic
processing operations
Solution - 24x7 HPC Remote Monitoring & Management
Support
Implemented multiple components that include Uni-
center Framework, Custom build tools and Open
Source tools
Remote Infrastructure Management services
Benefits
Integrated view of entire cluster infrastructure
Single point ownership for complete production support of cluster infrastructure
Reduction in operations cost by 40%
Staff redeployed for strategic in-house initiatives
Oil & Gas - Seismic Processing Service Provider
Customer Need
Build a 300TF Hybrid HPC
System to run complex CFD
workloads
Test application scalability on
hybrid environments and
optimize the HPC
infrastructure for superior
application performance
Solution - HPC Infra Build, Integration & Remote Support
Locuz HPC Services – Implementation, optimization
and support services including:
Achieving 77% HPL efficiency.
Implementation of middleware tools (pHPC for
scheduler , Cluster manager and user portal) ;
Compilers (PGI and Intel) ; CUDA Libraries and
environment.
Integrating end users applications to Hybrid GPGPU
environment for optimal performance.
Implementation of GPFS based parallel file system for
4GB/Sec of write throughput on 200TB usable storage.
Benefits
Single point ownership for complete integration of HPC infrastructure
Reduction in total cost of ownership by augmenting existing datacenter for new HPC
infrastructure
Enabling End user to port their application on CUDA (GPU) environment and scale it
optimally
Scientific Research & Development Lab – India
Customer Need
Provide appropriate
management tools to
enable ease of use and
analytics on the CFD/CAE
cluster
Enable remote use and
management of the
system
Solution - HPC Infra Build, Integration & Remote Support
Locuz Ganana Portal and related implementation &
integration services
Benefits
Fully customized and integrated HPC Management Portal
Ease of use along with support for remote management
Global Automobile Manufacturer - India
PROVIDER – INDIA
locuz.com
Locuz Professional Services
Locuz Inc
450, Raritan Center Parkway, Suite B, Edison, NJ - 08837