cluster computing ppt
Embed Size (px)
DESCRIPTION
Cluster Computing Paper Presentation & Seminar A computer cluster is a group of loosely coupled computers that work together closely so that ..TRANSCRIPT

1
Classification
of Cluster Computer

2
Clusters Classification..1
Based on Focus (in Market)
– High Performance (HP) Clusters• Grand Challenging Applications
– High Availability (HA) Clusters• Mission Critical applications

3
HA Cluster: Server Cluster with "Heartbeat" Connection

4
Clusters Classification..2
Based on Workstation/PC Ownership
– Dedicated Clusters
– Non-dedicated clusters• Adaptive parallel computing
• Also called Communal multiprocessing

5
Clusters Classification..3
Based on Node Architecture..
– Clusters of PCs (CoPs)
– Clusters of Workstations (COWs)
– Clusters of SMPs (CLUMPs)

6
Building Scalable Systems: Cluster of SMPs (Clumps)
Performance of SMP Systems Vs. Four-Processor Servers in a Cluster

7
Clusters Classification..4
Based on Node OS Type..– Linux Clusters (Beowulf)– Solaris Clusters (Berkeley NOW)– NT Clusters (HPVM)– AIX Clusters (IBM SP2)– SCO/Compaq Clusters (Unixware)– …….Digital VMS Clusters, HP clusters,
………………..

8
Clusters Classification..5
Based on node components architecture & configuration (Processor Arch, Node Type: PC/Workstation.. & OS: Linux/NT..):
– Homogeneous Clusters• All nodes will have similar configuration
– Heterogeneous Clusters• Nodes based on different processors and
running different OSes.

9
Clusters Classification..6a
Dimensions of Scalability & Levels of Clustering
Network
Technology
Platform
Uniprocessor
SMPCluster
MPP
CPU / I/O / M
emory / OS
(1)
(2)
(3)
Campus
Enterprise
Workgroup
Department
Public Metacomputing

10
Clusters Classification..6b
Group Clusters (#nodes: 2-99) – (a set of dedicated/non-dedicated computers - mainly
connected by SAN like Myrinet) Departmental Clusters (#nodes: 99-999) Organizational Clusters (#nodes: many
100s) (using ATMs Net) Internet-wide Clusters=Global Clusters:
(#nodes: 1000s to many millions)– Metacomputing– Web-based Computing– Agent Based Computing
• Java plays a major in web and agent based computing

11
Cluster Middleware
and
Single System Image

12
Contents
What is Middleware ? What is Single System Image ? Benefits of Single System Image SSI Boundaries SSI Levels Relationship between Middleware
Modules. Strategy for SSI via OS Solaris MC: An example OS supporting
SSI Cluster Monitoring Software

13
What is Cluster Middleware ?
An interface between user applications and cluster hardware and OS platform.
Middleware packages support each other at the management, programming, and implementation levels.
Middleware Layers:
– SSI Layer– Availability Layer: It enables the cluster
services of• Checkpointing, Automatic Failover, recovery from
failure, • fault-tolerant operating among all cluster nodes.

14
Middleware Design Goals
Complete Transparency
– Lets the see a single cluster system..• Single entry point, ftp, telnet, software loading...
Scalable Performance
– Easy growth of cluster• no change of API & automatic load distribution.
Enhanced Availability
– Automatic Recovery from failures• Employ checkpointing & fault tolerant technologies
– Handle consistency of data when replicated..

15
What is Single System Image (SSI) ?
A single system image is the illusion, created by software or hardware, that a collection of computing elements appear as a single computing resource.
SSI makes the cluster appear like a single machine to the user, to applications, and to the network.
A cluster without a SSI is not a cluster

16
Benefits of Single System Image
Usage of system resources transparently
Improved reliability and higher availability
Simplified system management
Reduction in the risk of operator errors
User need not be aware of the underlying system architecture to use these machines effectively

17
SSI vs. Scalability(design space of competing
arch.)

18
Desired SSI Services
Single Entry Point
– telnet cluster.my_institute.edu– telnet node1.cluster. institute.edu
Single File Hierarchy: xFS, AFS, Solaris MC Proxy
Single Control Point: Management from single GUI
Single virtual networking Single memory space - DSM Single Job Management: Glunix, Condin, LSF Single User Interface: Like workstation/PC
windowing environment (CDE in Solaris/NT), may it can use Web technology

19
Availability Support Functions
Single I/O Space (SIO):
– any node can access any peripheral or disk devices without the knowledge of physical location.
Single Process Space (SPS)
– Any process on any node create processes cluster wide and they communicate through signal, pipes, etc, as if they are one a single node.
Checkpointing and Process Migration.
– Saves the process state and intermediate results in memory to disk to support rollback recovery when node fails. PM for Load balancing...
Reduction in the risk of operator errors User need not be aware of the underlying system
architecture to use these machines effectively

20
SSI Levels
It is a computer science notion of levels of abstractions (house is at a higher level of abstraction than walls, ceilings, and floors).
Application and Subsystem Level
Operating System Kernel Level
Hardware Level

21
Cluster Computing - Research Projects
Beowulf (CalTech and Nasa) - USA CCS (Computing Centre Software) - Paderborn, Germany Condor - Wisconsin State University, USA DJM (Distributed Job Manager) - Minnesota Supercomputing Center DQS (Distributed Queuing System) - Florida State University, USA EASY - Argonne National Lab, USA HPVM -(High Performance Virtual Machine),UIUC&now UCSB,US far - University of Liverpool, UK Gardens - Queensland University of Technology, Australia Generic NQS (Network Queuing System),University of Sheffield, UK NOW (Network of Workstations) - Berkeley, USA NIMROD - Monash University, Australia PBS (Portable Batch System) - NASA Ames and LLNL, USA PRM (Prospero Resource Manager) - Uni. of S. California, USA QBATCH - Vita Services Ltd., USA

22
Cluster Computing - Commercial Software
Codine (Computing in Distributed Network Environment) - GENIAS GmbH, Germany
LoadLeveler - IBM Corp., USA LSF (Load Sharing Facility) - Platform
Computing, Canada NQE (Network Queuing Environment) -
Craysoft Corp., USA OpenFrame - Centre for Development of
Advanced Computing, India RWPC (Real World Computing Partnership),
Japan Unixware (SCO-Santa Cruz Operations,), USA Solaris-MC (Sun Microsystems), USA

23
Representative Cluster Systems
1. Solaris -MC2. Berkeley NOW3. their comparison with Beowulf & HPVM

24
Next Generation Distributed Computing:
The Solaris MC Operating System

25
Why new software?
Without software, a cluster is:– Just a network of machines
– Requires specialized applications
– Hard to administer
With a cluster operating system:– Cluster becomes a scalable, modular computer
– Users and administrators see a single large machine
– Runs existing applications
– Easy to administer
New software makes cluster better for the customer

26
Cluster computing and Solaris MC
Goal: use computer clusters for general-purpose computing
Support existing customers and applications
Solution: Solaris MC (Multi Computer) operating system
A distributed operating system (OS) for multi-computers

27
What is the Solaris MC OS ?
Solaris MC extends standard Solaris
Solaris MC makes the cluster look like a single machine
Global file system Global process management Global networking
Solaris MC runs existing applications unchanged Supports Solaris ANI (Application binary interface)

28
Applications Ideal for:
Web and interactive servers Databases File servers Timesharing
Benefits for vendors and customers Preserves investment in existing applications Modular servers with low entry-point price and low cost of
ownership Easier system administraion Solaris could become a preferred platform for clustered
systems

29
Solaris MC is a running research system
Designed, built and demonstrated Solaris MC prototype CLuster of SPARCstations connected with Myrinet network Runs unmodified commercial parallel database, scalable Web server,
parallel make
Next: Solaris MC Phase II High availability New I/O work to take advantage of clusters Performance evaluation

30
Advantages of Solaris MC
Leverages continuing investment in Solaris Same applications: binary-compatible Same kernel, device drivers, etc. As portable as base Solaris - will run on SPARC, x86, PowerPC
State of the art distributed systems techniques High availability designed into the system Powerful distributed object-oriented framework
Ease of administration and use Looks like a familiar multiprocessor server to users, sytem
administrators, and applications

31
Solaris MC details
Solaris MC is a set of C++ loadable modules on top of Solaris
– Very few changes to existing kernel
A private Solaris kernel per node: provides reliability Object-oriented system with well-defined interfaces

32
Solaris MC components
Object and communication support
High availability support
PXFS global distributed file system
Process mangement
NetworkingSolaris MC Architecture
System call interface
Network
File system
C++
Processes
Object framework
Existing Solaris 2.5 kernel
Othernodes
Object invocations
Kernel
Solaris MC
Applications

33
Object Orientation
Better software maintenance, change, and evolution Well-defined interfaces Separate implementation from interface Interface inheritance
Solaris MC uses: IDL: a better way to define interfaces CORBA object model: a better RPC (Remote Procedure Call) C++: a better C

34
Object and Communication Framework
Mechanism for nodes and modules to communicate Inter-node and intra-node interprocess communication
Optimized protocols for trusted computing base
Efficient, low-latency communication primitives
Object communication independent of interconnect We use Ethernet, fast Ethernet, FibreChannel, Myrinet
Allows interconnect hardware to be upgraded

35
High Availability Support
Node failure doesn’t crash entire system Unaffected nodes continue running Better than a SMP A requirement for mission critical market
Well-defined failure boundaries Separate kernel per node - OS does not use shared
memory Object framework provides support
Delivers failure notifications to servers and clients Group membership protocol detects node failures
Each subsystem is responsible for its recovery Filesystem, process management, networking,
applications

36
PXFS: Global Filesystem
Single-system image of file sytem
Backbone of Solaris MC
Coherent access and caching of files and directories Caching provides high performance
Access to I/O devices

37
PXFS: An object-oriented VFS
PXFS builds on existing Solaris file sytems Uses the vnode/virtual file system interface (VFS) externally Uses object communication internally

38
Process management
Provide global view of processes on any node Users, administrators, and applications see global view Supports existing applications
Uniform support for local and remote processes Process creation/waiting/exiting (including remote execution) Global process identifiers, groups, sessions Signal handling procfs (/proc)

39
Process management benefits
Global process management helps users and administrators
Users see familiar single machine process model
Can run programs on any node
Location of process in the cluster doesn’t matter
Use existing commands and tools: unmodified ps, kill, etc.

40
Networking goals Cluster appears externally as a single SMP server
Familiar to customers Access cluster through single network address Multiple network interfaces supported but not required
Scalable design protocol and network application processing on any mode Parallelism provides high server performance

41
Networking: Implementation
A programmable “packet filter” Packets routed between network device and the correct node Efficient, scalable, and supports parallelism Supports multiple protocols with existing protocol stacks
Parallelism of protocol processing and applications Incoming connections are load-balanced across the cluster

42
Status
4 node, 8 CPU prototype with Myrinet demonstratedObject and communication infrastructureGlobal file system (PXFS) with coherency and
cachingNetworking TCP/IP with load balancingGlobal process management (ps, kill, exec, wait,
rfork, /proc)Monitoring toolsCluster membership protocols
Demonstrated applicationsCommercial parallel databaseScalable Web serverParallel makeTimesharing
Solaris-MC team is working on high availability

43
Summary of Solaris MC
Clusters likely to be an important market Solaris MC preserves customer investment in Solaris
Uses existing Solaris applications
Familiar to customers Looks like a multiprocessor, not a special cluster architecture
Ease of administration and use Clusters are ideal for important applications
Web server, file server, databases, interactive services State-of-the-art object-oriented distributed
implementation Designed for future growth

44
Berkeley NOW Project

45
NOW @ Berkeley
Design & Implementation of higher-level system
Global OS (Glunix)Parallel File Systems (xFS)Fast Communication (HW for Active
Messages)Application Support
Overcoming technology shortcomingsFault toleranceSystem Management
NOW Goal: Faster for Parallel AND Sequential

46
NOW Software Components
AM L.C.P.
VN segment Driver
UnixWorkstation
AM L.C.P.
VN segment Driver
UnixWorkstation
AM L.C.P.
VN segment Driver
UnixWorkstation
AM L.C.P.
VN segment Driver
Unix (Solaris)Workstation
Global Layer Unix
Myrinet Scalable Interconnect
Large Seq. AppsParallel Apps
Sockets, Split-C, MPI, HPF, vSM
Active MessagesName Svr
Sched
uler

47
Active Messages: Lightweight Communication Protocol
Key Idea: Network Process ID attached to every message that HW checks upon receipt
Net PID match, as fast as beforeNet PIC mismatch, interrupt and invoke OS
Can mix LAN messages and MPP messages;invoke OS & TCP/IP only when not cooperating
(if everyone uses same physical layer format)

48
MPP Active Messages
Key Idea: associate a small user-level handler directly with each message
Sender injects the message directly into the network
Handler executes immediately upon arrivalPulls the message out of the network and
integrates it into the ongoing computation, or replies
No buffering (beyond transport), no parsing, no allocation, primitive scheduling

49
Active Message Model
Every message contains at its header the address of a user level handler which gets executed immediately in user level
No receive side buffering of messages
Supports protected multiprogramming of a large number of users onto finite physical network resource
Active message operations, communication events and threads are integrated in a simple and cohesive model
Provides naming and protection

50
Active Message Model (Contd..)
datastructs
datastructs
primarycomputation
primarycomputation
handler
data pc
Active Message
Network

51
xFS: File System for NOW
Serverless File System: All data with clientsUses MP cache coherency to reduce traffic
Files striped for parallel transfer Large file cache (“cooperative caching-
Network RAM”)
Miss Rate Response Time
Client/Server 10% 1.8 ms
xFS 4% 1.0 ms
(42 WS, 32 MB/WS, 512 MB/server, 8KB/access)

52
Glunix: Gluing Unix
It is built onto of Solaris It glues together Solaris running on Cluster
nodes. Support transparent remote execution, load
balancing, allows to run existing applications. Provides globalized view of system resources
like SolarisMC Gang schedule parallel jobs to be as good as
dedicated MPP for parallel jobs

53
3 Paths for Applications on NOW?
Revolutionary (MPP Style): write new programs from scratch using MPP languages, compilers, libraries,…
Porting: port programs from mainframes, supercomputers, MPPs, …
Evolutionary: take sequential program & use1)Network RAM: first use memory of many computers
to reduce disk accesses; if not fast enough, then:
2)Parallel I/O: use many disks in parallel for accesses not in file cache; if not fast enough, then:
3)Parallel program: change program until it sees enough processors that is fast
=> Large speedup without fine grain parallel program

54
Comparison of 4 Cluster Systems

55
Clusters Revisited

56
Summary
We have discussed ClustersEnabling TechnologiesArchitecture & its ComponentsClassificationsMiddlewareSingle System ImageRepresentative Systems

57
Conclusions
Clusters are promising..
Solve parallel processing paradoxOffer incremental growth and matches with funding
pattern.New trends in hardware and software technologies
are likely to make clusters more promising..so thatClusters based supercomputers can be seen
everywhere!