beowulf clusters

Post on 05-Jan-2016

50 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Beowulf Clusters. Matthew Doney. What is a cluster?. A cluster is a group of several computers connected Several different methods of connecting them Distributed Computers widely separated, connected over the internet Used by research groups like SETI@home and GIMPS Workstation Cluster - PowerPoint PPT Presentation

TRANSCRIPT

Beowulf ClustersMatthew Doney

What is a cluster?

A cluster is a group of several computers connected

Several different methods of connecting them

Distributed

Computers widely separated, connected over the internet

Used by research groups like SETI@home and GIMPS

Workstation Cluster

Collection of Workstations loosely connected by LAN

Cluster Farm

PC’s connected over LAN that perform work when idle

What is a Beowulf Cluster

A Beowulf Cluster is one class of a cluster computer

Uses Commercial Off The Shelf (COTS) hardware

Typically contains both master and slave nodes

Not defined by a specific piece of hardware

Image Source: http://www.cse.mtu.edu/Common/cluster.jpg

What is a Beowulf Cluster

The origin of the name “Beowulf”

Main character of Old English poem

Described in the poem – “he has thirty men’s heft of grasp in the gripe of his hand, the bold-in-battle”.

Image Source:http://www.teachingcollegeenglish.com/wp-content/uploads/2011/06/lynd-ward-17-jnanam-dot-net.jpg

Cluster Computer History – 1950’s

SAGE, one of the first cluster computers

Developed by IBM for NORAD

Linked radar stations together for first early warning detection system

Image Source: http://www.ieeeghn.org/wiki/images/3/34/Sage_nomination.jpg

Cluster Computer History – 1970’s

Technological Advancements

VLSI (Very Large Scale Integration)

Ethernet

UNIX Operating System

Cluster Computer History – 1980’s

Increased interest in cluster computing

Ex: NSA connected 160 Apollo workstations in a cluster configuration

First widely used clustering product: VAXcluster

Development of task scheduling software

Condor package developed by UW-Madison

Development of parallel programming software

PVM(Parallel Virtual Machine)

Cluster Computer History – 1990’s

NOW(Network of workstations) project at UC Berkeley

First cluster on TOP500 list

Development of Myrinet LAN system

Beowulf project started at NASA’s Goddard Space Flight Center

Image Source: http://www.cs.berkeley.edu/~pattrsn/Arch/NOW2.jpg

Cluster Computer History - Beowulf

Developed by Thomas Sterling and Donald Becker

16 Individual nodes

100 MHz Intel 80486 processors

16 MB memory, 500 MB hard drive

2 10Mbps Ethernet ports

Early version of Linux

Used PVM library

Cluster Computer History – 1990’s

MPI standard developed

Created to be a global standard to replace existing message passing protocols

DOE, NASA, California Institute of Technology collaboration

Developed a Beowulf system with sustained performance 1 Gflops

Cost $50,000

Awarded Gordon Bell prize for price/performance

28 Clusters were on the TOP500 list by the end of the decade

Beowulf Cluster Advantages

Price/Performance

Using COTS hardware greatly reduces associated costs

Scalability

By using individual nodes, more can easily be added by slightly altering the network

Convergence Architecture

Using commodity hardware has standardized operating systems, instruction sets, and communication protocols

Code portability has greatly increased

Beowulf Cluster Advantages

Flexibility of Configuration and Upgrades

Large variety of COTS components

Standardization of COTS components allows for easy upgrades

Technology Tracking

Can use new components as soon as they come out

No delay time waiting for manufacturers to integrate components

High Availability

System will continue to run if an individual node fails

Beowulf Cluster Advantages

Level of Control

System is easily configured to users liking

Development Cost and Time

No special hardware needs to be designed

Less time designing system, just pick parts to be used

Cheaper mass market components

Beowulf Cluster Disadvantages

Programming Difficulty

Programs need to be highly parallelized to take advantage of hardware design

Distributed Memory

Program data is split over the individual nodes

Network speed can bottleneck performance

Results may need to be compiled by a single node

Beowulf Cluster Architecture

Master-Slave configuration

Master Node

Job scheduling

System monitoring

Resource management

Slave Node

Does assigned work

Communicates with other slave nodes

Sends results to master node

Node Hardware Typically desktop PC’s

Can consist of other types of computers i.e.

Rack-mount servers

Case-less motherboards

PS3’s

RaspberryPi boards

Node Software

Operating System

Resource Manager

Message Passing Software

Resource Management Software

Condor

Developed by UW-Madison

Allows distributed job submission

PBS (Portable Batch System)

Initially developed by NASA

Developed to schedule jobs on parallel compute clusters

Maui

Adds enhanced monitoring to existing job scheduler (i.e. PBS)

Allows administrator to set individual and group job priorities

Sample Condor Submit File

Submits 150 copies of the program foo

Each copy of the program has its own input, output, and error message file

All of the log information from Condor goes to one file

Sample Maui Configuration File

User yangq will have the highest priority users of the group ART having lowest

Members of group CS_SE are limited to 20 jobs which use no more than 100 nodes

Sample PBS Submit File

Submits job “my_job_name” that needs 1 hour and 4 CPUs with 2GB of memory

Uses file “my_job_name.in” as input

Uses file “my_job_name.log” as output

Uses file “my_job_name.err” as error output

Message Passing Software

MPI (Message Passing Interface)

Widely used in HPC community

Specification is controlled by MPI-Forum

Available for free

PVM (Parallel Virtual Machine)

First message passing protocol in be widely used

Provided for fault tolerant operation

MPI Hello World Example

MPI Hello World Example(cont)

PVM Hello World Example

PVM Hello World Example

Interconnection Hardware

Two main choices – technology and topology

Main Technologies

Ethernet with speeds up to 10Gbps

Infiniband with speeds up to 300 Gbps

Image Source:http://www.sierra-cables.com/Cables/Images/12X-Infiniband-R.jpg

Interconnection Topology

Torus Network

Bus Network

Flat Neighborhood Network

References

[1] Impagliazzo, J., & Lee, J. A. N. (2004). History of Computing in Education. Norwell: Kluwer Academic Publishers.

[2] Pfeiffer, C. (Photographer). (2006, November 25). Cray-1 Deutsches Museum [Web Photo]. Retrieved from http://en.wikipedia.org/wiki/File:Cray-1-deutsches-museum.jpg

[3] Sterling, T. (2002). Beowulf Cluster Computing with Linux. United States of America: Massahusetts Institue of Technology.

[4] Sterling, T. (2002). Beowulf Cluster Computing with Windows. United State of America: Massachusetts Institute of Technology.

[5] Condor High Throughput Computing. (2013, October 24). Retrieved October 27, 2013, from http://research.cs.wisc.edu/htcondor/

References

[6] Beowulf: A Parallel Workstation For Scientific Computation. (1995). Retrieved October 27, 2013, from http://www.phy.duke.edu/~rgb/brahma/Resources/beowulf/papers/ICPP95/ icpp95.html

 [7] Development over Time | TOP500 Supercomputer Sites. Retrieved October 27, 2013, from www.top500.org/statistics/overtime/

[8] Jain, A. (2006). Beowulf cluster design and setup. Retrieved October 27, 2013. Informally published manuscript, Department of Computer Science, Boise State University, Retrieved from http://cs.boisestate.edu/~amit/research/beowulf/beowulf-setup.pdf

[9] Zinner, S. (2012). High Performance Computing Using Beowulf Clusters. Retrieved October 27, 2013. Retrieved from http://www2.hawaii.edu/~zinner/101/students/MitchelBeowulf/cluster.html

Questions???

top related