beowulf clusters

31
Beowulf Clusters Matthew Doney

Upload: anoush

Post on 05-Jan-2016

50 views

Category:

Documents


1 download

DESCRIPTION

Beowulf Clusters. Matthew Doney. What is a cluster?. A cluster is a group of several computers connected Several different methods of connecting them Distributed Computers widely separated, connected over the internet Used by research groups like SETI@home and GIMPS Workstation Cluster - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Beowulf Clusters

Beowulf ClustersMatthew Doney

Page 2: Beowulf Clusters

What is a cluster?

A cluster is a group of several computers connected

Several different methods of connecting them

Distributed

Computers widely separated, connected over the internet

Used by research groups like SETI@home and GIMPS

Workstation Cluster

Collection of Workstations loosely connected by LAN

Cluster Farm

PC’s connected over LAN that perform work when idle

Page 3: Beowulf Clusters

What is a Beowulf Cluster

A Beowulf Cluster is one class of a cluster computer

Uses Commercial Off The Shelf (COTS) hardware

Typically contains both master and slave nodes

Not defined by a specific piece of hardware

Image Source: http://www.cse.mtu.edu/Common/cluster.jpg

Page 4: Beowulf Clusters

What is a Beowulf Cluster

The origin of the name “Beowulf”

Main character of Old English poem

Described in the poem – “he has thirty men’s heft of grasp in the gripe of his hand, the bold-in-battle”.

Image Source:http://www.teachingcollegeenglish.com/wp-content/uploads/2011/06/lynd-ward-17-jnanam-dot-net.jpg

Page 5: Beowulf Clusters

Cluster Computer History – 1950’s

SAGE, one of the first cluster computers

Developed by IBM for NORAD

Linked radar stations together for first early warning detection system

Image Source: http://www.ieeeghn.org/wiki/images/3/34/Sage_nomination.jpg

Page 6: Beowulf Clusters

Cluster Computer History – 1970’s

Technological Advancements

VLSI (Very Large Scale Integration)

Ethernet

UNIX Operating System

Page 7: Beowulf Clusters

Cluster Computer History – 1980’s

Increased interest in cluster computing

Ex: NSA connected 160 Apollo workstations in a cluster configuration

First widely used clustering product: VAXcluster

Development of task scheduling software

Condor package developed by UW-Madison

Development of parallel programming software

PVM(Parallel Virtual Machine)

Page 8: Beowulf Clusters

Cluster Computer History – 1990’s

NOW(Network of workstations) project at UC Berkeley

First cluster on TOP500 list

Development of Myrinet LAN system

Beowulf project started at NASA’s Goddard Space Flight Center

Image Source: http://www.cs.berkeley.edu/~pattrsn/Arch/NOW2.jpg

Page 9: Beowulf Clusters

Cluster Computer History - Beowulf

Developed by Thomas Sterling and Donald Becker

16 Individual nodes

100 MHz Intel 80486 processors

16 MB memory, 500 MB hard drive

2 10Mbps Ethernet ports

Early version of Linux

Used PVM library

Page 10: Beowulf Clusters

Cluster Computer History – 1990’s

MPI standard developed

Created to be a global standard to replace existing message passing protocols

DOE, NASA, California Institute of Technology collaboration

Developed a Beowulf system with sustained performance 1 Gflops

Cost $50,000

Awarded Gordon Bell prize for price/performance

28 Clusters were on the TOP500 list by the end of the decade

Page 11: Beowulf Clusters

Beowulf Cluster Advantages

Price/Performance

Using COTS hardware greatly reduces associated costs

Scalability

By using individual nodes, more can easily be added by slightly altering the network

Convergence Architecture

Using commodity hardware has standardized operating systems, instruction sets, and communication protocols

Code portability has greatly increased

Page 12: Beowulf Clusters

Beowulf Cluster Advantages

Flexibility of Configuration and Upgrades

Large variety of COTS components

Standardization of COTS components allows for easy upgrades

Technology Tracking

Can use new components as soon as they come out

No delay time waiting for manufacturers to integrate components

High Availability

System will continue to run if an individual node fails

Page 13: Beowulf Clusters

Beowulf Cluster Advantages

Level of Control

System is easily configured to users liking

Development Cost and Time

No special hardware needs to be designed

Less time designing system, just pick parts to be used

Cheaper mass market components

Page 14: Beowulf Clusters

Beowulf Cluster Disadvantages

Programming Difficulty

Programs need to be highly parallelized to take advantage of hardware design

Distributed Memory

Program data is split over the individual nodes

Network speed can bottleneck performance

Results may need to be compiled by a single node

Page 15: Beowulf Clusters

Beowulf Cluster Architecture

Master-Slave configuration

Master Node

Job scheduling

System monitoring

Resource management

Slave Node

Does assigned work

Communicates with other slave nodes

Sends results to master node

Page 16: Beowulf Clusters

Node Hardware Typically desktop PC’s

Can consist of other types of computers i.e.

Rack-mount servers

Case-less motherboards

PS3’s

RaspberryPi boards

Page 17: Beowulf Clusters

Node Software

Operating System

Resource Manager

Message Passing Software

Page 18: Beowulf Clusters

Resource Management Software

Condor

Developed by UW-Madison

Allows distributed job submission

PBS (Portable Batch System)

Initially developed by NASA

Developed to schedule jobs on parallel compute clusters

Maui

Adds enhanced monitoring to existing job scheduler (i.e. PBS)

Allows administrator to set individual and group job priorities

Page 19: Beowulf Clusters

Sample Condor Submit File

Submits 150 copies of the program foo

Each copy of the program has its own input, output, and error message file

All of the log information from Condor goes to one file

Page 20: Beowulf Clusters

Sample Maui Configuration File

User yangq will have the highest priority users of the group ART having lowest

Members of group CS_SE are limited to 20 jobs which use no more than 100 nodes

Page 21: Beowulf Clusters

Sample PBS Submit File

Submits job “my_job_name” that needs 1 hour and 4 CPUs with 2GB of memory

Uses file “my_job_name.in” as input

Uses file “my_job_name.log” as output

Uses file “my_job_name.err” as error output

Page 22: Beowulf Clusters

Message Passing Software

MPI (Message Passing Interface)

Widely used in HPC community

Specification is controlled by MPI-Forum

Available for free

PVM (Parallel Virtual Machine)

First message passing protocol in be widely used

Provided for fault tolerant operation

Page 23: Beowulf Clusters

MPI Hello World Example

Page 24: Beowulf Clusters

MPI Hello World Example(cont)

Page 25: Beowulf Clusters

PVM Hello World Example

Page 26: Beowulf Clusters

PVM Hello World Example

Page 27: Beowulf Clusters

Interconnection Hardware

Two main choices – technology and topology

Main Technologies

Ethernet with speeds up to 10Gbps

Infiniband with speeds up to 300 Gbps

Image Source:http://www.sierra-cables.com/Cables/Images/12X-Infiniband-R.jpg

Page 28: Beowulf Clusters

Interconnection Topology

Torus Network

Bus Network

Flat Neighborhood Network

Page 29: Beowulf Clusters

References

[1] Impagliazzo, J., & Lee, J. A. N. (2004). History of Computing in Education. Norwell: Kluwer Academic Publishers.

[2] Pfeiffer, C. (Photographer). (2006, November 25). Cray-1 Deutsches Museum [Web Photo]. Retrieved from http://en.wikipedia.org/wiki/File:Cray-1-deutsches-museum.jpg

[3] Sterling, T. (2002). Beowulf Cluster Computing with Linux. United States of America: Massahusetts Institue of Technology.

[4] Sterling, T. (2002). Beowulf Cluster Computing with Windows. United State of America: Massachusetts Institute of Technology.

[5] Condor High Throughput Computing. (2013, October 24). Retrieved October 27, 2013, from http://research.cs.wisc.edu/htcondor/

Page 30: Beowulf Clusters

References

[6] Beowulf: A Parallel Workstation For Scientific Computation. (1995). Retrieved October 27, 2013, from http://www.phy.duke.edu/~rgb/brahma/Resources/beowulf/papers/ICPP95/ icpp95.html

 [7] Development over Time | TOP500 Supercomputer Sites. Retrieved October 27, 2013, from www.top500.org/statistics/overtime/

[8] Jain, A. (2006). Beowulf cluster design and setup. Retrieved October 27, 2013. Informally published manuscript, Department of Computer Science, Boise State University, Retrieved from http://cs.boisestate.edu/~amit/research/beowulf/beowulf-setup.pdf

[9] Zinner, S. (2012). High Performance Computing Using Beowulf Clusters. Retrieved October 27, 2013. Retrieved from http://www2.hawaii.edu/~zinner/101/students/MitchelBeowulf/cluster.html

Page 31: Beowulf Clusters

Questions???