lecture 4 cluster computing

87
Cluster Computing - GCB 1 Cluster Computing Javier Delgado Grid-Enabledment of Scientific Applications Professor S. Masoud Sadjadi

Upload: shaikhned

Post on 09-Dec-2014

5.196 views

Category:

Education


6 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Lecture 4 Cluster Computing

Cluster Computing - GCB 1

Cluster Computing

Javier DelgadoGrid-Enabledment of Scientific Applications

Professor S. Masoud Sadjadi

Page 2: Lecture 4 Cluster Computing

Cluster Computing - GCB 2

Page 3: Lecture 4 Cluster Computing

Cluster Computing - GCB 3

Page 4: Lecture 4 Cluster Computing

Cluster Computing - GCB 4

Essence of a Beowulf

Hardware One head/master node (Several) compute nodes Interconnection modality (e.g. ethernet)

Software Parallel Programming Infrastructure Scheduler (optional) Monitoring application (optional)

Page 5: Lecture 4 Cluster Computing

Cluster Computing - GCB 5

Scheduling

Multiple users fighting for resources = bad Don't allow them to do so directly

Computer users are greedy Let the system allocate resources

Users like to know job status without having to keep an open session

Page 6: Lecture 4 Cluster Computing

Cluster Computing - GCB 6

Cluster Solutions

Do-it-yourself (DIY) OSCAR Rocks Pelican HPC (formerly Parallel Knoppix) Microsoft Windows CCE OpenMosix (closed March 2008) Clustermatic (no activity since 2005)

Page 7: Lecture 4 Cluster Computing

Cluster Computing - GCB 7

DIY Cluster

Advantages Control Learning Experience

Disadvantages Control Administration

Page 8: Lecture 4 Cluster Computing

Cluster Computing - GCB 8

DIY-Cluster How-To Outline

Hardware Requirements Head Node Deployment

Core Software Requirements Cluster-specific Software Configuration

Adding compute nodes

Page 9: Lecture 4 Cluster Computing

Cluster Computing - GCB 9

Hardware Requirements

Several commodity computers: cpu/motherboard memory ethernet card hard drive (recommended, in most cases)

Network switch Cables, etc.

Page 10: Lecture 4 Cluster Computing

Cluster Computing - GCB 10

Software Requirements – Head node

Core system system logger, core utilities, mail, etc. Linux Kernel

Network Filesystem (NFS) server support

Additional Packages Secure Shell (SSH) server iptables (firewall) nfs-utils portmap Network Time Protocol (NTP)

Page 11: Lecture 4 Cluster Computing

Cluster Computing - GCB 11

Software Requirements – Head node

Additional Packages (cont.) inetd/xinetd – For FTP, globus, etc. Message Passing Interface (MPI) package Scheduler – PBS, SGE, Condor, etc. Ganglia – Simplified Cluster “Health” Logging

dependency: Apache Web Server

Page 12: Lecture 4 Cluster Computing

Cluster Computing - GCB 12

Initial Configuration

Share /home directory Configure firewall rules Configure networking Configure SSH Create compute node image

Page 13: Lecture 4 Cluster Computing

Cluster Computing - GCB 13

Building the Cluster

Install compute node image on the compute node Manually PXE Boot (pxelinux, etherboot, etc.) RedHat Kickstart etc.

Configure host name, NFS, etc. ... for each node!

Page 14: Lecture 4 Cluster Computing

Cluster Computing - GCB 14

Maintainance

Software updates in head node require update in compute node

Failed nodes must be temporarily removed from head node configuration files

Page 15: Lecture 4 Cluster Computing

Cluster Computing - GCB 15

Building the Cluster

But what if my boss wants a 200-node cluster? Monster.com OR come up with your own automation scheme OR Use OSCAR or Rocks

Page 16: Lecture 4 Cluster Computing

Cluster Computing - GCB 16

Cluster Solutions

Do-it-yourself (DIY) OSCAR Rocks Pelican HPC (formerly Parallel Knoppix) Microsoft Windows CCE OpenMosix (closed March 2008) Clustermatic (no activity since 2005)

Page 17: Lecture 4 Cluster Computing

Cluster Computing - GCB 17

OSCAR

Open Source Cluster Application Resources Fully-integrated software bundle to ease

deployment and management of a cluster Provides

Management Wizard Command-line tools System Installation Suite

Page 18: Lecture 4 Cluster Computing

Cluster Computing - GCB 18

Overview of Process

Install OSCAR-approved Linux distribution Install OSCAR distribution Create node image(s) Add nodes Start computing

Page 19: Lecture 4 Cluster Computing

Cluster Computing - GCB 19

OSCAR Management Wizard

Download/install/remove OSCAR packages Build a cluster image Add/remove cluster nodes Configure networking Reimage or test a node with the Network Boot

Manager

Page 20: Lecture 4 Cluster Computing

Cluster Computing - GCB 20

OSCAR Command Line tools

Everything the Wizard offers yume

Update node packages C3 - The Cluster Command Control Tools

provide cluster-wide versions of common commands

Concurrent execution example 1: copy a file from the head node to all

visualization nodes example 2: execute a script on all compute nodes

Page 21: Lecture 4 Cluster Computing

Cluster Computing - GCB 21

C3 List of Commands

cexec: execution of any standard command on all cluster nodes

ckill: terminates a user specified process cget: retrieves files or directories from all cluster

nodes cpush: distribute files or directories to all cluster

nodes cpushimage: update the system image on all

cluster nodes using an image captured by the SystemImager tool

Page 22: Lecture 4 Cluster Computing

Cluster Computing - GCB 22

List of Commands (cont.)

crm: remove files or directories cshutdown: shutdown or restart all cluster

nodes cnum: returns a node range number based on

node name cname: returns node names based on node

ranges clist: returns all clusters and their type in a

configuration file

Page 23: Lecture 4 Cluster Computing

Cluster Computing - GCB 23

Example c3 configuration

# /etc/c3.conf## # describes cluster configuration##cluster gcb { gcb.fiu.edu #head node dead placeholder #change command line to 1 indexing compute-0-[0-8] #first set of nodes exclude 5 #offline node in the range (killed by J. Figueroa)}-------

Page 24: Lecture 4 Cluster Computing

Cluster Computing - GCB 24

OPIUM

The OSCAR Password Installer and User Management

Synchronize user accounts Set up passwordless SSH Periodically check for changes in passwords

Page 25: Lecture 4 Cluster Computing

Cluster Computing - GCB 25

SIS

System Installation Suite Installs Linux systems over a network Image-based Allows different images for different nodes Nodes can be booted from network, floppy, or

CD.

Page 26: Lecture 4 Cluster Computing

Cluster Computing - GCB 26

Cluster Solutions

Do-it-yourself (DIY) OSCAR Rocks Pelican HPC (formerly Parallel Knoppix) Microsoft Windows CCE OpenMosix (closed March 2008) Clustermatic (no activity since 2005)

Page 27: Lecture 4 Cluster Computing

Cluster Computing - GCB 27

Rocks

Disadvantages Tight-coupling of software Highly-automated

Advantages Highly-automated... But also flexible

Page 28: Lecture 4 Cluster Computing

Cluster Computing - GCB 28

Rocks

The following 25 slides are property of UC Regants

Page 29: Lecture 4 Cluster Computing

Cluster Computing - GCB 29

Page 30: Lecture 4 Cluster Computing

Cluster Computing - GCB 30

Page 31: Lecture 4 Cluster Computing

Cluster Computing - GCB 31

Page 32: Lecture 4 Cluster Computing

Cluster Computing - GCB 32

Page 33: Lecture 4 Cluster Computing

Cluster Computing - GCB 33

Page 34: Lecture 4 Cluster Computing

Cluster Computing - GCB 34

Page 35: Lecture 4 Cluster Computing

Cluster Computing - GCB 35

Page 36: Lecture 4 Cluster Computing

Cluster Computing - GCB 36

Page 37: Lecture 4 Cluster Computing

Cluster Computing - GCB 37

Page 38: Lecture 4 Cluster Computing

Cluster Computing - GCB 38

Page 39: Lecture 4 Cluster Computing

Cluster Computing - GCB 39

Page 40: Lecture 4 Cluster Computing

Cluster Computing - GCB 40

Page 41: Lecture 4 Cluster Computing

Cluster Computing - GCB 41

Page 42: Lecture 4 Cluster Computing

Cluster Computing - GCB 42

Page 43: Lecture 4 Cluster Computing

Cluster Computing - GCB 43

Page 44: Lecture 4 Cluster Computing

Cluster Computing - GCB 44

Page 45: Lecture 4 Cluster Computing

Cluster Computing - GCB 45

Determine number of nodes

Page 46: Lecture 4 Cluster Computing

Cluster Computing - GCB 46

Page 47: Lecture 4 Cluster Computing

Cluster Computing - GCB 47

Page 48: Lecture 4 Cluster Computing

Cluster Computing - GCB 48

Page 49: Lecture 4 Cluster Computing

Cluster Computing - GCB 49

Page 50: Lecture 4 Cluster Computing

Cluster Computing - GCB 50

Page 51: Lecture 4 Cluster Computing

Cluster Computing - GCB 51

Page 52: Lecture 4 Cluster Computing

Cluster Computing - GCB 52

Rocks Installation Simulation

Slides courtesy of David Villegas and Dany Guevara

Page 53: Lecture 4 Cluster Computing

Cluster Computing - GCB 53

Page 54: Lecture 4 Cluster Computing

Cluster Computing - GCB 54

Page 55: Lecture 4 Cluster Computing

Cluster Computing - GCB 55

Page 56: Lecture 4 Cluster Computing

Cluster Computing - GCB 56

Page 57: Lecture 4 Cluster Computing

Cluster Computing - GCB 57

Page 58: Lecture 4 Cluster Computing

Cluster Computing - GCB 58

Page 59: Lecture 4 Cluster Computing

Cluster Computing - GCB 59

Page 60: Lecture 4 Cluster Computing

Cluster Computing - GCB 60

Page 61: Lecture 4 Cluster Computing

Cluster Computing - GCB 61

Page 62: Lecture 4 Cluster Computing

Cluster Computing - GCB 62

Page 63: Lecture 4 Cluster Computing

Cluster Computing - GCB 63

Page 64: Lecture 4 Cluster Computing

Cluster Computing - GCB 64

Page 65: Lecture 4 Cluster Computing

Cluster Computing - GCB 65

Page 66: Lecture 4 Cluster Computing

Cluster Computing - GCB 66

Page 67: Lecture 4 Cluster Computing

Cluster Computing - GCB 67

Page 68: Lecture 4 Cluster Computing

Cluster Computing - GCB 68

Installation of Compute Nodes

Log into Frontend node as root At the command line run: > insert-ethers

Page 69: Lecture 4 Cluster Computing

Cluster Computing - GCB 69

Page 70: Lecture 4 Cluster Computing

Cluster Computing - GCB 70

Page 71: Lecture 4 Cluster Computing

Cluster Computing - GCB 71

Installation of Compute Nodes

Turn on the compute node Select to PXE boot or insert Rocks

CD and boot off of it

Page 72: Lecture 4 Cluster Computing

Cluster Computing - GCB 72

Page 73: Lecture 4 Cluster Computing

Cluster Computing - GCB 73

Page 74: Lecture 4 Cluster Computing

Cluster Computing - GCB 74

Page 75: Lecture 4 Cluster Computing

Cluster Computing - GCB 75

Cluster Administration

Command-line tools Image generation Cluster Troubleshooting User Management

Page 76: Lecture 4 Cluster Computing

Cluster Computing - GCB 76

Command Line Tools

Cluster-fork – execute command on nodes (serially)

Cluster-kill – kill a process on all nodes Cluster-probe – get information about cluster

status Cluster-ps – query nodes for a running process

by name

Page 77: Lecture 4 Cluster Computing

Cluster Computing - GCB 77

Image Generation

Basis: Redhat Kickstart file plus XML flexibility and dynamic stuff (i.e. support for “macros”)

Image Location: /export/home/install Customization: rolls and extend-compute.xml Command: rocks-dist

Page 78: Lecture 4 Cluster Computing

Cluster Computing - GCB 78

Image Generation

Source: http://www.rocksclusters.org/rocksapalooza/2007/dev-session1.pdf

Page 79: Lecture 4 Cluster Computing

Cluster Computing - GCB 79

Example

Goal: Make a regular node a visualization node Procedure

Figure out what packages to install Determine what configuration files to modify Modify extend-compute.xml accordingly (Re-)deploy nodes

Page 80: Lecture 4 Cluster Computing

Cluster Computing - GCB 80

Figure out Packages

X-Windows Related X, fonts, display manager

Display wall XDMX, Chromium, SAGE

Page 81: Lecture 4 Cluster Computing

Cluster Computing - GCB 81

Modify Config Files

X configuration xorg.conf Xinitrc

Display Manager Configuration

Page 82: Lecture 4 Cluster Computing

Cluster Computing - GCB 82

User Management

Rocks Directory: /var/411 Common configuration files:

Autofs-related /etc/group, /etc/passwd, /etc/shadow /etc/services, /etc/rpc

All encrypted Helper Command

rocks-user-sync

Page 83: Lecture 4 Cluster Computing

Cluster Computing - GCB 83

Start Computing

Rocks is now installed Choose an MPI runtime

MPICH OpenMPI LAM-MPI

Start compiling and executing

Page 84: Lecture 4 Cluster Computing

Cluster Computing - GCB 84

Pelican HPC

LiveCD for instant cluster creation Advantages

Easy to use A lot of built-in software

Disadvantages Not persistent Difficult to add software

Page 85: Lecture 4 Cluster Computing

Cluster Computing - GCB 85

Microsoft Solutions

Windows Server 2003 Compute Cluster Edition (CCE)

Microsoft Compute Cluster pack (CCP) Microsoft MPI (based on MPICH2) Microsoft Scheduler

Page 86: Lecture 4 Cluster Computing

Cluster Computing - GCB 86

Microsoft CCE

Advantages Using Remote Installation Services (RIS), compute

nodes can be added by simply turning it on May be better for those familiar with Microsoft

Environment Disadvantages

Expensive Only for 64-bit architectures Proprietary Limited Application base

Page 87: Lecture 4 Cluster Computing

Cluster Computing - GCB 87

References

http://pareto.uab.es/mcreel/PelicanHPC/ http://pareto.uab.es/mcreel/ParallelKnoppix/ http://www.gentoo.org/doc/en/hpc-howto.xml http://www.clustermatic.org http://www.microsoft.com/windowsserver2003/c

cs/default.aspx http://www.redhat.com/docs/manuals/linux/

RHL-9-Manual/ref-guide/ch-nfs.html portmap man page http://www.rocksclusters.org/rocksapalooza http://www.gentoo.org/doc/en/diskless-howto.x

ml http://www.openclustergroup.org