scientiﬁc programming 13 (2005) 173–175 ios press book...

Scientific Programming 13 (2005) 173–175 173IOS Press

Book review

High Performance Linux Clusters by A. Joseph D.Sloan, O’Reilly & Associates Inc., Sebastopol, CA,USA, 2004. ISBN: 0-596-00570-9

High Performance Linux Clusters describes how toprovide substantial computational resources for large-scale computing. The approach is to use low-cost orfreely available components arranged as a computa-tional cluster. The book has 350 pages and includes aPreface, an Index, and five Parts divided into 17 Chap-ters and an Appendix. The Parts introduce the ideaof cluster computing, provide a quick-start overview,describe how to configure clusters for particular needs,and tell how to write message-passing applications.The fifth Part is the Appendix, which includes copiousreferences, both in-print and on-the-network.

The book’s goal provides guidance when buildingthe most effective distributed computing resource giventight budgetary constraints. These days, this means us-ing hardware based on Intel chips (or clones), Ethernet,and software based on the Linux operating system. Ananalogy is a couple just starting a family, whose budgetrequires the investment of “sweat equity” in order toown a dream house. So, this book’s goal is to help thescientist, engineer or economist whose program takestoo long to run on the available computer, and who hasonly a limited budget to invest. The book’s advice:invest “sweat equity” to achieve the goal of reducingthe time-to-solution via building a cluster. So how tobuild a cluster? This book provides check lists to assistwith both the planning and the building of a cluster.

The book covers hardware and network issues, butthe main emphasis is a step-by-step guide to selecting,obtaining, configuring, installing, and finally, using, thesoftware which makes a LAN into a cluster. The bookpromises to guide the reader through the whole process.It won’t delve deeply into any particular step but ratherprovides a strategic overview of the process. Follow theplan of the book, so it promises, and the result will be aworking, useful cluster. Supercomputing-on-a-budgetmay be had for those who merely have the need, butnot the resources, for jobs that would otherwise require“big iron”.

The Preface introduces the plan of the book. The planis to use freely available software (alas, the hardwaremust be found somehow). The software should bedownloaded from the Internet. The author has donethe reconnoitering, and has found out which softwareto get. The hardware may be recycled computers, ormight be purchased new. How to keep the costs low iscertainly discussed. Then one needs the applicationssoftware, which probably means doing some actualprogramming by using a message-passing library.

Chapter One starts with the question, what is a clus-ter? If a department has its computers on a Local AreaNetwork, is that already a cluster? Sadly, no, a LAN isnot a cluster. There’s a further layer of logical connec-tivity needed to go from LAN to cluster. That transitionis what this book describes. Don’t think that “grid”is the latest buzzword for cluster; grids have a layerof abstraction beyond that present with clusters. Withgrids, the user is not expected to know the name oraddress of the eventual resource provider – the namesare replaced by descriptions, the services are “virtual-ized” by a layer of software providing an abstractionbeyond clustering. With cluster computing, the userswill know the names of the computers providing theservices. The limitations of parallel computing, specif-ically expected speed-ups and the expected scaling asproblem size increases are discussed.

Chapter Two prompts one to state clearly the goalsfor building a cluster. What question is one trying toanswer and how is the cluster expected to help get theanswer? I especially like this chapter. It prompts theasking of a series of questions pertinent to decidingon how to proceed. The first step in any engineeringproject (and building a cluster certainly qualifies as en-gineering!) is to state clear goals. The author examinesthe likely implications of the hardware and softwarechoices available. The idea of cluster kits is presented,both download based and CD-ROM based. Next, thereis a discussion of benchmarking the initial results toestablish a baseline in anticipation of measuring theresults of software configuration changes or hardwareadditions.

ISSN 1058-9244/05/$17.00 © 2005 – IOS Press and the authors. All rights reserved

174 Book review

Chapter Three discusses the variables in the deci-sions regarding hardware. Hand-me-down hardware ischeaper, but may constrain the capability of the result-ing cluster. So how to reduce the costs of new hard-ware? The compute nodes will not continually needkeyboards and monitors, so one can eliminate them.But what to do if the BIOS will not boot without akeyboard? The author discusses solutions, in this casea switch to connect one keyboard-and-monitor pair toany of several computers. And how to build the net-work? Will good old Ethernet suffice? It’s cheaperthan a dedicated high-performance interconnect, andsuitable for many purposes.

Chapter Four discusses the variables in the decisionsthat are to be taken regarding the software. The au-thor’s bias in favor of Open Source software appears tobe well placed. Everything needed for fully function-ing clusters can be downloaded for free on the Inter-net. Indeed, one question facing the cluster maker isthe choice of alternatives. Linux itself and the atten-dant configuration and security issues are the subjecthere. For example, the trades involved with havinga single front-end connection to outside networks arediscussed. A dedicated front-end allows much easiercommunications within the cluster itself. For example,some packages used are easier, or are only possible, toconfigure if the secure shell is present.

Chapter Five discusses openMosix. Many virtualmemory systems have a set of addresses for the userspace of a process, and a distinct set for the systemaddresses of a process. openMosix is a set of patches tothe kernel which allows the user addresses to be movedto a different node within the cluster. System calls arestill processed on the originatingnode. One difficulty isthat there is little control over which process is executedon which node. The openMosix system is supposedto balance the load automatically. Nevertheless, asprocesses make system calls, there is communicationfrom the node to which user addresses have been movedto the system space on the originating node.

The hardware choice is almost certainly Intel, or aclone. The software choice is Linux, perhaps Red Hatalthough other Linux sources will do. So the biggestdecision remaining is how to establish the cluster soft-ware itself. The author presents this as a choice be-tween OSCAR and ROCKS. These are two schemesfor selecting and installing cluster software. ROCKSincludes the installation of Red Hat Linux, OSCARdoes not.

Chapter Six discusses OSCAR. OSCAR is the OpenSource Cluster Application Resources package. Basi-

cally, OSCAR provides the administrator with a graphicinterface to guide the choice of included software. Itdownloads the individual packages and installs them.It ensures that they have mutually compatible, and use-ful if not optimal, configurations. The author remarksthat OSCAR delays, rather than permanently reduces,the need to eventually learn cluster administration. Thekey to OSCAR is the opd, the OSCAR Package Down-loader. This is the software which brings the packagesto be included on the cluster to the system. OSCARis supported by the Open Cluster Group, which in turnis supported by Dell, IBM, Intel, NCSA, and ORNL.The chapter describes the installation and configurationprocess.

Chapter Seven discusses ROCKS. ROCKS is fromNPACI. The most significant difference betweenROCKS and OSCAR is that ROCKS installs Red HatLinux automatically, so the administrator doesn’t haveto prepare the nodes. ROCKS allows additional soft-ware to be selected in the form of rolls (get it? ROCKSand rolls). One of the rolls is the Intel compilers,C/C++ and Fortran. A license is required from Intel touse them. Again, the chapter describes the installationand configuration process.

Chapter Eight discusses cloning systems, that is, au-tomating the process of duplicating system software.Note that a system cannot be duplicated completely inthat it would give the clone the same network param-eters as the original system. So the cloning has to be“almost” with just enough differences to allow the net-work to work. Use of Kickstart. g4u (ghost for Unix),and SystemImager are discussed.

Chapter Nine discusses programming software. Thediscussion of programming languages is the only sec-tion of the book I found not to be helpful. First, Fortran(note the lowercase) means Fortran 2003. The GnuCompiler Collection has a Fortran 95 component, to bemerged into the GCC main distribution with the 4.0.0release. (I have both the g95 fork and the mainline gfor-tran versions of Gnu Fortran on my Linux computertoday.) Even if considering only free compilers, a dis-cussion of Fortran 77 is as out-of-date as a discussionof K&R C.

The most unhelpful suggestion is that the program-mer learns a new language while simultaneously learn-ing message passing! However, I feel the bigger issuehas been missed. This is, after all, High-PerformanceClusters. The missing issue is that of Gnu CompilerCollection (GCC) compilers versus commercial com-pilers. A homogeneous cluster needs only a single-seatfor the compiler, perhaps on the front-end node, or on

Book review 175

a dedicated compile server-node. An earlier chaptermentioned the Intel compilers, which generate more ef-ficient code on Intel hardware than the GCC compilers,as commercial compilers do generally. If one obtains amodest increase in execution rate, a commercial com-piler is well worth the cost. If one can qualify for theIntel non-commercial license, the Intel compilers arefree (but read Intel’s definition of “non-commercial”carefully, it’s more strict than you might think!). Inany case, the Academic prices of several compiler ven-dors are modest. This means one may have high per-formance compilers for both C/C++ and Fortran forless than the costs of a single node. No matter howfond one is of the GCC, or of free software in general,getting several nodes worth of execution for the costsof a single node must be a good buy. The discussionof high performance computing will be helped by areference to Dowd & Severance’s excellent book HighPerformance Computing; unfortunately out-of-print,from the same publisher.

But that’s enough, let’s move forward. Some ver-sion of MPI will most likely be the message-passinglibrary used on a cluster. The next chapter describes theinstallation of LAM/MPI and MPICH. The discussionthen progresses to debuggers. The section on HDF5states that there is no Fortran binding, which is wrong.Simply go to the website mentioned in the book to getthe Fortran binding. I’m especially pleased to see thesection on SPRNG, parallel random-numbergeneratorsmay well be needed with cluster computing. It’s goodto see that the needs of these users are not overlooked.

Chapter Ten discusses software for managing a clus-ter. Now that there are many nodes to maintain, andthe administrator must repeat every maintenance itemon each node. That is merely a tedious task (perhapsto be delegated to a student) if there are only 8 nodesor 16 nodes. When there are hundreds of nodes, it’s animpossible task. So there’s software to automate dis-tribution of configuration files throughout the cluster.This chapter discusses the C3 and Ganglia packages.

Now that the cluster is a well-administered,users willwant to run programs on it. Unless the supported usercommunity is so small that there’s little chance of onejob interfering with another there must be schedulingsoftware. A small group of supported users will bene-fit from restart, monitoring and notification services, alarger group will demand these services. The subject

of Chapter Eleven is OpenPBS and the Maui scheduler.After actual jobs start running, input/output becomes

a concern. So Chapter Twelve discusses parallel filesystems. The emphasis is on PVFS, the Parallel VirtualFile System.

Part IV, which comprises Chapters 13, 14, 15, 16,17, is an all too brief introduction to MPI-1. A begin-ner needs a more thorough discussion. But the basicsof message passing are introduced, along with debug-ging, timing and profiling. For more than just gettingstarted, one will want to read further. Fortunately, sev-eral good books are available; some are mentioned inthe Appendix.

Now the trip from plans to computing is complete.As Eisenhower said, “Plans are worthless, but planningis essential.” Does this book have the answer to allquestions? No, of course not and it doesn’t make thatclaim. But it does provide a strategic plan, it tells onewhat questions to ask, and where to look for answers.The cluster administrator will want to read the instruc-tions accompanyingeach package anyway; as packagesare being updated. The instructions that a colleaguehas used recently might not apply to the version down-loaded today. What this book does best is to organizethe questions, so the prospective cluster administratormay be able to anticipate the answers with some pre-science. While the plans may or may not be worthless,the planning remains essential.

If I had to assemble a cluster for a research group, Iwould certainly want this book at hand, before, during,and after the effort. This book will help with the plan-ning of a cluster, and with the actual assembling of it.I would use this book to write check lists for each step,starting with an overall check list. Next I would followeach chapter to develop a more detailed check list toimplement each step of the overall check list. The bookis small enough for other individuals on the team toread it. Thus informed, they will be able to make help-ful contributions to the process, each individual mayanticipate to the needs of the jobs. With this book, anindividual or team may be able to quickly converge onthe best use of limited resources in the form of clustercomputing. And that was the goal.

Dan NaglePurple Sage Software Inc.

USA

Submit your manuscripts athttp://www.hindawi.com

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014


Distributed Sensor Networks


Advances in

FuzzySystems

Hindawi Publishing Corporationhttp://www.hindawi.com

Volume 2014


ReconfigurableComputing

Hindawi Publishing Corporation http://www.hindawi.com Volume 2014


Applied Computational Intelligence and Soft Computing

Advances in

Artificial Intelligence


Advances inSoftware EngineeringHindawi Publishing Corporationhttp://www.hindawi.com Volume 2014


Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications


Hindawi Publishing Corporation

http://www.hindawi.com Volume 2014

Advances in

Multimedia


Biomedical Imaging


ArtificialNeural Systems

Advances in


RoboticsJournal of



Computational Intelligence and Neuroscience

Industrial EngineeringJournal of


Modelling & Simulation in EngineeringHindawi Publishing Corporation http://www.hindawi.com Volume 2014

The Scientific World JournalHindawi Publishing Corporation http://www.hindawi.com Volume 2014


Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in


scientiﬁc programming 13 (2005) 173–175 ios press book...

Documents