atlanta,georgia november 14–19,2004 conference …...summarizer: jan schaumann slas cross a...

22
conference reports LISA ’04: 18th Large Installation System Administration Conference Atlanta, Georgia November 14–19, 2004 SPECIAL AWARDS Doug Hughes was the recipient of the first Chuck Yerkes Award for Outstanding Individual Contribution on Member Forums, and Brent Chapman was the recipient of the SAGE Outstanding Achievement Award. CONFERENCE SUMMARIES Configuration Management Workshop Paul Anderson, University of Edinburgh Summarized by John Hawkins This year’s configuration work- shop was attended by 27 peo- ple, from a range of academic and commercial organizations with often widely differing requirements for their configu- ration management tools. In the three years since the first workshop, there has been great-er recognition that the configuration problem extends beyond that of configuring sin- gle machines toward methods of managing collections of nodes, often in a decentralized manner or by a devolved man- agement team. This workshop took a slightly different form from previous ones. Each of the four sessions followed a different theme, con- centrating on separate areas of the problem. The traditional presentations of attendees’ tools were not present this time, helping to avoid the “tool wars” of previous workshops. It is now widely agreed that no cur- rent tool adequately solves all the problems in configuration man- agement, and that collaboration between researchers in the field and a refocusing on the principles behind tool design, rather than the refinement of any one tool, are required for future progress. As usual, a range of polls were taken. The majority of attendees regard themselves as tool develop- ers, but a much smaller number have written tools that are also used by others. Many manage either high-performance clusters or Windows machines, and well over half indicated a strong inter- est in the theoretical research issues in system configuration, in addition to practical concerns of tool development. Some attendees felt that those within the configuration commu- nity are approaching a consensus, but others thought there is some way to go yet. The problems involved are only beginning to be specified in words whose mean- ings are agreed on, and there is still much duplication of work. Attempts were made to define a number of terms in common usage but with slightly ambiguous or overloaded meanings. “Policy” was suggested to be a description of what a machine is intended to do, including intended collabora- tions as well as configurations. “Service Level Agreements” (SLAs) were defined as relation- ships promising service within specific tolerances. Effort is required on the specifica- tion of intermachine relationships. A tool may correctly configure a machine in isolation, but more complex ways of capturing the details of a service specification are needed to ensure that interma- chine relationships hold, thus pro- ducing the desired service behav- iors. ;LOGIN: FEBRUARY 2005 18TH LARGE INSTALLATION SYSTEM ADMINISTRATION CONFERENCE 69 Our hearty thanks go: To the LISA ’04 scribe coordinator: John Sechrest To the LISA ’04 summarizers: Tristan Brown Rebecca Camus Andrew Echols John Hawkins Jimmy Kaplowitz Peter H. Salus John Sechrest Josh Simon Gopalan Sivathanu Josh Whitlock And to the EuroBSDCon 2004 summarizer: Jan Schaumann

Upload: others

Post on 20-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Atlanta,Georgia November 14–19,2004 conference …...summarizer: Jan Schaumann SLAs cross a dividing line, since they require active monitoring. While it is essential that this moni-toring

conferencereports

LISA ’04: 18th LargeInstallation SystemAdministration Conference

Atlanta, GeorgiaNovember 14–19, 2004

S P E C I A L AWA R D S

Doug Hughes was the recipientof the first Chuck Yerkes Awardfor Outstanding IndividualContribution on MemberForums, and Brent Chapmanwas the recipient of the SAGEOutstanding AchievementAward.

CO N F E R E N C E S U M M A R I E S

Configuration ManagementWorkshop

Paul Anderson, University ofEdinburgh

Summarized by JohnHawkins

This year’s configuration work-shop was attended by 27 peo-ple, from a range of academicand commercial organizationswith often widely differingrequirements for their configu-ration management tools.

In the three years since the firstworkshop, there has beengreat-er recognition that theconfiguration problem extendsbeyond that of configuring sin-gle machines toward methodsof managing collections ofnodes, often in a decentralizedmanner or by a devolved man-agement team.

This workshop took a slightlydifferent form from previousones. Each of the four sessionsfollowed a different theme, con-centrating on separate areas ofthe problem. The traditionalpresentations of attendees’ toolswere not present this time,helping to avoid the “tool wars”of previous workshops. It isnow widely agreed that no cur-

rent tool adequately solves allthe

problems in configuration man-agement, and that collaborationbetween researchers in the fieldand a refocusing on the principlesbehind tool design, rather than therefinement of any one tool, arerequired for future progress.

As usual, a range of polls weretaken. The majority of attendeesregard themselves as tool develop-ers, but a much smaller numberhave written tools that are alsoused by others. Many manageeither high-performance clustersor Windows machines, and wellover half indicated a strong inter-est in the theoretical researchissues in system configuration, inaddition to practical concerns of tooldevelopment.

Some attendees felt that thosewithin the configuration commu-nity are approaching a consensus,but others thought there is someway to go yet. The problemsinvolved are only beginning to bespecified in words whose mean-ings are agreed on, and there isstill much duplication of work.

Attempts were made to define anumber of terms in common usagebut with slightly ambiguous oroverloaded meanings. “Policy”was suggested to be a descriptionof what a machine is intended todo, including intended collabora-tions as well as configurations.“Service Level Agreements”(SLAs) were defined as relation-ships promising service withinspecific tolerances.

Effort is required on the specifica-tion of intermachine relationships.A tool may correctly configure amachine in isolation, but morecomplex ways of capturing thedetails of a service specification areneeded to ensure that interma-chine relationships hold, thus pro-ducing the desired service behav-iors.

; LO G I N : F E B R UA RY 2 0 0 5 1 8 TH L A RG E I N STA L L ATI O N SYSTE M A DM I N I STR ATI O N CO N F E R E N C E 69

Our hearty thanks go:To the LISA ’04 scribecoordinator:John SechrestTo the LISA ’04summarizers:Tristan BrownRebecca CamusAndrew EcholsJohn HawkinsJimmy KaplowitzPeter H. SalusJohn SechrestJosh SimonGopalan SivathanuJosh WhitlockAnd to the EuroBSDCon 2004 summarizer:Jan Schaumann

Page 2: Atlanta,Georgia November 14–19,2004 conference …...summarizer: Jan Schaumann SLAs cross a dividing line, since they require active monitoring. While it is essential that this moni-toring

SLAs cross a dividing line, sincethey require active monitoring.While it is essential that this moni-toring be integrated into the tool,it should be part of a layer distinctfrom the specification languagecurrently used.

As soon as dynamic properties areintroduced, much of the certaintythat previously existed is lost, andit will no longer be possible to tellwhat’s true or false at any particu-lar time, thus moving into therealm of probabilities. This is afundamental problem that must belived with. It was suggested thatthis uncertainty has at least twodimensions, time and value uncer-tainty.

The possibility of a set of stan-dards for configuration was dis-cussed, with the POSIX standardfor UNIX as an analogy. The use ofa low-level API for configurationwas also suggested.

Mark Burgess led a session ondecentralized configuration, illus-trated by Ed Smith’s simulation ofa decentralized service managercapable of reconfiguring inresponse to node failure.

A move away from centrally man-aged systems is required to copeboth with problems of scale andwith vulnerability to central pointsof failure that are becoming prob-lematic for large sites. However,this move brings with it reductionsin predictability, trust, and relativesimplicity of management. Decen-tralized management allowsautonomy where some configura-tion decisions are made by the sys-tem by use of protocols, includingthose for negotiation and servicediscovery. To govern thisautonomous behavior, the systemmust have in place an awareness ofits environment that is unneces-sary under central managementand policy control.

There was some discussion of per-vasive computing, the manage-ment of large numbers of smalldevices, but this is currently anopen problem and it may be too

early to consider it in any detailsince the range of requirements ofsuch systems is not yet fullyunderstood.

Luke Kanies and Alva Couchtalked of the difficulty of achievingmore widespread adoption of con-figuration management tools. Bar-riers to adoption include the hos-tility of system administratorsused to the current ways of work-ing, the complex task of respecify-ing the configuration of the siteunder the new tool, and unhelpfulmanagement attitudes not aidedby poor cost models and lack oftrust in often inadequately provensystems.

Alva’s presentation described howthe cost of configuration manage-ment goes through four phases,each phase reaching a point wherethe cost rapidly increases and amore sophisticated approach toconfiguration management mustbe adopted. Most sites havereached the point at the end of thesecond phase where the configura-tion is managed “incrementally,”for example with cfengine, but en-counters problems with hiddenpreconditions requiring bare-metalrebuilds. They are not aware ofhow to make the transition to a“proscriptive” management strat-egy.

Site managers need to know atwhat point it becomes more eco-nomical to adopt a heavyweighttool such as LCFG, with huge ini-tial costs in setup and staff trainingbut more robust results in han-dling large numbers of machines.

The problem of loss of institu-tional memory between the “incre-mental” and “proscriptive” phaseswould be alleviated if data miningtechniques could be used to pullout a large proportion of the cur-rent configuration and convert thisto configuration data for the newsystem. There was some discus-sion as to whether this isintractable.

A session on case studies withSteven Jenkins and John Hawkins

attempted to increase the numberof specific examples of configura-tion problems tool users are actu-ally grappling with. Question-naires were distributed, andalthough these have yet to be ana-lyzed at the time of writing, theresponse was impressive, and it ishoped that this exercise will provefruitful.

On devolved aspect resolution, itwas suggested that the systemshould follow the human processof resolution. Political lines areimportant and should be reflectedby the machine aspect resolution.An expert system could be utilizedto predict the impact of the choiceof value.

Suggestions about where researchshould focus from this pointincluded the formalization anddocumentation of the collectiveknowledge so far, provision of lim-ited user control, description ofconflicts within configurationsand procedures for their resolu-tion, mechanisms for configura-tion transactions, and the identifi-cation of a wider range ofcase-study examples with a varietyof candidate tools on which to trythem.

This is likely to remain an activearea for some time, but there was ageneral feeling of optimism at theworkshop that solutions to manyof the problems are reachable.

Sysadmin Education Workshop

John Sechrest, PEAK Internet Services;Curt Freeland, University of NotreDame

Summarized by John Sechrest

The system administration educa-tion workshop addresses theprocess of system administrationeducation at the university level.This was the seventh year of theworkshop. Previous materials forsystem administration course con-tent were discussed and an earliercurriculum of how many differentcourses might fit together into adegree program was reviewed.

70 ; L O G I N : V O L . 3 0 , N O . 1

Page 3: Atlanta,Georgia November 14–19,2004 conference …...summarizer: Jan Schaumann SLAs cross a dividing line, since they require active monitoring. While it is essential that this moni-toring

A new Web site (http://education.sage.org) was unveiled as a start-ing point for information collec-tion, and an online content man-agement tool called Drupal wasexplored(http://education.sage.org/drupal).

The goal of the Web site is toenable more collaboration andcooperation between groupsworking on university sysadmineducation.

Over the last two years, there hasbeen a reduction in the number ofuniversities that support systemadministration courses. Thechanges in the economy are beingfelt in the universities.

There was a substantial discussionof just-in-time learning materialsfor university training for existingsystem administration staff andhow these materials might beshared in the context of othercourses. There was also discussionof how online learning modulesmight be useful.

The system administrationeducation mailing list is now [email protected].

Advanced Topics Workshop

Adam S. Moskowitz, Menlo Computing

Summarized by Josh Simon

The Advanced Topics Workshopstarted with a quick overview ofthe new moderation software andcontinued with introductionsaround the room—businesses(including consultants) outnum-bered universities by about 2 to 1,and the room included three for-mer LISA program chairs and sixformer members of the USENIXBoard or SAGE Executive Com-mittee.

Our first topic was introducing theconcepts of disciplined infrastruc-ture to people (i.e., it’s more thanjust cfengine or isconf, or infra-structure advocacy, or getting ridof the ad hoc aspects). Some envi-ronments have solved this prob-lem at varying levels of scale; oth-ers have the fear of change

paralyzing the system administra-tion staff. One idea is to offload the“easy” tasks either to automation(while avoiding the “one-off”problem and being careful withnaming standards) or to more jun-ior staff so that senior staff canspend their time on more interest-ing things. Management buy-in isessential; exposing all concernedto LISA papers and books in thefield has helped in some environ-ments. Like many of our problems,this is a sociological one and notjust a technical one. Rememberthat what works on systems (e.g.,UNIX and Windows boxes) maynot work for networks (e.g.,routers and switches), which maybe a challenge for some of us. Wealso noted that understandinginfrastructures and scalability isvery important, regardless ofwhether you’re in systems, net-work, or development. Similarlyimportant is remembering twothings: First, ego is not relevant,code isn’t perfect, and a developer’sego does not belong in the code.Second, the perfect is the enemy ofthe good; sometimes you have toacknowledge there are bugs andrelease it anyway.

After the morning break, we dis-cussed self-service, where sysad-min tasks are traditionally handedoff (ideally in a secure manner) tousers. Ignoring for the momentspecial considerations (like HIPAAand SOX), what can we do aboutself-service? A lot of folks areusing a number of Web forms orautomated emails, including thebusiness process (e.g., approvals),not just the request itself. Oneconcern is to make sure theprocess is well defined (all edgecases and contingencies plannedfor). We’ve also got people doinguser education (“we’ve done thework, but if you want to do ityourself the command is . . .”).Constraining possibilities to doonly the right thing, not the wrongthing, is a big win here.

Next we discussed metrics. Somemanagers believe you have to

measure something before you cancontrol it. What does this mean?Well, there are metrics for servicegoals (availability and reliabilityare the big two), in-person meet-ings for when levels aren’t met,and so on. Do the metrics help theSAs at all, or just management? Itcan help the SAs identify a flaw inprocedures or infrastructure, orshow an area for improvement(such as new hardware purchasesor upgrades). We want to stressthat you can’t measure what youcan’t describe. Do any metricsother than “customer satisfaction”really matter? Measure what peo-ple want to know about or arecomplaining about; don’t justmeasure everything and try to fig-ure out what’s wrong from the(reams of) data. Also, measuringhow quickly a ticket got closed ismeaningless: Was the problemresolved, or was the ticket closed?Was the ticket reopened? Was itreopened because of a failure inwork we did, or because the userhad a similar problem and didn’topen a new ticket? What’s the pur-pose of the metrics? Are we addingpeople or laying them off? Quanti-fying behavior of systems is easy;quantifying behavior of people(which is the real problem here) isvery hard. But tailor the result inthe language of the audience, notjust numbers. Most metrics thatare managed and monitored cen-trally have no meaningful value;metrics cannot stand alone, butneed context to be meaningful.Some problems have technicalsolutions, but metrics is not one ofthem. What about trending? Howoften and how long do you have tomeasure something before itbecomes relevant? Not all metricsare immediate.

After a little bit of network trou-bleshooting (someone’s WindowsXP box was probing port 445 onevery IP address in the networkfrom the ATW), we next discussedvirtualized commodities such asUser Mode Linux. Virtualmachines have their uses—for

; LO G I N : F E B R UA RY 2 0 0 5 1 8 TH L A RG E I N STA L L ATI O N SYSTE M A DM I N I STR ATI O N CO N F E R E N C E 71

Page 4: Atlanta,Georgia November 14–19,2004 conference …...summarizer: Jan Schaumann SLAs cross a dividing line, since they require active monitoring. While it is essential that this moni-toring

research, for subdividingmachines, for providing easilywiped generic systems for firewallsor DMZ’d servers where you worryabout them being hacked, and soon. There are still risks, though,with reliance on a single point offailure (the hardware machine)theoretically impacting multipleservices on multiple (virtual)machines.

Next we discussed how to get themost out of wikis as internal tools.What’s out there better thanTWiki? We want authenticationout of LDAP/AD/Kerberos, amongother things. The conference usedPurpleWiki, which seems to bemore usable. There’s a lot of push-back until there’s familiarity.They’re designed for some specificthings, but not everything. Youneed to be able to pause and refac-tor discussions if you use it as, forexample, an email-to-Wiki gate-way. (There is an email-to-Wikigateway that Sechrest wrote.) Ifemail is the tool most people use,merging email into a wiki may be abig win. Leading by example—take notes in the wiki in real time,format after the fact, organize itafter you’re done—may help sell itto your coworkers.

Next we listed our favorite tool ofthe past year, as well as shorterdiscussions about Solaris 10, IPv6,laptop vendors, backups, andwhat’s likely to affect us on thetechnology front next year. We fin-ished off by making our annualpredictions and reviewing lastyear’s predictions; we generally didpretty well.

K EY N OTE A D D R E S S

Going Digital at CNN

Howard Ginsberg, CNN

Summarized by GopalanSivathanu

Howard spoke about CNN’s planto move video storage, playout,and editing from physical media tofile-based operations on disk. The

main motivating factors behindthis change are improving thespeed of general operations likeediting and transfer and bringingabout a better means for locatingarchived data. He pointed out thatthrough file operations, searchingbecomes much easier than pullingout physical videotapes.

According to CNN, using digitalformat for storing video simplifiesaccess, eases manageability, andprovides ready adaptability and acompetitive edge in expediting theoperations so as to bring news atthe right time. Howard introducedthe method that CNN plans toadopt to bring about this change: asystem called “Integrated Produc-tion Environment” (IPE), whoseoperations would broadly be man-aging media assets, scheduling thevarious operations on the media,production, playout, and, finally,archiving the old data so that theycan be accessed quickly wheneverrequired.

While speaking about the com-plexity involved in such a hugetransition, Howard presentednumbers for the approximate stor-age space required for storing oneday’s video in high-res and low-resformats. He pointed out that, sincehigh-res video might requirearound 22GB of storage per day,performing operations like editingand file transfer required foracquiring, archiving, and transfer-ring data is a big challenge.

Howard then discussed the variouspros and cons of making this tran-sition. The major pros he pointedout are parallelism in recordingand editing, sharing, versioning,flexibility, and transfer speed. Themain disadvantages are absence of“at scale” reference architecturesand the asynchronicity betweendeployment and technologicalimprovement wherein technologi-cal development renders the sys-tem obsolete by the time it’sdeployed.

Howard pointed out that provid-ing forward and backward com-

patibility between software andhardware is one of the importantchallenges in making the transi-tion. He then described the projectphases and workflow and gave abrief overview of the system archi-tecture.

S PA M / E M A I L

Scalable Centralized Bayesian SpamMitigation with Bogofilter

Awarded Best Paper!

Jeremy Blosser and David Josephsen,VHA Inc.

Summarized by GopalanSivathanu

Jeremy and David presented thepaper together. Jeremy began bydescribing the spam problem andexisting solutions such as check-summing and sender verification.Before introducing their own solu-tion, he went into the history ofBayesian filtering and discussed itsdisadvantages. He then describedtheir Bogofilter Bayesian classifier,which they implemented on aLinux and Qmail plaform. He gavethe salient features of the methodand its capabilities. They believethe success rate to be as high as98–99%.

They displayed a graph showingthe percentage of inbound mailblocked as spam to total inboundmail before and after implement-ing Bogofilter. It showed a sharpincrease in the number of mails fil-tered after Bogofilter was installed,in late April of 2003, and the per-formance has persisted for over ayear.

Jeremy and David described thetraining phase required forBogofilter: sorting several days’worth of messages sorted intospam and non-spam teachesBogofilter how to differentiatespam in a given environment.They explained that, after training,the parameters of Bogofilter haveto be tuned by running it over asample of presorted mail and logoutput and comparing error rates.They gave a 10-minute demonstra-

72 ; L O G I N : V O L . 3 0 , N O . 1

Page 5: Atlanta,Georgia November 14–19,2004 conference …...summarizer: Jan Schaumann SLAs cross a dividing line, since they require active monitoring. While it is essential that this moni-toring

tion of their automated trainingscripts.

I N V ITE D TA L KS

What Is This Thing Called SystemConfiguration?

Alva Couch, Tufts University

Summarized by JimmyKaplowitz

Alva Couch talked about efforts tofind a “good enough” path to con-figuration management. Whatwould be ideal would be todescribe to the computer thedesired high-level behavior, butnowadays we still manually trans-late this into an implementationspecification for the computers tointerpret. Couch distinguishedbetween host-based configuration,where individual machines are setup to cooperate in providing aservice, and the more fault-toler-ant network-based method of con-figuration, where a service is setup network-wide. The languagesused in system configuration caneither be procedural, an intuitiveparadigm that shows the steps ofimplementation that lead to thedesired result, or declarative, anonobvious but clear form of lan-guage that simply states thedesired intent and leaves it to soft-ware to determine the implemen-tation. These contrasts reflect acontinuum of rigor leading frommanual and unstructured changeson individual hosts to declarativespecifications applied to the entirefabric of the network. The mostimportant aspect of a configura-tion system, however, is the disci-pline of the administrators adopt-ing it. This is much moreimportant than the softwareinvolved.

Configuration specifications caneither be proscriptive, specifyingeverything, or incremental, wheresome requirements are specifiedbut others are not. Beginners,Couch says, are not proscriptiveenough, allowing the presence oflatent preconditions that differ

among hosts. The next stepbeyond this is a federated network,where software can change thefunction of individual machines inresponse to usage patterns andother conditions. As networkmanagement systems gain com-plexity from ad hoc to federated,they become harder to start man-aging but easier to continue andfinish managing. Simpler configu-ration management systems arebetter on smaller-scale networksand over shorter periods of time,but in large and long-lived net-works these techniques help.

Configuration languages, at theircore, describe data and state, notthe algorithms with which pro-gramming languages concernthemselves. Existing languages tomanage federated networks haveclasses of machines, but they areinsufficient when the classes over-lap. The best solutions use aspectcomposition to satisfy constraintsthat specify what properties theadministrators require the net-work configuration to possess.These constraint-based languagesare useful but hard. Configurationlanguage theory is still in itsinfancy.

Anomaly Detection: WhateverHappened to Computer Immunology?

Mark Burgess, Oslo University College

Summarized by JimmyKaplowitz

Anomaly detection in the contextof computers is the process ofmonitoring systems and lookingfor anything unusual, or “funny,”and maybe fixing it, too. To do thisit is necessary to determine whatan anomaly is. Are intrusionsanomalies? What about virusesand malware? Should the systemsignal that regulation is needed?Regardless, the management of theanomaly detection system shouldbe simple and flexible.

Burgess emphasized the scientificapproach to this problem. He saysthere are two types of knowledge:uncertain knowledge about the

real world, such as that found inbiology and experimental physics,and certain knowledge about thefantasy worlds of math and model-ing. It is hard to define what a pat-tern is that an anomaly detectionsystem would be looking for, but arough guess would be a noticeablerepetition or continuous variation.This is modeled with rules, andthe knowledge we have about nor-mal behavior is modeled withparameterized expressions.

Anomalies can be modeled as dis-crete events. For this task, one canuse languages anywhere on theChomsky hierarchy of languages,ranging from regular languages torecursively enumerable languages.In practice, all discrete anomaliesare finite in length, so regular lan-guages and the regular expressionsthat comprise them will suffice. Itis also possible to use techniquessuch as epitope matching or prob-abilistic searching of the patternspace.

Continuous modeling of anom-alies is a high-level way to attackthe problem. This always involvesintrinsic uncertainty, since eachpoint represents a summary ofmany values. It is possible toinclude time in the model, consid-ering sequences rather than indi-vidual anomalous symbols, butremembering all things doesn’tusually help. On the other hand,forgetting temporal details impliescertainty.

In fact, whether the model is dis-crete or continuous, there willalways be uncertainty, either overwhether the symbol is correctlymatched or over whether theshape of the trend is significant.How to deal with this uncertaintyis an important matter of policy.One example of a policy questionis where to set thresholds. Smallvariations should be ignored, butstriking differences need to bedealt with. Also, one can focus onhigh-entropy anomalies, which arevery broad, or specific and focusedlow-entropy anomalies.

; LO G I N : F E B R UA RY 2 0 0 5 1 8 TH L A RG E I N STA L L ATI O N SYSTE M A DM I N I STR ATI O N CO N F E R E N C E 73

Page 6: Atlanta,Georgia November 14–19,2004 conference …...summarizer: Jan Schaumann SLAs cross a dividing line, since they require active monitoring. While it is essential that this moni-toring

Burgess’ conclusion re-emphasizesthat there is inevitable uncertaintyin anomaly detection and that pol-icy is necessary to resolve it. Welack causally motivated models toexplain the reasons behind theanomalies that occur. Humanbeings are still the state of the artfor anomaly detection. It is impor-tant not to make the sample sizetoo large when determiningwhether an event is anomalous ornot, since with a large enoughsample nearly everything lookslike normal variation. Burgess fin-ished by saying that the next stepfor anomaly detection is todevelop higher-level languages todescribe policy.

What Information Security Laws Mean for You

John Nicholson, Shaw Pittman

Summarized by Rebecca Camus

With the prevalence of e-com-merce over the past several years,more and more emphasis has beenplaced on guaranteeing consumersthat their personal informationwill be secure in the business’sdatabases and while the customeris performing online transactions.John Nicholson addressed thetopic of information security byfirst giving the user a crash coursein civics, then presenting both fed-eral and state-level regulations andlaws that businesses must complywith, and finally discussing whatthe government is doing to enforcethese regulations.

Nicholson began by delineatingthe differences between state andfederal regulations and how fed-eral and state laws interact. Fed-eral law has the power to preemptstate law on controversial issuesrelating to individual protections.However, states have the power tomake these laws stricter if theywish.

The next topic that Nicholson cov-ered regarded federal laws. In thepast few years, the federal govern-ment has passed several informa-tion security laws that all busi-

nesses must comply with. Theones presented in this lecture andbrief explanations of each are:

Federal Information SecurityManagement Act: requires eachbusiness to inventory their com-puter systems, identify and pro-vide appropriate security protec-tions, and develop, document, andimplement information securityprograms.

Gramm-Leach-Bliley Act: requiresfinancial institutions to protectnonpublic financial informationby developing, implementing, andmaintaining their informationsecurity programs.

Health Insurance Portability andAccountancy Act: requires allhealth-related industries (includ-ing any company that deals withany health-related businesses orcompanies that self-insure theiremployees) to install safeguards inprotecting the security and confi-dentiality of their customers’ iden-tifiable information.

Sarbanes-Oxley Act: requires thatall stored information must beaccurate. In addition to therequirements of each law, they allrequire that companies performfrequent testing and risk assess-ments and fix all subsequent prob-lems.

Prompted by the increase in iden-tity thefts, California has led theway with state-level informationsecurity laws. Even businesses notlocated in California but with cus-tomers who are California resi-dents must comply with Califor-nia’s new information securitylaws. According to California’s SB1386, a resident cannot have his orher full name or first initial andlast name connected with eitherhis or her Social Security Number,driver’s license number, Californiaidentification card number, oraccount/credit card/debit card number,access code, or password. In addi-tion to SB 1386, California hasalso passed AB 1950, whichrequires all businesses (even third-

party companies who have pur-chased information about Califor-nia residents) dealing with Califor-nia residents to implement andmaintain reasonable security pro-cedures to protect the informationfrom unauthorized access or modi-fication.

All of these regulations are beingheavily monitored and enforced bythe Federal Trade Commission.Businesses that violate any of theselaws or regulations are subject toheavy fines and possible lawsuits.

I NTR U S I O N A N D V U L N E R A B I L IT Y

D E TE C TI O N

Summarized by Josh Whitlock

A Machine-Oriented VulnerabilityDatabase for Automated VulnerabilityDetection and Processing

Sufatrio, Temasek Laboratories,National University of Singapore;Roland H. C. Yap, School of Computing,National University of Singapore;Liming Zhong, Quantiq International

The motivation for the work isthat CERT-reported vulnerabilitieshave increased greatly in theperiod from 1995 to 2002. Theincrease, at first glance, is expo-nential; there have been decreasedreports in the time period, though.To address this increase in vulner-abilities, system administratorslook at vulnerability reports anduse databases of vulnerabilities.However, such human interven-tion is not scalable to the increasein the number of vulnerabilities.The solution is to increase thespeed of addressing new alerts bymoving from human language tomachine language reports. Thegoal of the work is to create a newframework and database that canbe used by third-party tools bymaking the framework declarativeand flexible.

Related work on this subjectincludes Purdue Coop Vulnerabil-ity database, ICAT, WindowsUpdate, and Windows SoftwareUpdate Server. ICAT is not de-signed for automatic responses

74 ; L O G I N : V O L . 3 0 , N O . 1

Page 7: Atlanta,Georgia November 14–19,2004 conference …...summarizer: Jan Schaumann SLAs cross a dividing line, since they require active monitoring. While it is essential that this moni-toring

but is geared more toward merevulnerability searches. WindowsUpdate and Windows SoftwareUpdate Server, two automaticpatch update systems, are closedand have black-box updates.Issues with these products men-tioned above include concernsthat black-box scanners andupdaters might leak informationand do not address problems suchas what is affected by the vulnera-bility in a system.

To produce a prototype of theframework, approximately 900CERT advisories were examinedmanually. Information scavengedfrom the advisories included con-ditions for vulnerabilities to existand the impact of vulnerabilities.A scanner written in Perl and amySQL database were used as theprototype. A symbolic languagewas created and used to store avulnerability source, an environ-ment source, a vulnerability conse-quence, and an exploit.

The conclusions reached were thatmanual translations of vulnerabili-ties were not feasible due to theincreasing size; the key to a betterframework is the abstraction ofvulnerabilities instead of newtools, and this issue needs theinvolvement of many organiza-tions, including CERT, BugTraq,and SANS.

DigSig: Runtime Authentification of Binaries at Kernel Level

Axelle Apvrille, Trusted Logic; SergeHallyn, IBM LTC; David Gordon,Makan Pourzand, and Vincent Roy,Ericsson

DigSig is a kernel module whosemain intent is to provide protec-tion from trojan horses. The mod-ule achieves this goal using publickey cryptography, signing knowntrusted programs using a hiddenprivate key, and verifying signa-tures at program load using thepublic key. Previous attemptsinclude CryptoMark and modulesfrom IBM Research, but DigSigseeks to contribute by being prac-

tical, simple to install, and effi-cient.

DigSig uses SHA-1 and RSA en-cryption, supports signature revo-cation, and has good performance.The main problems for DigSiginclude the lack of support for net-work file systems, lack of scripthandling (allowing for trojanedscripts going uncaught), and noprotection against vulnerabilitiessuch as buffer overflows.

I3FS: An In-Kernel Integrity Checkerand Intrusion Detection File System

Swapnil Patil, Anand Kashyap,Gopalan Sivathanu, and Erez Zadok,Stony Brook University

The motivation for I3FS (pro-nounced I cubed FS) is that intru-sions are on the rise; prevention isnearly impossible, and effectiveintrusion handling requires timelydetection. The most common waysto detect intrusions are to establishinvariant, run-scheduled integritychecks and to use non-kernel pro-grams for the detection. I3FS doesall this by existing in the kernel atthe file system layer and usingchecksums for integrity checks. Alook at other tools shows thatmany are user tools, includingTripwire, Samhain, and AIDE. TheLinux Intrusion Detection Systemis one of the few kernel tools. I3FSwas designed with a threat modelof unauthorized replacement ofkey files, modification of filesthrough raw disk access, and mod-ification of data in the network.I3FS has a stackable file system.The administrator can choosewhich files to protect and the pol-icy for protecting them. The poli-cies range from checking inodefields to the frequency at whichchecks are performed.

I3FS is implemented using an in-kernel Berkeley database and aB+ tree format. With caching,performance is good.

I N V ITE D TA L KS

LiveJournal’s Backend and mem-cached: Past, Present, and Future

Lisa Phillips and Brad Fitzpatrick,LiveJournal.com

Summarized by Andrew Echols

LiveJournal started out as a collegehobby project, providing blogging,forums, social networking, andaggregator services. Today, Live-Journal boasts over 5 million useraccounts and 50 million pageviews every day. LiveJournalquickly outgrew its capacity sev-eral times over, starting with a sin-gle shared server and scaling up tofive dedicated servers.

Each step along this path hadproblems with reliability andpoints of failure, with eachupgrade merely trying to remedythe problem at hand. Eventually,the user database was partitionedinto separate user clusters, but asgrowth continued, this systembecame chaotic as well.

Today, there is a global databasecluster with 10 individual userdatabase clusters, as well as sepa-rate groups of Web servers, proxyservers, and load balancers. Fur-thermore, master-master clusterswere implemented so that eachdatabase master has an identicalcold spare ready in the event of ahardware failure.

LiveJournal also uses a largeamount of custom software thathas been open sourced. Perlbal is aload balancer written in Perl thatprovides single-threaded, event-based load balancing. Perlbal alsohas multiple queues for prioritiz-ing free and paid users.

MobileFS is a distributed user-space file system where filesbelong to classes. The file systemtracks what devices files are onand utilizes multiple tracker data-bases.

Memcached provides a distributedmemory cache, taking advantageof unused system memory toreduce database hits. Memcached

; LO G I N : F E B R UA RY 2 0 0 5 1 8 TH L A RG E I N STA L L ATI O N SYSTE M A DM I N I STR ATI O N CO N F E R E N C E 75

Page 8: Atlanta,Georgia November 14–19,2004 conference …...summarizer: Jan Schaumann SLAs cross a dividing line, since they require active monitoring. While it is essential that this moni-toring

is now in use at many sites,including Slashdot, Wikipedia,and Meetup.

NFS, Its Applications and Future

Brian Pawlowski, Network Appliance

Summarized by John Sechrest

Brian Pawlowski is active in thedevelopment of NFS version 4.NFS is a distributed file systemprotocol started in 1985. The cur-rent version is version 3; the newrevision of NFS, version 4, wasinfluenced by AFS. Brian outlinedhow NFS was used in the past andcontrasted it with how it is beingused today. Grid computing andthe larger Internet are puttingmore demands on distributed filesystems.

NFSv4, an openly specified dis-tributed file system with reducedlatency and strong security, is wellsuited for complex WAN deploy-ment and firewalled architectures.NFSv4 represents a huge improve-ment in execution and coordina-tion over NFSv3, also improvesmulti-platform support, and isextensible. It lays the groundworkfor migration/replication andglobal naming.

NFSv4 adds a Kerberos V5(RFC1510) authentication systemas a way to create a secure distrib-uted file system. It also providesfiner grained control for access,including optionally supportingACLs.

NFSv4 is available now in someplatforms. More will be comingout over the next six months. It isavailable for Linux 2.6, but not bydefault; you must add it explicitly.

Future development in NFS mayinclude session-based NFS, direc-tory delegations, migration/repli-cation completion, failover, proxyNFS, and a uniform global name-space.

Brian spent some time outlininghow NFS works in a grid or clus-tered environment.

NFSv4 seems like a substantialtransformation, which will make a

significant difference for distrib-uted file systems.

CO N F I G U R ATI O N M A N AG E M E NT

Summarized by Josh Whitlock

Nix: A Safe and Policy-Free Systemfor Software Deployment

Eelco Dolstra, Merijn de Jonge, andEelco Visser, Utrecht University

Transferring software frommachine to machine is hard, espe-cially when packages don’t workright. There can be difficultieswith multiple versions and unreli-able dependency information. Thecentral idea of Nix is to store allpackages in isolation from oneanother so that dependency andversion problems evaporate. Thisis done by storing the path of eachpackage as an MD5 hash of all theinputs used to build the package,thus producing automatic version-ing. When a Nix package isinstalled, if there are dependen-cies, those packages are automati-cally installed too. Versioning isdealt with by having symlinks thatpoint to the current environment.To roll back to a previous versionmeans changing a symlink topoint to the old version. To deleteprevious versions, the symlink tothe old version is removed and theversion is removed from disk. Nixthus allows for safe coexistence ofversions, reliable dependencies,atomic upgrades and rollbacks,and multiple concurrent configu-rations.

I N V ITE D TA L K

Documentation

Mike Ciavarella, University ofMelbourne

Summarized by Rebecca Camus

Mike Ciavarella presented on theimportance of documentation tosystem administrators. However,instead of simply stating howessential documentation is to sys-tem administrators, he sparkedmany people’s interests by compar-ing the work of a system adminis-

trator to Alice from the bookThrough the Looking Glass.

He began the discussion by rein-forcing the fact that system admin-istrators are very different from theusers they support. Many timesthese differences lead to a lack ofcommunication between the userand the system administrator,which, in turn, causes the sysad-min to feel underappreciated andleads to a sense of frustration.Such sysadmins often feel littlemotivation to use valuable timedocumenting their system.

It is essential for system adminis-trators always to document theirwork, for multiple reasons. A sys-tem administrator will often beworking on several projects atonce, and it is very easy to forgetwhat one has previously done on aproject when dealing with severalother concurrent projects. Also,the system administrator has tothink of the future. It is commonfor things to go wrong in a systemthat has not been worked onrecently. It is helpful to be able toreference documentation that willhelp solve the problem at hand.And a system administrator maymove on to other projects. It ishelpful to those dealing with thesystem to be able to reference thematerial that was written when thesystem was created.

Overall, documentation willimprove the way system adminis-trators are perceived, emphasizingtheir professionalism and ulti-mately saving time and reducingstress.

G U R U S E S S I O N

Linux

Bdale Garbee, HP Linux, CTO/Debian

Summarized by Tristan Brown

Topics included HP’s commitmentto Linux, stabilization of the Linuxkernel, and future directions forLinux.

HP has a huge commitment toLinux across their product line.

76 ; L O G I N : V O L . 3 0 , N O . 1

Page 9: Atlanta,Georgia November 14–19,2004 conference …...summarizer: Jan Schaumann SLAs cross a dividing line, since they require active monitoring. While it is essential that this moni-toring

Every division uses Linux to somedegree, although some (such asthe Laptop division) would like tosee more. Linux represents asource of future growth, given itscurrent immature state in the mar-ket. One problem currently facedis the variety of distributions thatare used and supported, even atHP. Not only are tier-A distribu-tions used (e.g., SuSE or RedHat),but so are near-tier-A distribu-tions, such as some internationaldistributions. For many people, anoncommercial distribution likeDebian is required. Efforts to cre-ate one Linux strategy are under-way at HP.

The discrepancies between ver-sions of the Linux kernel can be aproblem, as many libraries andapplications need to be compiledagainst a specific kernel. This hastraditionally been a problem forLinux, although recent efforts tostabilize major interfaces haveresulted in less churn. This is agood thing, due to the tendencyfor closed-source drivers to becompiled for specific versions.Certification is also easier, sincecertification is typically done for aspecific kernel. The end result isthat the kernel can now be consid-ered a serious tool for end users.

There are many niches Linux isnow or will be entering, beyondthe typical desktop or serversetup. A significant portion ofhigh-performance computing isdone using Linux clusters, andLinux is beginning to enter thefinancial markets. At the other endof the spectrum, HP is beginningto work with Linux in the embed-ded space, although not much forreal-time uses. They are also tak-ing Linux to the DSP market.

N E T WO R K I N G

autoMAC: A Tool for AutomatingNetwork Moves, Adds, and Changes

Christopher J. Tengi, Joseph R.Crouthamel, Chris M. Miller, andChristopher M. Sanchez, PrincetonUniversity; James M. Roberts, TuftsUniversity

Summarized by Tristan Brown

Network administration at Prince-ton’s Computer Science depart-ment, with more than 1500 hostsand 100 subnets, is less than triv-ial. To manage this system, theautoMAC tools were developed.The toolset replaces a largely man-ual process of entering host infor-mation into a database, configur-ing a switch, and connecting apatch cable with a fully automaticapproach. Key to this system areseveral tools:

Web registration system. Thisallows administrators and users toenter configuration informationthat is validated and automaticallyentered into the DHCP, DNS, andNIS databases. This replaces a pre-vious manual system that involvedediting a file using vi.

NetReg server. This server answersDHCP requests for all hosts andacts as a gateway for unauthenti-cated users. When an unrecog-nized host is attached to the net-work, all HTTP requests areintercepted and redirected to a reg-istration page.

FreeRADIUS. This technology,originally created for wirelessaccess points, allows a device’sMAC address to be used as anauthentication key without anyspecial configuration from theclient end. When the switch sees adevice, it asks the server for infor-mation about its MAC address.Known devices are automaticallyswitched to their appropriateVLAN, while unknown devicesend up on the registration VLAN.

With these tools, adding a host tothe database is simple to perform,and moving from one Ethernetport to another requires no config-

uration changes at all. Futureimprovements include automaticvirus scanning that moves infectedhosts to a quarantine VLAN. Moreinformation is available athttp://www.cs.princeton.edu/autoMAC/.

More Netflow Tools for Performanceand Security

Carrie Gates, Michael Collins, MichaelDuggan, Andrew Kompanek, and MarkThomas, Carnegie Mellon University

Network analysis for large ISP net-works can involve massive datasets consisting of over 100 millionflow records per day stored for aperiod of months. To deal withthis demand for storage and pro-cessing, the SiLK tools were devel-oped. Based on Netflow logs, thecustom SiLK format takes lessthan half the space and has severaltools to perform statistical analysison the data.

The SiLK logfile format starts withthe directory tree, with data segre-gated by log date and type of traf-fic. Several fields from the Netflowlog are removed, and others, suchas time, are stored with reducedrange or precision. HTTP traffic isisolated from the rest, allowing thetransport type and port numberfields to be reduced to two bits ofdata, specifying only the TCP portused. This results in a two-bytesavings over other packet typesand reduces each entry to less thanhalf the size of the original Net-flow record.

To complement the compact log-file format, SiLK includes fourtools to analyze traffic patternsand seven utilities for summariza-tion. The tools all operate acrossthe directory structure and canwork with sets of IP addresses.Analysis performed with the toolset can be used by administratorsin a variety of ways, such asdetecting virus traffic on the net-work.

The SiLK tool set is available athttp://silktools.sourceforge.net.

; LO G I N : F E B R UA RY 2 0 0 5 1 8 TH L A RG E I N STA L L ATI O N SYSTE M A DM I N I STR ATI O N CO N F E R E N C E 77

Page 10: Atlanta,Georgia November 14–19,2004 conference …...summarizer: Jan Schaumann SLAs cross a dividing line, since they require active monitoring. While it is essential that this moni-toring

I N V ITE D TA L K

Flying Linux

Dan Klein, USENIX

Summarized by Tristan Brown

This talk poses a simple question:Is Linux robust enough to powerthe computers responsible for fly-by-wire in state-of-the-art aircraft?These computers operate withreal-time constraints and no toler-ance for unexpected failure. Linuxfaces several difficulties meetingthese demands.

Linux developers work on activi-ties they enjoy, not necessarily thedrudge-work required to eliminateelusive, rarely encountered bugs.Corporate developers are paid to do the work the Linux developersdon’t do.

The Linux source contains mil-lions of lines of code, many ofthem experimental, untested, orirrelevant to a fly-by-wire system.Knowing which components tocompile in is a difficult task, andtracking the interrelations betweenthese systems is all but impossible.A GraphViz graph of the FreeBSDkernel (fewer than half the lines ofcode of Linux) results in a graphso complex that individual nodesand edges are no longer distin-guishable.

Linux is the most exploited oper-ating system, counting only man-ual attacks. The kernel receiveshundreds of patches from all overthe world. Which contributors canyou trust? The NSA has decided toharden Linux, but are we sure vul-nerabilities haven’t slipped inthrough a back door?

The loose, open developmentmodel of Linux isn’t ideal for cre-ating a secure, stable platform.Ignoring this, however, there is yetanother problem. Fly-by-wire sys-tems are computers: they can reactonly to the situations they havebeen programmed for. Mechanicalor sensor failures can result in thecomputer taking the wrong courseof action. Human pilots have the

ability to adapt to new situationsmuch better than their electroniccounterparts, although this advan-tage is steadily shrinking.

The unfortunate conclusion is thatLinux, like its penguin mascot,may never fly. The difficult task ofcreating a hardened, real-time sys-tem can be left to purpose-builtsoftware, and Linux can remainthe general-purpose operating sys-tem it was designed to be.

S PA M M I N I -SYM P O S I U M

Filtering, Stamping, Blocking, Anti-Spoofing: How to Stop the Spam

Joshua Goodman, Microsoft Research

Summarized by John Hawkins

This talk gave a broad picture ofthe difficulties currently beingcaused by widespread proliferationof spam, the techniques recentlyadopted by spammers to getaround ever more complex spam-filtering tools, and possible solu-tions to further deal with spam.

A number of statistics were pre-sented, including the results of apoll in which 40% of respondentsstated that spam overload was thebiggest problem faced by IT staff.Spam has increased from around8% of all email in 2001 to at least50% currently. Wasted time deal-ing with unwanted messages,offense caused by often obscenecontent, and a lowering of trust inemail systems make spam a majorproblem.

Examples of different spammertechniques were given, such asenclosing content in an image,swapping letters of words, andusing coded HTML to confuse fil-ters. Techniques for gaining accessto computers to turn them intospam relays were also mentioned.

A range of methods to deal withspam were discussed, some ofwhich may be combined toincrease effectiveness: Better filter-ing and use of machine-learningtechniques will identify more com-plex patterns: using honeypots,

which should never receive “good”mail, to identify messages that aredefinitely spam; charging smallamounts per message as an eco-nomic barrier to profitable spam;computational challenges requir-ing a calculation on the client,making it difficult to send manymessages simultaneously. Many ofthe techniques discussed are onlyapplicable in certain situations,and most have significant draw-backs. Some proved unpopularwith the audience, which wascomposed mainly of email admins.

Approaches for anti-spoofingemail addresses, types of non-email spam, and legal approachesto tackling spam, such as the Can-Spam Act, were also mentioned.

The speaker pressed the case for aspecific conference to cover spam-related issues, due to the increas-ing seriousness and complexity ofthe problem.

Lessons Learned Reimplementing anISP Mail Service Infrastructure toCope with Spam

Doug Hughes, Global Crossing

Summarized by John Hawkins

In contrast to the previous talk byJoshua Goodman, which gave awide-angle view of current spamissues, this talk discussed thespecifics of dealing with spamwhile managing an engineeringmail platform with 200 users, anumber of ISPs including one with1096 users, and 4 to 6 millionmails processed a day.

The talk began with some statis-tics, identifying the source andnature of the spam dealt with andhow some of the trends are chang-ing. Of the mail received fromItaly, for example, 90% is currentlybeing blocked as spam. It wasfound that 98% of spam could beblocked by examining the headersalone.

The main part of the talk was moretechnical, covering how the sites’mail systems were configured todeal with spam. The majority ofthe filtering employs a modified

78 ; L O G I N : V O L . 3 0 , N O . 1

Page 11: Atlanta,Georgia November 14–19,2004 conference …...summarizer: Jan Schaumann SLAs cross a dividing line, since they require active monitoring. While it is essential that this moni-toring

version of smtpd using a binarysearch tree to store logs, whichvastly reduces search time. Senderaddresses can be compared withthose previously seen to assesswhether a message is likely to begenuine.

The system was shown to be veryeffective, blocking up to 98% ofspam while giving virtually nofalse positives.

I N V ITE D TA L K

Grid Computing: Just What Is It andWhy Should I Care?

Esther Filderman and Ken McInnis,Pittsburgh Supercomputing Center

Summarized by Josh Whitlock

There are different types of gridcomputing, including utility com-puting (where processing cyclesare for sale), distributed comput-ing, and high-performanceresource sharing. What grid com-puting actually is lies betweenthese types. Examples of gridsinclude the CERN Large HadronCollider Computing Grid, sup-porting approximately 12petabytes of data per year with6000+ users, and the ElectronicArts Grid for the Sims Onlinegame, with approximately 250,000players. The challenges of usinggrid computing include flexiblevirtual organizations having differ-ent site policies and disparatecomputing needs.

Components of grids include secu-rity, toolkits, job management,data movement, and Web portals.Security for grid computing isbasic X.509. The standard toolkitis Globus. Condor is a job man-ager for grids, and while flexibleand portable, it is not open sourceor simple to use. Data movementis supported with programs suchas GridFTP, which is built on topof regular FTP but is used forhigh-performance, secure, andreliable data transfer. Web portalsare used to provide a consistentfront to a grid that makes mainte-nance less difficult.

Grids are most successful whenthey have a purpose: setting up agrid at a university that everyonecan use for their own purpose isnot a good idea, but setting one upfor a research project is a goodidea. Political challenges in run-ning a grid include coordinatingwork between sites that don’t trustone another and implementingcommon policies. Technical chal-lenges include having interopera-ble software between sites andsynchronizing upgrades to the gridsoftware.

M O N ITO R I N G A N D

TRO U B L E S H O OTI N G

Summarized by Andrew Echols

FDR: A Flight Data Recorder UsingBlack-Box Analysis of Persistent StateChanges for Managing Change andConfiguration

Chad Verbowski, John Dunagan, BradDaniels, and Yi-Min Wang, MicrosoftResearch

The Flight Data Recorder conceptpresented in this talk targets audit-ing, configuration transaction, and“what if” scenarios. In auditingscenarios, the FDR would assist introubleshooting by identifyingwhat is on a machine, and byanswering questions like what isinstalled, where, when, and bywhom. Changes are then groupedlogically into change actions.

These change actions can betreated as transactions, allowinggroups of changes to be automati-cally rolled back if they are nolonger wanted, poorly done, hadan adverse effect, or were mali-cious.

The information gathered also aidsin answering “what if” questionsregarding the impact of certainchanges. This requires the forma-tion of “application manifests,”which are records of an applica-tion’s frequency of interactions, allstates read, all states written, andhow often it is run.

The approach taken in theresearch is to get into the lowestlevel of the OS. There, state inter-actions can be monitored andunderstood. To be useful, informa-tion gathered must then be ana-lyzed and presented in a way use-ful to humans. This isimplemented as a service that logsgathered information locally. Peri-odically, the logs are uploaded tocentral storage, where they can beprocessed and inserted into a data-base. Finally, a client may view thisinformation online or retrieve itfor offline use.

Real-Time Log File Analysis Using theSimple Event Correlator (SEC)

John P. Rouillard, University ofMassachusetts, Boston

All forms of log analysis result infalse reports, but the Simple EventCorrelator aims to perform auto-mated log analysis while minimiz-ing errors through proper specifi-cation of recognition criteria.Information can be gained fromlogs by looking for certain pat-terns. These patterns may appearacross multiple logs. Absences ofevents and relationships betweenevents should also be noted.

Relationships between events maytake the form of one event takingplace before or after another, or ina sequence. Events may also becoincident within a certain periodof time. If there is some event thatoccurs periodically, its absencewould be significant.

Relationships between events areimportant for combining events.For example, some program mayemit an error without a usernameattached to it, but only afteranother related event that containsa username. Correlation may takeplace by watching for one event,then expecting another within acertain amount of time that can bematched with the first.

SEC allows for analysis of logs inreal time using simple rules forsingle events, as well as manycomplex rules which support

; LO G I N : F E B R UA RY 2 0 0 5 1 8 TH L A RG E I N STA L L ATI O N SYSTE M A DM I N I STR ATI O N CO N F E R E N C E 79

Page 12: Atlanta,Georgia November 14–19,2004 conference …...summarizer: Jan Schaumann SLAs cross a dividing line, since they require active monitoring. While it is essential that this moni-toring

matching multiple events withingiven windows or thresholds andas pairs. With proper use of theserules, relationships between eventssuch as those mentioned abovemay easily be detected.

I N V ITE D TA L K

A New Approach to Scripting

Trey Harris, Amazon.com

Summarized by Josh Whitlock

People think of scripts being dif-ferent from programs. Peoplethink of Perl, Python, and bashshell code as scripts, while theythink of C, C++, and Java code asprograms. Scripts are interpreted,while programs are compiled. Infact, scripts are a subset of pro-grams. They are programs thatmake heavy use of their environ-ment (files and directories), definefew complex data structures, haveno outer event loop, are run byeither the author or administrator,and have a primary purpose ofensuring a given state is achievedin a system. Scripts should be dis-tinguished from programs becausegood programming methodologyis well researched while goodscripting methodology is not.

As an example, consider a scriptthat wants to mount a remote filesystem via NFS. If the script doesnine things, there are nine placeswhere the script can fail. Restart-ing the script when it fails couldresult in unwanted and incorrectduplication of tasks (e.g., multiplemountings). Adding error check-ing code to the script simply con-volutes the code. The problemwith error checking code is that itis syntactic and not based onimplementation. The error check-ing needs to be semantic. TheCommands::Guardedcommand set for Perl is availablefrom CPAN. Each guard consistsof two parts: a condition, or guardstatement, and an executable state-ment. For example:

ensure { $var == 0 }using { $var = 0};

means “if $var is not zero, then set$var equal to zero.” To effectivelyuse guarded commands, decom-pose the code into atomic steps.For each step, write the necessaryand sufficient condition for thestep to complete. Use the guards asthe conditions for each step.

There are many benefits to usingguarded commands. They makescripts more resilient because theymake testing semantic. If the envi-ronment changes, the script ismore likely to continue to func-tion.

Guarded commands reduce theamount of error-checking code,thus making the code easier toread and maintain. If a script ter-minates prematurely, the scriptwill pick back up exactly where itleft off, thanks to the guards.

G U R U S E S S I O N

AFS

Esther Filderman, The OpenAFSProject

Summarized by Peter H. Salus

Esther (Moose) Filderman knowsmore about AFS than anyone elseI’ve ever encountered. She illus-trated this knowledge and herquick wit for about an hour, atwhich time the session dissolvedinto a sequence of queries from thefloor, comments, and idle remarks.

She began by noting that “AFS wasrather clunky and difficult toadminister, but then the universechanged.” What happened, ofcourse, was that it moved fromCarnegie Mellon to a startup calledTransarc, which open-sourced“Andrew.”

Originating in 1986, the core hasbeen maintained and guided since1988 (when Moose joined thecrew). It is now “tightly secure”with a Kerberos-like mechanism.“AFS is a good complement toKerberos,” Moose said. It has thestructure of a distributed file sys-tem, and users can’t tell what’swhat: everything looks like /afs.

There are lots of protections avail-able, but AFS is still “slow,”though “speed is coming alongnicely.” The biggest problem (ofcourse) is with the I/O. “Readtimes are better; write times aregetting better.”

Apparently, when there are lots ofpath or sharing changes, oneshould opt for AFS over NFS.

Definitely interesting and worth-while.

SYSTE M I NTE G R IT Y

Summarized by John Sechrest

LifeBoat: An Autonomic Backup andRestore Solution

Ted Bonkenburg, Dejan Diklic, Ben-jamin Reed, Mark Smith, Steve Welch,and Roger Williams, IBM AlmadenResearch Center; Michael Vanover, IBMPCD

LifeBoat is a backup solutiondesigned for ease of use. The goalis to reduce the total cost of own-ership for a system by reducingclient support costs.

Over 50% of current support costsinvolve PC clients. While these PCclients often have mission criticalsoftware on them, the underlyingserver automation generally doesnot reach down to the PC clients.LifeBoat is aimed at resolving thisby creating a complete rescue andrecovery environment.

Targets for the backup system canbe network peers, dedicatedservers, or local devices. Thismeans that the backup mustinclude file data as well as filemetadata. In order for it to be userfriendly, it must enable users torestore individual files. More criti-cally, it must support special pro-cessing for open files locked byWindows OS. LifeBoat provides akernel driver to obtain file han-dlers for reading locked files.

In order to manage the backed-updata, LifeBoat uses an object stor-age device called SCARED thatorganizes local storage into a flatnamespace of objects. The data is

80 ; L O G I N : V O L . 3 0 , N O . 1

Page 13: Atlanta,Georgia November 14–19,2004 conference …...summarizer: Jan Schaumann SLAs cross a dividing line, since they require active monitoring. While it is essential that this moni-toring

not interpreted; it can beencrypted at the client and storedencrypted on the storage devices.These storage devices authenticateclients directly, supporting amethod of keeping the data path-ways secure.

LifeBoat can use centralizedservers, peers, or local devices forbackup. This offers some flexibil-ity in the overall problem of find-ing a good place to put data whenyou are on the road with your lap-top.

LifeBoat is packaged as a Linuxboot CD providing software usedfor maintenance. This allows for arescue and restoration of a systemwhen there is a problem.

While perhaps the term “auto-nomic” is being overused in thiscase, the system looks as though itenables users to easily back up sys-tems, including mobile systems.This is one of the great problemsin an organization supporting PCsystems. And the solution looks asthough it will integrate well in acorporate environment supportingautonomous server configurationstrategies.

PatchMaker: A Physical NetworkPatch Manager Tool

Joseph Crouthamel, James Roberts,Christopher M. Sanchez, andChristopher Tengi, Princeton University

What do you do when you havelots of networks and wires, with1500 hosts, 675 switch ports, and1100 patch ports, and you want tokeep track of what is plugged intowhat? Often a huge amount of thetime spent debugging a probleminvolves locating the port causingthe problem. Hand-tracing oldwires is very time-consuming.

There used to be a Perl CGI forpatch information, but it did notscale very well. As the port countincreased, they felt they needed abetter tracking system. By puttinga Web interface on the front end, itbecame more useful for a broadgroup of people.

This program has a patch data-base, which is searchable. This issupport port monitoring and man-agement and enables directchanges in the VLAN information.It allows for the cables to be docu-mented and for the switch to bemanaged directly and to trackinformation about each host. Itmakes it more effective by present-ing a visual view of a patch panel.

This is an open source packagewritten in mySQL, PHP, DHTML,and CSS. It supports SNMP/Sflowand uses Mrtg/nMon to monitornetwork activities.

They keep information on patches,racks, hosts, panels, port count,SNMP, and more.

In the future, they hope to addGUI creation, user authenticationand privilege levels, change log-ging, QoS/rate limiting, securityACL management, switch configu-ration, file management, and end-user PortMaker.

This project looks like a local solu-tion that is trying to evolve into aset of broader network manage-ment tools. That the whole sitecan be managed through manag-ing the configuration is an impor-tant step forward toward bringingautomation to network manage-ment. It is too bad that they didnot then integrate this data andstructure with a configurationmanagement system. When hostsare deployed by a managementsystem, you need to deploy thenetwork as well. This package willwork for people to patch by hand,but it has not progressed to thepoint where a content manage-ment system could do it.

http://www.cs.princeton.edu/patchmaker

Who Moved My Data? A BackupTracking System for DynamicWorkstation Environments

Gregory Pluta, Larry Brumbaugh,William Yurcik, and Joseph Tucek,NCSA/University of Illinois

NCSA has a lot of highly mobilecomputers, employees, and stu-

dents. With increased laptop use,in particular, they found that theyhad to rethink their approach todata management.

Far too many people look at lap-tops as a way to provide persistentdata storage, and this puts the dataat risk. As the systems move inand out of the network, thebackup process can easily fail.

NCSA started to look at the per-centage of laptops vs. percentageof backup success rate. Theywanted to be able to find theimportant data that was vital to theorganization.

They set up a backup tracking sys-tem, with authentication servers.This integrates with a Web-basedapplication to go through the dataand search for material. It makes alist of machines and users whohave not been backed up andallows them to work to increasethe backup success rate.

This talk clearly shows that fororganizations that have informa-tion as their main product, it isvital to work out a meaningfulbackup process. And for the mostpart, people do not back up PCsand they don’t back up laptops.NCSA’s system provides an organi-zational framework to enable thesebackups. This was a good measurefor NCSA, but it leaves me wishingfor a good distributed file systemthat knows what to do with mobileusers.

S PA M M I N I -SYM P O S I U M

What Spammers Are Doing to GetAround Bayesian Filtering & WhatWe Can Expect for the Future

John Graham-Cumming, Electric Cloud

Summarized by JimmyKaplowitz

Graham-Cumming started bydescribing several spam-relatedtrends that have occurred in thepast year or so. Most major emailclients support adaptive filtering,with the notable exception of Out-look. Spam is getting slightly sim-

; LO G I N : F E B R UA RY 2 0 0 5 1 8 TH L A RG E I N STA L L ATI O N SYSTE M A DM I N I STR ATI O N CO N F E R E N C E 81

Page 14: Atlanta,Georgia November 14–19,2004 conference …...summarizer: Jan Schaumann SLAs cross a dividing line, since they require active monitoring. While it is essential that this moni-toring

pler and less tricky, but not veryfast. Spam is also using CascadingStyle Sheets.

The speaker then described spe-cific tricks spammers are using,ranging from hard-to-read colorand Internet Explorer’s odd han-dling of hexadecimal color codesto Web bugs and nonexistent zero-width images. From July 2003 toApril 2004, email software firmSophos has noted the relative fre-quencies of certain tricks. Spam-mers try to hide red-flag wordsand include innocuous words.About 10% of spam uses obscureWeb site addresses, 20% uses sim-ple trickery such as the insertionof spaces or punctuation charac-ters into red-flag words, andanother 20% inserts HTML com-ments to break up red-flag words.About 80% of spam uses trickeryof some sort, but the percentagethat doesn’t is increasing. Evensome bulk email software manualspoint out that anti-filtering tricksoften make spam easier to filter.There are even spam-sending pro-grams that include SpamAssassintechnology to allow the spammerto test and tweak their mail formaximum penetration. A verycommon occurrence in spam is forthe text/plain MIME part to con-tain very different content fromthe text/HTML MIME part.

The speaker presented seventough questions to ask vendors ofanti-spam software. The first ques-tion is, “How do you measure yourfalse positive rate?” Claims of99.999% accuracy are meaning-less, since a mail filter that deletesall mail can claim to have removed100% of all spam. Ask what thefalse positive and false negativerates are, and how the vendorknows. Another question is, “Howoften do you react to changes inspammer techniques?” The thirdquestion, designed to cut throughthe hype, is, “What are your toptwo ways of catching spam?” Inorder to prevent spammers fromreceiving feedback on your read-ing preferences, another important

behavior to confirm is that thesoftware prevents Web bugs fromfiring. It also must properly handlelegitimate bulk mailings as non-spam, deal gracefully with userreports of legitimate mail as spam,and have a good method to sepa-rate non-English spam from othernon-English mail.

At the end of the talk, Graham-Cumming predicted that spam-mers will continue to reduce theiruse of obfuscations and othertrickery and that the set of popularspam tricks will continue tochange.

P L E N A RY S E S S I O N

A System Administrator’s Introductionto Bioinformatics

Bill Van Etten, The BioTeam

Summarized by John Sechrest

BioTeam is an organization of sci-entists, developers, and IT profes-sionals who have an objective—vendor-agnostic professionalservices aimed at solving large-scale bioinformatics problems forlarge companies.

Bill Van Etten provided a quick butdetailed introduction to geneticsand genomics. He then went on totalk about how computing isimpacting current biologicalactivities.

H I STO RY

1866 Genetic theory published—Mendel1869 DNA discovered—Miescher1952 DNA is genetic material—Hershey1953 DNA structure—W&C1959 Protein structure deter-mined—Perutz, Kendrew1966 Genetic code1977 DNA sequenced—Sanger1988 Human Genome Projectstarted2001 Human genome sequenced

G E N E TI C S TR IVI A

Everything you are is either pro-tein or the result of protein action.

Proteins are folded strings ofamino acids (20).Proteins’ structure is important.Genes are a hunk of DNA thatdefines a protein.There are 3 billion DNA letters(made up of only 4 characteristics)in the human genome.Five percent of human DNA con-tains genes.999/1000 DNA is identicalbetween any two people.Human genes are 98% the same asthose of a chimpanzee.Human genes are 50% similar tothose of a banana.

In 1953, it was discovered thatDNA structure is a 3D structure inthe form of a double helix. In1968, the DNA code was brokenusing Sanger DNA sequencing.DNA of all possible lengths from aknown starting point is treated.Each strand ends with a radioac-tive dideoxy nucleotide which ter-minates the chain. The strands areweighted.

You get something like:

ACTGAGTGAGCTAACTCACATTAATTGCGTT

It all happens in a single capillarytube, and the results are read vialaser spectrometer. This leads tohigh throughput sequencing (seehttp://www.sanger.ac.uk/Info./IT).

And this leads to an explosion ofpublic sequence databases, which,in turn, leads to a large number ofcomputing related activities:genetic mapping sequence analy-sis, genome annotation, functionalgenomics, comparative genomics,expression analysis, codingregions, and genetic coding.

The growth of the genetic scien-tific inquiry has followed thegrowth in computing power. All ofthis information is able to beprocessed through computer-aideddata analysis.

There are some interpersonal dif-ferences between wetlab peopleand computer scientists: Wetlabpeople know that biology is unreli-able but think computers work

82 ; L O G I N : V O L . 3 0 , N O . 1

Page 15: Atlanta,Georgia November 14–19,2004 conference …...summarizer: Jan Schaumann SLAs cross a dividing line, since they require active monitoring. While it is essential that this moni-toring

well all the time. Computer peoplethink that biology works great butthat computers have problems allthe time.

The half-life of scientific informa-tion is five years.

S E Q U E N C I N G S EA R C H I N GA LG O R ITH M S

There are several sequencingsearching algorithms, which buildand map sequences. These includeBlat, Blast, and HMMR (HiddenMarkoff Matching =-> HMM).They involve many pattern-match-ing approaches. Each tool is par-ticular to each pattern and struc-ture.

BioInformatics is full of problemsthat are embarrassingly parallel.There is a great deal of data, but itis generally easy to decompose theproblem into a number of littleproblems. This type of problemleads to a great many cluster andgrid-style solutions.

But because many of the peoplewho are trying to use these sys-tems are scientists who have ahard time with computing details,they found that a portal architec-ture was a fabulous step forward.

All the gory details of LSF, GridEngine, and Condor are really dis-tractions for scientists who are justtrying to get data back out.

While clusters are cheap ways toincrease computing power, theyare often hard to build, manage,and use. It is also difficult for sci-entists to map computing to com-puters in the cluster, making ithard to achieve high throughputand high performance .

BioTeam has developed a toolcalled iNquiry, which is a rapidcluster installer with a persistentgraphical user interface. It is builtaround the Apple Xserver. You cansee an example of it at http://workgroupcluster.apple.com. (Ask for a login.)

It supports the automatic deploy-ment of a cluster for biocomput-ing, including: network services;

DRM—LSF, etc.; admin tools;monitoring tools; biologicalapplications (200).

The most impressive idea that Igot from this talk was that theywrote a DTD to create a commandline interface and write an XMLdescription of the command-linewhich was used to generate Webinterfaces for command lines. Thishides system from the scientists inan elegant way that has leverage.

As a serious win on the cool mar-keting idea, they used an AppleiPod to build the cluster. All thedata is on the iPod, and it drivesthe boot and install process (andthe sysadmin gets to keep theiPod).

While this works well on the MacOS X-based Xserver, it works lesswell on some Linux system hard-ware platforms. It uses a systemimager.

The core work for the DTD/XMLsolution was done by CatherineLetondal. It is called pise:

http://www.pasteur.fr/recherche/unites/sis/Pise/CCP11/s6.html

http://www.pasteur.fr/recherche/unites/sis/Pise/

http://bioinformatics.oupjournals.org/cgi/reprint/17/1/73.pdf

For more details that might inter-est you, read Bill Bryson’s book AShort History of Nearly Everything.

Web site: http://bioteam.net

S E C U R IT Y

Summarized by Tristan Brown

Making a Game of Network Security

Marc Dougherty, NortheasternUniversity

The idea is to make a closed, fire-walled network with a collectionof hosts running variousunpatched and insecure services.Give each team a host, and givethe teams two days to defend theirmachines while exploiting every-one else. The result is Capture theFlag tools, a framework for online

war games. Run services on yourhost and gain defensive points.Compromise other people’s serversand gain offensive points. Thetoolkit facilitates the competitionin two ways:

1. Service verification. A flagservice is run on each host,allowing the central server toperform confirmation of theactual flag file on the server.The process allows the cen-tral server to determine whatteam is in control of eachserver and to defeat simplecron jobs designed to ensurethat one team’s flag file staysin place. This is a complexprocedure and is currentlybeing overhauled to becomeeven more robust.

2. Scoring. A scoreboard show-ing each team’s progress iskept available. This simpletouch is an effective motiva-tor to keep people workingon the project.

The tool framework has room forfuture expansion. Different gamescan be implemented (verificationis based on Perl scripts), and alarger game based on a large net-work of networks or autonomoussystems is being planned. Moreinformation can be found athttp://crew.ccs.neu.edu/ctf/.

Securing the PlanetLab DistributedTestbed: How to Manage Security inan Environment with No Firewalls,with All Users Having Root, and NoDirect Physical Control of Any System

Paul Brett, Mic Bowman, Jeff Sedayao,Robert Adams, Rob Knauerhause, andAaron Klingaman, Intel Corporation

PlanetLab is an environment withunique security requirements—the system must remain securedespite the fact that owners andusers of each system have rootaccess on the machines. To ensureavailability, several tools are used.Nodes run virtual servers to allowusers to operate as root in theirvirtual server but not in any oth-

; LO G I N : F E B R UA RY 2 0 0 5 1 8 TH L A RG E I N STA L L ATI O N SYSTE M A DM I N I STR ATI O N CO N F E R E N C E 83

Page 16: Atlanta,Georgia November 14–19,2004 conference …...summarizer: Jan Schaumann SLAs cross a dividing line, since they require active monitoring. While it is essential that this moni-toring

ers. Packets are tagged to identifythe originator of all data, thusensuring accountability among theresearch-ers. Traffic control is uti-lized to ensure that no one nodeuses excessive system bandwidth.In the event of a compromise, sys-tems can be remotely rebootedwith a magic packet, causing themto start off of a CD. The machinethen enters debug mode, restrict-ing access and allowing remoteforensics to be performed. Patchesmay be applied, bringing the nodeback to a known, safe state.

Because of all the measures takento protect the network, a largecompromise resulting in rootednodes was detected and containedwithin minutes. Further securityreviews since have resulted in aneffort to decentralize administra-tion and eliminate reusable pass-words. To support future expan-sion, a federated decentralizationscheme is being implemented.

Secure Automation: Achieving LeastPrivilege with SSH, Sudo, and Suid

Robert A. Napier, Cisco Systems

Automation and scripting acrossthe boundary between hosts is adifficult process often subject tosecurity vulnerabilities. Althoughinsecure tools such as Telnet andRSH may no longer be used, SSHand sudo can still be used toexploit a system. A design para-digm of small, simple scripts thatdo one task very well and with theleast privilege possible can help toprevent unauthorized access dueto an improperly written script.Several techniques can be used toharden scripts:

SSH command keys can be used toperform a command on anotherhost without the possibility ofusing the session to execute ashell. This ensures that a script canonly perform its single task on theremote machine, and it keepsattackers from performingunwanted actions on the remotemachine.

Properly configured sudo can givescripts the ability to do just whatthey need and no more. It shouldnot be used to give scripts com-plete root access.

Scripts should have a unique userID to isolate their domain on thesystem even further.

Setuid and setgid can both be usedto allow a script to work asanother user. Setgid has the advan-tage that fewer exploits have beenperformed against it. Setuid to rootis generally a bad idea for a script.

By following the principle of leastprivilege and asking, “How muchaccess do I need?” rather than“How much security do I need?”overall vulnerability due to scriptscan be reduced.

I N V ITE D TA L K

System Administration and SexTherapy: The Gentle Art of Debug-ging

David Blank-Edelman, NortheasternUniversity

Summarized by Peter H. Salus

Blank-Edelman’s underlying thesisis that we can improve our systemadministration through learningfrom other fields. He likes sextherapy. This is because:

Debugging is getting harder.

Debugging is not merely binary.

Both fields (sysadmin and sextherapy) deal with complexsystems.

The phenomena are harderbecause of interdependency;because we don’t control the“parts” and haven’t written thesoftware; as availability increases,the level aided decreases.

We all feel that “sex should justwork.” Blank-Edelman listed anumber of male and female mythsconcerning sex that he found inseveral therapy books. He thenmoved to “system administrationshould just work,” where the prin-cipal assumptions involve:

Plug and play

It’s just (merely) . . .

No administration toolkit

Printing

The Internet

“You have the source, right?”

Blank-Edelman analogizedbetween sex therapy and Agans’(2002) “Nine IndispensableRules”:

1. Understand.

2. Make it fail.

3. Look at it.

4. Divide and conquer.

5. Change one thing at a time.

6. Keep an audit trail.

7. Check the plug.

8. Get a fresh view.

9. If you didn’t fix it, it isn’t fixed.

Good advice.

The comparative metaphor wasattractive and Blank-Edelman is agood presenter, but it wore thinlong before the 90 minutes wereup, and there were too many oldand trite jokes and anecdotes.

G U R U S E S S I O N

RAID/HA/SAN (with a Heavy Dose ofVeritas)

Doug Hughes, Global Crossing; DarrenDunham, TAOS

Summarized by Andrew Echols

The RAID/HA/SAN guru sessionconsisted of a Q&A, primarilyabout using Veritas products forRAID, high availability, and SANsetups.

The gurus recommended a fewmailing lists for questions relatedto the session topics. Each list is at<list-name> mailman.eng.auburn.edu, and uses <list-name>-requestmailman.eng.auburn.edu tosubscribe:

Veritas Applications: veritas-app

Veritas Backup Products: veritas-bu

84 ; L O G I N : V O L . 3 0 , N O . 1

Page 17: Atlanta,Georgia November 14–19,2004 conference …...summarizer: Jan Schaumann SLAs cross a dividing line, since they require active monitoring. While it is essential that this moni-toring

Veritas High Availability Products:veritas-ha

Veritas VxVM, VxFS, and HSM:veritas-vx

Sun storage appliance: ssa-man-agers

A small sample of the topics dis-cussed follows.

Regarding experience with VxVMand Sun Cluster, it was noted thatSolaris 10 has a new file system,ZFS, which has its own volumemanagement. There are knownissues, but Sun is likely motivatedto drop the requirement for VxVMto use Sun Cluster. Sun is alsopushing ZFS as something thatjust does the right thing and hidesthe details.

In another case, a user recentlystarted using Veritas FoundationSuite 4 and wanted to reduce thenumber of inodes. There appearedto be a very high amount of over-head, but it was suggested thatthere might be differences betweendefinitions of gigabytes (2^30 vs.10^9) between programs. It wasalso noted that there may be prob-lems with the size of the privateregion.

Commenting on the maturity ofthe Linux Volume Manager, thegurus felt that it had some nicefeatures, primarily a combinationof those provided by Veritas andAIX. The naming scheme is goodand fairly robust. It still needs mir-roring, but that can be workedaround at the RAID level.

TH E O RY

Summarized by John Sechrest

Experience in Implementing an HTTPService Closure

Steven Schwartzberg, BBN Technolo-gies; Alva Couch, Tufts University

This talk is a part of an ongoingdiscussion in configuration man-agement. How can you create asystem which can programmati-cally be configured in a coherentand consistent way?

Two concepts put forward are“closures” and “conduits.” Clo-sures act as an element of the net-work and conduits are the mecha-nism by which closures talk toeach other.

A closure is a self-contained appli-cation that is self-managing andself-healing and where all opera-tions affecting the closure must bewithin this or another closure.

If this is so, then the closure will:

Provide consistency

Create a common command lan-guage

Operate the same way regardlessof OS or other applications

Provide security

Provide a single interface into theclosure

Detect intrusion and potentiallyrepair modifications

Hide complexity

Understand the dependencies andbe a mediator

Provide configuration language inplain English

Eliminate typographical errors

All of these things are good ideas.

This was proposed in an earlierpaper, but how does it work inreality? This talk covered the expe-rience of building an HTTP clo-sure:

1. Start with RH 9 w Apache.

2. Create a protected directory.

3. Wrap Apache binaries withscripts.

4. Move all HTML, logs, configura-tion files, and modules into therepository.

5. Create a Perl script to allowmanagement of the closure.

At the first iteration of the closurelanguage, they supported a limitedcommand set:

assert—Create a virtual domain.

post—Upload a file into thedomain.

retract—Remove a file from thedomain.

allow—Grant permission to usersand modify configuration options.

Deny—Remove permissions asabove.

For example:

Assert foo.edupost info.htmlwww.foo.edu/info.htmlallow cg www.foo.edu/cgi-binretract www.foo.edu/information

But this leads to problems andambiguity. They had problemswith how they specified files vs.directories, the renaming of files,and dealing with indexes.

In order to address these prob-lems, they tried to gain idempo-tence (making order not matter)by hiding the complexity of thecommands and trying to createstatelessness. Post must only dealwith directories, so you load thewhole site each time.

This was an interesting experi-ment. It showed that there weremany places where proceduralcomplexity and all the details of aconfiguration make it hard to pres-ent a simplified solution. This wasa good first step in understandinghow a closure might be built.However, Apache seems like a dif-ficult example to process. Perhapsa simpler service would be a betterstarting point for illustrating theidea of closure.

Meta Change Queue: TrackingChanges to People, Places, and Things

Jon Finke, Rensselaer PolytechnicInstitute

At RPI all of the data about all ofthe different systems is stored indatabases.

Getting this data where it needs togo, and doing it in a timely man-ner, has taken some thought.Many of the issues are related towho owns the data and who needsthe data. Often these are in differ-ent parts of the organization.

; LO G I N : F E B R UA RY 2 0 0 5 1 8 TH L A RG E I N STA L L ATI O N SYSTE M A DM I N I STR ATI O N CO N F E R E N C E 85

Page 18: Atlanta,Georgia November 14–19,2004 conference …...summarizer: Jan Schaumann SLAs cross a dividing line, since they require active monitoring. While it is essential that this moni-toring

Different data sets—CMMS, LDAP,ID card, Space—need to be inte-grated into the process. By settingup a Meta_Change_Queue Table,the changes to the data sets can beidentified. A Listener can be set upto catch and process these datachanges in a low-overhead way.

This table uses a vendor-providedtrigger to input real-time data.This protects the business logic ofthe program, which allows third-party programs to work com-pletely and still be integrated intothe larger campus data process.

Through this mechanism a singlesign-on system for campus hasbeen created and maintained.

In order to do this, Jon needed to:

Index a key column and set to nullafter processing

Import with a trigger, which workswell with Oracle target systems

Include a low-priority batch queue

A key idea is to use a trigger tocatch deletions and move things tothe history table on deletes.

All source code is available onJon’s Web site(www.rpi.edu/~finkej/), but it isnot packaged and is mostly inPLSQL.

Solaris Zones: Operating System Sup-port for Consolidating CommercialWorkloads

Daniel Price and Andrew Tucker, Sun Microsystems, Inc.

Solaris Zones are a way to runmultiple workloads on the samemachine. They are akin to Jails inBSD. Because they all use the samekernel, they gain efficiency, butthey do not gain the independenceof multiple kernels like User ModeLinux. These are not virtualmachines.

A set of processes look the same atthe base machine. It supports areduced root-permissions struc-ture, in order to prevent escapingfrom a zone as you can from achroot.

They support per-zone resourcelimits.

This was an interesting introduc-tion to Solaris Zones. However, itdid end up coming off as a salestalk more than a comparison withthe state of the art. Solaris Zonesare a good thing, just as Jails are agood thing. I wished there hadbeen a broader perspective in thistalk.

I N V ITE D TA L K

Used Disk Drives

Simon Garfinkel, MIT Computer Sci-ence and Artificial Intelligence Labora-tory

Summarized by JimmyKaplowitz

Simon Garfinkel explained thatmuch data is frequently left behindwhen a computer owner tries toremove it from the hard drive.This is because of the commonchoice to use quick but vastlyincomplete erasure methods suchas the DEL and FORMAT com-mands of Micro-soft operating sys-tems. The data left behind by thesetools stays around for a long time,since hard drives are quite longlived. Physical security and OS-level access controls fail to ensuredata privacy when drives arerepurposed or given to a newowner. Cryptography, whichwould provide that assurance, israrely used on-disk.

The author did a study of 235 usedhard drives, some of which wereobtained for as little as $5. Thecontents of the drives includedrunnable programs and privatemedical information. The data wasdeleted to different degrees, so theauthor developed several differentlevels of deletion to keep theseseparate. Level 0 files are ordinaryfiles in the file system. Level 1 filesare temporary files in standardtemporary directories. Level 2includes recoverable deleted files,such as those in DOS where onlythe first letter of the filename is

lost. Level 3 is partially overwrit-ten files.

An overview of digital forensicstools was given. Among hard-driveforensics tools, the two main cate-gories were consumer and profes-sional tools. All of them were ableto undelete files at level 2 andsearch for text in level 3 files. Theprofessional tools also had knowl-edge of file formats, were able toperform hash-based searching, andcould provide reports and time-lines useful for auditing and legaltestimony purposes.

The author then described thechallenge of performing forensicson many drives at once with verylittle time for each drive, explain-ing some of his choices of tools aswell as some of the gotchas he hadto overcome. In total, he workedwith 125GB of data. The dataincluded very common parts of theOS, Web caches, authenticationcookies, credit card numbers,email addresses, medical records,personal and HR correspondence,personnel records, and similaritems.

Garfinkel told several more horrorstories about really sensitive databeing revealed, and suggested thecreation of a US privacy commis-sioner or some other federalwaste-auditing role. He cautionedthat all of his warnings aboutrecoverable data on hard drivesapplies equally well to USB drives,digital cameras, and other types ofmedia. However, when faced witha police officer ordering deletionof (for example) a picture on a dig-ital camera, the recoverability ofdirty deletion is usable to preserveone’s data.

The author mentioned the Depart-ment of Defense’s standard for san-itizing media containing non-clas-sified data and then proceeded todiscuss other tools for overwritingmedia. The tools discussed rangefrom standard UNIX dd to com-mercial tools such as Norton DiskDoctor. It was pointed out thatWindows XP and NT hide a little-

86 ; L O G I N : V O L . 3 0 , N O . 1

Page 19: Atlanta,Georgia November 14–19,2004 conference …...summarizer: Jan Schaumann SLAs cross a dividing line, since they require active monitoring. While it is essential that this moni-toring

known sanitization program intheir installation; it’s calledcipher.exe.

The author discussed the exoticthreat of hard drives with hostilefirmware that lie about their stateand snoop on other hard drives inthe same system. It is even possi-ble for such hard drives to infectother drives with hard-driveviruses. There is a level 4 part ofthe disk, the vendor area, which iswhere the firmware lives. This canbe overwritten by really expensivesanitization tools.

The talk concluded with a sober-ing question: If the author wasable to get so much private info forunder $1000, who else is doingthis?

Work-in-Progress Reports

Summarized by John Sechrest

WiPs are brief introductions totopics that people are currentlyworking on. Seven people stoodup and gave rapid descriptions ofongoing projects.

OGSAConfig—A proof of conceptimplementation of an architectureto make grid fabrics more manage-able

Edmund Smith, University of Edin-burgh

Split the problem down for a gridby allowing each node in the gridto get a different profile. You cando this with a constraint-resolu-tion problem.

OGSA config project:http://groups.inf.ed.ac.uk/ogsaconfig/

CfengineDoc—A tool which usesGraphViz and NaturalDocs to doc-ument the import execution orderof cfengine scripts

Christian Pearce, Perfect Order

How to describe the meaning of acfengine file. Used Inkscape todraw a cfengine config file andthen discovered GraphViz. Thengot AutoDoc and found Natural-Docs. Use it to describe SysNavInterface.

http://cfenginedoc.comhttp://naturaldocs.comhttp://autodoc.comhttp://cfwiki.orghttp://sysnav.comhttp://inkscape.com

Verifiable Secure Electronic Voting—An electronic voting system thatallows voter verification of indi-vidual votes, and makes a passingnod towards being a secure andprivate voting environment

Zed Pobre, Louisiana StateUniversity

Use a UID + voting ID + MD5.Supports many different votingstyles.

Grand Unified Documentation Theory, or What’s wrong with cur-rent documentation standards.Can we fix it?

David Nolan, CMU

Trying to create an interface to asingle place for documentationusing a wiki front end, with somecustomizations for raw access anda choice of editor.

[email protected]@lists.andrew.cmu.edu

CADUBI: Creative ASCII DrawingUtility by Ian

Ian Langworth, Northeastern Uni-versity

Received Best WiP Award!

CADUBI is in all kinds of ports. Itis Old Crap, very old.

CADUBI 1.2: two maintainers tothe last version.

Go AWAY . . . Use TetraDraw.

(This was incredibly humorous.)

Geographic Firewalling

Peter Markowski, NortheasternUniversity

Wireless firewalls are natively geo-graphic entities. This way you canchoose where people get services.Limit your perimeter.

Portable Cluster Computers andInfiniBand Clusters

Mitch Williams, Sandia NationalLaboratories

Received 2nd Best WiP Award!

A cluster in a box. It is one foottall, has four CPUs, and by usingLinuxBios it can boot in 12 sec-onds. Working on adding infini-band and creating embeddedLinux clusters.

I N V ITE D TA L KS

Lessons Learned from Howard Dean’sDigital Campaign

Keri Carpenter, UC Irvine; Tom Limoncelli, Consultant

Summarized by Andrew Echols

Howard Dean’s presidential cam-paign was unconventional in itsapproach to putting the campaignonline. Traditionally, candidateactivities take the form of gettingout to the people, giving speeches,and kissing babies. Online cam-paigning takes this approach fur-ther, putting more content onlineand organizing supporters over theWeb.

Joe Trippi, the Dean campaignmanager, a veteran of the dot-comboom, wanted to run an “opensource” campaign. Traditionalcampaigns have been onlinebefore, but they have limitedthemselves to creating, essentially,a brochure Web site. The Deancampaign, however, created anonline social movement with toolslike blogs, meet-ups, and mailinglists. Furthermore, control of themessage was opened up to Website visitors. Supporters weretrusted and ex-pected to aid incrafting the movement.

The campaign blog provided up-to-the-minute articles and discus-sion of activities. It provided acentral online community for thecampaign. It also allowed visitorsto post comments, opening up thedialog.

The potential of the online cam-paign is apparent when one con-siders where it took the campaign.In January 2003, the campaign

; LO G I N : F E B R UA RY 2 0 0 5 1 8 TH L A RG E I N STA L L ATI O N SYSTE M A DM I N I STR ATI O N CO N F E R E N C E 87

Page 20: Atlanta,Georgia November 14–19,2004 conference …...summarizer: Jan Schaumann SLAs cross a dividing line, since they require active monitoring. While it is essential that this moni-toring

had only 432 known supportersand $157,000 in the bank. By theend of the campaign, it had raisedover $50 million from 300,000individuals, with 640,000 support-ers on a mailing list and 189,000meet-ups. The online campaign’ssuccess story is that it tookHoward Dean from obscurity tobeing, for a time, the front-run-ning contender for the Democraticnomination.

Storage Security: You’re Fooling Yourself

W. Curtis Preston, GlasshouseTechnologies

Summarized by Josh Whitlock

Until recently, storage people didnot have to worry that much aboutsecurity. Because storage was notmeant to be accessible on a net-work, security was not designedinto storage technology. Now thatnetwork security is becomingmore of an issue, storage securitymust be reconsidered. One prob-lem is that storage and securitypeople speak different languages.Storage people talk in terms ofzones, LUNs, and mirroring, whilesecurity people talk in terms ofACLs, VLANs, and DMZs. Bylearning how each other operateand working together, storage andsecurity people will be better ableto locate and address vulnerabili-ties.

There are a number of securityissues for storage. People can getinside the SAN and there have fullaccess to the data by exploitingconfiguration mistakes such asopen management interfaces anddefault passwords. Managementinterfaces are often plugged intothe corporate management back-bone, are visible from the LAN,use plaintext passwords, and oftenhave default passwords. This prob-lem can be solved by putting man-agement interfaces on a separateLAN and upgrading to interfaceslike SSL and SSH that don’t useplaintext passwords. The backupperson can be the greatest securityrisk of the SAN, because that per-

son has root access. Depending onthe backup person’s securityastuteness, a hacker could gainaccess to the backup server viasocial engineering and insecureserver configuration.

A hacker with access to thebackup server has complete con-trol over all the data. The greatest threat totape backups is theft. Social engi-neering could be used to sway alow-income employee transport-ing the backups into stealing thetapes. Discarded media that is notsanitized properly before beingsold could be scavenged for data.SAN issues with security deal withzones. For enforcement methods,hardware-enhanced zones are theonly type of zoning that offers ameaningful level of access control.LUN masking hides LUNs fromspecific servers and should not beconsidered a security mechanism,but commonly is.

Suggestions for improving storagesecurity include shutting downwhatever is non-essential, testingNAS servers for weaknesses, andputting the NAS on a separateLAN. Learning about weaknesses,planning for security, and workingwith vendors to make storagetechnologies more secure are othermeans of improving security.

I M P R E S S I O N S F ROM A L I S A N E W B I E

Chris Kacoroski, Comcast [email protected]

Hi,

As a LISA ’04 newbie with a focuson configuration managementtraining, I attended tutorials everyday and missed many of the tech-nical sessions. What follows aremy opinions, which you are obvi-ously free to agree or disagreewith.

Overview:

Having all areas covered by wire-less really changes how the audi-ence listens to the speakers. Iwould go to a session and while

the speaker was setting up, Iwould go online and check out theservers at work, along with myfavorite Web sites. I noticed sev-eral other people also doing this. Ifthe speaker was covering some-thing I was not interested in, Iresponded to emails. With the IRCchannel, I saw some folks holdingside chats while speakers weretalking. This is a good thing forfolks like me who do not have a backup in their job(one day I was able to fix a prob-lem with our name servers), butfrom a speaker’s viewpoint, Iwould think it could be very frus-trating that some people are notpaying atten-tion.

I was impressed by LISA’s leader-ship and staff’s emphasis on learn-ing from each other. They encour-aged this via the LISA bingo game,where you had to get signatures ofpeople, the lunches provided withthe tutorials, and the acknowledg-ment of the importance of “hall-way track,” where I definitelylearned the most. I was alsoimpressed by the folks who taughtthe tutorials; these are people whowrite books (e.g., O’Reilly’s LDAPby Gerald Carter).

The days were very intense, withsessions from 9 to 5:30. Thenthere were BoFs from 7 to 11 p.m.After that the hallway track typi-cally continued from 1 to 3 a.m. Iaveraged about five hours of sleepa night. I definitely had informa-tion overload.

The high points for me were:

System Logging and Analysis Tutorial

Marcus Ranum

I liked the idea of “artificial igno-rance,” where if you see a log mes-sage that is okay, you filter it outbut keep a count of how often ithappens. In a short period of time,the only messages that get past thefilters are ones that you have notseen before and that you need tolook at. By keeping a count ofmessages, you can be alerted if a

88 ; L O G I N : V O L . 3 0 , N O . 1

Page 21: Atlanta,Georgia November 14–19,2004 conference …...summarizer: Jan Schaumann SLAs cross a dividing line, since they require active monitoring. While it is essential that this moni-toring

message that normally happens100 times a day jumps to 1000(something probably happened). Ialso learned just how unreliableUDP really is and ways around thisproblem (using TCP or localserver filtering).

Marcus briefly mentioned log-watch, plog, and logbayes, andwent into detail on his NBS tool,which is used to implement theidea of Artificial Ignorance. Mar-cus is a good presenter who makesthe material real through severalreal-life examples.

Cfengine Tutorials

Mark Burgess

The introductory tutorial was notthat interesting, but the advancedtutorial was excellent. I picked upenough new ideas and hints that Ireally need to go back and re-readthe manual (some of this is due toa new version that has some signif-icant changes, such as subrou-tines). Mark even created a patchto re-solve a specific problem Ihave been having with my use ofcfengine.

Change Management Guru Session

Gene Kim

The book provided with this ses-sion is very good, as it provides arecipe that my management canuse to start to implement goodchange-management practices.Because this book is based on realdata about the impact of unman-aged changes, I think I will have agood chance of getting it imple-mented.

Configuration Management Session

Alva Couch/Mark Burgess

Mark’s discussion on computerimmunology did not grab me—toofar out in the future, but Alva’s dis-cussion about the phases of con-figuration management and thedifficulty in moving from one levelto another was excellent, as healso listed the benefits of movingto the next level. I will be able touse this to show my management

where I am trying to go. Thephases are:

Level 0—Ad hoc: Uses ad hoctools to manage the systems.Pretty much anything goes. Thereare no barriers, since everybodydoes their own thing. Also, there isno documentation, reproducibility,or interchangeability of systems.This starts to fail when the num-ber of systems goes above 50–100.

Level 1—Incremental: Uses incre-mental management—implementsa tool such as cfengine and justmanages a few things on themachines. One barrier is that eachsysadmin has to be trained on thetool you use. This level providesdocumentation of the systems andease of updating 1000s of systems.This starts to fail when you needto manage entire systems, not justparts of them.

Level 2—Proscriptive: Controlsthe entire system by building frombare metal for each system. A bar-rier is loss of memory of how amachine was set up (e.g., why amachine was configured in a cer-tain manner). Lots of work to doc-ument exactly how the system wasset up before it was rebuilt. Needto do a lot of testing to ensure thatthe bare-metal rebuild will work.This provides guaranteed repro-ducibility, where you can createany system from bare metal.

Level 3—Enterprise: Divorces thehardware from the machine use somachines are interchangeable—we can move services betweenmachines as needed. All machinesare created alike. A barrier is theloss of freedom to take care of yourown machine (e.g., you do notknow which machine will be theWeb server). The benefit is thatany machine is interchangeablewith another. Kind of like thinclients, where a person can log onto any terminal and get their ownenvironment. Companies likeCNN use this for their Web serverfarms.

Alva then went on to show howeach organization will need to

determine which level has the bestpayoff. For example, his group hasdecided to stay at level 2, becausethey rebuild machines every sixmonths and it makes no sense togo to level 3. Paul Anderson’sgroup has moved to level 3,because they do not rebuildmachines every six months, so thecost-benefit equation makes sense.

BoFs & Hallway Track

The asset management BoF hadthe idea of tracking MAC addressto IP address to switch port inorder to alert you to any changesin the data center. I really likedwhat one participant had to say: heused SNMP to get all kinds of datafrom his servers—the only thinghe was lacking was an automatedprocess to determine what powerplug a computer was using.

As I said at the beginning, Ifocused on configuration manage-ment. My environment consists ofabout 100 servers (Mac, Linux,Solaris, Windows), 1500 need-to-be-managed Mac workstations,and 700 Windows workstations.What I learned from the BoFs,hallway track, and sessions was:

Cfengine is a good choice for meto manage my systems.

I need to move to automatedbuilds of systems.

Cfengine has a lot of functionalitythat I was not aware of. I had beentrying to figure out hacks for cer-tain items that cfengine does outof the box.

Many folks who use cfengine donot really use it in depth but onlyuse a small portion of it. MarkBurgess was the only person Ifound who really understood all of it.

I have about a two- to three-yearproject ahead of me to get every-thing managed.

A monitoring system that usesSNMP to tie MAC address to IPaddress to switch port has lots ofbenefits for asset management and

; LO G I N : F E B R UA RY 2 0 0 5 1 8 TH L A RG E I N STA L L ATI O N SYSTE M A DM I N I STR ATI O N CO N F E R E N C E 89

Page 22: Atlanta,Georgia November 14–19,2004 conference …...summarizer: Jan Schaumann SLAs cross a dividing line, since they require active monitoring. While it is essential that this moni-toring

sending alerts when thingschange.

Change management is critical.Tracking the success rate ofapproved changes really makes itmuch easier to diagnose andresolve problems, since most prob-lems result from recent changes.

cheers,

ski

90 ; L O G I N : V O L . 3 0 , N O . 1