climb out of the hole cpte 433 chapter 2 adapted by john beckett from the practice of system &...

14
Climb Out of the Hole CPTE 433 Chapter 2 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup

Upload: amanda-paul

Post on 13-Jan-2016

230 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Climb Out of the Hole CPTE 433 Chapter 2 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup

Climb Out of the Hole

CPTE 433 Chapter 2Adapted by John Beckett from

The Practice of System & Network Administration

by Limoncelli, Hogan, & Chalup

Page 2: Climb Out of the Hole CPTE 433 Chapter 2 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup

The Problem

You are too busy fighting fires to

get things done right

You have lots of fires

because you aren’t

doing things right

Page 3: Climb Out of the Hole CPTE 433 Chapter 2 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup

What is “Doing Things Right?”

• Use a trouble-ticket system• Manage quick requests properly• Adopt time saving policies• Start every host in a known state• …

Page 4: Climb Out of the Hole CPTE 433 Chapter 2 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup

Use a Trouble-Ticket System

• Assure that each ticket goes to completion– User, dispatcher, and tech sign off

• Documents current state of unresolved tickets

• Provides important historical data for management/planning

Dash-board

Management Interface

Service

Page 5: Climb Out of the Hole CPTE 433 Chapter 2 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup

Trouble Ticket Cycle

TroubleTicket

System

SA Dispat

ch

SAs

SAMgt

Clients

Admin

A trouble ticket system is the core of SA management

Page 6: Climb Out of the Hole CPTE 433 Chapter 2 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup

Trouble Ticket System Messages

• SA Dispatch> Enter ticket> Reactivate ticket< Ticket status

• SAs< Ticket

information> Tasks done> Needs> Add’l tickets

• SA Mgt< Dashboard> Decisions

• Client< Status

information> Add’l data

• Administration< Value delivered> Money needed

Page 7: Climb Out of the Hole CPTE 433 Chapter 2 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup

Managing Quick Requests

• Have a shield– SA assigned to quick requests

• You may want to rotate this duty• Are all SAs equally trained or do they

specialize?• Precursor tasks (e.g. password

resets) need priority

Page 8: Climb Out of the Hole CPTE 433 Chapter 2 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup

Trouble Ticket Response Time is

(sort of) Like Network Packets• Latency

– How long it takes for a problem to reach the person who can solve it.

• Bandwidth– How quickly a person can solve a

problem

Page 9: Climb Out of the Hole CPTE 433 Chapter 2 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup

Time-Saving Policies

• Define how people are to get help from your group

• Define scope of responsibility for your team

• Define emergency• Define “quick request”• These policies are the responsibility

of SA management– Individual SA discretion needs to be

defined

Page 10: Climb Out of the Hole CPTE 433 Chapter 2 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup

Policy Tradeoffs

Individuals define their own policies

Individuals stick with group policies

Users gravitate toward “loose” individuals as long as they have success, then chaos grows

Users get consistent resultsSAs are not as productive

Loose policies Tight policies

Quick response for trivial requestsPoor response for longer requests because of interruptions

Consistent response for all requestsSequencing problems

Page 11: Climb Out of the Hole CPTE 433 Chapter 2 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup

Start Every Host in a Known State

• Have standard build methods for servers and clients.– Make this a key part of your technology

platform• Limit the number of options

– Record the options in that workstation’s entry in your inventory/ticketing system

• Automate the build process• New projects start with a standard build

– Document steps to final state

Page 12: Climb Out of the Hole CPTE 433 Chapter 2 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup

Other tips

• Make email work well– Stable, reliable, functional

• Fix the biggest time drain– Identify what’s bleeding you, give it

necessary resources to solve it– “Rinse and repeat”

• Quick Fixes– Get production going– Cost more to fix properly later on

Page 13: Climb Out of the Hole CPTE 433 Chapter 2 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup

More tips

• Sufficient power and cooling– People and technology malfunction if

they aren’t comfortable• Simple monitoring

– Email-enable systems that might fail– Set up a Web-based dashboard

• Work to make it more comprehensive• Practice using it• Update it as reality changes

Page 14: Climb Out of the Hole CPTE 433 Chapter 2 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup

Beckett’s Tips• Clearly identify and label each device and service

with a unique name that does not collide with any other namespace– Never, never, never re-designate (rooms, devices, etc.)

• Solo operation with no trouble ticket system? Start with 3x5 cards– Perhaps print one side with basic info like who reported it,

phone number, resource failing or needing upgrade, and 1-line description

• Expect to outgrow a trouble ticket system– If it’s comprehensive enough for the future, you might

never get it off the ground in the first place• Work through your trouble ticket system, not around

it