it2204: systems administration i 7. device management

80
IT2204: Systems Administration I 7. Device Management

Upload: bernice-mcbride

Post on 27-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: IT2204: Systems Administration I 7. Device Management

IT2204: Systems

Administration I

7. Device Management

Page 2: IT2204: Systems Administration I 7. Device Management

2

Managing desktops Managing many desktops (workstations)

Best done by automation for supported platforms, thus automating the process of managing many PCs.

−Main duties for workstations□ Loading of system and software□ Updating system software and applications□ Configuring network parameters

−All three must be right□ Initial load must be consistent across all machines□ Quick updates□ Network configuration to be managed centrally

Page 3: IT2204: Systems Administration I 7. Device Management

3

Managing many workstations

− Best done by automation for supported platforms

Page 4: IT2204: Systems Administration I 7. Device Management

4

Machine life cycle Five states and several transitions exist.

There is need to plan for the different states and transitions

Page 5: IT2204: Systems Administration I 7. Device Management

5

Machine life cycle

• Machine states New

− A new machine Clean

− OS installed, but not yet configured for environment.

Configured− Set up (configured) correctly for the

operating environment. Unknown

− Misconfigured, broken, newly discovered, outdated configuration, etc.

Off− Retired computer

Page 6: IT2204: Systems Administration I 7. Device Management

6

Machine life cycle

•State transitions Build

− Transition from new to clean states. − Set up hardware and install OS.

Initialize− Configure for environment; often

part of build. Update

− Install new software. − Patch old software. − Change configurations.

Page 7: IT2204: Systems Administration I 7. Device Management

7

Machine life cycle

•State transitions Entropy

− Undisciplined/ unmanageable changes to configurations

− Major environment changes − Unexplained problems

Debug− Going back to correct configured state

Rebuild− Machine rebuilding

□ Possibly due to a major OS revision,□ Drastic changes to be made such that

simple updates make no sense

Page 8: IT2204: Systems Administration I 7. Device Management

8

Automated Installations• Advantages Saves time/money

− Boot the computer, then go do something else. Ensures consistency

− No chance of entering wrong input during install.

− Avoids user requests due to mistakes in configuration

− What works on one desktop, works on all. Allows for fast system recovery

− Rebuild system with auto-install vs. slow tapes.

Page 9: IT2204: Systems Administration I 7. Device Management

9

Automated Installations

Full automation better than partial one

− Eliminates prompts in installation scripts

− Can include complete notification when complete.

Partial automation (better than no none)

− Needs proper documentation for consistency

Page 10: IT2204: Systems Administration I 7. Device Management

10

Vendor Installations

•Weaknesses Need to always reload the OS on new machines (you may have your OS preference)

− You need to configure the host for your environment

− Eventually you’ll reload the OS on a desktop, leaving you with two platforms to support: the vendor OS install and your OS install.

− Vendors change their OS images from time to time, so systems bought today have a different OS from systems bought a few months ago.

Page 11: IT2204: Systems Administration I 7. Device Management

11

Your own Installations

Little trust to vendor' s pre-installed OS− Makes adding new apps to your clean install

easier as you are installing for your own environment. No need to contact the vendor for installation help.

− There may be need for special apps or add-ons

− You may eventually need a re-install□ This may be different from that of the

vendor□ Need to make sure required drivers,

software are available

Page 12: IT2204: Systems Administration I 7. Device Management

12

System and Application

Updates At times the software we install may get bugs and security loop holes.

New applications are also launched now and again.

We may need to update or upgrade our software. Updates should be updated too Automation systems include:

− Solaris autopatch− Windows − Linux package updaters eg yum, apt• Updates can be retrieved from the online

software update centers for different OS vendors.

Page 13: IT2204: Systems Administration I 7. Device Management

• A software update provides bug fixes for features that aren't working right, minor software enhancements and sometimes include new drivers.

• Some software updates are free, and are sometimes called a patch because the update is installed over software you're already using and it isn't a full software installation. 

13

System and Application Updates

Page 14: IT2204: Systems Administration I 7. Device Management

• A software upgrade requires a purchase of a new version of your software, usually at a lower price than you would pay if you bought the software for the first time. 

• Some companies will offer a free update to the latest version if you just recently bought your software, so be sure to register the software when you install it so you know if you qualify for free upgrades.

14

System and Application Updates

Page 15: IT2204: Systems Administration I 7. Device Management

• A patch is a software update comprised of code inserted (or patched) into the code of an executable program. Typically, a patch is installed into an existing software program. Patches are often temporary fixes between full releases of a software package.

• Patches may do any of the following:– Fix a software bug– Install new drivers– Address new security vulnerabilities– Address software stability issues– Upgrade the software

15

System and Application Updates

Page 16: IT2204: Systems Administration I 7. Device Management

• A patch is designed to update a computer program or its supporting data, to fix or improve it. This includes fixing security vulnerabilities and other bugs, and improving the usability or performance.

• Though meant to fix problems, poorly designed patches can sometimes introduce new problems.

16

Page 17: IT2204: Systems Administration I 7. Device Management

17

Network configuration

Done over the network, typically using DHCP

− Eliminates time wastage and manual error

− More secure – only authorized systems have access

− Centralized control makes updates and changes a lot easier (e.g. new DNS server)

Page 18: IT2204: Systems Administration I 7. Device Management

18

Managing Servers

Different from workstation− Serves many users− Requires reliability and high uptime (a server

should always be on)− Requires tight security− Often expected to last longer− More expensive− Typically has a different OS configuration from

desktops− Deployed within a data center− Often has maintenance contracts− Has backup systems− Has better remote access

Page 19: IT2204: Systems Administration I 7. Device Management

19

Server HW

Buy server HW for servers

− specialized hardware− offers more internal space− offers more CPU performance− Offer high performance I/O (both

disk and network)− Has more upgrade options− Can be rack mountable

For reliability, use known vendors

Page 20: IT2204: Systems Administration I 7. Device Management

20

Vendor server product lines

• The typical vendor has three product lines:

1.Home

2.Require more internal space (Business)

3.Offer high performance I/O (Server)

Page 21: IT2204: Systems Administration I 7. Device Management

21

Vendor server product lines

Home− Absolute cheapest purchase price− Original equipment manufacturer

(OEM) components change often− focuses on being the lowest cost at

the outset. consumers make purchasing decisions based on price. (Any add-on features can be considered later and purchased for a premium cost.)

Page 22: IT2204: Systems Administration I 7. Device Management

22

Vendor server product lines

Require more internal space (Business)− Longer life, reduced total cost of ownership− Fewer component changes− concentrate on the total cost of

ownership. Businesses tend to keep their computers for a longer time than the average home consumer. Therefore, the manufacturers have to keep a large pool of spare parts for maintenance of these computers.

− Usually higher quality components than that of the home line computers.

Page 23: IT2204: Systems Administration I 7. Device Management

23

Vendor server product lines

Offer high performance I/O (Server)− Lowest cost performance metric− Easier to service components and design− The server line tends to focus on the

lowest total cost of ownership. For example, a server is designed with higher quality components that will last a great deal of time longer than the home or business line of computers. A server is also designed to be able to process a higher workload than a home line of computers.

Page 24: IT2204: Systems Administration I 7. Device Management

24

Maintenance contracts

Vendors have variety of service contracts− Customer-purchased spare parts get

replaced when they get used up How to select a maintenance contract?

− Think of needs□ Non-critical hosts: next-day or two-day

response time is likely reasonable, or perhaps no contract

□ Large groups of similar hosts: use spares approach

□ Controlled model: only use a small set of distinct technologies so that few spare part kits needed.

Page 25: IT2204: Systems Administration I 7. Device Management

25

Maintenance contracts

How to select a maintenance contract?

− Think of needs□ Critical host: stock failure-prone

and interchangeable parts (power supplies, hard drives); get same-day contract

□ Large variety of models from same vendor: sufficiently large sites may opt for a contract with an on-site technician

Page 26: IT2204: Systems Administration I 7. Device Management

26

Data backups

Servers often have critical data that must be backed up

− Client data often backed up on server

− Think of separate administrative network□ Keep bandwidth-hungry backup

jobs off of the production network□ Provide alternate access during

network problems □ May require additional NICs,

cabling, switches etc

Page 27: IT2204: Systems Administration I 7. Device Management

27

Servers in the data center

Servers should be located in data centers

− Data centers provide□ Proper power (enough power,

conditioned, UPS, maybe generator)

□ Fire protection/suppression□ Networking□ Sufficient air conditioning (climate

controlled)□ Physical security

Page 28: IT2204: Systems Administration I 7. Device Management

28

Remote administration

Data centers are expensive and may be distant from admin office

Servers should not require physical presence at a console

− Typical solution is a console server□ Eliminate need for keyboard and screen□ Can see booting, can send special keystrokes□ Access to console server can be remote (e.g.,

ssh, rdesktop)− Power cycling provided by remote-access power-strips− Media insertion & hardware servicing are still

problems

Page 29: IT2204: Systems Administration I 7. Device Management

RAIDRedundant Array of Independent

Disks

Page 30: IT2204: Systems Administration I 7. Device Management

30

RAID Redundant array of independent disks

RAID

− A system whereby two or more disks are physically linked together to form a single logical, large capacity storage device that offers a number of advantages over conventional hard disk storage devices

− Makes many smaller disks appear as one large disk to a server

Page 31: IT2204: Systems Administration I 7. Device Management

31

Why RAID? Superior performance and system

reliability

Improved resilience through increased redundancy

Lower costs

Page 32: IT2204: Systems Administration I 7. Device Management

32

Why RAID? Performance

The parallelism or ability to access multiple disks in the same time allows for the data to be written or read from an array in a faster way than what would be possible in a simple single drive.

− Typically used in large file servers, transaction or application servers, where data accessibility is critical, and fault tolerance is required.

− Today, RAID is also being used in desktop environments for CAD, multimedia editing and playback where higher transfer rates are needed.

Performance is increased because the server has more "spindles" to read from or write to when data is accessed from a drive

Page 33: IT2204: Systems Administration I 7. Device Management

33

Why RAID? More resilience

This provided allowance for a backup of data in the storage array during failure.

The failure of any array can be prevented by swapping out a new drive without turning the system off.

The RAID performance depends on the number of drives used in the array.

Page 34: IT2204: Systems Administration I 7. Device Management

34

Why RAID? More resilience

Redundancy is achieved by either writing the same data to multiple drives (mirroring), or collecting data (parity data) across the array, such that the failure of one or more disks in the array does not result in data loss.

A failed disk may be replaced by a new one, and the lost data reconstructed from the remaining data and the parity data.

Page 35: IT2204: Systems Administration I 7. Device Management

35

Why RAID? Cheaper

Since the main principle involved in the RAID is to provide greater or the same storage capacity to a system in comparison with that for a single drive, there is a high price difference.

When self repairing configurations are used that do not need humans to replace drives, storage could become cheaper and more liable.

Page 36: IT2204: Systems Administration I 7. Device Management

36

RAID Techniques

• Two principal techniques employed:

1. Mirroring− The first implementation of RAID,

typically requiring two individual drives of similar capacity. One drive is the active drive and the secondary drive is the mirror

− The technique provides a simple form of redundancy for data by automatically writing data to the mirror drive when it is written to the active drive.

Page 37: IT2204: Systems Administration I 7. Device Management

37

RAID Techniques

• Two principal techniques employed:

2. Striping− This technique provides increased

performance. It is a method of mapping data across the physical drives in an array to create a large virtual drive.

− The data is subdivided into consecutive segments or stripes that are written sequentially across the drives in the array.

Page 38: IT2204: Systems Administration I 7. Device Management

38

RAID levelsSeveral levels of RAID are used

RAID 0 – “Disk Striping”− It is technically not a RAID level since it

provides no fault tolerance. Data is written in blocks across multiple drives, so one drive can be writing or reading a block while the next is seeking the next block.

Page 39: IT2204: Systems Administration I 7. Device Management

RAID 0: Striping

39

Page 40: IT2204: Systems Administration I 7. Device Management

40

RAID levels

− Advantages: higher access rate, and full utilization of the array capacity, easy to implement.

− Disadvantage: there is no fault tolerance - if one drive fails, the entire contents of the array become inaccessible.

− Ideal for non-critical systems.

Page 41: IT2204: Systems Administration I 7. Device Management

41

RAID levels: RAID 1

RAID 1 – “Disk Mirroring”− Provides redundancy by storing

data twice. Data is written to both the data disk (active) and a mirror disk(s).

− If one fails, the controller uses either the data drive or the mirror drive for data recovery and continues operation.

Page 42: IT2204: Systems Administration I 7. Device Management

RAID 1: Mirroring

42

Page 43: IT2204: Systems Administration I 7. Device Management

43

RAID levels: RAID 1

− Advantage: it provides the best protection of data since the array management software will simply direct all application requests to surviving disk when one disk fails.

− Disadvantages: no improvement in data access speed, and higher cost, since twice the number of drives is required.

− Ideal for critical mission systems

Page 44: IT2204: Systems Administration I 7. Device Management

44

RAID levels: RAID 3

RAID 3− Data blocks are subdivided (striped)

and written in parallel on two or more drives. An additional drive stores parity information for error correction/recovery . A minimum of 3 disks is needed for a RAID 3 array.

− Since parity is used, a RAID 3 stripe set can withstand a single disk failure without losing data or access to data.

Page 45: IT2204: Systems Administration I 7. Device Management

RAID 3

45

Page 46: IT2204: Systems Administration I 7. Device Management

46

RAID levels: RAID 3

− Advantages: it provides high throughput (both read and write) for large data transfers; and disk failures do not significantly slow down throughput.

− Disadvantages: this technology is fairly complex and too resource intensive to be done in software; performance is slower for random, small I/O operations.

Page 47: IT2204: Systems Administration I 7. Device Management

47

RAID levels: RAID 5

RAID 5 – “Data striping with parity”− The most common secure RAID level.− Similar to RAID-3 except that data are

transferred to disks by independent read and write operations (not in parallel).

− The written data chunks are also larger.− Instead of a dedicated parity disk, parity

information is spread across all the drives. A minimum of 3 disks is needed for a RAID 5 array.

Page 48: IT2204: Systems Administration I 7. Device Management

RAID 5 – “Data striping with parity”

48

Page 49: IT2204: Systems Administration I 7. Device Management

49

RAID levels: RAID 5 RAID 5 – “Data striping with parity”

− A RAID 5 array can withstand a single disk failure without losing data or access to data. Although RAID 5 can be achieved in software, a hardware controller is recommended. Often extra cache memory is used on these controllers to improve the write performance.

Page 50: IT2204: Systems Administration I 7. Device Management

50

RAID levels: RAID 5− Advantages: read data transactions

are very fast while write data transaction are somewhat slower (due to the parity that has to be calculated).

− Disadvantages: Disk failures have an effect on throughput, although this is still acceptable; like RAID 3, this is complex technology.

− RAID 5 is a good all-round system that combines efficient storage with excellent security and decent performance.

− It is ideal for file and application servers.

Page 51: IT2204: Systems Administration I 7. Device Management

51

RAID levels

• RAID 0 and RAID 1 combinations

Combines the advantages (and disadvantages) of RAID 0 and RAID 1 in one single system.− It provides security by mirroring all

data on a secondary set of disks (disk 3 and 4 in the drawing) while using striping across each set of disks to speed up data transfers.

Page 52: IT2204: Systems Administration I 7. Device Management

52

Page 53: IT2204: Systems Administration I 7. Device Management

53

RAID levels

• RAID 0 and RAID 1 combinations

RAID 1+0 (or 10) is a mirrored data set (RAID 1) which is then striped (RAID 0), hence the "1+0" name. A RAID 10 array requires a minimum of two drives, but is more commonly implemented with 4 drives to take advantage of speed benefits.

RAID 0+1 (or 01) is a striped data set (RAID 0) which is then mirrored (RAID 1). A RAID 0+1 array requires a minimum of four drives: two to hold the striped data, plus another two to mirror the first pair.

Page 54: IT2204: Systems Administration I 7. Device Management

55

RAID Implementations

Data distribution across multiple drives can be managed either by dedicated hardware or by software.

When done in software the software may be part of the OS or it may be part of the firmware and drivers supplied with the card.

Page 55: IT2204: Systems Administration I 7. Device Management

56

RAID Implementations

• Operating system based (Software RAID)

• Software implementations are now provided by many OSs.

• A software layer sits above the disk device drivers and provides an abstraction layer between the logical drives (RAIDs) and physical drives.

• Most common levels are RAID 0 and RAID 1, followed by RAID 1+0, RAID 0+1, and RAID 5 are supported.

Page 56: IT2204: Systems Administration I 7. Device Management

57

RAID Implementations

• Software RAID Apple's Mac OS X Server supports RAID 0, RAID 1,

RAID 5 and RAID 1+0.

FreeBSD supports RAID 0, RAID 1, RAID 3, and RAID 5 and all layerings of the above via GEOM (main storage framework for FreeBSD OS) modules, as well as supporting RAID 0, RAID 1, RAID-Z, and RAID-Z2 (similar to RAID 5 and RAID 6 respectively), plus nested combinations of those via ZFS a Suns file system and logical volume manager .

Linux supports RAID 0, RAID 1, RAID 4, RAID 5, RAID 6 and all layerings of the above.

Page 57: IT2204: Systems Administration I 7. Device Management

58

RAID Implementations

• Software RAID Microsoft's server OSs support 3 RAID levels; RAID 0,

RAID 1, and RAID 5. Some of the Microsoft desktop OSs support RAID such as Windows XP Professional which supports RAID level 0 in addition to spanning multiple disks but only if using dynamic disks and volumes. Windows XP supports RAID 0, 1, and 5 with a simple file patch. RAID functionality in Windows is slower than hardware RAID, but allows a RAID array to be moved to another machine with no compatibility issues.

NetBSD supports RAID 0, RAID 1, RAID 4 and RAID 5 (and any nested combination of those like 1+0) via its software implementation, named RAIDframe.

Page 58: IT2204: Systems Administration I 7. Device Management

59

RAID Implementations• Software RAID OpenBSD aims to support RAID 0, RAID 1, RAID 4 and

RAID 5 via its software implementation softraid.

OpenSolaris and Solaris 10 supports RAID 0, RAID 1, RAID 5 (or the similar “RAID Z” found only on ZFS), and RAID 6 (and any nested combination of those like 1+0) via ZFS and now has the ability to boot from a ZFS volume on both x86 and UltraSPARC a Suns microprocessor.

− Through SVM, Solaris 10 and earlier versions support RAID 1 for the boot filesystem, and adds RAID 0 and RAID 5 support (and various nested combinations) for data drives.

Page 59: IT2204: Systems Administration I 7. Device Management

60

RAID Implementations

• Hardware RAID Hardware RAID controllers use different,

proprietary disk layouts, so it is not usually possible to span controllers from different manufacturers.

They do not require processor resources, the BIOS can boot from them, and tighter integration with the device driver may offer better error handling.

Page 60: IT2204: Systems Administration I 7. Device Management

61

RAID Implementations• Hardware RAID A hardware implementation of RAID requires at

least a special-purpose RAID controller. On a desktop system this may be a PCI expansion card, PCI-e expansion card or built into the motherboard. Controllers supporting most types of drive may be used – IDE/ATA, SATA, SCSI, SSA, Fibre Channel, sometimes even a combination.

The controller and disks may be in a stand-alone disk enclosure, rather than inside a computer.

The enclosure may be directly attached to a computer, or connected via SAN.

Page 61: IT2204: Systems Administration I 7. Device Management

62

RAID Implementations

• Hardware RAID Most hardware implementations provide a

read/write cache, which, depending on the I/O workload, will improve performance. In most systems the write cache is non-volatile (battery-protected), so pending writes are not lost on a power failure.

Hardware implementations provide guaranteed performance, add no overhead to the local CPU complex and can support many operating systems, as the controller simply presents a logical disk to the operating system.

Page 62: IT2204: Systems Administration I 7. Device Management

63

RAID Implementations

• Reading Assignment What are the advantages and

disadvantages of SW RAID?

What are the advantages and disadvantages of HW RAID?

Page 63: IT2204: Systems Administration I 7. Device Management

64

RAID and Backup!

• RAID is no substitute for back-up! All RAID levels except RAID 0 offer protection from

a single drive failure. A RAID 6 system even survives 2 disks dying

simultaneously. For complete security there is need to back-up the data from a RAID system.

− A back-up comes in handy if all drives fail simultaneously because of a power spike.

− It is a safeguard if the storage system gets stolen.

− Back-ups can be kept off-site at a different location. This can come in handy if a natural disaster or fire destroys your workplace.

Page 64: IT2204: Systems Administration I 7. Device Management

65

RAID and Backup!

• RAID is no substitute for back-up!

− The most important reason to back-up multiple generations of data is user error. If someone accidentally deletes some important data and this goes unnoticed for several hours, days or weeks, a good set of back-ups ensure you can still retrieve those files.

− **Read about existing backup strategies

Page 65: IT2204: Systems Administration I 7. Device Management

66

Redundant Power supplies

• Power supplies are the 2nd most failure-prone part

• Ideally, servers should have RPSs– The server will still operate if one

power supply fails– Should have separate power cords– Should draw power from different

sources (e.g., separate UPS)

Page 66: IT2204: Systems Administration I 7. Device Management

67

Hot-swap components

• Redundant components should be hot-swappable

– New components can be added without downtime

– Failed components can be replaced without outage

• Hot-swap components increases cost

– But consider cost of downtime• Always check

– Does OS fully support hot-swapping components?

– What parts are not hot-swappable?– How long/severe is the service interruption?

Page 67: IT2204: Systems Administration I 7. Device Management

68

But Servers are expensive !

• Is there an alternative?

• Server appliances

– Dedicated-purpose, already optimized– Examples: file servers, web servers, email, DNS, routers, etc.

• Many inexpensive workstations

– Common approach for web services□ Google, Hotmail, Yahoo, etc.

– Use full redundancy to counter unreliability– Can be useful (but need to consider total costs,

e.g., support and maintenance, not just purchase price)

Page 68: IT2204: Systems Administration I 7. Device Management

69

Managing Services

• Services distinguish a structured computing environment from a bunch of standalone computers

• Larger groups are typically linked by shared services that ease communication and optimize resources

• Typical environments have many services

– DNS, email, authentication, networking, printing

– Remote access, license servers, DHCP, software repositories, backup services, Internet access, file service

Page 69: IT2204: Systems Administration I 7. Device Management

70

Managing Services

• Providing a service means– Not just putting together hardware

and software– Making service reliable– Scaling the service– Monitoring, maintaining, and

supporting the service

Page 70: IT2204: Systems Administration I 7. Device Management

71

Provide Good Solid Services

• Get customer requirements– Reason for service

□ How service will be used□ Features needed vs. desired□ Level of reliability required□ Justifies budget level

– Define a service level agreement (SLA)□ Enumerates services□ Defines level of support provided□ Response time commitments for various kinds

of problems

– Estimate satisfaction from demos or small usability trials

Page 71: IT2204: Systems Administration I 7. Device Management

72

Provide Good Solid Services

• Get operational requirements

– What other services does it depend on?□ Only services/systems built to same standards or higher

□ Integration with existing authentication or directory services?

– How will the service be administered?

– Will the service scale for growth in usage or data?

– How is it upgraded? Will it require touching each desktop?

– Consider high-availability or redundant hardware

– Consider network impact and performance for remote users

• Revisit budget after considering operational concerns

Page 72: IT2204: Systems Administration I 7. Device Management

73

Provide Good Solid Services

• Consider an open architecture– e.g. open protocols and open file formats– Proprietary protocols and formats can be

changed, may cause dependent systems/vendors to become incompatible

– Beware of vendors who “embrace and extend” so that claims can be made for standards support, while not providing customer interoperability

– Open protocols allow different parties to select client vs. server portions separately

– Open protocols change slowly, typically in upward compatible ways, giving maximum product choice

– No need for protocol gateways (another system/service)

Page 73: IT2204: Systems Administration I 7. Device Management

74

Provide Good Solid Services

• Favor simplicity over complexity

KISSKeep it Simple Stupid

– Simple systems are more reliable, easier to maintain, and less expensive

– Typically a features vs. reliability trade-off

Take advantage of vendor relationships– Provide recommendations for standard services– Let multiple vendors compete for your business– Understand where the product is going– Attempt to favor vendors who develop natively on

your platform (not port to it)

Page 74: IT2204: Systems Administration I 7. Device Management

75

Provide Good Solid Services

• Machine independence– Clients should access services using generic

names

□ e.g., www, calendar, pop, imap, etc.– Moving services to different machines becomes

invisible to users– Consider (at the start) what it will take to move

the service to a new machine

• Supportive environment– Data center provides power, AC, security,

networking– Only rely on systems/services also found in data

center (within protected environment)

Page 75: IT2204: Systems Administration I 7. Device Management

76

Provide Good Solid Services

• Reliability– Build on reliable hardware– Exploit redundancy when available

□ Plug redundant power supply into different UPS on different circuit

– Components of service should be tightly coupled□ Reduce single points of failure

– e.g., all on same power circuit, network switch, etc.

□ Includes dependent services– e.g., authentication, authorization, DNS,

etc.

Page 76: IT2204: Systems Administration I 7. Device Management

77

Provide Good Solid Services

• Reliability– Make service as simple as possible– Independent services on separate

machines, when possible like having a DNS service independent of a Mail service.□ But put multiple parts of single

service together

Page 77: IT2204: Systems Administration I 7. Device Management

78

Provide Good Solid Services

• Restrict access– Customers should not need physical access to

servers□ Fewer people□ Eliminate any unnecessary services on server (security)

• Centralization and standards– Building a service = centralizing management of

service– May be desirable to standardize the service and

centralize within the organization as well□ Makes support easier, reducing training costs□ Eliminates redundant resources

Page 78: IT2204: Systems Administration I 7. Device Management

79

Provide Good Solid Services• Performance

– If a complicated service is deployed, but slow, it is unsuccessful

– Need to build in ability to scale□ Can't afford to build servers for service every year□ Need to understand how the service can be split

across multiple machines if needed

– Estimate capacity required for production (and get room for growth)

– First impression of user base is very difficult to correct

– When choosing hardware, consider whether service is Disk I/O, memory, or network bound

Page 79: IT2204: Systems Administration I 7. Device Management

80

Provide Good Solid Services

• Monitoring– Helpdesk or front-line support must be

automatically alerted to problems– Customers that notice major problems before

sysadmins are getting poor service– Need to monitor for capacity planning as well

• Service roll-out– First impressions

□ Have all documentation available□ Helpdesk fully trained□ Use slow roll-out (helps clients adjust to

service)

Page 80: IT2204: Systems Administration I 7. Device Management

Q & A