have you been stalking your servers?

Post on 10-May-2015

578 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Presentation for DrupalCon Prague 2013 https://prague2013.drupal.org/session/have-you-been-stalking-your-servers

TRANSCRIPT

Have you been stalking your servers?

Have you been stalking your servers?

Marji CermakSysadmin & DevOps Engineer at Morpht

marji@morpht.com@cermakm

The rule of 3 things

picture: http://www.flickr.com/photos/helenaperezgarcia/5692392667/

The rule of 3 things

1. What is monitoring and why do you want to monitor

2. Some monitoring tools available for you

3. It is easy to start with monitoring.

Part 1

What is monitoring and why do you want to monitor

photo: http://www.flickr.com/photos/tiagopadua/7903366470/

Monitoring

Monitoring is an intermittent (regular or irregular) series of observations in time, carried out to show the extent of compliance with a formulated standard or degree of deviation from an expected norm.

J. M. Hellawell (1991), modified by A. Brown (2000), http://jncc.defra.gov.uk/page-2268nature conservation area

Why you need to monitor

● to know about the bad news before your customers (or your boss)

Why you need to monitor

● to know about the bad news before your customers (or your boss)

● to scale up your server in advance

Why you need to monitor

● to know about the bad news before your customers (or your boss)

● to scale up your server in advance

● to tune up your app

Why you need to monitor (cont.)

● to prove your uptime of 99.999 :)

The fun of the nines

Source: http://en.wikipedia.org/wiki/High_availability

Nines: http://en.wikipedia.org/wiki/List_of_unusual_units_of_measurement#Nines

Why you need to monitor (cont.)

● to prove your uptime of 99.999 :)

● to minimise downtime (expensive)

Why you need to monitor (cont.)

● to prove your uptime of 99.999 :)

● to minimise downtime (expensive)

● to capture customer information

Why you need to monitor (cont.)

● to have data / metrics to diagnose

Diagnosing your collected data

watch out for:● trends

Diagnosing your collected data

watch out for:● trends● spikes

Diagnosing your collected data

watch out for:● trends● spikes● irregularities

Diagnosing your collected data

watch out for:● trends● spikes● irregularities● thresholds

Areas to monitor

● network

photo: http://www.flickr.com/photos/misja_klimov/2120956405/

Areas to monitor

● network● server

photo: http://www.flickr.com/photos/johnjack/3666997634/

Areas to monitor

● network● server● services

photo: http://www.flickr.com/photos/agustingodet/3691794089/

Areas to monitor

● network● server● services

photo: http://www.flickr.com/photos/agustingodet/3691792393/

Areas to monitor

● network● server● services● applications photo: http://www.flickr.com/photos/cheerfulstoic/942211994/

Areas to monitor

● network● server● services● applications● users

photo: http://www.flickr.com/photos/jimmysmith/99528596/

Drupal Areas to monitor?

● network● server● services● applications● users

Drupal Areas to monitor

● network● server ● services● applications● users

Drupal Areas to monitor

● network● server ● services● applications● users

Drupal Areas to monitor

● network● server● services

○ webserver○ database

● applications● users

Drupal Areas to monitor

● network● server● services

○ webserver○ database

● applications - your Drupal site(s)● users

Drupal Areas to monitor

● network● server● services

○ webserver○ database

● applications - your Drupal site(s)● users

Part 2

Some monitoring tools available for you

Meet Nagios, Munin and others

● Nagios● Munin● APC dashboard

● related Drupal modules

Nagios /ˈnɑːɡiːoʊs/

● system, network and infrastructure monitoring software application

● monitors and alerts

Nagios /ˈnɑːɡiːoʊs/

Provides monitoring of:● network services (SMTP, POP3, HTTP,

NNTP, ICMP, SNMP, FTP, SSH),● host resources (processor load, disk usage,

system logs),● anything else like probes (temperature,

alarms, etc).Many plugins available.

Nagios /ˈnɑːɡiːoʊs/

Name and Pronunciation:● NetSaint -> "Nagios Ain't Gonna Insist On

Sainthood"● Agios' a transliteration of the Greek word

άγιος (saint)

Nagios /ˈnɑːɡiːoʊs/

● alerts by email/pager/IM...● alerts to different contacts● notification escalation● service / host dependencies● soft / hard states

Nagios /ˈnɑːɡiːoʊs/

Nagios Addons

NRPE (Nagios Remote Plugin Executor)- executes plugins on remote Linux/Unix hosts

image source: http://nagios.sourceforge.net/docs/3_0/addons.html

Nagios Addons

NSCA- sends passive checks from remote Linux/Unix hosts to Nagios

image source: http://nagios.sourceforge.net/docs/3_0/addons.html

Drupal and Nagios

Munin

● network/system monitoring application● outputs graphs through a web interface● many plugins

Munin

● master / node architecture● connects to all nodes at regular intervals ● it uses the RRDtool (round robin database

tool, handles time-series data)

Munin Example

Drupal and Munin

Drupal and Munin

● they complement each other● nagios normally alerts on one “service” ● munin can be used to correlate different

things

Nagios & Munin

APC - what is it?

The Alternative PHP Cache (APC) is a free and open opcode cache for PHP.

APC - what is it?

The Alternative PHP Cache (APC) is a free and open opcode cache for PHP.

Its goal is to provide a free, open, and robust framework for caching and optimising PHP intermediate code.

Inside your webserver (not a webcache)

Monitoring APCMemory Usage, Hit & Misses

Monitoring APCFragmentation

Monitoring APCmemory usage

Monitoring APCfiles in cache

Other monitoring tools

● Collectd● Graphite● Shinken● Sensu● NewRelic● Pingdom

Part 3

It is easy to start with monitoring.

How to install these tools?

Muninsudo apt-get install munin munin-node

Nagiossudo apt-get install nagios3

APC dashboardphp.apc script from php-apc package

How to configure these?

● It is a bit fiddly● There are many guides targeting beginners● You don’t want to do it again and again

puppet – a quick way to start

system for automating system administration tasks

puppet – a quick way to start

● a declarative language for expressing system configuration,

puppet – a quick way to start

● a declarative language for expressing system configuration,

● a client and server for distributing it

puppet – a quick way to start

● a declarative language for expressing system configuration,

● a client and server for distributing it

● and a library for realising the configuration.

puppet – a quick way to start

}

puppet – a quick way to start

1. clone the stalk-your-box repo

2. run puppet apply on the code

3. monitor!

A quick way to start

$ git clone git://github.com/morpht/stalk-your-box.git /tmp/stalk-your-box

Cloning into '/tmp/stalk-your-box'...remote: Counting objects: 23, done.remote: Compressing objects: 100% (19/19), done.remote: Total 23 (delta 1), reused 23 (delta 1)Receiving objects: 100% (23/23), 11.35 KiB, done.Resolving deltas: 100% (1/1), done.

A quick way to start

$ cd /tmp/stalk-your-box/$ sudo puppet apply --modulepath=modules manifest.pp

notice: /Stage[main]/Nagios::Server/Package[nagios3]/ensure: ensure changed 'purged' to 'present'

notice: /Stage[main]/Nagios::Server/File[/etc/nagios3/htpasswd.users]/ensure: created

notice: /Stage[main]/Nagios::Server/Exec[update-nagios-htpasswd]/returns: Adding password for user nagiosadmin

notice: /Stage[main]/Nagios::Server/Exec[update-nagios-htpasswd]/returns: executed successfully

notice: /Stage[main]/Munin::Node/Package[libcache-cache-perl]/ensure: ensure changed 'purged' to 'present'

notice: /Stage[main]/Munin::Node/Package[munin-node]/ensure: ensure changed 'purged' to 'present'

notice: /Stage[main]/Munin::Node/File[munin-node.conf]/content: content changed '{md5}e486786f866d7d7e025dea401c300e7b' to '{md5}dbf97a87a8da86ef68155815ecae3c1c'

notice: /Stage[main]/Munin::Server/Service[apache2]: Triggered 'refresh' from 1 events

notice: Finished catalog run in 44.26 seconds

What this gives you

What this gives you

What this gives you

Manifest.pp

Manifest.pp

Manifest.pp

Summary

It is easy to start with monitoring.

The fun part - what’s wrong?

What’s wrong here?

The fun part - what’s wrong?

Questions

Here is the get started monitoring repo:https://github.com/morpht/stalk-your-box

Marji CermakSysadmin & DevOps Engineer at Morpht

marji@morpht.com@cermakm

ResourcesRule of Three: en.wikipedia.org/wiki/Rule_of_three_(writing)Nagios: http://www.nagios.org/Munin: http://munin-monitoring.org/Nagios module: https://drupal.org/project/nagiosMunin module: https://drupal.org/project/muninMunin plugins (experimental): https://drupal.org/sandbox/murrayw/2084281Sensu: http://sensuapp.orgMySQLTuner: http://MySQLTuner.pl

THANK YOU!

WHAT DID YOU THINK?

Locate this session at the DrupalCon Prague website:http://prague2013.drupal.org/schedule

Click the “Take the survey” link

top related