why favour icinga over nagios @ osdc 2015

62
www.icinga.org Why favour Icinga over Nagios OSDC Berlin - 23th April 2015

Upload: icinga

Post on 18-Jul-2015

2.359 views

Category:

Technology


2 download

TRANSCRIPT

www.icinga.org

Why favour Icinga over Nagios

OSDC Berlin - 23th April 2015

•  Bernd Erk

•  Working @NETWAYS

•  Icinga Co-Founder

•  @gethash

THE ICINGA PROJECT

Open Source Enterprise Monitoring

Icinga is a scalable and extensible monitoring system which checks the availability of your resources, notifies users of outages and provides extensive BI data.

•  Originally forked from Nagios in 2009

•  Independent version Icinga 2 since 2014

You?

Icinga Core

C-based source MySQL, PostgreSQL, Oracle

Icinga Quality, Testing and Community Support

Website and Open Source Ticketing System

Icinga Reports

based on Jasper Reports

Icinga Doc

based on Markdown

3rd Party Tools Icinga Web based on PHP using ExtJS, Agavi MVC

IDOUTILS

Icinga Web 2 Based on PHP / responsive design

Icinga 2

C++-based source with multiple components

IDO Livestatus Cluster …

ICINGA 2 - INTRODUCTION

•  Monitors everything

•  In a regular interval

•  Gathering status

•  Collect performance data

•  Notifies using any channel

•  Detects dependencies

•  Handles events in configured way

•  Forwards logs to Logstash and Graylog

•  Passes performance data to Graphite, OpenTSDB or InfluxDB

•  Based on C++ and Boost

•  Supports MySQL and PostgreSQL •  Includes a extensive template library

•  Version 2.3.4 is out since a couple of days

•  Puppet, Chef and Ansible support •  Packages and Vagrant Box available

WHY NAGIOS™ IS GOOD?

•  Monitoring things is very easy

•  Very simple software stack

•  No complex external dependencies

•  Active checks are powerful

•  Gathering performance data

•  Huge community

•  Thousands of Plugins

OK, BUT WHY ICINGA THEN?

NAGIOS DOES NOT SCALE

•  It is just a single loop

•  Limitations using external interfaces

•  Icinga 2 is a multithreaded C++ Core

•  Load is distributed automatically

•  Ability to monitor thousands of devices in second interval

ADDING MODULES IS HARD

#  tar  xzvf  mk-­‐livestatus-­‐1.2.4.tar.gz  

#  cd  mk-­‐livestatus-­‐1.2.4    

#  ./configure  -­‐-­‐prefix=/usr/local/icinga    

 -­‐-­‐exec-­‐prefix=/usr/local/icinga  

#  make  

#  cp  src/livestatus.o  /usr/local/icinga/bin  

 

 define  module  {  

               module_name          mklivestatus  

               path                        /usr/local/icinga/bin/livestatus.o  

               module_type          neb  

               args                        /usr/local/icinga/var/rw/live  

               }  

 

Checker

Notify

Gelf

Perfdata Graphite

IDO

Compat

Livestatus

•  We have a powerful CLI

•  Adding new features is easy

•  You can really really do sophisticated setups … but you don’t have too

# icinga2 feature enable livestatus

# icinga2 feature enable ido-mysql

DEMO

NO CLUSTERING AND DISTRIBUTION

•  There is no integrated failover mechanism

•  Configuration is not distributed

•  No shared monitoring information

•  Zones for multitenancy environments

•  Support for logic splits in the config

•  Availability and scaling zones

•  Automatic redistribution of checks

IDO

Config

Checker

Livestatus Checker

IDO

Checker

Perfdata GELF

SECURITY IS A MESS

•  NSCA works, but not in a good way

•  NRPE has a couple of security issues

•  You can make it secure … by hand

•  Bidirectional communication using SSL

•  “binlog” like retention for events

•  Distributed features throughout the cluster

CONFIGURATION LIMITS

define  service{  

 host_name        linux1,linux2,linux3,...,linux9  

 service_description    ssh-­‐check  

 other  service  directives  ...  

 }  

apply  Service  "ssh"  {  

 import  "generic-­‐service”  

check_command  =  "ssh”  

assign  where  host.address  &&  host.vars.os  ==  "Linux”  

}  

apply  Service  "ssh"  {  

 import  "generic-­‐service”  

check_command  =  "ssh”  

assign  where  host.address  &&  host.vars.os  ==  "Linux”  

ignore  where  host.vars.test  ==  true  

}  

define  hostgroup{  

 hostgroup_name                    linux-­‐servers  

 alias                                      Linux  Servers  

 members                                  linux1,linux2,linux3  

 }  

object  Host  “mysql-­‐server1"  {  

 address  =  "10.0.0.1“  

 check_command  =  "hostalive“  

}  

 

object  HostGroup  "mysql-­‐server"  {  

 display_name  =  "MySQL  Server“  

 assign  where  match("*mysql*",  host.name)  

}  

ONE MORE THING …

object  Service  ”webservice"  {  

   import  "generic-­‐service”  

   check_command  =  ”load”  

   host_name  =  “a  really  great  server”  

   vars.load_wload1  =  {{  

       if  (get_time_period(“9to5”).is_inside)  {  

           return  40  

       }  else  {  

           return  60  

       }  

   }}  

 }  

•  Different config format

•  Won’t miss the old config!

•  It is really time for change

•  You will love it!

WHAT YOU SEE IS WHAT YOU GET

HOPEFULLY NOT!

•  Parsing status.dat is not really cool

•  Executing commands is hard

•  Very inflexible architecture

•  Limitations in current Icinga interfaces

•  Really hard to extend and integrate

•  No unified interface so far

•  Easy to extend and embed

•  Multiple authentication providers

•  Support for db and livestatus

•  Responsive

Web 2

Monitoring Docs

BP Graphite PNP

Demo

THE COMMUNITY

You?

Portland 2015

October 10th

Kuala Lumpur 2015

June 9th

CONCLUSION

•  Download Icinga 2

•  Rethink you configuration

•  Install Icinga Web 2 and play with it

•  Give us feedback

#icinga  

THANK YOU! www.icinga.org

dev.icinga.org

git.icinga.org

@icinga

/icinga

+icinga