jesse olson - nagios log server architecture overview
TRANSCRIPT
Nagios Log Server Architecture Overview
By Jesse [email protected]
Introducing: Myself
• Support Technician/Community Ambassador• jolson
Topics Covered
•Elasticsearch, Logstash, Kibana
●Subsystems (Jobs and Poller)
●Backup architecture
●Best practices
Elasticsearch• Database• JSON Object storage• Java based• RESTful HTTP API• Scalable & Redundant
Elasticsearch Terminology
• Instance• Cluster• Shard• Index
Index:01012016
Instance 1 Instance 2
P5 R5R4P4R3P3R2P2
P1 R1
Redundancy and Performance Through ShardsCluster
Instance 3
It Just Works.• Common Problems with Elasticsearch• Out of memory• Out of disk space• High latency between instances• Massive deployment• Pitfalls aren't unique to Nagios Log Server
Logstash• Log collection - inputs• Log processing - filters• Log exporting - outputs
Inputscouchdb_changes drupal_dblog elasticsearch exec eventlog file ganglia gelfgenerator graphite github heartbeat heroku http http_poller irc imap jdbc jmx kafka log4j lumberjack
•tcp•udp•syslog
meetup pipe puppet_facter relp rss rackspace rabbitmq redis snmptrap stdin sqlite s3 sqs stomp syslog tcp twitter unix udp varnishlog wmi websocket xmpp zeromq
•grok•mutate•geoip
Filtersaggregate alter anonymize collate csv cidr clone cipher checksum date dns dropelasticsearch extractnumbers environment elapsedfingerprint geoip grok i18n json json_encode kv mutate metrics multiline metaevent prune punct ruby range syslog_pri sleep split throttle translate uuid urldecode useragent xml zeromq
Outputsboundary circonus csv cloudwatch datadog_metrics email elasticsearch exec file google_bigquery google_cloud_storageganglia gelf graphtastic graphite hipchat http irc influxdb juggernaut jira kafka lumberjack librato loggly mongodb
• elasticsearch• hipchat• nagios
metriccatcher nagios null nagios_nsca opentsdb pagerduty pipe riemann redmine rackspace rabbitmq redis riak s3 sqs stomp statsd solr_http sns syslog stdout tcp udp webhdfs websocket xmpp zeromq
Outputselasticsearch
Inputs Filtersgrok/mutate/geoip
NLS Cluster
DEV something.arkham.local IP 192.168.1.1 COUNTRY US CODE ERROR STATUS ON
Field ValueDevice something.arkham.localIP 192.168.1.1Country USResponse ErrorStatus On
tcp/udp/syslog
Kibana
Nagios Log Server Community
How does Nagios Log Server differ from the ELK Stack?
Nagios Log Server Literal ELK stack
Key Differences• Users
• Alerting• Backups• Security• Support• Administration Time
Installation DifferencesNagios Log Server ELK Stack
cd /tmpwget assets.nagios.com/downloads/nagios-log-server/nagioslogserver-latest.tar.gz tar xzf nagioslogserver-latest.tar.gzcd nagioslogserver./fullinstall
./upgrade
sudo add-apt-repository -y ppa:webupd8team/java
sudo apt-get updatesudo apt-get -y install oracle-java8-installer
wget -O - http://packages.elasticsearch.org/GPG-
KEY-elasticsearch | sudo apt-key add -
echo 'deb http://packages.elasticsearch.org/elasticsearch/
1.4/debian stable main' | sudo tee
/etc/apt/sources.list.d/elasticsearch.list
sudo apt-get updatesudo apt-get -y install elasticsearch=1.4.4
sudo vi /etc/elasticsearch/elasticsearch.yml
ADD network.host: localhost
sudo service elasticsearch restart
sudo update-rc.d elasticsearch defaults 95 10
cd ~; wget https://download.elasticsearch.org/kibana/kibana
/kibana-4.0.1-linux-x64.tar.gz
tar xvf kibana-*.tar.gz
vi ~/kibana-4*/config/kibana.yml
ADD host: "localhost"sudo mkdir -p /opt/kibana
sudo cp -R ~/kibana-4*/* /opt/kibana/
cd /etc/init.d && sudo wget
https://gist.githubusercontent.com/thisismitch/8
b15ac909aed214ad04a/raw/bce61d85643c2dcdfbc2728c
55a41dab444dca20/kibana4
sudo chmod +x /etc/init.d/kibana4
sudo update-rc.d kibana4 defaults 96 9
sudo service kibana4 start
sudo apt-get install nginx apache2-utils
sudo htpasswd -c /etc/nginx/htpasswd.users
kibanaadminsudo vi /etc/nginx/sites-available/default
sudo service nginx restart
echo 'deb http://packages.elasticsearch.org/logstash/1.5/d
ebian stable main' | sudo tee
/etc/apt/sources.list.d/logstash.list
sudo apt-get updatesudo apt-get install logstash
sudo mkdir -p /etc/pki/tls/certs
sudo mkdir /etc/pki/tls/private
sudo vi /etc/ssl/openssl.cnf
cd /etc/pki/tlssudo openssl req -config /etc/ssl/openssl.cnf
-x509 -days 3650 -batch -nodes -newkey rsa:2048
-keyout private/logstash-forwarder.key -out
certs/logstash-forwarder.crt
cd /etc/pki/tls; sudo openssl req -subj
'/CN=logstash_server_fqdn/' -x509 -days 3650
-batch -nodes -newkey rsa:2048 -keyout
private/logstash-forwarder.key -out
certs/logstash-forwarder.crt
sudo vi /etc/logstash/conf.d/01-lumberjack-
input.confsudo vi /etc/logstash/conf.d/10-syslog.conf
sudo vi /etc/logstash/conf.d/30-lumberjack-
output.confsudo service logstash restart
scp /etc/pki/tls/certs/logstash-forwarder.crt
user@client_server_private_address:/tmp
echo 'deb http://packages.elasticsearch.org/logstashforwar
der/debian stable main' | sudo tee
/etc/apt/sources.list.d/logstashforwarder.list
wget -O - http://packages.elasticsearch.org/GPG-
KEY-elasticsearch | sudo apt-key add
sudo apt-get updatesudo apt-get install logstash-forwarder
sudo mkdir -p /etc/pki/tls/certs
sudo cp /tmp/logstash-forwarder.crt
/etc/pki/tls/certs/
Subsystems: Jobs and Poller● Queue Based● Automatic● Cron Controlled
Jobs Subsystem• Apply configuration
• Changing timezone• Snapshots• Start or stop services• Alerts
• Backups
/usr/local/nagioslogserver/var/jobs.log
Poller Subsystem• Keeps instances clustered• Checks for updates• Elasticsearch service status• Logstash service status• Instance IP address• Instance hostname
/usr/local/nagioslogserver/var/poller.log
Backup Architecture• Configuration Backup
• Snapshots• Log backups
One reason you might need a backup server.
Configuration Backups[jolson@localhost ~]#ls -lh /store/backups/nagioslogserver
Sep 3 nagioslogserver.2015-09-03.1441308221.tar.gz
Sep 4 nagioslogserver.2015-09-04.1441394621.tar.gz
Sep 5 nagioslogserver.2015-09-05.1441481022.tar.gz
Sep 6 nagioslogserver.2015-09-06.1441567426.tar.gz
Sep 7 nagioslogserver.2015-09-07.1441653826.tar.gz
Snapshots/usr/local/nagioslogserver/snapshots
Log Backups
NFS Server
Cluster
i4i2 i3i1
Best Practices• 60GB Memory per instance• Rotation Schedule• Avoiding Split Brain
Avoiding Split Brain
Instance 1 Instance 2 Instance 3
Minimum Master Nodes: 2
Thank you!Any Questions?