noit ocon-2010

12
/ Reconnoiter: Large-scale trending and fault-detection another product built from pain Sunday, August 1, 2010

Upload: theo-schlossnagle

Post on 15-Jan-2015

3.590 views

Category:

Documents


3 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Noit ocon-2010

/

Reconnoiter:Large-scale trending and fault-detection

another product built from pain

Sunday, August 1, 2010

Page 2: Noit ocon-2010

Goals

• make checks cheap: 10000+ checks on cheap hardware

• centralized configuration management.

• decentralized configuration manipulation.

• decouple data collections from

• visualization/trending

• fault-detection

• make life suck just a little less.

Sunday, August 1, 2010

Page 3: Noit ocon-2010

Architectural Design

Sunday, August 1, 2010

Page 4: Noit ocon-2010

System Components

• noitd

• C, hybrid thread/event model, async I/O, small, fast, efficient,

• extensible in lua.

• stratcond

• C, brokers and aggregates data, feeds PostgreSQL,

• feeds Esper complex event processing system for fault-detection,

• has a comet-style webserver built in for feeding web clients.

• postgres

• webconsole

• PHP, almost entirely AJAX based,

• uses canvas to draw.

Sunday, August 1, 2010

Page 5: Noit ocon-2010

Some basics on the architecture

• Everything important happens over SSL:

• Services exposed over SSL with certificates,

• Client connects using client certificates as well.

• Designed so that data collection is journalled and replayed

• prevents data loss due to transient networking issuesbetween NOC an data center.

• There are a lot of moving parts...

• designed to work when the parts don’t get along.

Sunday, August 1, 2010

Page 6: Noit ocon-2010

Installing the code

• svn co https://labs.omniti.com/reconnoiter/tags/wangle

• cd trunk

• autoconf

• ./configure

• make

• make install

Sunday, August 1, 2010

Page 7: Noit ocon-2010

Installing the database

• createdb reconnoiter

• createlang -d reconnoiter plpgsql

• createuser reconnoiter

• createuser stratcon

• createuser prism

• psql -U postgres reconnoiter

• BEGIN;

• \i sql/reconnoiter_ddl_dump.sql

• COMMIT;

• install the crontab in sql/crontab

Sunday, August 1, 2010

Page 8: Noit ocon-2010

Installing the web console

• Web web UI lives in trunk/web/ui

• Setup Apache (left as an exercise for the reader)

Sunday, August 1, 2010

Page 9: Noit ocon-2010

SSL is underneath everything

• Setup SSL certs:

• Need a CA (even if a dummy CA)

• need a cert (signed) for each noitd server

• need a client cert (signed) for the stratcond server

• need a web cert (signed) for the stratcond server (future**)

• More details by running make test and looking into the

• test-noit.conf

• test-stratcon.conf

Sunday, August 1, 2010

Page 10: Noit ocon-2010

Configure noitd

• noit.conf

• pretty good as a starting point, just clear out all the checks

• module loading is at boot time, so make sure all the modules you want are loaded.

• checks can be added at run time via the text console.

• noitd must run as root, it drops privileges after module initialization.

• /usr/local/sbin/noitd -c /usr/local/etc/noit.conf

Sunday, August 1, 2010

Page 11: Noit ocon-2010

Configure stratcond

• stratcon.conf

• pretty good as a starting point, just clear out all the noit addresses

• each noit in the field requires a line in the stratcon.conf file:

• <noit address="10.225.209.25" port="43191" />

• database passwords must be configured

• hostname and document_domain must be configured

• /usr/local/sbin/stratcond -c /usr/local/etc/stratcon.conf

Sunday, August 1, 2010

Page 12: Noit ocon-2010

Thank You

• Documentation at https://labs.omniti.com/trac/reconnoiter

• Commercial support available from OmniTI.

• Please join in.

• OmniTI is hiring: http://omniti.com/is/hiring

• Thanks!

Sunday, August 1, 2010