log analysis in the accelerator sector
DESCRIPTION
Log analysis in the accelerator sector. Steen Jensen, BE-CO-DO. Outline. Introduction to the controls system Our motivation for log analysis Requirements The current setup Use cases Conclusions. Outline. Introduction to the controls system Our motivation for log analysis - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Log analysis in the accelerator sector](https://reader036.vdocument.in/reader036/viewer/2022062501/56816664550346895dd9f5d2/html5/thumbnails/1.jpg)
Log analysis in the accelerator sectorSteen Jensen, BE-CO-DO
![Page 2: Log analysis in the accelerator sector](https://reader036.vdocument.in/reader036/viewer/2022062501/56816664550346895dd9f5d2/html5/thumbnails/2.jpg)
Openlab workshop on data analytics, CERN, November 16th 20122
Outline
Introduction to the controls system Our motivation for log analysis Requirements The current setup Use cases Conclusions
![Page 3: Log analysis in the accelerator sector](https://reader036.vdocument.in/reader036/viewer/2022062501/56816664550346895dd9f5d2/html5/thumbnails/3.jpg)
Openlab workshop on data analytics, CERN, November 16th 20123
Outline
Introduction to the controls system Our motivation for log analysis Requirements The current setup Use cases Conclusions
![Page 4: Log analysis in the accelerator sector](https://reader036.vdocument.in/reader036/viewer/2022062501/56816664550346895dd9f5d2/html5/thumbnails/4.jpg)
Openlab workshop on data analytics, CERN, November 16th 20124
CERN’s accelerator complex
![Page 5: Log analysis in the accelerator sector](https://reader036.vdocument.in/reader036/viewer/2022062501/56816664550346895dd9f5d2/html5/thumbnails/5.jpg)
Openlab workshop on data analytics, CERN, November 16th 20125
Cern’s Control Centre, CCC
![Page 6: Log analysis in the accelerator sector](https://reader036.vdocument.in/reader036/viewer/2022062501/56816664550346895dd9f5d2/html5/thumbnails/6.jpg)
Openlab workshop on data analytics, CERN, November 16th 20126
Control system infrastructureGPN
T
T T T T
OperatorConsoles
Fixed Displays
File Servers Appl Servers
PVSS Servers
Real-time Linux LynxOS Computers
WorldFIPSystems PLC
s
Cryo, Vacuum, …
Magnet Control, RFQuench Protection, …
Beam instrum. KickersMachine Protection, …
Other operationalConsoles
Oracle
Technet
Timing
![Page 7: Log analysis in the accelerator sector](https://reader036.vdocument.in/reader036/viewer/2022062501/56816664550346895dd9f5d2/html5/thumbnails/7.jpg)
Openlab workshop on data analytics, CERN, November 16th 20127
Hardware
425 consoles 300 servers 1300 front ends 600 module types ~85.000 device instances
![Page 8: Log analysis in the accelerator sector](https://reader036.vdocument.in/reader036/viewer/2022062501/56816664550346895dd9f5d2/html5/thumbnails/8.jpg)
Openlab workshop on data analytics, CERN, November 16th 20128
Software Java – operational code
400 GUIs, 200 servers ~8MLOC, > 1000 jars Up to 1000 processes 80 people, 10 groups
C/C++ > 1300 Front End servers Real-time processes Drivers (Linux/LynxOS) 80 people, 8 groups
![Page 9: Log analysis in the accelerator sector](https://reader036.vdocument.in/reader036/viewer/2022062501/56816664550346895dd9f5d2/html5/thumbnails/9.jpg)
Openlab workshop on data analytics, CERN, November 16th 20129
Software environment
Operating systems Linux, LynxOS, Windows
Languages Java, C++, C, Scripts
Developers Staff, Fellows, Students – as primary or secondary activity CO, OP and Equipment Groups
Legacy 10+ years
![Page 10: Log analysis in the accelerator sector](https://reader036.vdocument.in/reader036/viewer/2022062501/56816664550346895dd9f5d2/html5/thumbnails/10.jpg)
Openlab workshop on data analytics, CERN, November 16th 201210
Connections in middleware layer
![Page 11: Log analysis in the accelerator sector](https://reader036.vdocument.in/reader036/viewer/2022062501/56816664550346895dd9f5d2/html5/thumbnails/11.jpg)
Openlab workshop on data analytics, CERN, November 16th 201211
Outline
Introduction to the controls system Our motivation for log analysis Requirements The current setup Use cases Conclusions
![Page 12: Log analysis in the accelerator sector](https://reader036.vdocument.in/reader036/viewer/2022062501/56816664550346895dd9f5d2/html5/thumbnails/12.jpg)
Openlab workshop on data analytics, CERN, November 16th 201212
Motivation
Further improve quality and availability of the controls system
New responsibility model => non-experts must be able to quickly identify who to call
Increase diagnostic efficiency Widen and deepen diagnostic capabilities Proactive & preventive maintenance & evolution We already have many log files that can be exploited
better
![Page 13: Log analysis in the accelerator sector](https://reader036.vdocument.in/reader036/viewer/2022062501/56816664550346895dd9f5d2/html5/thumbnails/13.jpg)
Openlab workshop on data analytics, CERN, November 16th 201213
Log data characteristics
Continuous stream of text messages ~2Gb per day, bursts > 10Gb per day Multiple transport mechanisms
syslog, log4J, JMS Scattered, system-specific log files Diversity in format and length Ambiguity in log level interpretation Variations in frequency Differences in relevance
![Page 14: Log analysis in the accelerator sector](https://reader036.vdocument.in/reader036/viewer/2022062501/56816664550346895dd9f5d2/html5/thumbnails/14.jpg)
Openlab workshop on data analytics, CERN, November 16th 201214
Outline
Introduction to the controls system Our motivation for log analysis Requirements The current setup Use cases Conclusions
![Page 15: Log analysis in the accelerator sector](https://reader036.vdocument.in/reader036/viewer/2022062501/56816664550346895dd9f5d2/html5/thumbnails/15.jpg)
Openlab workshop on data analytics, CERN, November 16th 201215
Use cases
Allow people to diagnose systems other than their own Facilitate correlation of messages across systems Detect what has changed if something does not work Observe trends over time Easily obtain customized views of one or more systems Receive notifications based on custom criteria
![Page 16: Log analysis in the accelerator sector](https://reader036.vdocument.in/reader036/viewer/2022062501/56816664550346895dd9f5d2/html5/thumbnails/16.jpg)
16
Non-functional requirements
Easy to use and access by (non-)specialists Overviews and graphical visualisations Efficient data mining Allow for knowledge capture, sharing and re-use
grep/awk/sed does not provide this
Looking at Splunk, it is unrealistic to develop own system
Openlab workshop on data analytics, CERN, November 16th 2012
![Page 17: Log analysis in the accelerator sector](https://reader036.vdocument.in/reader036/viewer/2022062501/56816664550346895dd9f5d2/html5/thumbnails/17.jpg)
Openlab workshop on data analytics, CERN, November 16th 201217
Outline
Introduction to the controls system Our motivation for log analysis Requirements The current setup Use cases Conclusions
![Page 18: Log analysis in the accelerator sector](https://reader036.vdocument.in/reader036/viewer/2022062501/56816664550346895dd9f5d2/html5/thumbnails/18.jpg)
Openlab workshop on data analytics, CERN, November 16th 201218
Log data pre-processing
Centralization Filtering of irrelevant messages Throttling and burst protection
Down from 2+ Gb / day to ~100 Mb / day, not counting Java, i.e. likely to increase considerably
![Page 19: Log analysis in the accelerator sector](https://reader036.vdocument.in/reader036/viewer/2022062501/56816664550346895dd9f5d2/html5/thumbnails/19.jpg)
Openlab workshop on data analytics, CERN, November 16th 201219
Splunk setup
Central instance receiving pre-filtered messages via syslog and JMS
Automated alerts Dashboards and saved searches Manual monitoring and follow-up
![Page 20: Log analysis in the accelerator sector](https://reader036.vdocument.in/reader036/viewer/2022062501/56816664550346895dd9f5d2/html5/thumbnails/20.jpg)
Openlab workshop on data analytics, CERN, November 16th 201220
Splunk dashboard
![Page 21: Log analysis in the accelerator sector](https://reader036.vdocument.in/reader036/viewer/2022062501/56816664550346895dd9f5d2/html5/thumbnails/21.jpg)
Openlab workshop on data analytics, CERN, November 16th 201221
RBAC denials on SET
![Page 22: Log analysis in the accelerator sector](https://reader036.vdocument.in/reader036/viewer/2022062501/56816664550346895dd9f5d2/html5/thumbnails/22.jpg)
Openlab workshop on data analytics, CERN, November 16th 201222
Outline
Introduction to the controls system Our motivation for log analysis Requirements The current setup Use cases Conclusions
![Page 23: Log analysis in the accelerator sector](https://reader036.vdocument.in/reader036/viewer/2022062501/56816664550346895dd9f5d2/html5/thumbnails/23.jpg)
Openlab workshop on data analytics, CERN, November 16th 201223
Detailed case: Leap second
On July 1st, 01:59:59 a 60th leap second was added globally Using Splunk, a system administrator discovered that
Some computers failed to add the leap second on time Certain types of computers added the leap second at a later moment There seems to be no pattern over time No computer added a leap second more than once
![Page 24: Log analysis in the accelerator sector](https://reader036.vdocument.in/reader036/viewer/2022062501/56816664550346895dd9f5d2/html5/thumbnails/24.jpg)
Openlab workshop on data analytics, CERN, November 16th 201224
Use cases RBAC security
Tokens missing, malformed or expired CMW Client issues
Slow consumer, zombies, incorrect accesses, error handling Timing system
telegram layout issues Performance testing of new system
CO Frameworks Improvement: Token check prior to access Bug: applying wrong token in certain cases Bug: Loop on timing error
![Page 25: Log analysis in the accelerator sector](https://reader036.vdocument.in/reader036/viewer/2022062501/56816664550346895dd9f5d2/html5/thumbnails/25.jpg)
Openlab workshop on data analytics, CERN, November 16th 201225
Use cases, continued
Design considerations Architectural decisions based on usage info
Driver error reporting, missing module System issues
yum updates misbehaving due to race condition Separation between operational and test environments
![Page 26: Log analysis in the accelerator sector](https://reader036.vdocument.in/reader036/viewer/2022062501/56816664550346895dd9f5d2/html5/thumbnails/26.jpg)
Openlab workshop on data analytics, CERN, November 16th 201226
Outline
Introduction to the controls system Our motivation for log analysis Requirements The current setup Use cases Conclusions
![Page 27: Log analysis in the accelerator sector](https://reader036.vdocument.in/reader036/viewer/2022062501/56816664550346895dd9f5d2/html5/thumbnails/27.jpg)
Openlab workshop on data analytics, CERN, November 16th 201227
Conclusions Log analysis is very valuable – we learned a lot using Splunk
Knowledge about control system usage Pro-active detection, trimming, improving Re-active diagnostics
It is a continuous discover-learn-improve process It involves a mix of
automated alerts manual monitoring ad hoc data mining
Experience allows improving log data quality Culture: Developer usage Structure: Key-value pairs
Splunk is a very promising tool in CO’s context, and projects show strong interest