towards structured log analysis
DESCRIPTION
Value of software log file analysis has been constantly increasing with the value of information to organizations. Log management tools still have a lot to deliver in order to empower their customers with the true strength of log information. In addition to the traditional uses such as testing software functional conformance, troubleshooting and performance benchmarking, log analysis has proven its capabilities in fields like intrusion detection and compliance evaluation. This is verified by the emphasis on log analysis in regulations like PCI DSS, FISMA, HIPAA and frameworks such as ISO 27001 and COBIT.In this paper we present an in depth analysis into current log analysis domains and common problems. A practical guide to the use of few popular log analysis tools is also included. Lack of proper support for structured analysis is identified as one major flaw in existing tools. After that, we describe a framework we developed for structured log analysis with the view of providing a solution to open problems in the domain. The core strength of the framework is its ability to handle many log file formats that are not well served by existing tools and providing sophisticated infrastructure for automating recurring log analysis procedures. We prove the usefulness of the framework with a simple experiment.TRANSCRIPT
Towards Structured Log Analysis
Coding Log Analysis Intelligence
JCSSE 2012
Dileepa Jayathilake99X Technology
Log Analysis in Use
Troubleshooting Functional Conformance
Monitoring System Health
Statistical Insight
Log Analysis Domains
Web server logsNetwork logsSecurity logsSystem logsApplication logs
Access patterns, traffic patterns, client platforms, user agents, client technologies, attacks, HTTP
errors, etc
Automation will save lot of costs
Manually dealing with
vast amount of log information
is difficult
Manual analysis needs acquaintance with format
Even with expertise, manual log analysis is laborious
Manual analysis
hinders reusing recurring analysis patterns
Log Analysis Automation Challenges
• “Universal Format for Logger Messages” - Expired without a successor• “Syslog” – Serves only a limited range of system logs
Lack of a standard
• Erasing parts of a log file, mixing up multiple log entries, presence of log entries in wrong order and garbage in the middle of log files
Log file corruptions
• Problem stems from incorrect judgments of developers regarding the importance of log entries
Inappropriate log content
• Format and the content logged can continue to evolve
Varying log semantics
• Log files can easily grow into gigabyte sizes in a commercial environment
Huge sizes of log files
Existing Log Management Tools
Splunk
Log Rhythm
ArcSight
Logger
loggly
Log logic
AW Stats
Secure Vue
Rich log management capabilities
Unstructured search on log content
Assume line logs
Do not use structure semantics
Supported functionalities
Identifying common constructsLog indexingHandling different log sourcesDealing with different log typesRich user interfacesAlertsIntrusion detectionCompliance validationAutomate recurring analysis procedures
Structured Log Analysis
Why Structured Log Analysis?
Many log files manifest a structure
Analysis needs contextual correctness
Automation requires a structure-aware tool Example
Structured Log Analysis Framework
ConclusionsExisting tools solve a subset of automated log analysis requirements, but ignore the importance of structure
New declarative language is capable of expressing any log file format and is resilient to corruptions
The scripting language provides solid infrastructure for rule based automation
Data management scheme offers flexibility
Current UI generation method is not appropriate
Future WorkAdd more log management capabilities
Real time analysis
Built-in format declarations for common log formats
Optimize data management module to handle heterogeneous data efficiently
UI generation based on HTML5