tb3248 - syslog connector performance tuning - girish mantry moehadi liang(1)

Upload: dru5h

Post on 02-Jun-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 TB3248 - Syslog Connector Performance Tuning - Girish Mantry Moehadi Liang(1)

    1/28

    Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

    Syslog Connector

    Performance TuningGirish Mantry, Moehadi LiangTechnical Solutions Consultants

  • 8/10/2019 TB3248 - Syslog Connector Performance Tuning - Girish Mantry Moehadi Liang(1)

    2/28

    Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.2

    Syslog Connector Performance Tuning

    Agenda

    In this session we will take a look at

    Syslog connector variants Connector components and operation

    Stages in the event flow

    Performance bottlenecks and tuning at each stage

    Out of memory problems and tuning

    Customer cases

    General recommendations

  • 8/10/2019 TB3248 - Syslog Connector Performance Tuning - Girish Mantry Moehadi Liang(1)

    3/28

    Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

    Syslog Connector PerformanceTuning

    Syslog connector variants, components, operation and

    event flow

  • 8/10/2019 TB3248 - Syslog Connector Performance Tuning - Girish Mantry Moehadi Liang(1)

    4/28

    Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.4

    Syslog Connector Performance TuningSyslog Connector Variants

    Network

    ListenersFil

    Read

    ArcSight CEF

    Encrypted

    Syslog (UDP)

    Syslog

    NG

    Daemon

    Syslog

    Daemon

    Syslog

    Pipe

    UDP

    Raw TCP

    Default port 514

    UDP

    Raw TCP

    TLS

    Default port 1999

    UDP

    Symmetric Key

    Encryption

    Default port 514

    Only CEF format

    Unix Pipe

    Supported on all platforms

    Configurable interfaces and ports

    Supported only

    Work in conjunction with

  • 8/10/2019 TB3248 - Syslog Connector Performance Tuning - Girish Mantry Moehadi Liang(1)

    5/28

    Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.5

    Syslog Connector Performance Tuning

    Syslog Connector Components

    Device

    Type 1

    Device

    Type 2

    Device

    Type N

    Subagent

    QueueC

    Subagent

    Subagent

    Raw Events

    ESM

    Transport

    Logger

    Transport

    C

    C1 C2

    Main Flow

    C1 C2

    Destination Flow

    C1 C2

    Destination FlowParsed Events

    Processed Events

    Note: Queuing only applies to network listeners and not for file readers

  • 8/10/2019 TB3248 - Syslog Connector Performance Tuning - Girish Mantry Moehadi Liang(1)

    6/28

    Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.6

    Syslog Connector Performance TuningEvent Flow

    Event Reception Event Queuing Event Parsing Event Processing

    Receives network

    packets on UDP/TCP

    sockets

    Extracts human

    readable syslog raw

    events from network

    packets

    Raw events are written

    to a queue of files on

    the file system in the

    order in which they are

    received

    Raw events are picked

    up from the file queue

    in a FIFO manner and

    parsed using regular

    expressions

    Information from

    device log formats

    normalized into

    Arcsight event format

    Normalized events are

    categorized and

    processed in many

    ways useful for

    correlation and asset

    modeling

    Events are batched,

    filtered or aggregated

    as required for

    efficiency

  • 8/10/2019 TB3248 - Syslog Connector Performance Tuning - Girish Mantry Moehadi Liang(1)

    7/28

    Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

    Syslog Connector PerformanceTuning

    Performance Bottlenecks in the Event Flow and Tuning

  • 8/10/2019 TB3248 - Syslog Connector Performance Tuning - Girish Mantry Moehadi Liang(1)

    8/28

    Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.8

    Syslog Connector Performance Tuning

    Event Reception Choice of Transport Protocol

    UDP performs better on reliable networks

    Use Raw TCP on unreliable networks

    Use TLS for encrypted transport with Syslog NG

    Bottleneck (when dealing with Raw TCP or TLS)

    Java applications do not know when a client closes the connection with a FIN

    Connections remain idle in a CLOSE_WAIT state until closed explicitly by the application

    Idle connections can grow over a period of time and can exceed the connector limit or OS limit

    Happens faster with large number of devices or with devices that create new connections frequently

    Tuning

    Parameter Default Recommendation

    tcppeerclosedchecktimeout -1Set it to 30000 msec or higher to tell the connector to check for connections closed by p

    and close them on the connector side as well

    tcpmaxsockets 1000 Increase it higher as required to accommodate simultaneous connections from a large n

  • 8/10/2019 TB3248 - Syslog Connector Performance Tuning - Girish Mantry Moehadi Liang(1)

    9/28

  • 8/10/2019 TB3248 - Syslog Connector Performance Tuning - Girish Mantry Moehadi Liang(1)

    10/28

    Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.10

    Syslog Connector Performance Tuning

    Event Parsing Inspection and Device Type Detection

    Multiple subagents with one subagent per device type with a parser that has a regex to match something unique in the log

    Subagent parsers are ordered such that specific regexes come ahead of generic ones to detect device types accurately

    Connector inspects messages from senders applying regexes in the order to detect the device type and associates the subagen

    when a match is found. A single sender could be associated with multiple device types and subagents

    Associated subagent parsers are used to parse messages from a sender and inspection process is not reapplied unless a messa

    type is encountered from the same sender

    Syslog senders and their associated subagent types can be seen in current/user/agent/syslog.properties

    Bottleneck

    Inspection process involving regex matching could be expensive because connector has more than 100 subagents

    Tuning If you are sure of device types in your environment, you can restrict the subagent list by following properties

    Parameter Default Recommendation

    usecustomsubagentlist false Set it to true to make the connector consider the customized subagent list

    customsubagentlist List of subagents (>100)Set it to the restricted subagent list based on device types in your environm

    the original relative order of subagents not to affect the accuracy of subag

  • 8/10/2019 TB3248 - Syslog Connector Performance Tuning - Girish Mantry Moehadi Liang(1)

    11/28

    Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.11

    Syslog Connector Performance Tuning

    Event Parsing - continued Regular expressions in parsers

    Bottleneck

    A badly written regular expression in the parser can be a big performance hit on the connector

    Optimization

    For supported device types, development went through optimizing the regular expressions in the respective parsers. If you are

    syslog flex connector parsers, consider the following guidelines.

    Make your regexes generic only as much as needed. Specific regular expressions perform better than generic ones

    Use generic greedy expressions like .* and .+ at the end and not in the beginning or middle of a regular expression. Replace th

    equivalents like .*? and .+? with a clear character or token marking the boundary.

    Use of greedy expressions with more specific characters or meta characters is okay, ex:- \s+ for a continuous string of whites

    \d+ for a continuous string of numerals or \w+ for a continuous string alpha numerals Maximum number of devices

    Bottleneck

    Connector allows up to a max of 5000 devices and does not process events from newer devices once this limit is reached

    Tuning

    Parameter Default Recommendation

    syslog.max.device.count 5000 Increase it as required to match the number of devices in your environment

  • 8/10/2019 TB3248 - Syslog Connector Performance Tuning - Girish Mantry Moehadi Liang(1)

    12/28

    Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.12

    Syslog Connector Performance Tuning

    Event Processing Agent Batching

    Batch size controls how many events go together from component to component in the event flow and eventually to the destin

    Doubling or tripling default size of 100 could help improve the performance internally as well as over networks with latency

    Do not increase beyond that because it could have a negative impact by increasing memory requirements to hold the batches

    Categorization

    Categorization files for different device types are loaded into memory and some of those can be big

    Connector base memory usage can be high when dealing with a large number of device types

    Java heap space may need to be bumped up

    External Map File Processing

    External map file query is executed for every batch of events

    Make sure the query is simple and returns fast, if you are using this feature

    Connector Filtering

    Make sure that the filter condition is optimized and not extremely complex

  • 8/10/2019 TB3248 - Syslog Connector Performance Tuning - Girish Mantry Moehadi Liang(1)

    13/28

    Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.13

    Syslog Connector Performance Tuning

    Event Processing - continued Field Based Aggregation

    Groups events with same values in specified fields into buckets and produces aggregated events on time interval expiry or reac

    Restrict the field set to minimum required and choose an optimal event threshold value to keep the number of event field comp

    Choose an optimal time interval not to block the event flow for too long

    Avoid using preserve common fields setting in a high event volume environment

    Name Resolution

    Name resolutions are done in background threads and the event flow is not normally blocked for the answers to come back

    If the Wait For Name Resolution feature is enabled, then the event flow is blocked for a certain timeout period for the answers

    Do not enable Wait For Name Resolution feature in an environment requiring frequent resolutions

  • 8/10/2019 TB3248 - Syslog Connector Performance Tuning - Girish Mantry Moehadi Liang(1)

    14/28

    Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.14

    Syslog Connector Performance Tuning

    Event TransportEvent caching can occur for a number of reasons - network latency and problems in the destination are the com

    Bottleneck

    Excessive caching can cause delays in events reaching their destination

    When cache becomes full, connector starts dropping events

    Tuning

    Enable transport multithreading (except when the root cause is a problem in the destination)

    For the logger smart message transport, turn on the https persistent connection feature

    Increase the cache size to hold events for longer in the cache and prevent loss of events

    Parameter Default Recommendation

    http.transport.threadcount 1 Applies only to the ESM transport. Increase it by small increments as re

    transport.loggersecure.threads 1Applies only to the logger secure transport. Increase it by small increm

    transport.loggersecure.connection.persistent falseApplies only to the logger secure transport. Change it to true for reusin

    HTTPS connections and not tear them down for every batch of events

    Cache Size 1GBIncrease it as required up to a limit of 50GB. This is a destination setting

    configured using ESM console, connector appliance GUI or local connec

  • 8/10/2019 TB3248 - Syslog Connector Performance Tuning - Girish Mantry Moehadi Liang(1)

    15/28

    Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

    Syslog Connector PerformanceTuning

    Out of Memory Problems and Tuning

  • 8/10/2019 TB3248 - Syslog Connector Performance Tuning - Girish Mantry Moehadi Liang(1)

    16/28

    Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.16

    Syslog Connector Performance Tuning

    Java Process Memory and ManagementMemory allocated to a java process consists of Heap Space and Native Memory

    Heap space is allocated as instructed by java run time parameters

    -Xms (Initial heap size), -Xmx(Maximum heap size), 256 MB by default on connectors

    Native memory size = Process Memory size Size of Heap Space

    Garbage Collection reclaims the memory of unused objects

    Minor collections (GC), reclaims memory in YOUNG generation and moves survivors into OLD

    Major collections (Full GC), reclaims memory in all of the Heap space, takes much longer

    JVM stops the application threads during GC or Full GC

    Frequent Full GCs affects application performance severely

    A clear indicator for the need to increase the maximum heap size

    Out of memory errors can happen in any of these memory areas

    Memory limitations in 32 bit connector build

    Total addressable space is 4GB, Kernel space ranges from 1GB to 2GB depending on OS

    User space available for process is 2GB to 3GB depending on OS

    Limits exist on max heap space: 1GB (connector appliances), 1.5 GB (Windows), 2 GB (Unix)

    Use 64 bit connector build for higher memory

    Process Mem

    YOUNG Genera

    Newly created obj

    OLD Generat

    Old objects surviving m

    PERMANENT Gen

    Classes, methods

    Code Generation

    Socket Buffers

    Thread Stacks

    Direct Memory S

    JNI Code

    Garbage Collect

    JNI Allocated m

  • 8/10/2019 TB3248 - Syslog Connector Performance Tuning - Girish Mantry Moehadi Liang(1)

    17/28

    Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.17

    Syslog Connector Performance Tuning

    Dealing with Java out-of-memory errors

    Errors Root Cause and Recommendation

    java.lang.OutOfMemoryError:Java heap space

    java.lang.OutOfMemoryError:

    Requested array size exceeds

    VM limit

    Garbage collection is unable to free up more space and memory could not be allocated for new o

    Increase the maximum heap size using -Xmx option in increments as required up to the limit

    If this still does not help, there could be a potential memory leak or a bug open a support incid

    logs and heap dumps

    java.lang.OutOfMemoryError:

    PermGen space

    Permanent generation area has become full due to loading many classes statically or creating dy

    creating too many interned strings

    Default max size of PermGen space is 64 MB. Increase it in small increments using -XX:MaxPer

    java.lang.OutOfMemoryError:

    Unable to create a new native

    thread

    JVM is low on native memory and unable to create a new VM thread. Make more native memory a

    Reducing the heap space using Xms and Xmx options

    Reducing the stack space of using Xss option

    Out of Memory Error

    (allocation.cpp:211),

    pid=16950, tid=1855142800

    Displayed in the fatal error logs when the JVM crashes due to a malloc failure. The system could

    RAM or swap space or the process size limit was hit on a 32 bit system. Take one or more of the f

    Reduce memory load on the system or increase physical memory or swap space

    Decrease the number of application threads, reduce the java heap space and stack space

  • 8/10/2019 TB3248 - Syslog Connector Performance Tuning - Girish Mantry Moehadi Liang(1)

    18/28

    Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.18

    Syslog Connector Performance TuningAdjusting memory optionsOn a software connector, add or edit settings in a file under current/user/agent folder

    agent.wrapper.confwhen running as a service

    setmem.sh(Unix) or setmem.bat (Windows) when running as a standalone application. This file may have to be created if it does n

    set ARCSIGHT_MEMORY_OPTIONS="-Xms256m -Xmx256m (Example only. Add or remove options as required inside the doub

    export ARCSIGHT_MEMORY_OPTIONS (only on Unix)

    On a connector appliance

    Only heap space can be changed using a container command Configure Memory Settings

    Other settings can be changed using SSH or diagnostic tools file editor using the same mechanisms as for a software connector

    Memory Type Running as service Running as a standalone applicat

    Heap Space wrapper.java.initmemory=256 (initial heap size)wrapper.java.maxmemory=256 (maximum heap size)

    -Xms256m Xmx1024mIt is recommended to increase only the max hea

    Perm Gen Space

    Add additional java parameters with adjusted indexes

    wrapper.java.additional.7=-XX:PermSize=64m

    wrapper.java.additional.8=-XX:MaxPermSize=128m

    -XX:PermSize=64m -XX:MaxPermSize=128m

    It is recommended to increase only the max pe

    Stack space

    Add an additional java parameter with adjusted index

    wrapper.java.additional.9=-Xss=64k

    -Xss=64k Default stack size is OS dependent. A

    observe. Too low a value can cause StackOverfl

  • 8/10/2019 TB3248 - Syslog Connector Performance Tuning - Girish Mantry Moehadi Liang(1)

    19/28

    Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

    Syslog Connector PerformanceTuning

    Customer Cases

  • 8/10/2019 TB3248 - Syslog Connector Performance Tuning - Girish Mantry Moehadi Liang(1)

    20/28

    Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.20

    Syslog Connector Performance TuningTroubleshooting

    Changing the transport protocol to UDP or Raw TCP did not help

    Could not reproduce the problem in house

    Customer captured tcpdump packets and analyzed them using Wireshark

    Large number of TCP Window Full messages SEQ/ACK analysis showed that at times there is more than 10KB data in flight indicating that the

    receiver is too slow to process the incoming flood of packets

    TCP receive buffer and window sizes got reduced over time which contributed to the slow reception

    Further enquiries revealed that the Syslog NG connector is receiving TLS data from 2 other sources

    With this new discovery of customer environment, problem could also be reproduced in house

    Observed a high memory usage and Increased the heap space to1024 MB, but it did not help

    Root Cause

    Destination Syslog NG connector did not close TCP connections when sources closed connections Growing TCP connections forces receive buffer size to be reduced causing slower reception

    Solution

    Set the tcppeerclosedchecktimeout parameter to 30000 msec (half a minute)

    This parameter tells the connector to proactively check and close any TCP sockets

    Source

    Connector

    S

    Customer CCEF Syslog TLS destinat

    200eps, while ESM and

    cache for the same even

    Syslog NG

    Source 1

    Syslog NG

    Source 2

  • 8/10/2019 TB3248 - Syslog Connector Performance Tuning - Girish Mantry Moehadi Liang(1)

    21/28

    Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.21

    Syslog Connector Performance Tuning

    CustomerHuge difference of

    Fortigate Firewall a

    agent.out.wrapper.log:INFO | jvm 1 | 2012/12/05 11:35:29 | [Full GC

    agent.out.wrapper.log:INFO | jvm 1 | 2012/12/05 11:37:08 | [Full GC

    agent.out.wrapper.log:INFO | jvm 1 | 2012/12/05 11:38:52 | [Full GC

    agent.out.wrapper.log:INFO | jvm 1 | 2012/12/05 11:40:30 | [Full GC

    agent.out.wrapper.log:INFO | jvm 1 | 2012/12/05 11:42:06 | [Full GC

    agent.out.wrapper.log:INFO | jvm 1 | 2012/12/05 11:43:47 | [Full GC

    agent.out.wrapper.log:INFO | jvm 1 | 2012/12/05 11:45:32 | [Full GC

    agent.out.wrapper.log:INFO | jvm 1 | 2012/12/05 11:47:10 | [Full GC

    Queue Rate(SLC) vs Events/Sec(SLC) Queue Drop Count Memory usage (Total vs Used)

    Events/Sec(SLC) vs Throughput(SLC) Cache Size and Current Drop count Frequent Full GCs

    Observations

    Incoming event rate was much higher than the processing rate and connector was queuing heavily

    During peak hours, queuing has exceeded the size limit and dropped a huge number of events

    Caching observed during peaks hours and some events were dropped when cache size limit is exceeded

    High memory usage and frequent Full GCs were observed affecting the performance of the connector

  • 8/10/2019 TB3248 - Syslog Connector Performance Tuning - Girish Mantry Moehadi Liang(1)

    22/28

    Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.22

    Syslog Connector Performance Tuning

    Customer Case 2 SolutionMachines hosting the connectors were very powerful (64 bit Linux, 48 core CPU, 128 GB RAM, 600 GB hard disk)

    Actions Taken

    Increased the java heap size to 2048 MB to reduce the frequency of full GCs

    Enabled syslog parser multi-threading to keep up with the queuing rate

    Increased the file queue size from 100 to 2000 files of 10MB equivalent to 20 GB in total size to prevent dropping of events from fi

    Increased the cache size from 1GB to 10GB to prevent dropping of events from cache during peak hours

    The above measured helped the performance of the connector significantly

    Where it did not help solve the problem completely, we asked the customer to split the event volume among multiple syslog co

  • 8/10/2019 TB3248 - Syslog Connector Performance Tuning - Girish Mantry Moehadi Liang(1)

    23/28

    Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.23

    Syslog Connector Performance Tuning

    Onboard

    Connecto

    Customer Customer had logg

    events to an ESM m

    loggers were exper

    caching

    Observations

    Time per Batch= roundtrip time taken for a batch of events to travel from logger to ESM and the

    acknowledgment for the batch to come back from ESM logger

    The US logger took an average of 40 msec/batch and the UK logger took an average over 500 msec/batch

    This large difference in the round trip time is indicative of network latency due to geographical distance and isthe root cause of caching in the UK logger Logger in

    Solution

    Enabled multithreading on the ESM transport with a thread count of 2, this showed an improvement in throughput

    Increased the thread count to 7 (number of processors in the CPU) and caching went away completely

    USA Logger: Time per Batch ~ 40 msec UK Logger: Time per Batch > 500 msec

    f

  • 8/10/2019 TB3248 - Syslog Connector Performance Tuning - Girish Mantry Moehadi Liang(1)

    24/28

    Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

    Syslog Connector PerformanceTuning

    Recommendations

  • 8/10/2019 TB3248 - Syslog Connector Performance Tuning - Girish Mantry Moehadi Liang(1)

    25/28

    Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.25

    Syslog Connector Performance Tuning

    Some Recommendations

    Evaluate the number of devices, device types and event volume early in your deployment cyc

    Split the load among multiple connectors when the incoming event rate exceeds the achievab

    that varies based on the underlying platform and environment

    When splitting the load, consider grouping the devices of same type to one connector and ano

    different connector

    Evaluate the total capacity of your machine and other processes running to determine the nu

    connectors to install on a single machine

    Cumulative heap size allocated to connectors and other java processes should be well below

    memory available on the system

    Use 64 bit syslog connector builds to overcome the out of memory errors

  • 8/10/2019 TB3248 - Syslog Connector Performance Tuning - Girish Mantry Moehadi Liang(1)

    26/28

    Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.26

    Please fill out a survey.

    Hand it to the door monitor on your way out.

    Thank you for providing your feedback, which

    helps us enhance content for future events.

    Session TB3248 Speakers Girish Mantry, Moehadi Liang

    Please give me your feedback

  • 8/10/2019 TB3248 - Syslog Connector Performance Tuning - Girish Mantry Moehadi Liang(1)

    27/28

  • 8/10/2019 TB3248 - Syslog Connector Performance Tuning - Girish Mantry Moehadi Liang(1)

    28/28

    Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.