architectureguide · 2020-05-01 · considerations...
TRANSCRIPT
SNYPR 6.3.1 On-Prem
Architecture Guide
Date Published: 7/9/2020
Table of ContentsIntroduction 4
SNYPR Architecture 8
Hadoop Components 8High Availability 10
Data Ingestion 16
Phase 1: Collect and Publish 16Phase 2: Enrichment 17Phase 3: Processing 18Indexing Incoming Events 20
Deployment Alternatives 21
Dedicated SNYPR Deployment 21SNYPR Deployment with Existing Hadoop Infrastructure 23
Deployment Assumptions 25
SNYPR Kafka Topic Partitioning Reference 25SNYPR Search Shard Allocation Reference 26SNYPR YARN Resource Allocation Reference 27
Search Deployment Options 30
Search Index Storage Estimates 32
Disaster Recovery Alternatives 36
Alternatives 36Considerations 36
Network Bandwidth 39
Network Bandwidth Characteristics by Tier 40Network Bandwidth Requirements from RIN Collection Tier to MessagingTier 41
Virtual Infrastructure 43
Considerations for virtual deployments 43
SNYPR Cloud Deployment 44
Deployment Architecture 44
Considerations 45
Amazon EC2 45Microsoft Azure 46
Google Cloud 48
SNYPR Reference Hardware 50
SNYPR Architecture Guide 2
Reference Server Specifications 53
Hardware Specifications 53Server Mount Point 56Alternatives for Limiting the Size of the Infrastructure 68
Sizing and Capacity Planning 70
Server Types 70Assumptions 70Deployment Option 1: Dedicated Snypr Data Lake 72Deployment Option 2: Existing Data Lake 73
Spark Jobs Configuration for Kerberized Kafka 75
Network Tuning Recommendations 76
RIN Syslog Configuration 83
Hadoop Cluster Tuning Recommendations 84
Hadoop Cluster Performance 84Hadoop Cluster Log Configuration 99
Remote Ingestion Node Tuning 103
Reference Server Used is a VM 103Server Preparation 104Recommended Tools for Network Statistics 104Tune Server Network Parameters 105Performance Scenarios 117Best Practices 118Common Errors 120References 120
SNYPR Architecture Guide 3
Introduction
IntroductionSNYPR is a big data security analytics platform built on Hadoop that utilizes Securonixmachine-learning-based anomaly detection techniques and threat models to detectsophisticated cyber and insider attacks. SNYPR uses Hadoop both as its distributedsecurity analytics engine and long-term data retention engine. Hadoop nodes can beadded as needed, allowing the solution to scale horizontally to support hundreds ofthousands of events per second (EPS).
SNYPR features:
l Supports a rich variety of security data, including security event logs, user identitydata, access privileges, threat intelligence asset metadata, and netflow data.
l Normalizes, indexes, and correlates security event logs, network flows, and applic-ation transactions.
l Utilizes machine learning-based anomaly detection techniques, including behaviorprofiling, peer group analytics, pattern analysis, and event rarity to detect advancedthreats.
l Provides out-of-the-box threat and risk models for detection and prioritization ofinsider threat, cyber threat, and fraud.
l Risk-ranks entities involved in threats to enable an entity-centric (user or devices)approach to mitigating threats.
l Provides Spotter, a blazing-fast search feature with normalized search syntax that
enables investigators to investigate today’s threats and track advanced persistentthreats over long periods of time, with all data available at all times.
Documentation ConventionsThere are different font styles used throughout the SNYPR documentation to indicatespecific information. The table below describes the common formatting conventionsused in the documentation:
SNYPR Architecture Guide 4
Introduction
Convention Description
Bold font
Words in bold can indicate the following:
l Buttons that you need to click
l Fields in the user interface (UI)
l Menu options in the UI
l Information you need to type or select
Indicates commands or code.
Menu navigationThe navigation path to reach a specific screen in the UI is separatedby a greater than symbol (>). For example, Menu > Administration.
UPPERCASE FONT All uppercase words are acronyms.
Folders and folderpaths
Quotation marks are used around a folder name or folder path. Forexample, “C:\Documents\UserGuide”.
Document Name Audience
Installation Guide
System administrators, system integrators,
and deployment teams who need to install
the application.
RIN Installation Guide
On-boarding team and deployment
engineers who need to install the RIN to
connect to the SNYPR application to ingest
data.
Data Integration Guide
Data integrators who need to import
activity and enrichment datasources to
support existing and custom use cases.
SNYPR Architecture Guide 5
Introduction
Document Name Audience
Content Guide
l Data Integrators and deployment
engineers who need to use existing
connectors to import data and deploy
available content.
l Content developers who need to use
the out-of-the-box content to detect
the threats to your
organization.
Analytics Guide
Content developers who need to use the
existing content and custom analytics
available in the SNYPR platform to
develop use cases to detect the threats to
your organization.
Security Analyst Guide
l Information security professionals and
security analysts who need to detect
and manage threats.
l Risk and compliance officers and IT
specialists who need to use SNYPR
reporting capabilities to monitor and
remediate compliance.
Access Analytic Guide
l Information security professionals and
security analysts who need to detect
and remediate high-risk access due to
orphaned accounts, privilege creep, or
account compromise.
l Compliance officers and data owners
who need to review and remediate
access for privilege creep,
SOD violations, and orphaned
accounts.
SNYPR Architecture Guide 6
Introduction
Document Name Audience
Administrator Guide
l System administrators and service
providers who need information about
how to monitor and administer the
platform at a systems level.
l Business managers and other users in a
supervisory role who need information
about how to use SNYPR to grant
employees and partners access to
applications, check for policy
violations, and manage cases.
SNYPR Architecture Guide 7
SNYPR Architecture
SNYPR Architecture
Hadoop ComponentsSNYPR users a Hadoop cluster for processing all data. The core Hadoop components
include the following services:
l HDFS (Hadoop Distributed File System): Used to store security events and viol-ations. Data is stored in compressed parquet format.
l YARN (Yet Another Resource Negotiator): Provides resource management cap-abilities for jobs.
l Spark Streaming: Processing framework for live streaming data.
l HBase: Distributed no-SQL data store on HDFS to store the results of the ana-lytics.
SNYPR Architecture Guide 8
SNYPR Architecture
l Kafka: Horizontally scalable message-bus used to manage the delivery of incomingsecurity events.
l Impala (CDH) or Hive (HDP): Provides a SQL interface to the data stored inHDFS.
l ZooKeeper: Cluster management software to maintain configurations and
synchronization services across nodes within a cluster.
Note: The Hadoop cluster is configured for high availability based on best
practices deployment of Hadoop.
SNYPR Architecture Guide 9
SNYPR Architecture
High AvailabilityThe SNYPR solution includes high availability of all the components of theinfrastructure. The Hadoop cluster is configured for high availability based on bestpractices deployment of Hadoop. This includes (but is not limited to) at a minimumhigh availability of the HDFS Namenodes, YARN resource Managers, minimum of 3Zookeeper servers, and minimum of 3 kafka brokers. The high availability for theSNYPR servers that leverage the Hadoop cluster are described below.
SNYPR Application ServerHigh availability of the SNYPR Console is provided with an HA configuration of twonodes, with the user interface active on one of the two nodes during normal operation.MySQL replication, and a Redis cluster is configured as well as backup of the filesystem where the configuration data is stored (referred to as SECURONIX_HOME). Aload balancer is configured for access to the user interface.
SNYPR Architecture Guide 10
SNYPR Architecture
SNYPR-EYE ServerHigh availability of the SNYPR-EYE Server is provided with an HA configuration of twonodes, with the user interface active on one of the two nodes during normal operation.MySQL replication, as well as backup of the file system where the configuration data isstored (referred to as SNYPR-EYE_HOME) is configured on these servers for highavailability and a load balancer is configure for access to the user interface.
SNYPR Architecture Guide 11
SNYPR Architecture
SNYPR Search ServerHigh availability of the SNYPR Search Servers is configured for each SNYPR Search cellin the deployment. The SNYPR Search cell includes a Local Event Indexer (LEI) as wellas multiple search instances. A search cell with high availability will include at least 2SNYPRSNYPR Search servers. The LEI process is running on the primary server forindexing the incoming event data from the Enriched topic on Kafka. A search serverprovides a replica of all indexed data on another server. During a fail-over, the LEI isstarted on the second search server to enable active indexing on that server.
SNYPR Architecture Guide 12
SNYPR Architecture
SNYPR Remote Ingestion NodesAt least two SNYPR Remote Ingestion nodes (RINs) are recommended for highavailability in each location that they are deployed. RINs are typically installed in eachmajor data center in close proximity to the logs that are being collected. The datacollected by the RINs and forwarded to the kafka brokers is in compressed batchesthat minimize the network transfer by roughly 90%. The RINs also encrypt the payloadand support SSL and mutual authentication as well as Kerberos authentication.
The RINs collect data through two different methods, the push method and the pullmethod. The push method uses the embedded syslog server to collect and forwarddata to the kafka topics. The pull method uses the Securonix Connectors installed onthe RIN to connect to the APIs and gather the logs and forward them the to the Kafkatopic. High availability is provided on the kafka brokers by having 3 separate kafkabrokers and replication of the topics for availability.
A sticky load balancer is recommended for incoming traffic to the Remote Ingestionnode for incoming syslog traffic.
SNYPR Architecture Guide 13
SNYPR Architecture
SNYPR Remote Ingestion Nodes
Hadoop Cluster Guidance for High AvailabilityThe Hadoop infrastructure services are used for high availability. The recommended
settings are as follows:
l At least three Kafka brokers with ISR=3
l HDFS replication factor =3
l Kafka message retention = 2 days
l Kafka In Sync Replica (ISR=3)
l HDFS replication set to three
l HA Namenode
l HA Resource Manager
SNYPR Architecture Guide 14
SNYPR Architecture
l Minimum of three Zookeeper servers
l If security is required:
l Kerberos authentication of all services in the Hadoop cluster
l Encryption of HDFS folders with HDFS encryption is also available for sensitiveresource data
l Authorization for protection of the access to data in the Hadoop cluster isrecommended with the native tools (Ranger for Hortonworks, Sentry forCloudera)
l The SNYPR Edge Nodes for Ingestion and the Console User interface interact withthe Hadoop services and support Kerberos
Note: This is not a complete list. It is recommended that you follow the Hadoop
best practices for deployment.
In addition to the storage required for the data, the compute and memory required forrunning the SNYPR jobs must be available in the Hadoop cluster. The SNYPR solution
includes several jobs that are running in the cluster. YARN is used to schedule theresources. The primary jobs that are part of SNYPR and the resources allocation arelisted below.
The specific infrastructure required is based on the required peak ingestion rate.
Request specific deployment guidance from Securonix.
SNYPR Architecture Guide 15
Data Ingestion
Data IngestionSNYPR includes a data ingestion pipeline that includes normalization, contextenrichment, and correlation.
All event data in SNYPR is stored in a super enriched format. The Open Event Format(OEF) is a self-describing format capable of supporting information fromheterogeneous data sources, while also adding enrichment data sets like user identitydata, threat intelligence feeds, asset information and others. This format enablesevents to be contextually enriched at ingestion time. This ensures that historicalchanges to the enriched data are captured with the event at the time it occurred. Theoriginal source event is always maintained in the OEF event. (Seehttps://openeventformat.org for details )The three phases of the SNYPR eventingestion pipeline are shown below.
Phase 1: Collect and PublishIn this phase, events are collected and a SNYPR publisher on the Remote Ingestion
Node (RIN) forwards the messages to the Kafka raw topic. There are multiple types ofSNYPR publishers, including the Ingestion node that uses the SNYPR ConnectorLibrary and the syslog publisher that forwards messages directly to the Kafka rawtopic . The SNYPR publishers forward all events to the raw topic in the SNYPRtransport format. This transport format adds metadata to the source events todescribe the event source and tag the events for processing in the enrichment job. The
SNYPR publishers also support batching, compression, and encryption of the eventsthat are published. This minimizes the bandwidth for transmission to the centralizedKafka brokers.
SNYPR Architecture Guide 16
Data Ingestion
Phase 2: EnrichmentThe SNYPR Enrichment Spark Streaming job is responsible for event filtering,normalization, and context enrichment of the raw logs. During context enrichment,context is added to the incoming log data. This context enrichment includesenrichment from user HR sources, geolocation information, threat intelligence datum,and other lookup data like internal network maps and asset data. Additionally, the rawevent log message is stored in the original format as one of the columns in thenormalized schema.
Single PipelineDuring processing of the data ingestion at the enrichment phase, either a singleingestion pipeline or the Pipeline Orchestration can be configured. All data sent fromthe remote ingestion nodes is processed by the enrichment job in a single pipelineconfiguration. This is used for small deployments with similar data sources.
SNYPR Architecture Guide 17
Data Ingestion
Pipeline OrchestrationSNYPR include a Pipeline Orchestration feature that allows the enrichment process tointelligently distribute the enrichment phase across multiple enrichment pipelines. Inaddition to parallel ingestion at the enrichment phase, this is a feature that ensuresthat resources that take longer to process each event are split off of the mainenrichment pipeline into an alternate pipeline. This ensures that the resourcesprocessing can be analyzed and optimized out of the main enrichment pipeline withoutaffecting the performance of the other resources being ingested.
Phase 3: ProcessingThe third phase of the event ingestion pipeline is a parallel phase where multiple Spark
streaming jobs subscribe to the enriched topic and perform indexing, store enrichedevents in HDFS, and also analyze the events for threats.
The ingested data is stored for long-term storage in HDFS as parquet files and madeaccessible as Hive database tables that are partitioned by resource, year, and day.
SNYPR Architecture Guide 18
Data Ingestion
The solution also indexes the data and stores it in SNYPR Search Solr collections. Thesolution creates additional index collections as the data size passes a configurablethreshold, and maintains a control index for execution of parallel queries across theentire set of collections. The index files are maintained on the dedicated SNYPR Searchservers on local storage. This configuration provides parallel query execution across allthe collections for deterministic response time for interactive use by the SNYPR userinterface.
The log compliance data is stored in a read-only format that cannot be modified.SNYPR supports strong authentication, authorization, and encryption of the Hadoopinfrastructure. SNYPR also provides application layer encryption and masking that canbe enabled selectively.
SNYPR uses Edge nodes for the user interface and for the SNYPR Search nodes. Allprocessing and long term storage of data is done within the Hadoop cluster. SNYPRprovides a feature called Spotter as an integral part of the solution. This featureprovides online searching and visualization of event data for the configured indexretention period.
The SNYPR Remote ingestion node includes the connectors that are used to ingest the
log data. The connectors leverage the specific log source APIs or files to access the logdata. The incoming log messages are associated with a Job ID and a Resource IDbefore they are submitted to Kafka, so that they can be processed by the SparkStreaming enrichment job. The connectors also perform offset management of thesource of the log data to ensure that all the logs messages are obtained and, in somecases, pre-processing of the source data. An example of pre-processing the log data is
the Ironport syslog connector. This connector converts the multi-line messages into asingle line for publishing to Kafka.
SNYPR Architecture Guide 19
Data Ingestion
Indexing Incoming EventsSNYPR includes dedicated SNYPR Search servers. These servers are edge nodes in theHadoop cluster and consume the enriched messages from the Kafka topic andperformed local indexing on the search servers. The search indexes are designed tooptimize the search performance by paralleling the searching across multiple sub-indexes or SNYPR Search collections. Each collection is further distributed across aconfigured number of shards to ensure distribution of the workload. Each Solr serverin the cluster is allocated CPU and memory to allow the SNYPR Search server toperform optimally.
The indexed events are ingested in real-time by the solution. The SNYPR indexing jobis a distributed Spark Streaming job that runs within the Hadoop cluster. The computeand memory resources used for indexing are reserved capacity to ensure that eventsare ingested at the rate that they arrive to the solution. This allows the indexing ofingested events to be paralleled across the cluster to meet the deploymentrequirements of the solution.
An index control core collection is used to track the number of collections that the
solution is hosting. The solution maintains a maximum number of documents percollection threshold. The solution dynamically creates additional collections as moreevents are imported into the environment. The solution also provides the ability toduplicate redundant event data from the indexes during ingestion.
SearchingThe Spotter search interface allows users to search across all events. Interactive anddeterministic response time for searches is obtained by executing parallel searchesacross the collections. This approach ensures that the size of each index is optimizedand that the infrastructure can grow to support larger indexes without impacting theuser experience. The search results are incrementally returned to the user interfaceand displayed to the user as they arrive to ensure the responsiveness of the Spotterinterface.
SNYPR Architecture Guide 20
Deployment Alternatives
Deployment AlternativesThe SNYPR solution utilizes services in a Hadoop cluster. SNYPR provides thefollowing deployment options:
l SNYPR UEBA: SNYPR User and Entity Behavior Analytics (UEBA)This solutionprovides security analytics on security events. Events are stored only during theprocessing of the analytics.
l SNYPR Security Analytics Data: This solution provides security analytics on secur-ity events. Events are stored for historical purposes and high-performance threathunting solution is provided for searching and visualization of events.
Dedicated SNYPRDeploymentThe Securonix SNYPR solution, shown in the diagram above, illustrates the servicesthat are used within SNYPR. In this deployment diagram, SNYPR is deployed with adedicated Security Analytics Data Lake. In this configuration, the Master nodes includethe SNYPR Console and the Cloudera Manager service as well as other services like
the HDFS Namenode, the YARN resource manager, Zookeeper, and other services thatare used by the Hadoop cluster.
Based on the size of the deployment (events per second (EPS), analytics processed,retention period) and the features being supported (UEBA, Security AnalyticsPlatform, Data Lake).
SNYPR Architecture Guide 21
Deployment Alternatives
The SNYPR Architecture will scale to meet the deployment requirements. For a smallUEBA deployment, a limited number of servers are deployed and a dedicated SNYPRSearch Server is used for index storage. The deployment include between 3 and 6Hadoop servers along with a dedicated SNYPR Search server. The SNYPR applicationand the Redis service are collocated with the Hadoop master services.
For a medium UEBA deployment, full high availability of all services is configured ofservers are deployed and two dedicated SNYPR Search Servers are used for indexstorage. The deployment include between 6 and 10 Hadoop servers along with twodedicated SNYPR Search servers and two dedicated SNYPR Application Servers.
SNYPRDeployment with Dedicated Security Analytics Data Lake – Medium –UEBA
For a large Security Analytics Data Lake deployment, full high availability of allservices is configured for all servers that are deployed and at least two dedicatedSNYPR Search Servers are used for index storage. The deployment includes between 6and 10 Hadoop servers along three dedicated Kafka Brokers and two dedicatedSNYPR Search servers.
SNYPR Architecture Guide 22
Deployment Alternatives
SNYPRDeployment with Dedicated Security Analytics Data Lake – Large –Security Analytics Data Lake
SNYPR Deployment with Existing HadoopInfrastructureThe SNYPR solution shown in the following diagram (Figure 5) illustrates the servicesfor SNYPR that are added to an existing Hadoop cluster. The SNYPR Application,SNYPR Search and SNYPR-EYE nodes are shown on the top and the existing Hadoopcluster is shown in the box on the bottom. For the supported Hadoop distributions,please see the SNYPR Installation Guide.
SNYPR Architecture Guide 23
Deployment Alternatives
Logical SNYPRArchitecture – Existing Hadoop Cluster
SNYPR Architecture Guide 24
Deployment Assumptions
Deployment AssumptionsDeploying a SNYPR environment requires many considerations for each of thecomponents of the solution.
For a standard deployment architecture, the following is recommended:
l Fast network access for the Hadoop cluster and edge nodes – 10 gigabyte Ethernetwith jumbo frames configured on all switches and network interfaces (MTU=9000).
l All services running in a single data center
l A balanced SNYPR cluster with similar nodes (CPU, memory, storage, network)
l Securonix SNYPR using standard Securonix connectors for data ingestion. Theexact sources of event data are deployment specific
l The log event data available to the SNYPR environment (Ingestion Nodes), or fordirect connector access to log sources, based on the connector used
l Storage bandwidth recommended: 1000 IOPS per Hadoop and SNYPR Searchserver
l Purging Online Event data after retention period days to minimize required stor-age, unless there is a business need for long term historical searching. Violation andbehavior data is not purged.
l Java 8 used by the cluster
For Hadoop tuning, See the section in this guide: Hadoop Cluster TuningRecommendations.
SNYPR Kafka Topic Partitioning Reference
KafkaTopics
10000 - 20000 EPS
Partitions Replication
tenantid-Raw 75 2
tenantid-Enriched 75 2
SNYPR Architecture Guide 25
Deployment Assumptions
KafkaTopics
10000 - 20000 EPS
Partitions Replication
tenantid-Ops 1 2
tenantid-Tiertwo 75 2
tenantid-Control 1 2
tenantid-IndexerCount 1 2
tenantid-Violations 75 2
tenantid-User 1 2
tenantid-Count 1 2
tenantid-Preview 1 2
SNYPR Search Shard Allocation Reference
Solr Collections
10000 - 20000 EPS
Servers 9
Shards Replication
tenantid-activity 12 2
tenantid-violation 12 2
tenantid-whitelist 1 2
tenantid-entitymetadata 1 2
tenantid-tpi 1 2
SNYPR Architecture Guide 26
Deployment Assumptions
Solr Collections
10000 - 20000 EPS
Servers 9
Shards Replication
tenantid-eeocontrolcore 1 2
tenantid-lookup 1 2
tenantid-ipmapping 1 2
tenantid-watchlist 1 2
tenantid-
dailyviolationsummary1 2
tenantid-users 1 2
tenantid-riskscorecard 1 2
tenantid-entityrelation 1 2
tenantid-access 1 2
SNYPR YARNResource Allocation ReferenceThe SNYPR Spark applications are configured based on the ingestion rate that must besupported. An example of the Spark Application resources allocation is shown in thetable below. The table below is an example of the resource allocation for a deploymentthat supports 20,000 events per second with typical workload. There are manyvariables affecting a deployment and the specific sizing recommended. ContactSecuronix for specific information.
SNYPR Architecture Guide 27
Deployment Assumptions
SparkStreamingYARNResources
10,000 - 20,000 EPS
Driver Executors
vCPUMemory(GB)
NumberofExecutors
vCPUMemory(GB)
Event
Enrichment6 2 80 1 3
Event Ingestion 6 2 20 1 2
Behavior
Analytics1 2 10 1 4
Policy Engine
IEE1 2 40 1 2
Policy Engine
AEE1 2 10 1 3
Risk Generation 1 2 10 2 2
Traffic
Analyzer1 2 10 1 4
Behavior
Profile1 2 6 1 2
Robotic
Behavior1 2 10 1 3
Event Archiver 1 2 10 1 1
SNYPR Architecture Guide 28
Deployment Assumptions
SparkStreamingYARNResources
10,000 - 20,000 EPS
Driver Executors
vCPUMemory(GB)
NumberofExecutors
vCPUMemory(GB)
Phishing 1 2 1 1 4
YARN
Resources21 22 217 546
Total
YARN
Resources
238 568
SNYPR Architecture Guide 29
Search Deployment Options
Search Deployment OptionsSNYPR Search is a high-performance indexing and search solution that stores allactivity events in the environment that are access by the user interface.
SNYPR Search is deployed on an edge node in the Hadoop cluster. It requires access tothe SNYPR Console on the application server and the Kafka Brokers. These serversperform event indexing as well as storage of all violation data and related informationused by the SNYPR user interface.
Embedded Dedicated
Description
Limited search server for
small UEBA deployments.
Limited to one search cell.
Dedicated search server
for small UEBA or Security
Analytics Data Lake
deployments.
Indexing rate per Search
Cell (multiple cells are
configured for increased
performance)
3k average EPS
5k peak EPS
Multiple Search Cells are
supported, each cell
supports 10k average EPS
/ 15k peak EPS.
Redundancy of search
indexes with replication
can be configured for high
availability and faster
search performance.
Retention 7 days 30 days or more.
Search EmbeddedAn embedded deployment of SNYPR Search is collocated with the SNYPR Applicationand shares the resources on that server. The resources required for an embeddeddeployment of SNYPR Search are:
SNYPR Architecture Guide 30
Search Deployment Options
l 10 CPU
l 16 GB RAM
l 1 TB usable storage
An embedded SNYPR Search server is for small UEBA deployments and is limited to3,000 EPS average and 5k peak (EPS), and 7 days of retention. For deploymentscenarios with greater requirements, SNYPR Search Dedicated servers will be used.
SNYPRSearch EmbeddedMode
Search DedicatedThe SNYPR Search Dedicated deployment options are listed in the diagram below. A
SNYPR Search Standard deployment uses a single dedicated server for indexing andsearching. A SNYPR Search High Performance Cell includes separate servers forindexing and searching. In a high-performance cell, the indexes are replicated acrossservers for redundancy and for isolating indexing workload from search workload.
SNYPR Architecture Guide 31
Search Deployment Options
SNYPRSearch Dedicated
Search Index Storage Estimates
Embedded: 7 Days
Premium: 30Days
Premium: 30DayswithReplica
Days 7 30 30
Replic
as1 1 2
EPS
Avg
Messag
e Size
events / dayGB/da
y
Storage
(GB)
Storage
(GB)
Storage
(GB)
1,000 600 86,400,000 48 169 724 1,448
2,500 600 216,000,000 121 422 1,810 3,621
5,000 600 432,000,000 241 845 3,621 7,242
SNYPR Architecture Guide 32
Search Deployment Options
Embedded: 7 Days
Premium: 30Days
Premium: 30DayswithReplica
7,500 600 648,000,000 362 N/A 5,431 10,863
10,00
0600 864,000,000 483 N/A 7,242 14,484
15,00
0600
1,296,000,0
00724 N/A 10,863 21,726
20,00
0600
1,728,000,0
00966 N/A 14,484 28,968
Premium: 60Days
Premium: 60DayswithReplica
Premium: 90Days
Premium: 90DayswithReplica
Days 60 60 90 90
Repli
cas1 2 1 2
EPS
Avg
Mess
age
Size
events /
day
GB/d
ay
Storage
(GB)
Storage
(GB)
Storage
(GB)
Storage
(GB)
SNYPR Architecture Guide 33
Search Deployment Options
Premium: 60Days
Premium: 60DayswithReplica
Premium: 90Days
Premium: 90DayswithReplica
1,00
0600
86,400,00
048 1,448 2,897 2,173 4,345
2,50
0600
216,000,0
00121 3,621 7,242 5,431 10,863
5,00
0600
432,000,0
00241 7,242 14,484 10,863 21,726
7,50
0600
648,000,0
00362 10,863 21,726 16,294 32,589
10,0
00600
864,000,0
00483 14,484 28,968 21,726 43,452
15,0
00600
1,296,000,
000724 21,726 43,452 32,589 65,178
20,0
00600
1,728,000,
000966 28,968 57,936 43,452 86,904
SNYPR Extra Large DeploymentsThe sizing guidelines in this document are references for deployment of SNYPR. Thesolution will support much larger deployments based on the customer requirements.
For large deployments the search servers are dedicated servers rather than beingcollocated on the Compute/Storage nodes. This allows the search indexers to scale asneeded without impacting other services. This includes Solr and a dedicatedZookeeper configuration to avoid contention.
SNYPR Architecture Guide 34
Search Deployment Options
There is no upper limit to the deployment size. The deployment architecture for extra-large deployments will be determined based on the specific deployment requirements.Contact Securonix for details.
The major variables that dictate the deployment recommendations include:
l Ingestion Rate (Events Per Second) of security event data
l Number of Users interacting with the application interactively
l The data retention requirements for online data
l The data retention requirements for log data
l The disaster recovery strategy
SNYPR Architecture Guide 35
Disaster Recovery Alternatives
Disaster Recovery AlternativesSNYPR can be deployed to meet several disaster recovery objectives. Because of thesize of the solution and the costs associated with disaster recover, several DRalternative strategies are available. Since SNYPR can be deployed with an existingHadoop environment, the disaster recovery strategy must align with the DR strategyfor the Hadoop infrastructure being used for SNYPR. The alternatives in thisdocument assume a dedicated Hadoop infrastructure for SNYPR, and describe thedisaster recovery considerations for the entire solution, including Hadoop. If anexisting Hadoop environment is used, the same considerations, are relevant, but theactual configuration of the Hadoop disaster recovery will be assumed to be part of theexisting Hadoop infrastructure.
AlternativesThe SNYPR Disaster Recovery Alternatives include:
1. Advanced DR with Full Infrastructure - identical infrastructure with data rep-
lication from primary site to DR Site, with the ability to continue processing inflight messages from the Kafka brokers at the DR site.
2. Full DR with Full Infrastructure - identical infrastructure with select data rep-lication from primary site to DR Site, with the ability to rebuild search indexes aftera DR from the historical enriched event data, and the ability to process new activityevents at the DR site.
3. Limited DR with limited infrastructure - limited infrastructure with violation, sum-mary, and configuration data only and the ability to process new activity events.
ConsiderationsThe considerations for disaster recovery must be made for each service included in thesolution. The primary considerations for each of the node types are described asfollows:
SNYPR Architecture Guide 36
Disaster Recovery Alternatives
l SNYPR Console Nodes: The SNYPR Console Nodes include the SNYPR User inter-face and the SNYPR configuration database.
l SNYPR Search Servers are dedicated search nodes that include a local eventindexer and multiple search instances for distributed searches. These servers areedge nodes in a hadoop cluster that read data from Kafka and index the data tolocal storage on the search servers. The SNYPR Search servers include optimizationfor maximum search performance and density on physical server. Apache Solr isused for the underlying search server.
l SNYPR-EYE Server is a SNYPR monitoring and alerting server that is used for theconfiguration and operational health monitoring of all SNYPR services includingthe all the servers in the Hadoop cluster, the processes on the SNYPR Console, theSNYPR Spark Streaming applications running in the YARN cluster, including the per-formance of the data ingestion of all resources, the performance and health of theSNYPR Search processes. The SNYPR Eye solution installs and manages SNYPR-EYE agents on the servers in the environment for local monitoring.
l SNYPR Remote Ingestion Nodes include the ingestion servers with the connectors,the incoming activity log files, and the Kafka brokers with the in-flight messages.
l Hadoop Master: These nodes also include the Hadoop administration services like
Cloudera Manager and Zookeeper when Hadoop is deployed as part of the solu-tion. The considerations for disaster recovery at this tier include file system rep-lication with rsync, or a backup and restore strategy, as well as MySQL databasereplication for the SNYPR configuration database and the Hive metastore.
l Compute / Storage Nodes: The SNYPR Compute / Storage Nodes include HDFSand all the files stored by the system in HDFS for Hive / Impala table access, Solr
indexes, and HBase tables. The considerations for disaster recovery at this tierinclude replication (using distcp) or backup and recovery of the HDFS data, HBasereplication (using the WALs), and replication of the Solr collection schema data.
l Kafka Brokers: The considerations for disaster recovery at this tier include KafkaMirrorMaker for the in-flight Kafka messages.
The exact disaster recovery strategy implemented should be in alignment with thebusiness continuity requirements for each deployment. The table shows thealternatives for disaster recovery configuration with the impact on the businesscontinuity.
SNYPR Architecture Guide 37
Disaster Recovery Alternatives
Advanced DRwith FullInfrastructure
Full DR with FullInfrastructure
Limited DR withLimitedInfrastructure
DR Target 1 day 1 week
1 week (Violations,
behavior and data
only)
Configuration Data X X X
Violation Data X X X
Case Management X X X
Behavior
SummariesX X X
Historical Enriched
EventsX X X
Search Indexes X
rebuild search
indexes after DR
initiation
X
Kafka in-flight
messagesX X X
Unprocessed
Event FilesX X X
The availability of the data that SNYPR needs at the disaster site as well as networkfailover and end user access to the disaster recovery infrastructure must also beconsidered. The typical services that are needed at the disaster site to continueprocessing are shown in the diagram below. This includes user and access data, as wellas event logs that are ingested by the solution. For details, refer to the ClouderaBackup and Disaster Recovery at:https://www.cloudera.com/documentation/enterprise/5-8-x/topics/cm_bdr_about.html
SNYPR Architecture Guide 38
Network Bandwidth
Network BandwidthA SNYPR deployment includes network transfer of several types of data into thesolution. This includes User, Access, TPI, Event Logs, Network Maps and other types ofdata for a typical deployment. Due to the potential sensitivity of some of this data, avirtual private cloud may be required for each deployment. In addition to the securityconsiderations, the infrastructure will require sufficient network bandwidth. The typeof network traffic used by the solution is:
l End User Access to the Securonix User Interface
l Import of User, Access, and TPI data into the Master nodes
l Cluster communication and synchronization between the cluster nodes
l Import of event log data into the child nodes
The largest network traffic required is to transfer the event log data from the source
to the child nodes for import through the solutions connectors. The network traffic
rate from the event logs sources to the child can be calculated by multiplying the
events per second time the average message size.
5000 events per second (EPS) to two ingestion nodes in the deployment with an
average message size of 500 byte, will require 2.5 MB per second, or roughly 25 Mb
per second of bandwidth.
SNYPR Architecture Guide 39
Network Bandwidth
Network Bandwidth Characteristics by Tier
Tier DescriptionNetworkRequirements
Admin
This tier is where the end users
losing to the user interface (traffic
on port 443). This tier also includes
all management services for the
cluster and connects to the Compute
/ Storage / Search tier and
Messaging Tier for various services.
incoming connections Web Services
on port 443, MySQL configuration
for spark jobs, Redis, zookeeper and
other hadoop cluster services. This
tier hosts the management services
that the agents on the Admin,
Compute / Storage / Messaging tiers
will communicate with.
10 GB ethernet, MTU =
9000, centralized data
center for Admin,
Compute / Storage /
Search, and Messaging
Tiers)
Compute/Storage/Search
Network traffic for to these servers
includes spark, Impala, HDFS, HBase
services. Outbound traffic to
services in the Admin tier and the
Messaging Tier are also required.
10 GB ethernet, MTU =
9000, centralized data
center for Admin,
Compute / Storage /
Search, and Messaging
Tiers)
SNYPR Search
Network traffic for to these servers
includes SNYPR Search (Solr).
Outbound traffic to services in the
Kafka Messaging Tier is required.
10 GB ethernet, MTU =
9000, centralized data
center for Admin,
Compute / Storage /
Search, and Messaging
Tiers)
SNYPR Architecture Guide 40
Network Bandwidth
Tier DescriptionNetworkRequirements
Messaging
This tier includes incoming traffic to
Kafka Brokers (SSL traffic to port
9093, and zookeeper traffic on port
2181).
10 GB ethernet, MTU =
9000, centralized data
center for Admin,
Compute / Storage /
Search, and Messaging
Tiers)
Collection
This server collects logs and
provides a syslog server on port
514. The connectors on the server
also collect logs with native
protocols. The primary network
traffic from this tier is to the Admin
tier on port 443 for web services
and the Kafka brokers in the
Messaging Tier on port 9093 SSL)
Remote data center
with outbound network
access to the
centralized data center.
If 10 gigabyte Ethernet is not available and gig-bit Ethernet is used in the deployment,then the performance of the deployment will be limited by the network performance.
Network Bandwidth Requirements fromRINCollection Tier to Messaging TierThe table below displays the network bandwidth requirements from the RemoteIngestion Nodes (RINs) collection tier to the messaging tier (Kafka Brokers).
Average EPS 20,000 EPS
Number of RINs 1 RINS
average message size 600 bytes
SNYPR Architecture Guide 41
Network Bandwidth
Average EPS 20,000 EPS
Transferred to Kafka after
compression (%)30 %
Total Traffic to Kafka 36 Mbits/s
Traffic per RIN to Kafka
(assuming equal distribution)36 Mbits/s
SNYPR Architecture Guide 42
Virtual Infrastructure
Virtual InfrastructureDue to the high-performance requirements of the solution, physical servers ordedicated cloud instances are recommended. A virtual infrastructure can beconsidered for small deployments or non-production environments.
Considerations for virtual deploymentsl These are VMs that can be deployed as needed on the vSphere cluster, without
over- subscription of either CPU or Memory resources. Configure CPUs along phys-ical socket boundaries. According to vmware, one VM per NUMA node is advisable.
l These nodes house the Cloudera Master services and serve as the gateway/edgedevice that connects the rest of the customer’s network to the Cloudera cluster.
l Care should also be taken to ensure automated movement of VMs is disabled.There should be no DRS or vMotion of VMs allowed in this deployment model.This is critical as VMs are tied to physical disks and movement of VMs within thecluster will result in data loss.
l Configure Distributed Resource Scheduler (DRS) rules so that there is strong neg-ative affinity between the master node VMs. This ensures that no two masternodes are provisioned or migrated to the same physical vSphere host.
l Key configuration parameter to consider is the MTU size to ensure that the sameMTU size being set at the physical switches, guest OS, ESXi VMNIC and thevswitch layers. This is relevant when enabling jumbo frames. (9000 MTU), which is
recommended for Hadoop environments.
l Set up virtual disks in “independent persistent” mode for optimal performance.Eager Zeroed Thick virtual disks provide the best performance.
l Each provisioned disk is mapped to one vSphere datastore (which in turn containsone VMDK or virtual disk)
l VMXNET3 NIC should be configured.
l Disable or minimize anonymous paging by setting vm.swappiness=0 or 1.
l VMs on the same physical host are affected by the same hardware failure. In orderto match the reliability of a physical deployment, replication of data across two vir-tual machines on the same host should be avoided.
SNYPR Architecture Guide 43
SNYPR Cloud Deployment
SNYPR Cloud DeploymentSNYPR solution can be deployed in a cloud environment. Several considerations mustbe addressed when deploying SNYPR in a cloud including the following:
l Infrastructure selection: The infrastructure used should provide equivalentresources (CPU, memory, and storage capacity and bandwidth) to the physicalserver recommendations listed in this document.
l Deployment Architecture: SNYPR could be deployed exclusively in the cloud or asa hybrid cloud / on-site topology
l Network Access: The infrastructure must have access to the data (user, access,event log, TPI, etc.) that will be used. A Virtual Private Cloud may be required fortransmission of sensitive data.
Infrastructure SelectionSNYPR can be deployed in public or private cloud environments. Based on thedeployment requirements of the solution, the specific infrastructure used for each
cloud infrastructure should be selected to ensure that the appropriate resources areavailable. This includes selection of the appropriate virtual instance types to supportthe CPU, memory, storage and network bandwidth requirements of the solution.
Deployment ArchitectureThe deployment of SNYPR includes a Hadoop cluster as well as servers for the userinterface and for event ingestion. When SNYPR is deployed in a cloud environment,there are two primary deployment alternatives. The first is a Securonix Clouddeployment where all servers in the cluster are hosted in the cloud.
The second is a Securonix Cloud / On-Premise deployment where the console nodesare deployed in the cloud and the ingestion nodes are deployed on-premise.
SNYPR Architecture Guide 44
Considerations
ConsiderationsThis section contains considerations for the following topics:
l Amazon EC2
l Microsoft Azure
Amazon EC2There are several Amazon EC2 Instance Types that are a good fit for deployingSecuronix. The M4 general purpose instances are recommended. These are defined byAmazon as: "M4 instances are the latest generation of General-Purpose Instances.This family provides a balance of compute, memory, and network resources, and it is agood choice for many applications."
Featuresl 2.4 GHz Intel Xeon® E5-2676 v3 (Haswell) processors
l EBS-optimized by default at no additional cost
l Support for Enhanced Networking
l Balance of compute, memory, and network resources
HadoopMaster
Compute /Storage
KafkaSNYPRSearch
SNYPRSearch
AmazonEC2InstanceType
R5.4xlarge M4.16xlarge M5.2xlarge M5.4xlarge M4.16xlarge
RAM(GB)
128 256 32 64 256
vCPU 16 64 8 16 64
SNYPR Architecture Guide 45
Considerations
HadoopMaster
Compute /Storage
KafkaSNYPRSearch
SNYPRSearch
Storage(GB, splitintomultipleEBSvolumes)
10,000 10,000 3,000 3,000 10,000
Amazon provides several alternatives for the instance types used, like the R3.8XL, andthe D2.8XL, which are also good options. The storage chosen should provide adequatebandwidth to the volume used. This is the equivalent of 1000 IOPs per instance to theselected storage type.
In addition to standard Amazon AWS EC2 instances, the guidance for deployingCloudera in Amazon Web Services is recommended. See the following link:https://www.cloudera.com/partners/solutions/amazon-web-services.html.
Microsoft AzureSeveral Azure Virtual Machine instance types (https://azure.microsoft.com/en-us/pricing/details/virtual-machines/linux/) are a good fit for deploying Securonix. A G4(East US2), D15 v2 (East US2), or H16m (South Central US) instance type is
recommended.
The D sv2 instances are recommended. These are defined by Microsoft as:
“D11-15 v2 instances are based on the 2.4 GHz Intel Xeon® E5-2673 v3 (Haswell)processor, and can achieve 3.1 GHz with Intel Turbo Boost Technology 2.0. D11-15 v2are ideal for memory-intensive enterprise applications. D15 v2 instance is isolated tohardware dedicated to a single customer.
For persistent storage, use the variant “Dsv2” VMs and purchase Premium Storageseparately.”
SNYPR Architecture Guide 46
Considerations
HadoopAdmin
Compute /Storage
KafkaBroker
SNYPRSearch
SNYPRConsole
MicrosoftAzureInstanceType
E16 v3 D64 v3 E16 v3 D64 v3 E16 v3
RAM (GB) 128 256 128 256 128
vCPU 16 64 16 64 16
Storage(GB, splitinto multipleEBSvolumes)
3,000 10,000 5,000 10,000 5,000
Microsoft provides several alternatives for the storage for the instances used. The
storage chosen should provide adequate bandwidth to the volume used. This is theequivalent of 1000 IOPs per instance to the selected storage type.
In addition to standard Azure instances, the following guidance for deploying Clouderain Microsoft Azure is recommended. See the link:https://www.cloudera.com/more/news-and-blogs/press-releases/2015-09-24-
cloudera-enterprise-data-hub-edition-provides-enterprise-ready-hadoop-for-microsoft-azure.html.
SNYPR Architecture Guide 47
Google Cloud
Google CloudThe table below shows an example configuration for a Google Cloud SNYPRarchitecture with 10,000 EPS, and 30-day os search index storage.
Type QuantityInstanceType
CPU Memory Storage
Master
Servers3
N1-
Highmem-
16
16 vCPU 104 GB
(Quantity 1)
/root 500 GB
(SSD),
(Quantity 1)
/zookeeper
250 GB (SSD)
SNYPR
Console
Servers
1
N1-
standard-
16
16 vCPU 60 GB
(Quantity 1)
/root 500 GB
(SSD),
(Quantity 8)
/data 500 GB
(standard)
Compute /
Storage
Servers
6
N1-
standard-
64
64 vCPU 240 GB
(Quantity 1)
/root 128 GB
(SSD),
(Quantity 5)
/search[1-10]
1000 GB
(standard)
Search /
Storage
Servers
1
N1-
Highmem-
64
64 vCPU 416 GB
(Quantity 1)
/root 128 GB
(SSD),
(Quantity 10)
/search[1-10]
5500 GB
(SSD)
SNYPR Architecture Guide 48
Google Cloud
Type QuantityInstanceType
CPU Memory Storage
Kafka
Ingestion
Servers
3
N1-
standard-
8
8 vCPU 30 GB
(Quantity 1)
/root 128 GB
(SSD),
(Quantity 1)
/zookeeper
256 GB (SSD),
(Quantity 3)
/data 1024 GB
GB (standard)
Remote
Ingestion
Nodes
1
N1-
standard-
8
8 cpu 30 GB
(Quantity 1)
/root 128 GB
(SSD),
(Quantity 3)
/data 2000 GB
GB (standard)
SNYPR Architecture Guide 49
SNYPR Reference Hardware
SNYPR Reference HardwareThe SNYPR architecture includes the following nodes that integrate with the Hadoopservices:
SNYPR
Application
Server
Console
User
Interface,
configuration
db, Redis
These are edge nodes in a Hadoop cluster that are used
for the SNYPR user interface and the configuration
repository for all components used by the solution. Each
Console Node performs the following tasks:
l Provide visualizations for
monitoring events, threat management dashboards,
investigations and incident
response
l Build custom dashboards with
visualizations for viewing violation and event data
l Configure all ingestion jobs
- user identities, access privileges, threat
intelligence, security events
and others
l Administration interface for application
support, personnel and administrators
l Configure all policies and
analytics, including behavior-based anomaly
detection, peer-based analytics,
threat modeling and risk analytics
h
SNYPR Architecture Guide 50
SNYPR Reference Hardware
SNYPR
EYE
SNYPR EYE
Interface,
configuration
db
SNYPR Eye Server is a SNYPR monitoring and alerting
server that is used for the configuration and operational
health monitoring of all SNYPR services including the all
the servers in the hadoop cluster, the processes on the
SNYPR Console, the SNYPR Spark Streaming
applications running in the YARN cluster, including the
performance of the data ingestion of all resources, the
performance and health of the SNYPR Search processes.
The SNYPR Eye solution installs and manages SNYPR-
EYE agents on the servers in the environment for local
monitoring.
SNYPR
Remote
Ingestion
Node
SNYPR
Remote
Ingestion
Node
SNYPR Remote Ingestion Nodes: These nodes are Edge
nodes in a Hadoop cluster that are used to ingest
security event log data into the environment with the
Securonix connectors.
Each SNYPR Ingestion node performs the following
tasks:
l Import Events from log sources
l Publish events to Kafka
Message Bus with batching, compression and
encryption
l Accept incoming log files on syslog
l Cache In-transit messages
Hadoop
Master
Hadoop
cluster
management
services
Hadoop Master Nodes: These are the master servers in
the Hadoop cluster.
SNYPR Architecture Guide 51
SNYPR Reference Hardware
Hadoop Compute / Storage Nodes: These are the main
nodes in a Hadoop cluster that are used to store
compressed data and process all the jobs associated with
SNYPR.
Each SNYPR Compute/Storage node performs the
following tasks:
l Fetch data from the ingestion nodes.
l Perform all the jobs associated with SNYPR
based on the configuration stored in the Master
node, including parsing, indexing,
analytics, and storage.
l Store data with 90% compression in structured
JSON format.
l Pass processed data to SNYPR Search indexes
that are used by the SNYPR console for review by
the end user.
Hadoop
Kafka
Broker
Kafka
Broker,
dedicated
zookeeper
Kafka broker servers for in transit messages,
configuration zookeeper servers dedicated to Kafka.
These servers use local storage for in transit messages.
SNYPR Architecture Guide 52
Reference Server Specifications
Reference Server SpecificationsThis section contains recommendations for the following topics:
l Hardware Specifications
l Server Mount Point
Hardware SpecificationsThe hardware specifications for the infrastructure are listed in the following table:
ConfigurationSNYPR-M1:Hadoop Master
SNYPR-M2:Hadoop Masterwith SNYPR
SNYPR-M3:Hadoop Masterwith SNYPR andKafka
Server Model Dell R640 Dell R640 Dell R640
CPU
2 x Intel Xeon
Gold 5120 2.2G,
14C/28T
2 x Intel Xeon
Gold 5120 2.2G,
14C/28T
2 x Intel Xeon Gold
5120 2.2G,
14C/28T
Memory256GB RDIMM,
2666MT/s
256GB RDIMM,
2666MT/s
256GB RDIMM,
2666MT/s
Boot Storage
2 x 1.6TB SSD
SATA Mix Use
12Gbps 512e
2 x 1.6TB SSD
SATA Mix Use
12Gbps 512e
2 x 1.6TB SSD
SATA Mix Use
12Gbps 512e
Additional Storage
4 x 2.4 TB 10K
RPM SAS 12Gbps
4Kn
6 x 2.4 TB 10K
RPM SAS 12Gbps
4Kn
8 x 2.4TB 10K
RPM SAS 12Gbps
4Kn
Network 10GE 10GE 10GE
SNYPR Architecture Guide 53
Reference Server Specifications
ConfigurationSNYPR-M1:Hadoop Master
SNYPR-M2:Hadoop Masterwith SNYPR
SNYPR-M3:Hadoop Masterwith SNYPR andKafka
Power 2 x 1100W 2 x 1100W 2 x 1100W
Rack Units 1RU 1RU 1RU
ConfigurationSNYPR-C1:Standard DensityCompute/Storage
SNYPR-C2: HighDensityCompute/Storage
SNYPR-C3:MaximumDensityCompute/Storage
Server Model Dell R640 Dell R740xd Dell R740xd
CPU
2 x Intel Xeon Gold
5120 2.2G,
14C/28T
2 x Intel Xeon Gold
5120 2.2G,
14C/28T
2 x Intel Xeon Gold
5120 2.2G,
14C/28T
Memory256GB RDIMM,
2666MT/s
256GB RDIMM,
2666MT/s
256GB RDIMM,
2666MT/s
Boot Storage
2 x 1.6TB SSD
SATA Mix Use
12Gbps 512e
2 x 1.6TB SSD
SATA Mix Use
12Gbps 512e
2 x 1.6TB SSD
SATA Mix Use
12Gbps 512e
Additional
Storage
10 x 2.4 TB 10K
RPM SAS 12Gbps
4Kn
24 x 2.4 TB 10K
RPM SAS 12Gbps
4Kn
30 x 2.4TB 10K
RPM SAS 12Gbps
4Kn
Network 10GE 10GE 10GE
Power 2 x 1100W 2 x 1100W 2 x 1100W
Rack Units 1RU 2 RU 2 RU
SNYPR Architecture Guide 54
Reference Server Specifications
ConfigurationSNYPR-SEARCH1:Standard DensityCompute/Storage
SNYPR-SEARCH3:Maximum DensityCompute/Storage
Server Model Dell R640 Dell R740xd
CPU2 x Intel Xeon Gold 5120
2.2G, 14C/28T
2 x Intel Xeon Gold 5120
2.2G, 14C/28T
Memory256GB RDIMM,
2666MT/s256GB RDIMM, 2666MT/s
Boot Storage2 x 1.6TB SSD SATA Mix
Use 12Gbps 512e
2 x 1.6TB SSD SATA Mix
Use 12Gbps 512e
Additional Storage10 x 2.4TB 10K RPM SAS
12Gbps 4Kn
30 x 2.4TB 10K RPM SAS
12Gbps 4Kn
Network 10GE 10GE
Power 2 x 1100W 2 x 1100W
Rack Units 1RU 2 RU
ConfigurationSNYPR-K3:Kafka Brokers
SNYPR-R1:RemoteIngestion Node
SNYPR-S3:SNYPR Console
Server Model Dell R640 Dell R640 Dell R640
CPU
2 x Intel Xeon
Gold 5120 2.2G,
14C/28T
2 x Intel Xeon
Gold 5120 2.2G,
14C/28T
2 x Intel Xeon Gold
5120 2.2G,
14C/28T
SNYPR Architecture Guide 55
Reference Server Specifications
ConfigurationSNYPR-K3:Kafka Brokers
SNYPR-R1:RemoteIngestion Node
SNYPR-S3:SNYPR Console
Memory128GB RDIMM,
2666MT/s
64GB RDIMM,
2666MT/s
128GB RDIMM,
2666MT/s
Boot Storage
2 x 1.6TB SSD
SATA Mix Use
12Gbps 512e
2 x 1.6TB SSD
SATA Mix Use
12Gbps 512e
2 x 1.6TB SSD
SATA Mix Use
12Gbps 512e
Additional Storage
10 x 2.4 TB 10K
RPM SAS 12Gbps
4Kn
4 x 2.4 TB 10K
RPM SAS 12Gbps
4Kn
4 x 2.4 TB 10K
RPM SAS 12Gbps
4Kn
Network 10GE 10GE 10GE
Power 2 x 1100W 2 x 1100W 2 x 1100W
Rack Units 1RU 1RU 1RU
Alternate hardware configuration can be used, but equivalent specifications arerequired for CPU, memory, network bandwidth, storage capacity and bandwidth.
Server Mount PointThe storage mount point configuration for each of the servers is listed in the tablebelow:
SNYPR Architecture Guide 56
Reference Server Specifications
Mount PointSNYPR-M1:HadoopMaster
SNYPR-M2:HadoopMaster withSNYPR
SNYPR-M3:HadoopMaster withSNYPR andKafka
Comments
/ 100 GB 100 GB 100 GB
RAID 1, (1.6
TB mixed use
SSD drives),
xfs
/boot 2 GB 2 GB 2 GB
RAID 1, (1.6
TB mixed use
SSD drives),
xfs
swap 10 GB 10 GB 10 GB
RAID 1, (1.6
TB mixed use
SSD drives),
xfs
/zookeeper 100 GB 100 GB 100 GB
RAID 1, (1.6
TB mixed use
SSD drives),
xfs
/var 800 GB 800 GB 800 GB
RAID 1, (1.6
TB mixed use
SSD drives),
xfs
/dfs 200 GB 200 GB 200 GB
RAID 1, (1.6
TB mixed use
SSD drives),
xfs
SNYPR Architecture Guide 57
Reference Server Specifications
Mount PointSNYPR-M1:HadoopMaster
SNYPR-M2:HadoopMaster withSNYPR
SNYPR-M3:HadoopMaster withSNYPR andKafka
Comments
/securonix 4.2 TB 6.3 TB 8.4 TB
RAID 10, xfs,
if syslog is
used locally
use higher
storage
amount
/snyprsearch - - - RAID 6
/data1 - - 2.1 TBJBOD, xfs,
noatime
/data2 - - 2.1 TBJBOD, xfs,
noatime
/data3 - - 2.1 TBJBOD, xfs,
noatime
/data4 - - 2.1 TBJBOD, xfs,
noatime
SNYPR Architecture Guide 58
Reference Server Specifications
MountPoint
SNYPR-C1:StandardDensityCompute/Storage
SNYPR-C2:High DensityCompute/Storage
SNYPR-C3:MaximumDensityCompute/Storage
Comments
/ 100 GB 100 GB 100 GB
RAID 1,
(1.6 TB
mixed use
SSD
drives),
xfs
/boot 2 GB 2 GB 2 GB
RAID 1,
(1.6 TB
mixed use
SSD
drives),
xfs
swap 10 GB 10 GB 10 GB
RAID 1,
(1.6 TB
mixed use
SSD
drives),
xfs
/zookeepe
r100 GB 100 GB 100 GB
RAID 1,
(1.6 TB
mixed use
SSD
drives),
xfs
SNYPR Architecture Guide 59
Reference Server Specifications
MountPoint
SNYPR-C1:StandardDensityCompute/Storage
SNYPR-C2:High DensityCompute/Storage
SNYPR-C3:MaximumDensityCompute/Storage
Comments
/var 800 GB 800 GB 800 GB
RAID 1,
(1.6 TB
mixed use
SSD
drives),
xfs
/dfs 200 GB 200 GB 200 GB
RAID 1,
(1.6 TB
mixed use
SSD
drives),
xfs
/securonix - - -
RAID 10,
xfs, if
syslog is
used
locally use
higher
storage
amount
/snyprsear
ch- - - RAID 6
/data1 2.1 TB 2.1 TB 2.1 TBJBOD, xfs,
noatime
SNYPR Architecture Guide 60
Reference Server Specifications
MountPoint
SNYPR-C1:StandardDensityCompute/Storage
SNYPR-C2:High DensityCompute/Storage
SNYPR-C3:MaximumDensityCompute/Storage
Comments
/data2 2.1 TB 2.1 TB 2.1 TBJBOD, xfs,
noatime
/data3 2.1 TB 2.1 TB 2.1 TBJBOD, xfs,
noatime
/data4 2.1 TB 2.1 TB 2.1 TBJBOD, xfs,
noatime
/data5 2.1 TB 2.1 TB 2.1 TBJBOD, xfs,
noatime
/data6 2.1 TB 2.1 TB 2.1 TBJBOD, xfs,
noatime
/data7 2.1 TB 2.1 TB 2.1 TBJBOD, xfs,
noatime
/data8 2.1 TB 2.1 TB 2.1 TBJBOD, xfs,
noatime
/data9 2.1 TB 2.1 TB 2.1 TBJBOD, xfs,
noatime
/data10 2.1 TB 2.1 TB 2.1 TBJBOD, xfs,
noatime
/data11 - 2.1 TB 2.1 TBJBOD, xfs,
noatime
SNYPR Architecture Guide 61
Reference Server Specifications
MountPoint
SNYPR-C1:StandardDensityCompute/Storage
SNYPR-C2:High DensityCompute/Storage
SNYPR-C3:MaximumDensityCompute/Storage
Comments
/data12 - 2.1 TB 2.1 TBJBOD, xfs,
noatime
/data13 - 2.1 TB 2.1 TBJBOD, xfs,
noatime
/data14 - 2.1 TB 2.1 TBJBOD, xfs,
noatime
/data15 - 2.1 TB 2.1 TBJBOD, xfs,
noatime
/data16 - 2.1 TB 2.1 TBJBOD, xfs,
noatime
/data17 - 2.1 TB 2.1 TBJBOD, xfs,
noatime
/data18 - 2.1 TB 2.1 TBJBOD, xfs,
noatime
/data19 - 2.1 TB 2.1 TBJBOD, xfs,
noatime
/data20 - 2.1 TB 2.1 TBJBOD, xfs,
noatime
/data21 - 2.1 TB 2.1 TBJBOD, xfs,
noatime
SNYPR Architecture Guide 62
Reference Server Specifications
MountPoint
SNYPR-C1:StandardDensityCompute/Storage
SNYPR-C2:High DensityCompute/Storage
SNYPR-C3:MaximumDensityCompute/Storage
Comments
/data22 - 2.1 TB 2.1 TBJBOD, xfs,
noatime
/data23 - 2.1 TB 2.1 TBJBOD, xfs,
noatime
/data24 - 2.1 TB 2.1 TBJBOD, xfs,
noatime
/data25 - - 2.1 TBJBOD, xfs,
noatime
/data26 - - 2.1 TBJBOD, xfs,
noatime
/data27 - - 2.1 TBJBOD, xfs,
noatime
/data28 - - 2.1 TBJBOD, xfs,
noatime
/data29 - - 2.1 TBJBOD, xfs,
noatime
/data30 - - 2.1 TBJBOD, xfs,
noatime
SNYPR Architecture Guide 63
Reference Server Specifications
Mount PointSNYPR-SEARCH1:StandardDensityCompute/Storage
SNYPR-SEARCH3:MaximumDensityCompute/Storage
Comments
/ 100 GB 100 GB
RAID 1, (1.6
TB mixed use
SSD drives),
xfs
/boot 2 GB 2 GB
RAID 1, (1.6
TB mixed use
SSD drives),
xfs
swap 10 GB 10 GB
RAID 1, (1.6
TB mixed use
SSD drives),
xfs
/zookeeper 100 GB 100 GB
RAID 1, (1.6
TB mixed use
SSD drives),
xfs
/var 800 GB 800 GB
RAID 1, (1.6
TB mixed use
SSD drives),
xfs
/dfs 200 GB 200 GB
RAID 1, (1.6
TB mixed use
SSD drives),
xfs
SNYPR Architecture Guide 64
Reference Server Specifications
Mount PointSNYPR-SEARCH1:StandardDensityCompute/Storage
SNYPR-SEARCH3:MaximumDensityCompute/Storage
Comments
/securonix - -
RAID 10, xfs,
if syslog is
used locally
use higher
storage
amount
/snyprsearch 17 TB 60 TB RAID 6
Mount PointSNYPR-K3:KafkaBrokers
SNYPR-R1:RemoteIngestionNode
SNYPR-S3:SNYPRConsole
Comments
/ 100 GB 100 GB 100 GB
RAID 1, (1.6
TB mixed use
SSD drives),
xfs
/boot 2 GB 2 GB 2 GB
RAID 10, xfs,
if syslog is
used locally
use higher
storage
amount
SNYPR Architecture Guide 65
Reference Server Specifications
Mount PointSNYPR-K3:KafkaBrokers
SNYPR-R1:RemoteIngestionNode
SNYPR-S3:SNYPRConsole
Comments
swap 10 GB 10 GB 10 GB
RAID 10, xfs,
if syslog is
used locally
use higher
storage
amount
/zookeeper 100 GB - -
RAID 10, xfs,
if syslog is
used locally
use higher
storage
amount
/var 800 GB 1000 GB 1000 GB
RAID 10, xfs,
if syslog is
used locally
use higher
storage
amount
/dfs 200 GB 200 GB 200 GB
RAID 10, xfs,
if syslog is
used locally
use higher
storage
amount
SNYPR Architecture Guide 66
Reference Server Specifications
Mount PointSNYPR-K3:KafkaBrokers
SNYPR-R1:RemoteIngestionNode
SNYPR-S3:SNYPRConsole
Comments
/securonix - 4.2 TB 4.2 TB
RAID 10, xfs,
if syslog is
used locally
use higher
storage
amount
/snyprsearch - - - RAID 6
/data1 2.1 TB - -JBOD, xfs,
noatime
/data2 2.1 TB - -JBOD, xfs,
noatime
/data3 2.1 TB - -JBOD, xfs,
noatime
/data4 2.1 TB - -JBOD, xfs,
noatime
/data5 2.1 TB - -JBOD, xfs,
noatime
/data6 2.1 TB - -JBOD, xfs,
noatime
/data7 2.1 TB - -JBOD, xfs,
noatime
SNYPR Architecture Guide 67
Reference Server Specifications
Mount PointSNYPR-K3:KafkaBrokers
SNYPR-R1:RemoteIngestionNode
SNYPR-S3:SNYPRConsole
Comments
/data8 2.1 TB - -JBOD, xfs,
noatime
/data9 2.1 TB - -JBOD, xfs,
noatime
/data10 2.1 TB - -JBOD, xfs,
noatime
Alternatives for Limiting the Size of theInfrastructureThe recommended architecture assumes full functionality and full access to indexeddata and source data for the duration of the retention period.
Other factors may reduce the size of the recommended infrastructure such as areduction in the volume of log data or filtering of some log data to avoid storage ofunneeded events.
You can configure the Hadoop compute and storage nodes to use very dense storageper node. The following table shows an example configuration that is possible. Thisconfiguration includes dense storage.
Model vCpu Memory (GB) Storage (TB)
SNYPR-S3 56 256 3
SNYPR-SEARCH1 56 256 17
SNYPR-R3 32 64 3
SNYPR-M3 56 256 9
SNYPR Architecture Guide 68
Reference Server Specifications
Model vCpu Memory (GB) Storage (TB)
SNYPR-C1 56 256 21
SNYPR-K3 56 64 15
Premium: 60Days
Premium: 60DayswithReplica
Premium: 90Days
Premium: 90DayswithReplica
Days 60 60 90 90
Repli
cas1 2 1 2
EPS
Avg
Mess
age
Size
events /
day
GB/d
ay
Storage
(GB)
Storage
(GB)
Storage
(GB)
Storage
(GB)
1,00
0600
86,400,00
048 1,448 2,897 2,173 4,345
2,50
0600
216,000,0
00121 3,621 7,242 5,431 10,863
5,00
0600
432,000,0
00241 7,242 14,484 10,863 21,726
7,50
0600
648,000,0
00362 10,863 21,726 16,294 32,589
10,0
00600
864,000,0
00483 14,484 28,968 21,726 43,452
15,0
00600
1,296,000,
000724 21,726 43,452 32,589 65,178
20,0
00600
1,728,000,
000966 28,968 57,936 43,452 86,904
SNYPR Architecture Guide 69
Sizing and Capacity Planning
Sizing and Capacity PlanningSecuronix provides sizing and capacity planning. The considerations and examples ofthe sizing recommendations provided here. (Contact Securonix Support for morespecific deployment recommendations).
Server Types
Type Description Type
Snypr Application
Snypr Console user interface
and Snypr Eye Monitoring
Server
Dedicated Snypr Server
Snypr Search
Snypr Search and Indexer
Servers with local storage of
search indexes
Dedicated Snypr Server
Remote Ingestion
Nodes
Remote Ingestion Servers for
log collectionDedicated Snypr Server
Hadoop Master Hadoop management serverDedicated Snypr Server or
Existing Hadoop Cluster
Compute / Storagehadoop compute and storage
server
Dedicated Snypr Server or
Existing Hadoop Cluster
Kafka BrokersKafka brokers for transient
messages
Dedicated Snypr Server or
Existing Hadoop Cluster
AssumptionsSeveral assumptions are made when providing a sizing recommendation. The followinglist is an example of input assumptions and related sizing recommendations:
l EPS Input is pre filtered EPS (for UEBA, 40 % filtering is assumed, for SDL, no fil-tering)
SNYPR Architecture Guide 70
Sizing and Capacity Planning
l if EPS is greater than 2500 EPS, dedicated Snypr Search servers are required.
l YARN will use 50% of the memory on the compute nodes
l in small cluster (less than 6 nodes) the master will have 20 vCPUs for YARN and 64GB RAM
l Kafka compression from RIN = 3x (30% or source)
l Additional Adjustment Considerations
l HDFS data node - resources required - 4 vCpu compute on data node servers, 8 GB
l HBase Region Servers - resources required - 5 percent compute on region servers,16 GB
l Search - if 7200 rpm drives are use, reduce EPS by 20 percent
l Custom hardware requires a YARN memory to vCPY ratio of 3.2 or higher GB per
vCPU
Type Sample Input Values
Deployment Type SDL
Average Events Per Second 20,000
Peak EPS 20,000
Peak Filtered EPS 20,000
Average Message Size 600
Identities 20000
Analytics Complexity Medium
Search Retention 30
Long Term Storage 180
Kafka Retention 2
HDFS Replication 2
Solr HA No
SNYPR Architecture Guide 71
Sizing and Capacity Planning
Type Sample Input Values
Excess Capacity None
RIN HA No
Snypr Console HA No
Snypr Console dedicated Yes
YARN vCpu percent 70
YARN Memory percent 60
TypeOutput CalculatedValues
Peak EPS 20,000 EPS
Peak Filtered EPS 20,000 EPS
Daily Events 1,728,000,000 events per day
Daily Ingestion Size 966 GB/day
Total Search Events 52 billion events
Total Search Storage 14,484 GB
Total long term Events 311 billion events
Total Long Term (HDFS)
Storage109,499 GB
Identities 20000
Deployment Option 1: Dedicated Snypr Data Lake
Server Type Quantity Model
SNYPR Application 1 SNYPR-S3
SNYPR Architecture Guide 72
Sizing and Capacity Planning
Server Type Quantity Model
SNYPR Search 2 SNYPR-SEARCH1
Remote Ingestion Nodes 2 SNYPR-R3
Hadoop Master 3 SNYPR-M2
Compute / Storage 8 SNYPR-C1
Kafka Brokers 3 SNYPR-K2
Total Servers 19
Deployment Option 2: Existing Data LakeExisting Hadoop Recommendation
SNYPR Edge Node ServerType
Quantity Model
SNYPR Application 1 SNYPR-S3
SNYPR Search 2 SNYPR-SEARCH1
Remote Ingestion Nodes 2 SNYPR-R3
Total Snypr Edge Node Servers 5
Existing Hadoop Capacity Required
SNYPR Edge Node ServerType
Quantity Model
YARN Capacity 320 vCPU
1,229 GB RAM
HDFS storage (3 x replication) 107 TB
Kafka Storage (2 days retention) 45 TB
SNYPR Architecture Guide 73
Sizing and Capacity Planning
SNYPR Architecture Guide 74
Spark Jobs Configuration for Kerberized Kafka
Spark Jobs Configuration forKerberized KafkaWhen running the SNYPR spark applications in a kerberized cluster, add the belowparameters in the sparkjobs scripts in order to sparkjobs for connecting to securekafka.
--driver-java-options "-
Djava.security.auth.login.config=/opt/keytabs/jaas.conf -
Djute.maxbuffer=50000000 -Dspark.driver.userClassPathFirst=true -
Dspark.executor.userClassPathFirst=true" \
--conf "spark.executor.extraJavaOptions=-
Djava.security.auth.login.config=/opt/keytabs/jaas.conf -
XX:+UseConcMarkSweepGC -
Dlog4j.configuration=./conf/log4j.properties -
Djute.maxbuffer=50000000 -Xss1G" \
SNYPR Architecture Guide 75
Network Tuning Recommendations
Network Tuning RecommendationsThe network configuration can have a dramatic performance impact on theenvironment. The network tuning guidance in this section can be used to optimize thenetwork configuration for the linux servers in the environment.
Modify Network Kernel SettingEdit the Network Tuning Parameters in / etc / sysctl.conf file:
# vi /etc/sysctl.conf
Edit the following values:
# allow testing with buffers up to 128MB
net.core.rmem_max = 134217728
net.core.wmem_max = 134217728
# increase Linux autotuning TCP buffer limit to 64MB
net.ipv4.tcp_rmem = 4096 87380 67108864
net.ipv4.tcp_wmem = 4096 65536 67108864
# recommended default congestion control is htcp
SNYPR Architecture Guide 76
Network Tuning Recommendations
net.ipv4.tcp_congestion_control=htcp
# recommended for hosts with jumbo frames enabled (only relevant
for systems with 10GB interfaces)
net.ipv4.tcp_mtu_probing=1
# recommended for CentOS7
net.core.default_qdisc = fq
In order for the above changes to take effect, reboot the server.
Increase the Transmit Queue LengthSet the txqueuelen permanently:
vi /etc/rc.local
Add the following (this is the interface where you will receive data):
/sbin/ifconfig em1 txqueuelen 10000
To validate:
# ifconfig em1 | grep txque
ether 90:b1:1c:1f:e6:1b txqueuelen 10000 (Ethernet)
SNYPR Architecture Guide 77
Network Tuning Recommendations
Location Value
/etc/sysctl.conf vm.swappiness = 10
/etc/security/limits.conf hdfs - nofile 32768
/etc/security/limits.conf mapred - nofile 32768
/etc/security/limits.conf hbase - nofile 32768
/etc/security/limits.conf yarn - nofile 32768
/etc/security/limits.conf solr - nofile 32768
/etc/security/limits.conf sqoop2 - nofile 32768
/etc/security/limits.conf spark - nofile 32768
/etc/security/limits.conf hive - nofile 32768
/etc/security/limits.conf impala - nofile 32768
/etc/security/limits.conf hue - nofile 32768
/etc/security/limits.conf kafka - nofile 32768
/etc/security/limits.conf hdfs - nproc 32768
/etc/security/limits.conf mapred - nproc 32768
/etc/security/limits.conf hbase - nproc 32768
/etc/security/limits.conf yarn - nproc 32768
/etc/security/limits.conf solr - nproc 32768
/etc/security/limits.conf sqoop2 - nproc 32768
/etc/security/limits.conf spark - nproc 32768
/etc/security/limits.conf hive - nproc 32768
/etc/security/limits.conf impala - nproc 32768
/etc/security/limits.conf hue - nproc 32768
/etc/security/limits.conf kafka - nproc 32768
SNYPR Architecture Guide 78
Network Tuning Recommendations
Location Value
/etc/security/limits.d/20-nproc.conf hdfs - nproc 32768
/etc/security/limits.d/20-nproc.conf mapred - nproc 32768
/etc/security/limits.d/20-nproc.conf hbase - nproc 32768
/etc/security/limits.d/20-nproc.conf yarn - nproc 32768
/etc/security/limits.d/20-nproc.conf solr - nproc 32768
/etc/security/limits.d/20-nproc.conf sqoop2 - nproc 32768
/etc/security/limits.d/20-nproc.conf spark - nproc 32768
/etc/security/limits.d/20-nproc.conf hive - nproc 32768
/etc/security/limits.d/20-nproc.conf impala - nproc 32768
/etc/security/limits.d/20-nproc.conf hue - nproc 32768
/etc/security/limits.d/20-nproc.conf kafka - nproc 32768
/sys/kernel/mm/transparent_
hugepage/defragecho never
/sys/kernel/mm/transparent_
hugepage/enabledecho never
Proposed Configuration Tuning
jetty.conf for SNYPR Searchmake the default timeout 180K ms vs the
50K
/etc/sysctl.conf
# --------------------------------------------------------------------
# The following allow the server to handle lots of connection requests
# --------------------------------------------------------------------
# Increase number of incoming connections that can queue up
SNYPR Architecture Guide 79
Network Tuning Recommendations
/etc/sysctl.conf
# before dropping
net.core.somaxconn = 50000
# Handle SYN floods and large numbers of valid HTTPS connections
net.ipv4.tcp_max_syn_backlog = 30000
# Increase the length of the network device input queue
net.core.netdev_max_backlog = 20000
# Increase system file descriptor limit so we will (probably)
# never run out under lots of concurrent requests.
# (Per-process limit is set in /etc/security/limits.conf)
fs.file-max = 100000
# Widen the port range used for outgoing connections
net.ipv4.ip_local_port_range = 10000 65000
# If your servers talk UDP, also up these limits
net.ipv4.udp_rmem_min = 8192
net.ipv4.udp_wmem_min = 8192
# --------------------------------------------------------------------
# The following help the server efficiently pipe large amounts of data
# --------------------------------------------------------------------
# Disable source routing and redirects
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.accept_source_route = 0
# Disable packet forwarding.
SNYPR Architecture Guide 80
Network Tuning Recommendations
/etc/sysctl.conf
net.ipv4.ip_forward = 0
net.ipv6.conf.all.forwarding = 0
# Disable TCP slow start on idle connections
net.ipv4.tcp_slow_start_after_idle = 0
# Turn on the tcp_window_scaling
net.ipv4.tcp_window_scaling = 1
# Turn on the tcp_timestamps
net.ipv4.tcp_timestamps = 1
# Turn on the tcp_sack
net.ipv4.tcp_sack = 1
# Change Congestion Control (default: reno)
net.ipv4.tcp_congestion_control=htcp
# Increase Linux autotuning TCP buffer limits
# Set max to 16MB for 1GE and 32M (33554432) or 54M (56623104) for 10GE
# Don't set tcp_mem itself! Let the kernel scale it based on RAM.
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.core.rmem_default = 16777216
net.core.wmem_default = 16777216
net.core.optmem_max = 40960
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 87380 16777216
# --------------------------------------------------------------------
SNYPR Architecture Guide 81
Network Tuning Recommendations
/etc/sysctl.conf
# The following allow the server to handle lots of connection churn
# --------------------------------------------------------------------
# Disconnect dead TCP connections after 1 minute
net.ipv4.tcp_keepalive_time = 60
# Wait a maximum of 5 * 2 = 10 seconds in the TIME_WAIT state after a FIN, to handle
# any remaining packets in the network.
net.netfilter.nf_conntrack_tcp_timeout_time_wait = 10
# How long to keep ESTABLISHED connections in conntrack table
# Should be higher than tcp_keepalive_time + tcp_keepalive_probes * tcp_keepalive_
intvl )
net.netfilter.nf_conntrack_tcp_timeout_established = 300
net.netfilter.nf_conntrack_generic_timeout = 300
# Allow a high number of timewait sockets
net.ipv4.tcp_max_tw_buckets = 2000000
# Timeout broken connections faster (amount of time to wait for FIN)
net.ipv4.tcp_fin_timeout = 10
# Let the networking stack reuse TIME_WAIT connections when it thinks it's safe to do
so
net.ipv4.tcp_tw_reuse = 1
# Determines the wait time between isAlive interval probes (reduce from 75 sec to 15)
net.ipv4.tcp_keepalive_intvl = 15
# Determines the number of probes before timing out (reduce from 9 sec to 5 sec)
net.ipv4.tcp_keepalive_probes = 5
# -------------------------------------------------------------
SNYPR Architecture Guide 82
Network Tuning Recommendations
RIN Syslog ConfigurationWhen the NDB ( Data Broker ) is used in the design, the values of parameters belowneed to be 0:
net.ipv4.conf.enp94s0f1.rp_filter = 0
The NDB is a one way device and will not acknowledge packets received that the OSmay send. For this reason, the Kernel will drop packets if the above is set to 1.
SNYPR Architecture Guide 83
Hadoop Cluster Tuning Recommendations
Hadoop Cluster TuningRecommendationsThe tuning parameters in Table 1 describe the Hadoop tuning parameters for each ofthe services in the Hadoop cluster that optimize the Hadoop cluster performance forthe SNYPR workloads.
Hadoop Cluster Performance
Type
Type
SettingConservative
Optimal
Yar
nAll
yarn
container
memory
60 GB60
GB70 GB
Yar
nAll
Yarn :
Container
Memory
Maximum
4 GB 4 GB 4 GB
Yar
nAll
Java Heap
Size of Node
Manager
850 MB 1 GB 850 MB
Yar
nAll
Java Heap
Size of
ResourceMan
ager
2 GB 2 GB 2 GB
SNYPR Architecture Guide 84
Hadoop Cluster Tuning Recommendations
Type
Type
SettingConservative
Optimal
Yar
nAll
ZooKeeper
Client
Timeout
zkClientTime
out
1 min1
min1 min
Hba
seAll
Java Heap
Size of Hbase
Master in
Bytes
1 GB 1 GB 1 GB
Hba
seAll
HBase: Java
Heap Size
Thrift in
Bytes: 1 GB
1 GB 1 GB 1 GB
Hba
seAll
Java Heap
Size of Hbase
RegionServer
in Bytes
15 GB20
GB20 GB
Hba
se
Cl
ou
de
ra
hbase.rpc.tim
eout15 min
10
min15 min
Hba
se
Cl
ou
de
ra
Hbase Client
Scanner
Timeout
15 min10
min15 min
SNYPR Architecture Guide 85
Hadoop Cluster Tuning Recommendations
Type
Type
SettingConservative
Optimal
Hba
se
Cl
ou
de
ra
RegionServer
Lease Period15 min
10
min15 min
Hba
seAll
zookeeper.se
ssion.timeout90000
9000
090000
Hba
se
Cl
ou
de
ra
HBase
Service
Advanced
Configuration
Snippet
(Safety
Valve) for
hbase-
site.xml
name:
hbase.ipc.warn.respo
nse.time value: 500
name:
hbase.ipc.warn.respon
se.time value: 500
HD
FSAll
Java Heap
Size of
NameNode in
Bytes
8 GB 8gb 16 GB
HD
FSAll
Java Heap
Size of
DataNode in
Bytes
8 GB 8gb 8 GB
HD
FSAll
Maximum
Concurrent
Moves
300 300 300
SNYPR Architecture Guide 86
Hadoop Cluster Tuning Recommendations
Type
Type
SettingConservative
Optimal
HD
FSAll
DataNode
Balancing
Bandwidth
1GB optional , 10MB
default
1GB
optio
nal ,
10M
B
defa
ult
1GB optional , 10MB
default
HD
FSAll
HDFS :
Maximum
Memory
Used for
Caching : 2
GB
2 GB 2gb 2 GB
HD
FSAll
Maximum
Number of
Transfer
Threads
160001600
016000
Imp
ala
Cl
ou
de
ra
Java Heap
Size of
Catalog
Server in
Bytes
2 GB 4gb 2 GB
Imp
ala
Cl
ou
de
ra
Impala
Daemon
Memory
Limit
12 GB20
gb12 GB
SNYPR Architecture Guide 87
Hadoop Cluster Tuning Recommendations
Type
Type
SettingConservative
Optimal
Spa
rkAll
Java Heap
Size of
History
Server in
Bytes
512 MB512
MB512 MB
Spa
rk
2
All
Java Heap
Size of
History
Server in
Bytes
512 MB512
MB512 MB
Hiv
eAll
Hive : Spark
Executor
Maximum
Java Heap
Size : 256
MB
256 MB256
MB256 MB
Hiv
eAll
Hive : Spark
Driver
Maximum
Java Heap
Size : 256
MB
256 MB256
MB256 MB
Hiv
eAll
Hive : Spark
Driver
Memory
Overhead :
26 MB
26 MB256
MB26 MB
SNYPR Architecture Guide 88
Hadoop Cluster Tuning Recommendations
Type
Type
SettingConservative
Optimal
Hiv
eAll
Hive : Spark
Executor
Memory
Overhead :
26 MB
26 MB256
MB26 MB
Hiv
eAll
Hive : Java
Heap Size of
Hive
Metastore
Server in
Bytes : 4 GB
4 GB 4 gb 4 GB
Kaf
kaAll
KAFKA:
Maximum
Message Size
- message_
max_bytes -
10 MiB
10 MiB10
MB10 MiB
Kaf
kaAll
KAFKA:
Replica
Maximum
Fetch Size -
replica.fetch.
max.bytes -
10 MiB
15 MiB15
MB15 MiB
Kaf
kaAll
Kafka Broker
logging levelERROR ERROR
SNYPR Architecture Guide 89
Hadoop Cluster Tuning Recommendations
Type
Type
SettingConservative
Optimal
Kaf
kaAll
Kafka
MirrorMaker
Logging
Threshold
ERROR ERROR
Kaf
kaAll
ZooKeeper
Session
Timeout
zookeeper.se
ssion.timeout.
ms
6s 6s 6s
Kaf
kaAll
Number of
Replica
Fetchers
2 2 4
Kaf
kaAll
open file
limit or
maximum file
descriptors
1000001000
00100000
Kaf
kaAll
Java Heap
Size of
Broker
8 GB 8GB 8 GB
Kaf
kaAll
Data
Retention
Hours
log.retention.
hours
7 days7
Days7 days
SNYPR Architecture Guide 90
Hadoop Cluster Tuning Recommendations
Type
Type
SettingConservative
Optimal
Kaf
kaAll
Kafka Broker
Advanced
Configuration
Snippet
(Safety
Valve) for
kafka.propert
ies
num.network.threads
=16
socket.send.buffer.b
ytes=1048576
socket.receive.buffe
r.bytes=1048576
socket.request.max.b
ytes=104857600
replica.fetch.wait.ma
x.ms=500
replica.socket.timeo
ut.ms=30000
replica.socket.receiv
e.buffer.bytes=6553
6
replica.high.waterma
rk.checkpoint.interv
al.ms =5000
controller.socket.tim
eout.ms=30000
controller.message.q
ueue.size=10
zookeeper.sync.time.
ms=2000
socket.request.max.b
ytes=104857600
queued.max.requests
=16
fetch.purgatory.purg
e.interval.requests=
100
producer.purgatory.p
urge.interval.request
s =100
authorizer.class.nam
e=
kafka.security.auth.Si
mpleAclAuthorizer
allow.everyone.if.no.
acl.found=true
num.network.threads=
16
socket.send.buffer.byt
es=1048576
socket.receive.buffer.
bytes=1048576
socket.request.max.by
tes=104857600
replica.fetch.wait.max.
ms=500
replica.socket.timeout.
ms=30000
replica.socket.receive.
buffer.bytes=65536
replica.high.watermar
k.checkpoint.interval.
ms =5000
controller.socket.time
out.ms=30000
controller.message.qu
eue.size=10
zookeeper.sync.time.
ms=2000
socket.request.max.by
tes=104857600
queued.max.requests=
16
fetch.purgatory.purge.i
nterval.requests=100
producer.purgatory.pu
rge.interval.requests=
100
authorizer.class.name=
kafka.security.auth.Si
mpleAclAuthorizer
allow.everyone.if.no.a
cl.found=true
SNYPR Architecture Guide 91
Hadoop Cluster Tuning Recommendations
Type
Type
SettingConservative
Optimal
HD
FSAll
Blocks With
Corrupt
Replicas
Monitoring
Thresholds
warning:0.5,
critical:1
warn
ing:0.
5,
critic
al:1
warning:0.5, critical:1
HD
FSAll
Under-
replicated
Block
Monitoring
Thresholds
warning:10,
critical:40
warn
ing:1
0,
critic
al:40
warning:10, critical:40
HD
FSAll
Replication
Factor2 2 2
HD
FSAll
Minimal
Block
Replication
1 1 1
HD
FSAll
Maximal
Block
Replication
512 512 512
HD
FS
Cl
ou
de
ra
Safemode
Threshold
Percentage
0.9990.99
90.999
Ima
pla
Cl
ou
de
ra
dump when
out of
memory
disableddisab
leddisabled
SNYPR Architecture Guide 92
Hadoop Cluster Tuning Recommendations
Type
Type
SettingConservative
Optimal
Kaf
ka
Cl
ou
de
ra
dump when
out of
memory
disableddisab
leddisabled
Spa
rk
Cl
ou
de
ra
dump when
out of
memory
disableddisab
leddisabled
Yar
n
Cl
ou
de
ra
dump when
out of
memory
disableddisab
leddisabled
Zoo
kee
per
Cl
ou
de
ra
dump when
out of
memory
disableddisab
leddisabled
zoo
kee
pe
r-
kaf
ka
Cl
ou
de
ra
dump when
out of
memory
disableddisab
leddisabled
Hba
se
Cl
ou
de
ra
Dump Heap
when out of
memory
disableddisab
leddisabled
SNYPR Architecture Guide 93
Hadoop Cluster Tuning Recommendations
Type
Type
SettingConservative
Optimal
Kaf
kaAll
Minimum
Number of
Replicas in
ISR
min.insync.re
plicas
1 1
SNYPR Architecture Guide 94
Hadoop Cluster Tuning Recommendations
Type
Type
SettingConservative
Optimal
HB
ASEAll
Java
Configuration
Options for
HBase
RegionServer
-XX:+UseParNewGC
-
XX:+UseConcMarkS
weepGC -
XX:CMSInitiatingOcc
upancyFraction=70 -
XX:+CMSParallelRe
markEnabled -
XX:ParallelGCThrea
ds=20 -
XX:ConcGCThreads=
15 -
XX:+UnlockExperim
entalVMOptions -
XX:G1MixedGCLive
ThresholdPercent=8
5 -
XX:G1HeapWastePe
rcent=2 -
XX:InitiatingHeapOc
cupancyPercent=35
-
XX:+PrintReference
GC -
XX:+UseGCLogFileR
otation -
XX:NumberOfGCLog
Files=10 -
XX:GCLogFileSize=2
0M -verbose:gc -
XX:+PrintGCDetails
-
XX:+PrintGCTimeSta
mps -
XX:+PrintGCDateSta
mps -
XX:+PrintTenuringDi
stribution -
XX:+PrintGCApplicat
ionStoppedTime -
Xloggc:/var/log/hbas
e/gc.log
-XX:+UseParNewGC -
XX:+UseConcMarkSw
eepGC -
XX:CMSInitiatingOccu
pancyFraction=70 -
XX:+CMSParallelRema
rkEnabled -
XX:ParallelGCThreads
=20 -
XX:ConcGCThreads=1
5 -
XX:+UnlockExperimen
talVMOptions -
XX:G1MixedGCLiveT
hresholdPercent=85 -
XX:G1HeapWastePerc
ent=2 -
XX:InitiatingHeapOcc
upancyPercent=35 -
XX:+PrintReferenceG
C -
XX:+UseGCLogFileRot
ation -
XX:NumberOfGCLogF
iles=10 -
XX:GCLogFileSize=20
M -verbose:gc -
XX:+PrintGCDetails -
XX:+PrintGCTimeSta
mps -
XX:+PrintGCDateStam
ps -
XX:+PrintTenuringDis
tribution -
XX:+PrintGCApplicati
onStoppedTime -
Xloggc:/var/log/hbase/
gc.log
SNYPR Architecture Guide 95
Hadoop Cluster Tuning Recommendations
Type
Type
SettingConservative
Optimal
Zoo
kee
per
Cl
ou
de
ra
Jute Max
Buffer90 MB
90
MB90 MB
Zoo
kee
per
All
Java Heap
Size of
Zookeeper
Server in
Bytes
6 GB 8 GB 6 GB
Zoo
kee
per
All
Minimum
Session
Timeout
8000 8000 8000
Zoo
kee
per
All
Maximum
Session
Timeout
900009000
090000
Zoo
kee
per
All
Canary
Connection
Timeout
20 seconds
20
seco
nds
20 seconds
Zoo
kee
per
AllTick Time
tickTime4000 4000 4000
Zoo
kee
per
All
Maximum
Client
Connections
maxClientCn
xns
8000 8000 8000
SNYPR Architecture Guide 96
Hadoop Cluster Tuning Recommendations
Type
Type
SettingConservative
Optimal
Zoo
kee
pe
r-
Kaf
ka
All
Zookeeper-
kafka - Java
Heap Size of
Zookeeper
Server in
Bytes
8 GB 8GB 8 GB
Zoo
kee
pe
r-
Kaf
ka
AllTick Time
tickTime4000 2000 4000
Zoo
kee
pe
r-
kaf
ka
Cl
ou
de
ra
Jute Max
Buffer50 MB
50
MB50 MB
Zoo
kee
pe
r-
kaf
ka
AllMaxclientcon
nections6000 6000 6000
SNYPR Architecture Guide 97
Hadoop Cluster Tuning Recommendations
Type
Type
SettingConservative
Optimal
Zoo
kee
pe
r-
kaf
ka
AllminSessionTi
meout4000 4000 4000
Zoo
kee
pe
r-
kaf
ka
AllmaxSessionTi
meout90000
6000
090000
YA
RNAll
yarn.resourc
emanager.am.
max-retries,
yarn.resourc
emanager.am.
max-attempts
20 20 20
Imp
alaAll
Impala
daemon
Saftety valve
--enable_partitioned_
aggregation=true --
enable_partitioned_
hash_join=true
SNYPR Architecture Guide 98
Hadoop Cluster Tuning Recommendations
Hadoop Cluster Log Configuration
Service Level Property
HBase ERRORGateway Logging
Threshold
HBase ERRORHBase REST Server
Logging Threshold
HDFS ERRORDataNode Logging
Threshold
HDFS ERRORFailover Controller Logging
Threshold
HDFS ERRORGateway Logging
Threshold
HDFS ERROR HttpFS Logging Threshold
HDFS ERRORJournalNode Logging
Threshold
HDFS ERRORNFS Gateway Logging
Threshold
HDFS ERRORNameNode Block State
Change Logging Threshold
HDFS ERRORNameNode Logging
Threshold
HDFS ERRORSecondaryNameNode
Logging Threshold
SNYPR Architecture Guide 99
Hadoop Cluster Tuning Recommendations
Service Level Property
Hive ERRORGateway Logging
Threshold
Hive ERRORHive Metastore Server
Logging Threshold
Hive ERRORHiveServer2 Logging
Threshold
Hive ERRORWebHCat Server Logging
Threshold
Imapala ERRORImpala Catalog Server
Logging Threshold
Imapala ERRORImpala Daemon Logging
Threshold
Imapala ERROR
Impala Llama
ApplicationMaster Logging
Threshold
Imapala ERRORImpala StateStore Logging
Threshold
Kafka ERRORGateway Logging
Threshold
Kafka ERRORKafka Broker Logging
Threshold
Kafka ERRORKafka MirrorMaker
Logging Threshold
SNYPR Architecture Guide 100
Hadoop Cluster Tuning Recommendations
Service Level Property
Key Value Store ERRORLily HBase Indexer Logging
Threshold
Oozie ERROROozie Server Logging
Threshold
Spark ERROR Shell Logging Threshold
Spark ERRORGateway Logging
Threshold
YARN ERRORHistory Server Logging
Threshold
Gateway Logging
Threshold
YARN ERRORJobHistory Server Logging
Threshold
YARN ERRORNodeManager Logging
Threshold
YARN ERRORResourceManager Logging
Threshold
Zookeeper ERROR Server Logging Threshold
Clouder Manager ERRORActivity Monitor Logging
Threshold
Clouder Manager ERRORAlert Publisher Logging
Threshold
SNYPR Architecture Guide 101
Hadoop Cluster Tuning Recommendations
Service Level Property
Clouder Manager ERROREvent Server Logging
Threshold
Clouder Manager ERRORHost Monitor Logging
Threshold
Clouder Manager ERRORService Monitor Logging
Threshold
SNYPR Architecture Guide 102
Remote Ingestion Node Tuning
Remote Ingestion Node TuningThe below configurations have been tested to support TCP connections from over 3K- 5K hosts providing a continuous stream of data. The NIC on the server is 10G tosupport the increased loads.
If the data is forwarded from a SIEM to the RIN , the number of tcp connections madeare minimal [< 50]. In such a scenario, high number of connections are not a bottleneckand aggressive tuning is not suggested. For high eps environments, dedicatedresources are requested
Reference Server Used is a VMCompare Server to Reference Server lscpu
l Architecture: x86_64
l CPU op-mode(s): 32-bit, 64-bit
l Byte Order: Little Endian
l CPU(s): 8
l On-line CPU(s) list: 0-7
l Thread(s) per core: 1
l Core(s) per socket: 2
l Socket(s): 4
l NUMA node(s): 1
l Vendor ID: GenuineIntel
l CPU family: 6
l Model: 58
l Model name: Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz
l Stepping: 0
l CPU MHz: 2700.000
l BogoMIPS: 5400.00
SNYPR Architecture Guide 103
Remote Ingestion Node Tuning
l Hypervisor vendor: VMware
l Virtualization type: full
l L1d cache: 32K
l L1i cache: 32K
l L2 cache: 256K
l L3 cache: 30720K
l NUMA node0 CPU(s): 0-7
Server PreparationAdding networking monitoring tools is recommended for gathering statistics anddebugging when errors are present.
Requesting for dedicated resources ensures optimal performance for the collectors.
Dedicated resources and VM configurations setting the latency sensitivity as High[Edit Settings > VM Options > Latency Sensitivity] are essential in any of the following
scenarios:
l Unfiltered eps > 10K.
l Inbound tcp connections > 10
l Complex Filters
Recommended Tools for Network Statisticsl Install netstat: rpm -ivh net-tools-1.60-114.el6.x86_64.rpm. For more rpm
packages:https://rpmfind.net/linux/rpm2html/search.php?query=%2Fbin%2Fnetstat
l Install ethtool: rpm -ivh ethtool-3.5-6.el6.x86_64.rpm. For additional rpm packages:http://fr2.rpmfind.net/linux/rpm2html/search.php?query=ethtool
SNYPR Architecture Guide 104
Remote Ingestion Node Tuning
Tune Server Network ParametersEdit sysctl.conf file:
vi /etc/sysctl.conf
Add the following parameters to sysctl.conf:
# ---------------------------------------------------------------
-----
# The following allow the server to handle lots of connection
requests
# ---------------------------------------------------------------
-----
# Increase number of incoming connections that can queue up
# before dropping
net.core.somaxconn = 50000
# Handle SYN floods and large numbers of valid HTTPS connections
net.ipv4.tcp_max_syn_backlog = 30000
# Increase the length of the network device input queue
net.core.netdev_max_backlog = 20000
SNYPR Architecture Guide 105
Remote Ingestion Node Tuning
# Increase system file descriptor limit so we will (probably)
# never run out under lots of concurrent requests.
# (Per-process limit is set in /etc/security/limits.conf)
fs.file-max = 100000
# Widen the port range used for outgoing connections
net.ipv4.ip_local_port_range = 10000 65000
# If your servers talk UDP, also up these limits
net.ipv4.udp_rmem_min = 8192
net.ipv4.udp_wmem_min = 8192
# ---------------------------------------------------------------
-----
# The following help the server efficiently pipe large amounts of
data
# ---------------------------------------------------------------
-----
SNYPR Architecture Guide 106
Remote Ingestion Node Tuning
# Disable TCP slow start on idle connections
net.ipv4.tcp_slow_start_after_idle = 0
# Turn off the tcp_window_scaling
net.ipv4.tcp_window_scaling = 1
# Turn off the tcp_timestamps
net.ipv4.tcp_timestamps = 0
# Turn off the tcp_sack
net.ipv4.tcp_sack = 0
# Increase Linux autotuning TCP buffer limits
# Don't set tcp_mem itself! Let the kernel scale it based on RAM.
# allow testing with buffers up to 128MB
net.core.rmem_max = 134217728
SNYPR Architecture Guide 107
Remote Ingestion Node Tuning
net.core.wmem_max = 134217728
# increase Linux autotuning TCP buffer limit to 64MB
net.ipv4.tcp_rmem = 4096 87380 67108864
net.ipv4.tcp_wmem = 4096 65536 67108864
# recommended default congestion control is htcp
net.ipv4.tcp_congestion_control=htcp
# recommended for hosts with jumbo frames enabled (only relevand
for systems with 10GB interfaces)
net.ipv4.tcp_mtu_probing=1
Reload sysctl configurations:
sysctl -p
Increase Transmit Queue Length for 10G NICs;
/sbin/ifconfig <interface where you will receive data> txqueuelen
10000
/sbin/ifconfig <interface where you will receive data> txqueuelen
10000
Set the txqueuelen permanently:
SNYPR Architecture Guide 108
Remote Ingestion Node Tuning
vi /etc/rc.local
SNYPR Architecture Guide 109
Remote Ingestion Node Tuning
Syslog-NG Configurations
SNYPR Architecture Guide 110
Remote Ingestion Node Tuning
High EPS Environment
SNYPR Architecture Guide 111
Remote Ingestion Node Tuning
## For Improving performance with lots of connections## max_connections = active_connections## log_iw_size = number of active_connections * EPS## log_fetch_limit = 10000 ## flush_lines = 10000## log_fifo_size = log_iw_size * 10
## Improving performance with a few connections but high amount of traffic:
## log_iw_size = number of active_connections * 100,000## or number of active_connections * EPS whichever is greater
## log_fetch_limit = number of active_connections * 100,000## or number of active_connections * EPS whichever is greater
## log_fifo_size = log_fifo_size = log_iw_size * 10## flush_lines = 10,000 or greater
options {
## Specifies how many lines are flushed to a destination at a time.## The syslog-ng OSE application waits for this number of lines## to accumulate and sends them off in a single batch## Increasing this number increases throughput as more messages are sent in a singlebatch
## but also increases message latency
flush_lines (10000);
## Enable syslog-ng OSE to run in multithreaded mode and use multiple CPUs
threaded(yes) ;## The time to wait in seconds before an idle destination file is closed## Time in sec.time-reap(3);
##The time to wait in seconds before a dead connection is reestablished
SNYPR Architecture Guide 112
Remote Ingestion Node Tuning
## Time in sec.time-reopen(2);
## MARK messages are generated when there was no message traffic## to inform the receiver that the connection is still alive## Destination driver drops all MARK messages
## If an explicit mark-mode() is not given to the drivers## where none is the default value, then none will be used
mark-mode(none);
## STATS are log messages sent by syslog-ng, containing statistics about dropped logmessages## Set to 0 to disable the STATS messagesstats-freq(0);
## If a client sends the log message directly to the syslog-ng server, the chain-hostnames() option is enabled on the server,## and the client sends a hostname in the message that is different from its DNS
hostname (as resolved from DNS by the syslog-ng server),## then the server can append the resolved hostname to the hostname in the message(separated with a / character) when the message is written to the destination.chain_hostnames (off);
## Use DNS for name resolution
use_dns (no);dns_cache(no);use_fqdn (no);
## Create Directories and Set permissions for files getting generated## Set the permission for directories where the file will be read fromcreate_dirs (yes);keep_hostname (yes);chain_hostnames(off);log_msg_size(10000);dir_owner(securonix);dir_group(securonix);
SNYPR Architecture Guide 113
Remote Ingestion Node Tuning
owner(securonix);group(securonix);dir_perm(0775);perm(0775);
## From TCP and unix-stream sources, syslog-ng reads a maximum of log-fetch-limit()from every connection of the source
log-fetch-limit(1000);
};
## Other Notes: Sizing and setting buffer sizes## The number of connections to the source is set using the max-connections()parameter source s_network { network(transport("tcp") ip(0.0.0.0) port(10517) max-connections(10000) keep-alive(yes) so_rcvbuf(161920000) log-iw-size(161920000) );};
## Every destination has an output buffer (log-fifo-size()).
destination d_file{ file("/opt/Ingester/import/in/windows/windows_$R_YEAR$R_MONTH$R_DAY$R_HOUR$R_MIN.log" log-fifo-size(1619200000000) );};
## Flow-control uses a control window to determine if there is free space in theoutput buffer for new messages.
## Every source has its own control window, the log-iw-size() parameter sets the sizeof the control window.## add in Filters as per requirements
log { source(s_network); destination(d_file);
## If the output buffer becomes full, and flow-control is not used, messages may belost ## Comment out if causing impact on source systems buffer
flags(flow-control);
};
SNYPR Architecture Guide 114
Remote Ingestion Node Tuning
Low EPS Environment
SNYPR Architecture Guide 115
Remote Ingestion Node Tuning
options {flush_lines (1000);threaded(yes) ;time-reap(3);time-reopen(2);mark-mode(none);stats-freq(0);chain_hostnames (off);use_dns (no);dns_cache(no);use_fqdn (no);create_dirs (yes);chain_hostnames(off);log_msg_size(10000);dir_owner(securonix);dir_group(securonix);owner(securonix);group(securonix);dir_perm(0775);
perm(0775);log-fetch-limit(1000);threaded(yes) ;time-reap(3);time-reopen(2);mark-mode(none);
stats-freq(0);chain_hostnames (off);use_dns (no);dns_cache(no);use_fqdn (no);create_dirs (yes);keep_hostname (yes);chain_hostnames(off);log_msg_size(10000);dir_owner(securonix);dir_group(securonix);owner(securonix);
SNYPR Architecture Guide 116
Remote Ingestion Node Tuning
group(securonix);dir_perm(0775);perm(0775);log-fetch-limit(1000);
};source s_network { network(transport("tcp") ip(0.0.0.0) port(10517) max-connections(1000) keep-alive(yes) ); };
## Every destination has an output buffer (log-fifo-size()).destination d_file{file("/opt/Ingester/import/in/windows/windows_$R_YEAR$R_MONTH$R_DAY$R_HOUR$R_MIN.log");};
## Flow-control uses a control window to determine if there is free space in theoutput buffer for new messages.## Every source has its own control window, the log-iw-size() parameter sets the sizeof the control window.## add in Filters as per requirements
log { source(s_network);
destination(d_file);
## If the output buffer becomes full, and flow-control is not used, messages may belost
## Comment out if causing impact on source systems buffer
flags(flow-control);
};
Performance ScenariosFor improving performance with a lot of connections:
SNYPR Architecture Guide 117
Remote Ingestion Node Tuning
max_connections = active_connectionslog_iw_size = number of active_connections * EPSlog_fetch_limit = 10000flush_lines = 10000log_fifo_size = log_iw_size * 10
For improving performance with few connections, but a high amount of traffic:
log_iw_size = number of active connections * 100,000 or number of active connections* EPS whichever is greaterlog_fetch_limit = number of active connections * 100,000 or number of activeconnections * EPS whichever is greaterlog_fifo_size = log_fifo_size = log_iw_size * 10flush_lines = 10,000 or greater
Best Practices
Data CollectionThe fastest way the syslog-ng application can receive log messages from the networkis using plain TCP transport with the network() source driver . By default, syslog-ngruns in multithreaded mode to scale to multiple CPUs or cores for increasedperformance
A TCP-based network source will scale based on the number of active connections.
This means that if there are 10 incoming TCP connections all coming to the samenetwork source, then that source can use 10 threads, one thread for each connection.
Higher stats_level decreases the performance. For example, stats_level(2) means -10%in performance.
Data Processing and FilteringMessage processors — such as filters, rewrite rules, and parsers — are executed by thereader thread in a sequential manner.
SNYPR Architecture Guide 118
Remote Ingestion Node Tuning
Simple filtering (for example, filtering on facility or tag) has no impact on performanceat all. However , regular expressions, even simple ones, significantly decrease themessage-processing rate, by about 40-45%.
It is advised to use the simplest filters when filtering incoming messages. If a messagecan be filtered with several types of filters, check the measured data. A message whenfiltered with a regexp , the performance of syslog-ng can drop down to 55-60% of theoriginal performance level. If the tag or facility filters are used, there is no decrease inperformance.
When using multiple filters one after the other, or connecting filters with the logicalAND/OR operators, the order of filters has a significant impact on performance.
Prioritize filters that are the most likely to match the incoming log messages to the topof the configuration file.
Data ConnectionsIf there are several thousand active connections simultaneously , it is advised to placerelay syslog-ng-s on another computer in front of the syslog-ng server. Switching
between active connections is time-consuming, while the amount of incomingmessages is usually not significant. This problem is solved by using relays, since theyare collecting the logs. The syslog-ng solution can handle lots of log messages sentfrom a few connections easily .
For system to support large number of tcp connections , irrespective of the EPS, thepreference is for a 10G NIC. NIC bonding can be carried out if VM cannot provide
dedicated 10G NIC
Debugging
Obtain pcap file at the interface and port to identify the source of the problem:
tcpdump -i eth0 -nnAs0 tcp dst port 517 -w /Securonix/tcpdump-06052018-183200.pcap
SNYPR Architecture Guide 119
Remote Ingestion Node Tuning
Common Errors
Rx DropsIt implies there is a network issue. This could mean fault network, faulty cable, badinterface.
Interface Not Sending ACKThis scenario implies that there is contention on the NIC and the existent NIC isunable to handle the load.
Files Not Getting CreatedThis scenario implies either configuration error in syslog-ng or the file handler limit onthe environment for the user creating the files has been reached.
ReferencesReason to turn off the tcp_sack: http://rtodto.net/effect-of-tcp-sack-on-throughput/
Reason to turn off the tcp_timestamps: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux_for_real_time/7/html/tuning_guide/reduce_tcp_performance_spikes
Result of modifying Latency Sensitivity:
https://www.brianjgraf.com/2016/10/12/enabling-latency-sensitivity-option-on-vms-should-i-do-it/
Results of tuning syslog: https://syslog-ng.com/documents/html/syslog-ng-pe-6.0-guides/en/syslog-ng-pe-v6.0-performance-whitepaper/pdf/syslog-ng-pe-v6.0-performance-whitepaper.pdf
Results of Syslog guidelines: https://syslog-ng.com/documents/html/syslog-ng-ose-latest-guides/en/syslog-ng-ose-guide-admin/html/configuring-flow-control.html
SNYPR Architecture Guide 120