abhinav nekkanti, sourav pal & tameem anwar splunk · abhinav nekkanti - sr. software engineer,...

Copyright©2016SplunkInc.

AbhinavNekkanti,SouravPal&TameemAnwarSplunk

HarnessingPerformanceandScalabilitywithParallelization

Disclaimer

2

Duringthecourseofthispresentation,wemaymakeforwardlookingstatementsregardingfutureeventsortheexpectedperformanceofthecompany.Wecautionyouthatsuchstatementsreflectourcurrentexpectationsandestimatesbasedonfactorscurrentlyknowntousandthatactualeventsorresultscoulddiffermaterially.Forimportantfactorsthatmaycauseactualresultstodifferfromthose

containedinourforward-lookingstatements,pleasereviewourfilingswiththeSEC.Theforward-lookingstatementsmadeinthethispresentationarebeingmadeasofthetimeanddateofitslivepresentation.Ifreviewedafteritslivepresentation,thispresentationmaynotcontaincurrentoraccurateinformation.Wedonotassumeanyobligationtoupdateanyforwardlookingstatementswemaymake.Inaddition,anyinformationaboutourroadmapoutlinesourgeneralproductdirectionandissubjecttochangeatanytimewithoutnotice.Itisforinformationalpurposesonlyandshallnot,beincorporatedintoanycontractorothercommitment.Splunkundertakesnoobligationeithertodevelopthefeaturesor

functionalitydescribedortoincludeanysuchfeatureorfunctionalityinafuturerelease.

Agenda

3

• Under-utilizedhardware?• Multipleingestionpipelines• Parallelizingsearch• Performance• Bestpractices Screenshothere

AboutUsAbhinavNekkanti- Sr.SoftwareEngineer,Splunk

– IngestionPipeline

SouravPal- PrincipalEngineer,Splunk

– SearchParallelization

TameemAnwar- SoftwareEngineer,Splunk

– Performance

4

3TierArchitecture

5

ForwardersIndexers

RawDataSearches

SearchHeads

SearchResults

InsightintotheIndexer

6

Splunkd ServerDaemon

SplunkSearchProcess

.

.

.RawData

TraditionalIndexerHosts

Disk

Buckets

B

B

B

.

.

SearchResults

SearchResults

SP SP SP

SplunkSearchProcessSP SP SP

Splunkd ServerDaemon/Pipelineset

7

ParsingQueue

AggQueue

TypingQueue

IndexQueue

TCP/UDPpipeline

Tailing

FIFOpipeline

FSChange

Execpipeline

utf8

header

ParsingPipeline

linebreaker aggregator

MergingPipeline

regexreplacement

annotator

TypingPipeline

tcp out

syslogout

indexer

IndexPipeline

IngestionPipelineSet

IndexerCoreUtilizationRuleofThumb:

ExamplecoreutilizationofaIndexerHost:– 4to6coresforSplunkd Serverdaemon– 10X1coresforSplunkSearchProcesses– Totalcoresused:14to16cores

8

Process Cores(approx.)

Splunkd ServerDaemon 4 to6cores

SplunkSearchProcess 1core /searchprocess

Under-utilizedIndexer

9


SplunkSearchProcessDisk

Buckets

B

B

B

UnutilizedResourcesCPU/Memory/Network/Disk

SP SP SP

SplunkSearchProcessSP SP SP

0

400

800

1200

1600

2000

2400

2800

3200

CoreUtilization%

PerformanceEnhancements

MultiplePipelineSets– Parallelingestingpipelinesets– Improvesresourceutilizationofthehostmachine

SearchImprovements– Fasterbatchsearchesusingparallelsearchpipelines– Schedulerimprovements– FasterSummarybuildup

10

MultipleIngestionPipelineSets

Splunkd withMultipleIngestionPipelineSets

12


RawData

Disk

BucketsB

B

B

B

B

B

B

B

B

Indexerwith3PipelineSets

ConfiguringMultipleIngestionPipelineSets

13

$SPLUNK_HOME/etc/system/local/server.conf

[general]parallelIngestionPipelines = 2

MultipleIngestionPipelineSets– Details

EachPipelineSethasitsownsetofQueues,PipelinesandProcessors– ExceptionsareInputPipelineswhichareusuallysingleton

NostateissharedacrossPipelineSetsDatafromauniquesourceishandledbyonlyonePipelineSetatatime

14

MultipleIngestionPipelineSetsoverNetwork

15

Forwarderwith3PipelineSets

SplunkdForwarder

Indexerwith3PipelineSets

File

File

Script


Disk

BucketsB

B

B

B

B

B

B

B

B

TCP

MultipleIngestionPipelineSets– MonitorInput

EachPipelinesethasitsownsetofTailReader,BatchReader andArchiveProcessorEnablesparallelreadingoffilesandarchivesonForwardersEachfile/archiveisassignedtoonepipelineset

16

MultipleIngestionPipelineSets- Forwarding

Forwarder:– Onetcp outputprocessorperpipelineset– Multipletcp connectionsfromtheforwardertodifferentindexersatthe

sametime– Loadbalancingrulesappliedtoeachpipelinesetindependently

Indexer:– Everyincomingtcp forwarderconnectionisboundtoonepipelinesetonthe

Indexer

17

MultipleIngestionPipelineSets- Indexing

EverypipelinesetwillindependentlywritenewdatatoindexesDataiswritteninparalleltobetterutilizeresourcesBucketsproducedbydifferentpipelinesetscouldhaveoverlappingtimeranges

18

Search:ParallelizationEffortsPerformanceImprovements

Search Parallelization:PerformanceImprovement

SplunkSearchesarefaster.

20

• ParallelizingtheSearchPipeline

• ImprovingtheSearchScheduler

• TheSummaryBuildingisparallelizedandfaster.

Search Pipeline

21

CursoredSearch

…B6B5B4B3B2B1

ReadingOrderIteratesovertimehenceneedstoreadbucketbasedonthetimeordering.

BatchSearch

Option1:…B3B5B1B2B1B6Option2:…B6B5B4B3B2B1Option3…B6B5B4B7B4B9

ReadingOrder

Iteratesoverbuckets,timeorderingisnotneeded

Targetsearchbucketids

B1 B2 B3

B4 B5 B6

B7 B8 B9

b11 b11 b11SearchPostProcessing

SearchProcessor

SearchProcessor

Serialize&

Transmit

Indexer(Disk)

SearchPipelineatthePeer

Facilitatesparallelprocessingofbucketsindependentlyacrossmultiplepipeline

• CursoredSearch:Timeordereddataretrieval.• BatchSearch:Bucketordereddataretrieval.

BatchSearch:PipelineParallelization

22

Targetsearchbuckets

B1 B2 B3

b11 b11 b11

B7 B8 B9

B4 B5 B6

Indexer(Disk)

SearchProcessor

SearchProcessor

SearchProcessor

SearchProcessor

SearchProcessor

SearchProcessor

SearchProcessor

SearchProcessor

SearchPostProcessing

Aggregator&

Serializer

Transmit(I/O)

SearchPipeline1

SearchPipeline4

SearchPipeline3

SearchPipeline2

T

T

T

T

T

T

T=Thread

BatchSearch:PipelineParallelization

Under-utilizedindexersprovideusopportunitytoexecutemultiplesearchpipelines.BatchSearchtime-unordereddataaccessmodeisidealformultiplesearchpipelines.Nostateissharedi.e.nodependencyexistsacrossSearchPipelines.Peer/Indexersideoptimizations.Takeaway:– Underutilizedindexersarecandidatesforsearchpipelineparallelization.– DoNOTenableifindexersareloaded.

23

ConfiguringtheBatchSearchinParallelmode

• Howtoenable?

24

• Whattoexpect?Searchperformanceintermsofretrievingsearchresultsimproved.Increaseinnumberofthreads

$SPLUNK_HOME/etc/system/local/limits.conf

[search]batch_search_max_pipeline =2

SearchSchedulerImprovements

SchedulerimprovementsinSplunkEnterprise:– PriorityScoring– ScheduleWindows

Performanceimprovementsoverpreviousschedulers– LowerLag– Fewerskippedsearches

25

SearchSchedulerImprovementsPriorityScore

26

Problem:Simplesingle-termpriorityscoringcouldresultinsavedsearchlag,skipping,andstarvation(underCPUconstraint).

score(j) =next_runtime(j)+average_runtime(j)×

priority_runtime_factor– skipped_count(j)× period(j)×priority_skipped_factor

+schedule_window_adjustment(j)

Solution:Bettermulti-termpriorityscoringmitigatesproblemsandimprovesperformanceby25%.

SearchSchedulerImprovements

27

Problem:Schedulercannotdistinguishbetweensearchesthat(A)reallyshould runataspecifictime(justlikecron)fromthosethat(B)don'thaveto.Thiscancauselagorskipping.

Solution:Giveaschedulewindow tosearchesthatdon’thavetorunatspecifictimes.

Example:Foragivensearch,it’sOKifitstartsrunningsometimebetweenmidnightand6am,butyoudon'treallycarewhenspecifically.

• Asearchwithawindowhelpsother searches.

• Searchwindowsshouldnot beusedforsearchesthatruneveryminute.

• Searchwindowsmust belessthanasearch’speriod

ConfiguringSearchScheduler

28

[scheduler]max_searches_perc =50

#Allowvaluetobe75anytimeonweekends.max_searches_perc.1=75max_searches_perc.1.when=****0,6

#Allowvaluetobe90betweenmidnightand5am.max_searches_perc.2=90max_searches_perc.2.when=*0-5***

$SPLUNK_HOME/etc/system/local/limits.conf

Search:ParallelSummarization

Sequentialnatureofbuildingsummarydatafordatamodelandsavedreportsisslow.SummaryBuildingprocesshasbeenparallelized.

29

SummaryBuildingParallelization

30

autosummarysearch

everyNminutes

SCHEDULERSCHEDULER

autosummarysearch

autosummarysearch

autosummarysearch

SequentialSummaryBuilding ParallelizedSummaryBuilding

ConfiguringSummaryBuildingforParallelization

$SPLUNK_HOME/etc/system/local/savedsearches.conf

31

[default]auto_summarize.max_concurrent =1

$SPLUNK_HOME/etc/system/local/datamodels.conf

[default]acceleration.max_concurrent =2

Performance

PerformanceTests

• SystemInfoo 2x12Xeon2.30GHzo 24cores(48w/HT)o 64GBRAMo 8x300GB15kRPMdisksinRAID-0o 1GbEthernetNICo CentOS7.6

• Nootherloadonthebox

33

Indexing

34

• Indexa100GBgenericsyslogdataset.Nosearchloads.• AverageIndexingThroughput– 41.40MB/s

Pipelines Time taken(minutes)

1 40.25m

2

Indexing

35

• AverageIndexingThroughput– 78.80MB/s• 90%IncreaseinAverageIndexingThroughput• OnanaverageSplunkutilized2xCPUcores,1.3xMemory

and2xDiskIOPS

Pipelines Time taken(minutes)

1 40.25m

2 21.16m

Forwarding

36

• UFsending100GBsyslogdataset(1kfiles)• 70%IncreaseinAverageThroughput• OnanaverageSplunkutilized2xtheresources

Pipelines AverageThroughput

1 33.6MB/s

2 57.1MB/s

SplunkwithoutParallelization

37

4forwardersdatasources

Indexer SearchHead

Machine1 Machine3Machine2

SplunkwithParallelization

38

Singleforwarder4IngestionPipelineSets

datasources

Indexer4IngestionPipelinesets4SearchPipelinesets

SearchHead

Machine1 Machine3Machine2

BurstinIndexingLoad+Searches

39

SplunkwithoutParallelization• Dataforwarded@10MB/s+Monitor100GBdataset• AverageIndexingThroughput– 39.12MB/s• NumberofConcurrentSearches– 4

IngestionPipelines

Time(mins)

1 53m

4

BurstinIndexingLoad+Searches

40

SplunkwithParallelization• Dataforwarded@10MB/s+Monitor100GBdataset• AverageIndexingThroughput– 94.7MB/s• 142%IncreaseinAverageIndexingThroughput• NumberofConcurrentSearches– 4

IngestionPipelines

Time(mins)

1 53m

4 22.5m

BatchModeSparseSearch

41

• SparseSearch– Characterizedpredominatelybyreturningsomeeventsperbucket

• 1SearchPipelinevs4SearchPipelines• Searchis2.4xfasterwithSearchParallelization

SearchPipelines

Time(seconds)

1 9.51 s

4 3.90s

BatchModeDenseSearch

42

• DenseSearch– Characterizedpredominatelyby returningmanyeventsperbucket

• 1SearchPipelinesvs4SearchPipelines• Searchis3.4xfasterwithSearchParallelization

SearchPipelines

Time(minutes)

1 15.5m

4 4.57m

ScheduledSearchesSetup

• 10searchesarescheduledtoruneveryminute• 5 longerrunningsearches(~40s)• 5 shorterrunningsearches(~15s)• Testconfiguredtorunonly3scheduledconcurrently

43

ScheduledSearches

44

• Skippedvs.SuccessfulSearches– 30minutewindow• 30%IncreaseinSuccessfulSearches• ThisoptimizationwillnotutilizeadditionalSystemResource

Version Searchescompleted

6.2 191

6.5 248

CPUUtilization

45

IngestionPipelines

SearchPipelines

CPUUtilized

1 1 990%

4 4 2437%

• BurstinIndexingLoad+Searches• CPUutilizedbysplunkd &searchprocess

MemoryUtilization

46

• BurstinIndexingLoad+Searches• ResidentMemoryutilizedbysplunkd &searchprocess

IngestionPipelines

SearchPipelines

MemoryUtilized

1 1 3.32GB

4 4 4.59GB

DiskI/O

47

• BurstinIndexingLoad+Searches• AverageReadandWritesOperationspersecond

IngestionPipelines

SearchPipelines

AverageDiskIOPS

1 1 202

4 4 579

FinalThoughts• WhatismyCurrentWorkload?o Datavolume– DailyandPeako SearchVolume– Concurrentandtotalo SystemResourceUsage

• HowdoIapproachthesefeatures?o Systemsignificantlyunder-utilized?o SearchPipelines• LotofBatchmodeSearches?

o ParallelIngestionPipelines• HandlingBurstsinData?PeaksinData• Readinglargenumberoffilesinparallel?

• Don’tforgetaboutHorizontalscaling

48

THANKYOU

abhinav nekkanti, sourav pal & tameem anwar splunk · abhinav nekkanti - sr. software engineer,...

Documents