abhinav nekkanti, sourav pal & tameem anwar splunk · abhinav nekkanti - sr. software engineer,...

49
Copyright © 2016 Splunk Inc. Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk Harnessing Performance and Scalability with Parallelization

Upload: others

Post on 31-May-2020

7 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

Copyright©2016SplunkInc.

AbhinavNekkanti,SouravPal&TameemAnwarSplunk

HarnessingPerformanceandScalabilitywithParallelization

Page 2: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

Disclaimer

2

Duringthecourseofthispresentation,wemaymakeforwardlookingstatementsregardingfutureeventsortheexpectedperformanceofthecompany.Wecautionyouthatsuchstatementsreflectourcurrentexpectationsandestimatesbasedonfactorscurrentlyknowntousandthatactualeventsorresultscoulddiffermaterially.Forimportantfactorsthatmaycauseactualresultstodifferfromthose

containedinourforward-lookingstatements,pleasereviewourfilingswiththeSEC.Theforward-lookingstatementsmadeinthethispresentationarebeingmadeasofthetimeanddateofitslivepresentation.Ifreviewedafteritslivepresentation,thispresentationmaynotcontaincurrentoraccurateinformation.Wedonotassumeanyobligationtoupdateanyforwardlookingstatementswemaymake.Inaddition,anyinformationaboutourroadmapoutlinesourgeneralproductdirectionandissubjecttochangeatanytimewithoutnotice.Itisforinformationalpurposesonlyandshallnot,beincorporatedintoanycontractorothercommitment.Splunkundertakesnoobligationeithertodevelopthefeaturesor

functionalitydescribedortoincludeanysuchfeatureorfunctionalityinafuturerelease.

Page 3: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

Agenda

3

• Under-utilizedhardware?• Multipleingestionpipelines• Parallelizingsearch• Performance• Bestpractices Screenshothere

Page 4: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

AboutUsAbhinavNekkanti- Sr.SoftwareEngineer,Splunk

– IngestionPipeline

SouravPal- PrincipalEngineer,Splunk

– SearchParallelization

TameemAnwar- SoftwareEngineer,Splunk

– Performance

4

Page 5: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

3TierArchitecture

5

ForwardersIndexers

RawDataSearches

SearchHeads

SearchResults

Page 6: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

InsightintotheIndexer

6

Splunkd ServerDaemon

SplunkSearchProcess

.

.

.RawData

TraditionalIndexerHosts

Disk

Buckets

B

B

B

.

.

SearchResults

SearchResults

SP SP SP

SplunkSearchProcessSP SP SP

Page 7: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

Splunkd ServerDaemon/Pipelineset

7

ParsingQueue

AggQueue

TypingQueue

IndexQueue

TCP/UDPpipeline

Tailing

FIFOpipeline

FSChange

Execpipeline

utf8

header

ParsingPipeline

linebreaker aggregator

MergingPipeline

regexreplacement

annotator

TypingPipeline

tcp out

syslogout

indexer

IndexPipeline

IngestionPipelineSet

Page 8: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

IndexerCoreUtilizationRuleofThumb:

ExamplecoreutilizationofaIndexerHost:– 4to6coresforSplunkd Serverdaemon– 10X1coresforSplunkSearchProcesses– Totalcoresused:14to16cores

8

Process Cores(approx.)

Splunkd ServerDaemon 4 to6cores

SplunkSearchProcess 1core /searchprocess

Page 9: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

Under-utilizedIndexer

9

Splunkd ServerDaemon

SplunkSearchProcessDisk

Buckets

B

B

B

UnutilizedResourcesCPU/Memory/Network/Disk

SP SP SP

SplunkSearchProcessSP SP SP

0

400

800

1200

1600

2000

2400

2800

3200

CoreUtilization%

Page 10: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

PerformanceEnhancements

MultiplePipelineSets– Parallelingestingpipelinesets– Improvesresourceutilizationofthehostmachine

SearchImprovements– Fasterbatchsearchesusingparallelsearchpipelines– Schedulerimprovements– FasterSummarybuildup

10

Page 11: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

MultipleIngestionPipelineSets

Page 12: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

Splunkd withMultipleIngestionPipelineSets

12

Splunkd ServerDaemon

RawData

Disk

BucketsB

B

B

B

B

B

B

B

B

Indexerwith3PipelineSets

Page 13: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

ConfiguringMultipleIngestionPipelineSets

13

$SPLUNK_HOME/etc/system/local/server.conf

[general]parallelIngestionPipelines = 2

Page 14: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

MultipleIngestionPipelineSets– Details

EachPipelineSethasitsownsetofQueues,PipelinesandProcessors– ExceptionsareInputPipelineswhichareusuallysingleton

NostateissharedacrossPipelineSetsDatafromauniquesourceishandledbyonlyonePipelineSetatatime

14

Page 15: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

MultipleIngestionPipelineSetsoverNetwork

15

Forwarderwith3PipelineSets

SplunkdForwarder

Indexerwith3PipelineSets

File

File

Script

Splunkd ServerDaemon

Disk

BucketsB

B

B

B

B

B

B

B

B

TCP

Page 16: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

MultipleIngestionPipelineSets– MonitorInput

EachPipelinesethasitsownsetofTailReader,BatchReader andArchiveProcessorEnablesparallelreadingoffilesandarchivesonForwardersEachfile/archiveisassignedtoonepipelineset

16

Page 17: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

MultipleIngestionPipelineSets- Forwarding

Forwarder:– Onetcp outputprocessorperpipelineset– Multipletcp connectionsfromtheforwardertodifferentindexersatthe

sametime– Loadbalancingrulesappliedtoeachpipelinesetindependently

Indexer:– Everyincomingtcp forwarderconnectionisboundtoonepipelinesetonthe

Indexer

17

Page 18: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

MultipleIngestionPipelineSets- Indexing

EverypipelinesetwillindependentlywritenewdatatoindexesDataiswritteninparalleltobetterutilizeresourcesBucketsproducedbydifferentpipelinesetscouldhaveoverlappingtimeranges

18

Page 19: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

Search:ParallelizationEffortsPerformanceImprovements

Page 20: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

Search Parallelization:PerformanceImprovement

SplunkSearchesarefaster.

20

• ParallelizingtheSearchPipeline

• ImprovingtheSearchScheduler

• TheSummaryBuildingisparallelizedandfaster.

Page 21: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

Search Pipeline

21

CursoredSearch

…B6B5B4B3B2B1

ReadingOrderIteratesovertimehenceneedstoreadbucketbasedonthetimeordering.

BatchSearch

Option1:…B3B5B1B2B1B6Option2:…B6B5B4B3B2B1Option3…B6B5B4B7B4B9

ReadingOrder

Iteratesoverbuckets,timeorderingisnotneeded

Targetsearchbucketids

B1 B2 B3

B4 B5 B6

B7 B8 B9

b11 b11 b11SearchPostProcessing

SearchProcessor

SearchProcessor

Serialize&

Transmit

Indexer(Disk)

SearchPipelineatthePeer

Facilitatesparallelprocessingofbucketsindependentlyacrossmultiplepipeline

• CursoredSearch:Timeordereddataretrieval.• BatchSearch:Bucketordereddataretrieval.

Page 22: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

BatchSearch:PipelineParallelization

22

Targetsearchbuckets

B1 B2 B3

b11 b11 b11

B7 B8 B9

B4 B5 B6

Indexer(Disk)

SearchProcessor

SearchProcessor

SearchProcessor

SearchProcessor

SearchProcessor

SearchProcessor

SearchProcessor

SearchProcessor

SearchPostProcessing

Aggregator&

Serializer

Transmit(I/O)

SearchPipeline1

SearchPipeline4

SearchPipeline3

SearchPipeline2

T

T

T

T

T

T

T=Thread

Page 23: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

BatchSearch:PipelineParallelization

Under-utilizedindexersprovideusopportunitytoexecutemultiplesearchpipelines.BatchSearchtime-unordereddataaccessmodeisidealformultiplesearchpipelines.Nostateissharedi.e.nodependencyexistsacrossSearchPipelines.Peer/Indexersideoptimizations.Takeaway:– Underutilizedindexersarecandidatesforsearchpipelineparallelization.– DoNOTenableifindexersareloaded.

23

Page 24: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

ConfiguringtheBatchSearchinParallelmode

• Howtoenable?

24

• Whattoexpect?Searchperformanceintermsofretrievingsearchresultsimproved.Increaseinnumberofthreads

$SPLUNK_HOME/etc/system/local/limits.conf

[search]batch_search_max_pipeline =2

Page 25: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

SearchSchedulerImprovements

SchedulerimprovementsinSplunkEnterprise:– PriorityScoring– ScheduleWindows

Performanceimprovementsoverpreviousschedulers– LowerLag– Fewerskippedsearches

25

Page 26: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

SearchSchedulerImprovementsPriorityScore

26

Problem:Simplesingle-termpriorityscoringcouldresultinsavedsearchlag,skipping,andstarvation(underCPUconstraint).

score(j) =next_runtime(j)+average_runtime(j)×

priority_runtime_factor– skipped_count(j)× period(j)×priority_skipped_factor

+schedule_window_adjustment(j)

Solution:Bettermulti-termpriorityscoringmitigatesproblemsandimprovesperformanceby25%.

Page 27: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

SearchSchedulerImprovements

27

Problem:Schedulercannotdistinguishbetweensearchesthat(A)reallyshould runataspecifictime(justlikecron)fromthosethat(B)don'thaveto.Thiscancauselagorskipping.

Solution:Giveaschedulewindow tosearchesthatdon’thavetorunatspecifictimes.

Example:Foragivensearch,it’sOKifitstartsrunningsometimebetweenmidnightand6am,butyoudon'treallycarewhenspecifically.

• Asearchwithawindowhelpsother searches.

• Searchwindowsshouldnot beusedforsearchesthatruneveryminute.

• Searchwindowsmust belessthanasearch’speriod

Page 28: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

ConfiguringSearchScheduler

28

[scheduler]max_searches_perc =50

#Allowvaluetobe75anytimeonweekends.max_searches_perc.1=75max_searches_perc.1.when=****0,6

#Allowvaluetobe90betweenmidnightand5am.max_searches_perc.2=90max_searches_perc.2.when=*0-5***

$SPLUNK_HOME/etc/system/local/limits.conf

Page 29: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

Search:ParallelSummarization

Sequentialnatureofbuildingsummarydatafordatamodelandsavedreportsisslow.SummaryBuildingprocesshasbeenparallelized.

29

Page 30: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

SummaryBuildingParallelization

30

autosummarysearch

everyNminutes

SCHEDULERSCHEDULER

autosummarysearch

autosummarysearch

autosummarysearch

SequentialSummaryBuilding ParallelizedSummaryBuilding

Page 31: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

ConfiguringSummaryBuildingforParallelization

$SPLUNK_HOME/etc/system/local/savedsearches.conf

31

[default]auto_summarize.max_concurrent =1

$SPLUNK_HOME/etc/system/local/datamodels.conf

[default]acceleration.max_concurrent =2

Page 32: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

Performance

Page 33: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

PerformanceTests

• SystemInfoo 2x12Xeon2.30GHzo 24cores(48w/HT)o 64GBRAMo 8x300GB15kRPMdisksinRAID-0o 1GbEthernetNICo CentOS7.6

• Nootherloadonthebox

33

Page 34: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

Indexing

34

• Indexa100GBgenericsyslogdataset.Nosearchloads.• AverageIndexingThroughput– 41.40MB/s

Pipelines Time taken(minutes)

1 40.25m

2

Page 35: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

Indexing

35

• AverageIndexingThroughput– 78.80MB/s• 90%IncreaseinAverageIndexingThroughput• OnanaverageSplunkutilized2xCPUcores,1.3xMemory

and2xDiskIOPS

Pipelines Time taken(minutes)

1 40.25m

2 21.16m

Page 36: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

Forwarding

36

• UFsending100GBsyslogdataset(1kfiles)• 70%IncreaseinAverageThroughput• OnanaverageSplunkutilized2xtheresources

Pipelines AverageThroughput

1 33.6MB/s

2 57.1MB/s

Page 37: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

SplunkwithoutParallelization

37

4forwardersdatasources

Indexer SearchHead

Machine1 Machine3Machine2

Page 38: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

SplunkwithParallelization

38

Singleforwarder4IngestionPipelineSets

datasources

Indexer4IngestionPipelinesets4SearchPipelinesets

SearchHead

Machine1 Machine3Machine2

Page 39: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

BurstinIndexingLoad+Searches

39

SplunkwithoutParallelization• Dataforwarded@10MB/s+Monitor100GBdataset• AverageIndexingThroughput– 39.12MB/s• NumberofConcurrentSearches– 4

IngestionPipelines

Time(mins)

1 53m

4

Page 40: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

BurstinIndexingLoad+Searches

40

SplunkwithParallelization• Dataforwarded@10MB/s+Monitor100GBdataset• AverageIndexingThroughput– 94.7MB/s• 142%IncreaseinAverageIndexingThroughput• NumberofConcurrentSearches– 4

IngestionPipelines

Time(mins)

1 53m

4 22.5m

Page 41: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

BatchModeSparseSearch

41

• SparseSearch– Characterizedpredominatelybyreturningsomeeventsperbucket

• 1SearchPipelinevs4SearchPipelines• Searchis2.4xfasterwithSearchParallelization

SearchPipelines

Time(seconds)

1 9.51 s

4 3.90s

Page 42: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

BatchModeDenseSearch

42

• DenseSearch– Characterizedpredominatelyby returningmanyeventsperbucket

• 1SearchPipelinesvs4SearchPipelines• Searchis3.4xfasterwithSearchParallelization

SearchPipelines

Time(minutes)

1 15.5m

4 4.57m

Page 43: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

ScheduledSearchesSetup

• 10searchesarescheduledtoruneveryminute• 5 longerrunningsearches(~40s)• 5 shorterrunningsearches(~15s)• Testconfiguredtorunonly3scheduledconcurrently

43

Page 44: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

ScheduledSearches

44

• Skippedvs.SuccessfulSearches– 30minutewindow• 30%IncreaseinSuccessfulSearches• ThisoptimizationwillnotutilizeadditionalSystemResource

Version Searchescompleted

6.2 191

6.5 248

Page 45: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

CPUUtilization

45

IngestionPipelines

SearchPipelines

CPUUtilized

1 1 990%

4 4 2437%

• BurstinIndexingLoad+Searches• CPUutilizedbysplunkd &searchprocess

Page 46: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

MemoryUtilization

46

• BurstinIndexingLoad+Searches• ResidentMemoryutilizedbysplunkd &searchprocess

IngestionPipelines

SearchPipelines

MemoryUtilized

1 1 3.32GB

4 4 4.59GB

Page 47: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

DiskI/O

47

• BurstinIndexingLoad+Searches• AverageReadandWritesOperationspersecond

IngestionPipelines

SearchPipelines

AverageDiskIOPS

1 1 202

4 4 579

Page 48: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

FinalThoughts• WhatismyCurrentWorkload?o Datavolume– DailyandPeako SearchVolume– Concurrentandtotalo SystemResourceUsage

• HowdoIapproachthesefeatures?o Systemsignificantlyunder-utilized?o SearchPipelines• LotofBatchmodeSearches?

o ParallelIngestionPipelines• HandlingBurstsinData?PeaksinData• Readinglargenumberoffilesinparallel?

• Don’tforgetaboutHorizontalscaling

48

Page 49: Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

THANKYOU