Download - How to Scale: From raw to tstats (and beyond!) - SplunkConf · Disclaimer 2 During the course of this presentation, we may make forward looking statements regarding future events

Copyright©2016Splunk Inc.

DavidVeuveStaffSecurityStrategist,Splunk

HowtoScale:From_rawtotstats (andbeyond!)

Disclaimer

2

Duringthecourseofthispresentation,wemaymakeforwardlookingstatementsregardingfutureeventsortheexpectedperformanceofthecompany.Wecautionyouthatsuchstatementsreflectourcurrentexpectationsandestimatesbasedonfactorscurrentlyknowntousandthatactualeventsorresultscoulddiffermaterially.Forimportantfactorsthatmaycauseactualresultstodifferfromthose

containedinourforward-lookingstatements,pleasereviewourfilingswiththeSEC.Theforward-lookingstatementsmadeinthethispresentationarebeingmadeasofthetimeanddateofitslivepresentation.Ifreviewedafteritslivepresentation,thispresentationmaynotcontaincurrentoraccurateinformation.Wedonotassumeanyobligationtoupdateanyforwardlookingstatementswemaymake.Inaddition,anyinformationaboutourroadmapoutlinesourgeneralproductdirectionandissubjecttochangeatanytimewithoutnotice.Itisforinformationalpurposesonlyandshallnot,beincorporatedintoanycontractorothercommitment.Splunkundertakesnoobligationeithertodevelopthefeaturesor

functionalitydescribedortoincludeanysuchfeatureorfunctionalityinafuturerelease.

HowtoUseThisPresentation

ThisPDFisintendedtobeareferenceguide,tocomplementtheactualpresentation.Ifyou’vealreadydabbledintstats,feelfreetoreadthrough.Ifyou’renewtotstats,Ihighlyrecommendwatchingthevideopresentationfirst.Idon’tdoquiteagoodenoughjobwiththisslideverison forittostandalone.Pleasefindthevideorecordingonthe.conf website,maybemid-to-lateOctober(askyourSplunkteamforupdatesifyoudon’tseeitbythen)

3

Agenda1. Intro2. David’sStory3. OverviewofTechniques(SI,RA,AP,tstats)4. DataModels– Whatyouneedtoknow5. Howtotransitionfrom_rawtotstats6. WhenDataModelAccelerationdoesn’twork7. RealWorldExamples8. AdvancedTopics

PersonalIntroduction

5

• DavidVeuve – StaffSecurityStrategist,SecurityProductAdoption• SMEforArchitecture,Security,Analytics• [email protected]• FormerSplunkCustomer(For3years,3.xthrough4.3)• PrimaryauthorofSearchActivityapp• FormerTalks:

– SecurityNinjutsu PartThree:.conf 2016(Thisyear!)– SecurityNinjutsu PartTwo:.conf 2015– SecurityNinjutsu PartOne:.conf 2014– PasswordsareforChumps:.conf 2014

Whyallthis?• Gettingresultsfastisgreat,butonlyhalfthepuzzle.• Ifyou/yourteamarewritingsearchesthatwillrunfor100cpu hoursperday,supposethat’s50%ofyourcluster’stime.

• Whatifwecouldshrinkthatto10CPUhours?Yourclusterjustwentfrom75%utilizedto30%utilized.

• SearchaccelerationlowersyourTCO• Searchaccelerationsavesyoutimewaiting• Searchaccelerationletsyouaskallofthequestions

Whythistalk?Whynow?• tstats isn’tthathard,butwedon’thaveverymuchtohelppeoplemakethetransition.

• EverythingthatSplunkInc doesispoweredbytstats.• I’vetaughtalotofpeopleinsmallergroupsaboutSearchAccelerationtechnologies.

• Tothemasses!

Whoareyou?• Youareeithera*super*hardcoredev,oryou’renotbrandnewtoSplunk.

• You’veplayedwithSPL.Youunderstandhowitworks.• You’reprobablycomfortablewithstats.• Peopleprobablycomeaskyouforhelpbuildingqueriesorsolvingproblems.

Whatwillyouget?• You’llunderstandhowtomakequeriesthatwowpeople.• You’llcementyourselfas*the*officeoruser-groupsearchninja.• You’llhappilylearnhoweasyitis.

David’sStoryJustaboy,standinginfrontofasearchcommand,askingittoshowthesyntaxerror.

WhereIStarted• Customeratanadvertisingcompany• Wasacasualuser,whenIwashandedaBusinessAnalyticsproject• Goingfromtensorhundredsofdatapointstomillions• Builttieredsummaryindexes• Auto-switchedbetweenhighgranularityandlowbasedonselectedtimewindows

• TonsofhelpfromNickMealy@Sideview

ThenItookabreak• ItooktwoyearsoffofSplunk,missing5.xandtheinitial6.0release.

• SplunkreleasedReportAcceleration• SplunkreleasedDataModelAcceleration

ICametoSplunk• Irebuiltmydashboard.FromSplunk4toSplunk6,loadtimewentfrom1.5minto27seconds.

• IusedReportAcceleration– loadtimewentdownto6seconds

• ButthenIhadabunchofdifferentsearchesrunning..

IHelpedaFinanceCompany• Theywantedmultipledashboards,drilldown,searches,on18keyfieldsin2000lineXMLdocuments.

• Builtanaccelerateddatamodelwith18calculatedspath fields• Usedthepivotinterfacetobuilddashboards• 30dayunaccelerated loadtimewouldhavebeen2days ifIcouldwait• 30dayacceleratedloadtimewas15seconds

IHelpedaHealthCareCompany• Theywanteddistinctcountofdest_ip persrc_ip perday,averagedandstdev’d.

• Runningoverrawwasn’tevenconsidered.• Dependingontheanalysis,wecansearchandprocessover1billionresults/minute.

TechniquesIt’sallaboutthetechnique..

SummaryIndexing• Takethesearchyou’rerunningrightnow,andstoretheresultsinanewindex.Nolicenserequired.

• How:– Justadd|collectinyoursearch,specifyingdestinationindex(maybe

”summary”)– Probablydon’twanttousesistats,sitop,si..anything.They’renotreally

valuable.– http://www.davidveuve.com/tech/how-i-do-summary-indexing-in-splunk/

• Examples:– Store#oflogins,#ofdistincthosts,#of…peruser/device/etc– Emaillogsarehorribleandslowtoprocess– storetheoutput– ITSIMetricsearches

SummaryIndexing(2)• Why:You’renotacceleratingrawevents,you’reacceleratingtheresultofasearch.Wecan’taccelerateasearchbaseddatamodel.So:summaryindexing

• Whynot?– NoMultipleLevelsofTimeGranularity--------->– Manualcoordinationofsummaryindexing---->– Missedsearches-------------------------------------->

ReportAcceleration• Takesasinglesavedsearch,withstats/timechart/top/chartandpre-computestheaggregatesatmultipletimebuckets(per10m,perhour,perday,etc.,basedonyouraccelerationrange).

• Automaticallyswitchesbetweenaccelerationandrawdataaccesswhenneeded.

• Youcannotquerythedatainwaysthatyoudidn’tplanfororiginally

ReportAcceleration(2)• How:

– GointothesavedsearchconfigurationandchecktheAcceleratebox

– Decideonoverwhattimerangeyou’dliketoaccelerate– Keepinmindthatlongertimeranges=>lessgranularity(so

ifyouchoose1year,you’lllose10minor1hr buckets

• Example– Myexecdashboardneedstoload,like,immediately.

NormalSearchExample• Youaskforastatisticalsearch

• Indexersreturnminimumnecessarystatistics(e.g.,anavg needssum/count)

• SHcomputesfinalresult(sum(sum)/sum(count))

ReportAccelerationExample• SHregularlyrequestsminimumnecessarystatistics(e.g.,avg needssum/count)splitintotimebuckets

• Later,whenuserrequestsvalues,theSHalreadyknowstheanswer.

ReportAcceleration(3)• Why?

– You’vegotasmallmodestdatasetwithlowsplit-bycardinalitywhereyouarewillingtobecraftytorunmultiplequeries

– Autofallbacktorawlogs,autobackfillandrecovery,autotimegranularity– SUPERFAST– Easy

• WhyNot?– Mostlylimitedtoasinglesearchperjob------->– Onlysupportforbasicanalytics------------------>– Kinda ablackart,notthatwidelyused--------->

AcceleratedPivot• Draganddropbasicstatsinterface,withtheoverwhelmingpoweroveraccelerateddatamodelsonthebackend

• How:– Buildadatamodel(moreonthatlater)– Accelerateit– Usethepivotinterface– Savetodashboardandgetpromoted

• Examples– Yourfirstforayintoacceleratedreporting– Anythingthatinvolvesstats

AcceleratedPivot(2)• Why?

– Supereasy– Automaticallyswitchbetweenrawlogsandaccelerated

data– DataModelAcceleration=💯

• WhyNot?– Notentirelyacceleratedbydefault---------------->– Can’tgosummariesonly inUI----------------------->– Pivotsearchlanguageisweirderthantstats ---->

tstats• Operatesonaccelerateddatamodelsortscollect files(andindex-timefieldextractions,suchassource,host,index,sourcetype,andthoseITSIoroccasionalothers)

• Canonlydostats– norawlogs(today!)• Isfasterthanyou’veeverimaginedlifetobe.• How:

– Differentsearchsyntax,whichtakesadjustment,butactuallyreallysimilartonormalstats.

– |tstats countwhereindex=*groupby indexsourcetypeê Bringafour-pointseatharness‘cause we’regoingFAST

tstats (2)• Why?

– Distributedindexedfieldsearchingwiththeflexibilityofsearchlanguagetodefinesyntax

– summaries_only=t– Fasterthanyou’veeverbeen.

DataModels– WhatyouneedtoknowSomethingcleverhere..

DataModelBasics• Essentiallyanythingyoucandefineinpropsandtransformscangointoanaccelerateddatamodel

• Onlyrawevents– can’taccelerateadatamodelbasedonsearches,orwithtransaction,oretc.– Gocheckoutsummaryindexing

• Favoriteexample:|eval myfield=spath(_raw,“path.to.my.field”)isslow.Putthatinyourdatamodel,andpivot/tstats querieswillbesuperfast

• NextfiveslidesfromDavidMarquardt’s.conf2013Presohttp://conf.splunk.com/session/2013/WN69801_WhatsNew_Splunk_DavidMarquardt_UnderstandingSplunkAccelerationTechnologies.pdf

30

Splunk EnterpriseIndexStructure

IDX1IDX2

IDX3

ColdPath

ThawedPath

Rawdata

TSIDXhot_v1_100

hot_v1_101

db_lt_et_80

db_lt_et_101

*.data*.tsidxrawdata

db_lt_et_70

apple

beer

LEXICON

POSTING

“applepieandicecreamisdelicious”

“anappleadaykeepsdoctoraway”

150100

etet

ltlt

itit

apple beer cokeice java …

HomePath

Source/Sourcetype/HostMetadata1source::/my/log2source::/blah

cream

31

RawdatastoredatoffsetsPostingvalue Seekaddress _time

0 42 1331667091

1 78 1331667091

2 120 1331667091

3 146 1331667091

4 170 1331667091

5 212 1331667091

6 240 1331667091

Rawevents

DeeplikesBudlight

Amrit likesMakers

Ledion likescognac

DavelikesJackDaniels

Zhanglikesvodka

DeeplikesMakers

DavelikesMakers

32

RawDataGetsIndexed Term PostingsListAmrit 1Bud 0Daniels 3Dave 3,6Deep 0,5Jack 3Ledion 2Makers 1,5,6Zhang 4cognac 2likes 0,1,2,3,4,5,6light 0vodka 4

Rawevents

DeeplikesBudlight

Amrit likesMakers

Ledion likescognac

DavelikesJackDaniels

Zhanglikesvodka

DeeplikesMakers

DavelikesMakers

• Eachwordintheraweventisindexed• TheTSIDXwillstoretheoffset#,andlocationinthegzip’d journal

• Queryingdave makersreturns#6

ReadingCompressedRawdatajournal.gz

078148236380434506

33

Example:Readingoffsets(120,170)1. Groupoffsetsintoresidingchunks

120fallsintorange(78,148)170fallsintorange(148,236)

2.Readdataoffdiskanddecompress3.Runthroughfieldextractions4.Recheckfilters5.Runcalculations

Thisisdisk+CPUEXPENSIVE

34

StoringIndexedFieldsinTSIDXTerm PostingsListbar::AB 1,3,7,39,98bar::cez 0,6,9,12bar::xyz 3,4,5,6baz::1 3,6,85baz::2567 0,5baz::462 3,24,45baz::98 2,3,5,8,9baz::99023 1,5,6,76,99foo::afdjsi 4,567,2345foo::aghdafo 2,234,6667foo::bazcxuid 0,1,623,7777foo::cef 0,1,2,3,4,43foo::zaz 4

BigIdea:Usethelexiconasafieldvaluestore!

Bysimplyseparatingfieldsandvalueswith“::”wecanstoresufficientinformationtorunmoreinterestingqueries.

DataModelqueriesdon’t evervisitrawlogs.TheyliveentirelywithinTSIDX!

HowtoTransitionfrom_rawtotstatsAwholenewworld(don’tyoudarecloseyoureyes)

ProcessOverview• Buildyourdatamodelwithwhateverfieldsyoucouldcareabout• Startwithyourrawsearch• Identifytheaggregationthatyouwanttodo

– Statsavg(bytes),dc(host),whateverelse

• Maketheminorsyntaxadjustmentsfortstats

ExampleWithoutDataModelsRawindex=*|statscountbyindex,sourcetype

Tstats|tstats countwhereindex=*groupby index,sourcetype

ExampleWithDataModelsRawtag=networktag=traffic|statsdc(dest_ip)bysrc_ip

Tstats|tstats dc(All_Traffic.dest_ip)fromdatamodel=Network_Trafficgroupby All_Traffic.src_ip

Challenge:IdentifyingFields• Whatfieldsareactuallyinadatamodel?• HowdidIknowtouse“All_Traffic.dest_ip”insteadof“dest_ip”orinsteadof“Network_Traffic.dest_ip?

• Tofigureitout,wecanlookatthedatamodeldefinitionviapivot,orattheresultingtsidx filesviawalklex

• Pivotdoesn’trequireSSHaccess,butstillleavesyouguessingforparts• walklex ismuchmoreaccurateandpreferable

IdentifyingFieldsviaPivot

ThedatamodelisNetwork_Traffic,andtherooteventnodeis

“All_Traffic”sofieldsshouldmostlybeAll_Traffic.fieldname

IdentifyingFieldsviaWalklex• FindtheTSIDXFileonyourindexer(let’sassumeadatamodel)

– Pathsetinyourindexconfig,butbydefaultintheindexfolder– Usually$SPLUNK_HOME/var/lib/splunk/<INDEX>/datamodel_summary/<BUCKET_ID>

/<SEARCH_HEAD_GUID>/<DATAMODEL_NAME>/<TIMERANGE>.tsidx– Goodnews:That’sbyfarthehardpart– Example:/opt/splunk/var/lib/splunk/defaultdb/datamodel_summary/1772_813B72E7-6743-

4F46-9DE6-536F78929EDD/813B72E7-6743-4F46-9DE6-536F78929EDD/DM_Splunk_SA_CIM_Network_Traffic/1466344886-1466326949-3864670955536478127.tsidx

• Runwalklex,eitherwithanemptystring“”orawildcard“*dest_ip*”– $SPLUNK_HOME/bin/splunk cmd walklex <TSIDXFILE>“”

ExampleWalklex

ExampleWalklex foraParticularField

ExampleDistinctCountofWalklex Fields• /opt/splunk/bin/splunk cmd walklex 1457540473-1457196480-3287925045170504614.tsidx""|tr

-s""|cut-d""-f3|grep"::"|awk -F"::"'{print$1;}'|sort|uniq -c

tstats whereclause• Workssurprisinglyliketheinitialsearchcriteriaofarawsearch• whereindex=*sourcetype=pan_traffic ORsourcetype=pan:traffic

– Justlikenormalsearch

• |tstats countwhereindex=pan10.1.1.1– Withnon-datamodel data,10.1.1.1willbeinthetsidx.

• whereearliest=-24h– Notethatthereisabugin6.3,6.4whereamorerestrictivetimepicker range

doesn’toverridetheearliest=…(unlikeinrawsearch– thisisabug)

tstats groupingby• Whengroupingbyvalues(e.g.,src_ip,sourcetype,etc.)it’slikeanormalstats….by…– |tstats countwhereindex=*groupby source,index

• Youcanalsogroupbytime,withoutusingthebucketcommand– |tstats countwhereearliest=-24hindex=*groupby index_timespan=1h

BugsandSurprises• There’sabugin6.3/6.4withearliestandlatestwheretstats doesn’toverridethetimepicker,soeasiesttoleaveyourtimepickeratalltime.

• Sometimeststats handleswhereclausesinsurprisingways.Forexample: nounderscoresinvalues,nosplunk_server_group,nocidrmatches (All_Traffic.dest_ip!=172.16.1.0/24– Fail.All_Traffic.dest_ip!=172.16.1.*– Success)

WhenDataModelAccelerationortstats Don’tWorka sad,sadday….

Ontheoutputofastatscommand• Sadly,youcan’taccelerateasearch-baseddatamodel,sonoluck.• ThisiswhereSummaryIndexingcomesin• Youcanalsodoindex-timefieldextractionsonsummaryindexesifyou’refancy,andthentstats onthose!

fields.conf:[indexed_itsi_kpi_id]INDEXED=true

Workaround:Stats->SI+IndexTime->tstats• Creatingindextimefieldsisahassle,involvingfields.conf,props.conf,transforms.conf,butitworksonsummaryindexeddata.

• Forexample,fromITSI,weindexthefieldindexed_itsi_kpi_id fromsummaryindexedsearches(sourcetype:stash_new)

props.conf:[stash_new]TRANSFORMS-set_kpisummary_index_fields = set_kpisummary_kpiid

transforms.conf:[set_kpisummary_kpiid]REGEX = itsi_kpi_id\s*=\s*([^\s,]+)WRITE_META = trueFORMAT = indexed_itsi_kpi_id::$1

WhenYourCardinalityisCrazyHigh• Tstats canprocesshugenumbersofevents(billions,trillions,noproblem).

• Butifwehavetostoremillionsofrowsinmemorybasedonyoursplit-by,thatcanberough

• Example:300,000personcompanytracks#ofloginsperuserperdayover100days.300,000*100=30Mrows,whichmeanswritingpartialresultstodiskandsadness.

• Betterapproachistosummaryindexeachday,andthenusetstats toprocessthoseresultseitherviaindex-timesummarizationorDMA

WhenAnyCardinalityiscrazycrazyhigh• tstats efficiencyisfundamentallybasedontheassumptionthataparticularvaluewillbeusedafewtimes.

• Ifyouhavemillionsofevents,eachwith10datapoints,with10pointsofprecisionsuchthatrepeatvaluesareunlikely,yourtsidx filewillbeabsolutelymassive.

RealWorldExamplesWhenthingsstopbeingslow,andstartgettingreal.

Splunk(x)- IndexSearches• ForrunningourSplunkInternalUBAproject,weneededtoknowwhatsourcetypes wereinthesystem.

• _raw:index=*earliest=-24h|bucket_timespan=1h|statscountbysourcetype,_time– Timetocomplete:68,476seconds(19hours)

• tstats:|tstats countwhereindex=*groupby sourcetype _timespan=1h– Timetocomplete:6.19seconds

• SpeedDifference:11,062x (notpercent,eleventhousandtimesfaster)• QueryLengthdifference:18charactersshorter

FinancialCustomerXMLUseCase• WhatTechnology?

– AcceleratedDataModelswithPivot

• Why?– HeavyXMLParsingmeantsearchquerieswereterriblyslow– Pivotwasveryeasytouse

• Result– Veryhighscale,veryhappycustomer

FinancialCustomerXMLUseCase(2)• NoXMLextraction• Raw:8.811seconds

– index=xx-xxxx sourcetype=xxx-xxxsplunk_server=myserver01.myserver.localParticularLogIdentifier host=*ServerType*|timechart countbyhost

• AcceleratedPivot:1.25seconds– |pivotXXXXXXX YYYYYY count(YYYYYY)AS"NumberofEvents"SPLITROW_timeAS_timePERIODautoSPLITCOLhostFILTERhostis

"*ServerType*"SORT100_timeROWSUMMARY0COLSUMMARY0NUMCOLS100SHOWOTHER1

• Tstats SummariesOnly:0.896seconds– |tstats summariesonly=tcountfromdatamodel=XXXXXXXwhere(nodename =YYYYYY)(YYYYYY.host="*ServerType*")groupby _time

• SpeedDifference:9.9xFaster• QueryLength:18charactersshorter

FinancialCustomerXMLUseCase(3)• SingleXMLExtractionviaspath• _raw:299.763seconds

– index=xx-xxxx sourcetype=xxx-xxxsplunk_server=myserver01.myserver.localParticularLogIdentifier |eval RuleId=spath(_raw,"___path____.___to__._____very_______._____long___._:_xml___._:____._:_____.______.__________.__________.______________.______")|timechart countbyRuleId

• AcceleratedPivot:2.4seconds● |pivotXXXXXXXYYYYYYcount(YYYYYY)AS"NumberofEvents"SPLITROW_timeAS_timePERIODautoSPLITCOLRuleId SORT100_time

ROWSUMMARY0COLSUMMARY0NUMCOLS100SHOWOTHER1

• tstatssummariesonly:2.04seconds– |tstats summariesonly=tcountfromdatamodel=XXXXXXXwhere(nodename =YYYYYY)groupby RuleId_time

• SpeedDifference:about146.9xfaster• QueryLength:50charactersshorter

FinancialCustomerXMLUseCase(4)• HeavyXMLExtraction(mentionedearlier).Searchesanonymized...• AnEntireDashboardofUnaccelerated PivotswithlotsofXMLspath

– Timetocomplete:172,800seconds(2days)

• AnEntireDashboardofAcceleratedPivots– Timetocomplete:16seconds

• SpeedDifference:about10000x• TimeTakentoBuild14PanelDashboardviaPivot:15minutes

ESEndpoint+Proxy+AV• WhatTechnology?

– ESDataModels+tstats

• Why?– ESDataModelswerealreadybuilt,andmultipledatasourcessotstats append=t

• Result– Superfastsearch,highscalable.– DataModelsmakethingseasier

• Downside– Inthiscase,a19secondsavingsevery15minutes=a$211ROI/yearona$300k

Splunkinfrastructure…maybenotenough?

ESEndpoint+Proxy+AV• Fromlastyear’sSecurityNinjutsu PartTwo,correlatingsysmon withproxyandAVdata.

• _raw:[searchtag=malwareearliest=-20m@mlatest=-15m@m|tabledest |renamedest assrc ]

earliest=-20m@m(sourcetype=sysmon ORsourcetype=carbon_black eventtype=process_launch)OR(sourcetype=proxycategory=uncategorized)

|statscount(eval(sourcetype="proxy"))asproxy_events count(eval(sourcetype="carbon_black"ORsourcetype="sysmon"))asendpoint_events bysrc

|whereproxy_events >0ANDendpoint_events >0– 21seconds

ESEndpoint+Proxy+AV(2)• tstats:|tstats prestats=tsummariesonly=tcount(Malware_Attacks.src)asmalwarehits fromdatamodel=MalwarewhereMalware_Attacks.action=allowedgroupby Malware_Attacks.src

|tstats prestats=tappend=tsummariesonly=tcount(web.src)aswebhits fromdatamodel=Webwhereweb.http_user_agent="shockwaveflash"groupby web.src

|tstats prestats=tappend=tsummariesonly=tcount(All_Changes.dest)fromdatamodel=Change_Analysis wheresourcetype=carbon_black ORsourcetype=sysmon groupby All_Changes.dest

|renameweb.src assrc Malware_Attacks.src assrc All_Changes.dest assrc

|statscount(Malware_Attacks.src)asmalwarehits count(web.src)aswebhits count(All_Changes.dest)asprocess_launches bysrc

– 2seconds

ESEndpoint+Proxy+AV(3)• SpeedDifference:10.5x

– Itdoesn’talwayshavetobe10,000x.10xoreven3xisstillahugereductioninresources.

• QueryLengthdifference:282characterslonger– Multiplenamespacescanmakethingslonger,andalsomaybemorecomplicated

sometimes.Worthitthough.

AdvancedTopicsBecauseit’sbeenstraightforwardsofar,right?

allow_old_summaries andsummaries_only• Thesetwosettingsareperhapsthemostimportanttotstats.• summaries_only meansthatwewon’tautomaticallyfallbacktorawdata– thismeansfastresults,andmuchmoreofadifferencethanyouwouldprobablyexpect.Ifsearching100daysofdata,and15minutesaren’taccelerated,weprobablydon’tcare.

• allow_old_summaries iskeyfortwoscenarios:– Youleveragethecommoninformationmodel,whichisperiodicallyupdated,

andyouwanttobeabletosearchdatafromanearlierversion(verylikely)– Youhavemultipleappswithdifferentglobalconfigsharingsettings,andyou

wanttosearchfromanappthatdidn’t*generate*thedatamodeloriginally.

allow_old_summaries andsummaries_only (2)• WhilethesesettingsareautomaticallysettotrueinES(andprobablyotherSplunkownedapps),becausetheyaresokeyyoumaywanttosetthemtotrueautomaticallyacrossthesystemvialimits.conf

• Bigimpact:pivotwillusewhateverthedefaultis– Note:thepivotuserinterfaceactuallyruns

tstats.Thepivotsearchcommandisnotimpacted– Iknow,Iknow.

prestats=t• Tstats canbefedintoupstreamstats.Forexample,tstats _timespan=…putdirectlyintoagraphlooksterrible.

• |tstats prestats=tcountwhereindex=*groupby _timespan=1dindex|timechart span=1dcountbyindex

chunk_size• Howmuchdatawillberetrievedbytstats fromatsidx fileatonce• Tradeoffbetweenmemory,sorting,andotherfactors• Defaultvalue(10000000– 10MB)isusuallytherightfit.

– Loweringthatcouldsignificantlyhurtperformance.– Forveryhighcardinality,raisingitto50MBor100MBmaybebeneficial– Worthtestingoutonlyforalong-runningsearchyouwilluseregularly

SearchingAcrossMultipleNamespaces• Withnormalsearch,youcanuseasmanydifferentindexes,sourcetypes,etc asyouwant,withrecklessabandon.

• Withtstats,youcanuseappend=t,butrequiresprestats=t.Frequentlyrequiresmungingwitheval alongtheway.

• |tstats prestats=tdc(All_Traffic.dest)fromdatamodel=Network_Trafficgroupby All_Traffic.src|tstats prestats=tappend=tcountfromdatamodel=MalwaregroupbyMalware_Attacks.dest|eval system=coalesce('All_Traffic.src','Malware_Attacks.dest')|statsdc(All_Traffic.dest),countbysystem

SearchingAcrossMultipleNamespaces(2)• Ifyouarequeryingthesameparametersinthefirstandsecondquery,suchascomparingtimespansorlookingattwocounts,useeval withcoalecese todefineafield

|tstats prestats=tappend=tcountfromdatamodel=Malwarewhereearliest=-24hgroupby Malware_Attacks.dest|eval range="current"|tstats prestats=tappend=tcountfromdatamodel=Malwarewhereearliest=-7dlatest=-24hgroupby Malware_Attacks.dest|eval range=coalesce(range,"past")|chartcountoverMalware_Attacks.dest byrange

SearchingAcrossMultipleNamespaces(3)

70

Youcanalsousedifferentfields,suchascount(Malware_Attacks.src),count(web.src),andetc.• |tstats prestats=tsummariesonly=tcount(Malware_Attacks.src)as

malwarehits fromdatamodel=MalwarewhereMalware_Attacks.action=allowedgroupby Malware_Attacks.src

• |tstats prestats=tappend=tsummariesonly=tcount(web.src)aswebhits fromdatamodel=Webwhereweb.http_user_agent="shockwaveflash"groupby web.src

• |tstats prestats=tappend=tsummariesonly=tcount(All_Changes.dest)fromdatamodel=Change_Analysis wheresourcetype=carbon_black ORsourcetype=sysmon groupbyAll_Changes.dest

• |renameweb.src assrc Malware_Attacks.src assrcAll_Changes.dest assrc

• |statscount(Malware_Attacks.src)asmalwarehits count(web.src)aswebhits count(All_Changes.dest)asprocess_launches bysrc

PullMalwareData

PullWeb(Proxy)Data

PullEndpointData

NormalizeFieldNames

DoCount

Drilldown• Drilldownsfromtstats queriesdon’toftenworkcorrectly• Besttoputthatinadashboardwhereyoucanmanuallydefinethedrilldown

_indextime• WhiletheSplunkUIdoesn’tshow_indextime normally,youcanuseitbecauseitisanindexedfield.Just|eval _time=_indextime

• Youcan’tdoaggregationsonit,butyoucanfilter!• Boththetimerangepicker*AND*_indextime apply|tstats countmin(_time)asmin_time max(_time)asmax_time where

[|statscountassearch|eval search="_indextime>".relative_time(now(),"-7d")|tablesearch]

index=*groupby _indextime

|eval lag=_indextime - (min_time +max_time)/2

|eval _time=_indextime

|timechart avg(lag)

ASpecialNoteAboutTime_timeisspecialwithtstats,foracoupleofreasons:• Youcan’tdoavg(_time)orrange(_time)• Youcandomin(_time)andmax(_time)andofcoursegroupby _timespan=10m(orwhatevertime)

Cardinality• Datamodelsarephenomenalwithsplit-bycardinality,e.g.:

– |tstats avg(bytes)fromdatamodel=Network_Traffic groupby All_Traffic.dest_ip

• Datamodelsarelessgreatwithoverwhelmingfieldcardinality,whentrackingmetricdata

• Roundoffirrelevantdatapoints.Ifyouhavetemperatureto7decimalplaces,but1decimalplaceisallthatactuallymatters,justacceleratethat.– Don’tincludetheunroundedfieldinyourdatamodel,becausethenthe

accelerationwillstoreitandyou’llusemorediskspace.

SchemeonWhat?• DataModelsareagreatcombinationofschemaonreadandschemaonwrite.

• AswitheverythinginSplunk,youcanflexiblydefineandchangeyourschema,rebuildtsidx,etc.

• Butforaccelerateddatamodels,yougetalltheperformanceofschemeonwrite…withoutlosingtheflexibilitytoredefineandrebuildasneeded.– Obviously,forVERYlargedatamodels,youmightnotwanttowaitforarebuild,

butyoucanaffectmovingforward

QuirksofDataModelAcceleration• Secondcompression.Youcan’tlookatmillisecondsormicrosecondsfor_timewithouthijinks(separatefieldandseparatefiltering)

• Requiresstats.It’scalledtstats forareason– there’snotstatsraw in6.4.

• |datamodel searchcommandwasthedevil<6.4– muchbetterinnewestrelease

• Interrogatingfieldsisahassle• TSIDXtradesdiskspaceforperformance

SummaryLet’spullitalltogether,team

Summary

78

Gettingstarted:useacceleratedpivotondatamodelsGettingstartedw/tstats:usetstats onnormalindexeddata– countingevents– lookingforindextime lagtstats isactuallyreallyeasyThatsaid,therearesomeweirdquirks.– CheckoutthePDF

THANKYOU

Download - How to Scale: From raw to tstats (and beyond!) - SplunkConf · Disclaimer 2 During the course of this presentation, we may make forward looking statements regarding future events

Top Related