taming*your*data · *agenda! osu*splunk*deployment–environmental*background*...

57
Copyright © 2014 Splunk Inc. Mark Runals Sr Security Engineer The Ohio State University Taming Your Data

Upload: others

Post on 21-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

Copyright  ©  2014  Splunk  Inc.  

Mark  Runals  Sr  Security  Engineer  The  Ohio  State  University  

Taming  Your  Data  

Page 2: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

Disclaimer  

2  

During  the  course  of  this  presentaFon,  we  may  make  forward-­‐looking  statements  regarding  future  events  or  the  expected  performance  of  the  company.  We  cauFon  you  that  such  statements  reflect  our  current  expectaFons  and  

esFmates  based  on  factors  currently  known  to  us  and  that  actual  events  or  results  could  differ  materially.  For  important  factors  that  may  cause  actual  results  to  differ  from  those  contained  in  our  forward-­‐looking  statements,  

please  review  our  filings  with  the  SEC.  The  forward-­‐looking  statements  made  in  the  this  presentaFon  are  being  made  as  of  the  Fme  and  date  of  its  live  presentaFon.  If  reviewed  aRer  its  live  presentaFon,  this  presentaFon  may  not  contain  current  or  accurate  informaFon.  We  do  not  assume  any  obligaFon  to  update  any  forward-­‐looking  statements  we  may  make.  In  addiFon,  any  informaFon  about  our  roadmap  outlines  our  general  product  direcFon  and  is  subject  to  change  at  any  Fme  without  noFce.  It  is  for  informaFonal  purposes  only,  and  shall  not  be  incorporated  into  any  contract  or  other  commitment.  Splunk  undertakes  no  obligaFon  either  to  develop  the  features  or  funcFonality  described  or  to  

include  any  such  feature  or  funcFonality  in  a  future  release.  

Page 3: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

Disclaimer  

3  

During  the  course  of  this  presentaFon,  we  may  make  forward  looking  statements  regarding  future  events  or  the  expected  performance  of  the  company.  We  cauFon  you  that  such  statements  reflect  our  current  expectaFons  and  

esFmates  based  on  factors  currently  known  to  us  and  that  actual  events  or  results  could  differ  materially.  For  important  factors  that  may  cause  actual  results  to  differ  from  those  contained  in  our  forward-­‐looking  statements,  

please  review  our  filings  with  the  SEC.  The  forward-­‐looking  statements  made  in  the  this  presentaFon  are  being  made  as  of  the  Fme  and  date  of  its  live  presentaFon.  If  reviewed  aRer  its  live  presentaFon,  this  presentaFon  may  not  contain  current  or  accurate  informaFon.  We  do  not  assume  any  obligaFon  to  update  any  forward  looking  statements  we  may  make.  In  addiFon,  any  informaFon  about  our  roadmap  outlines  our  general  product  direcFon  and  is  subject  to  change  at  any  Fme  without  noFce.  It  is  for  informaFonal  purposes  only  and  shall  not,  be  incorporated  into  any  contract  or  other  commitment.  Splunk  undertakes  no  obligaFon  either  to  develop  the  features  or  funcFonality  described  or  to  

include  any  such  feature  or  funcFonality  in  a  future  release.  

Page 4: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Agenda  

!   OSU  Splunk  deployment  –  environmental  background  !   Props/field  extracFon  score  methodology  !   Look  at  data  curator  app  

4  

FYI  -­‐  Splunk  Admin  Focused  PresentaFon  

Page 5: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

Some  Background  &  Program  Drivers    

5  

135  Distributed  IT  units  around  OSU  •  Each  group  is  autonomous  •  No  standardizaFon  •  Huge  variety  of  technologies  •  Splunk  use  not  mandatory    Desired  lightweight  onboarding  process  •  For  units  &  for  Splunk  team  

=  

OSU  Environment  Incredible  roll-­‐on/adopFon  rate  

+  

Page 6: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Fast  Forward  a  Year  or  2  +/-­‐  

6  

!   2TB  Of  data  !   1,800+  Splunk  agents  !   10k  Devices  !   12  Types  of  firewalls  !   MulFple  OS  !   90+  Teams  with  data  in  Splunk  !   700+  Sourcetypes  –  many  ‘learned’  !   350+  People  

Page 7: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Fast  Forward  a  Year  or  2  +/-­‐  

7  

!   2TB  Of  data  !   1,800+  Splunk  agents  !   10k  Devices  !   12  Types  of  firewalls  !   MulFple  OS  !   90+  Teams  with  data  in  Splunk  !   700+  Sourcetypes  –  many  ‘learned’  !   350+  People  

Is  data  being  ingested  correctly?    What  fields  have  been  defined?  Where?    What  types  of  data  are  in  Splunk?    What’s  not  configured  correctly?  

Page 8: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

Issue  Overview  

8  

Out  of  the  box  and  without  specific  data  definiFon  Splunk  will  generally  ingest  data  correctly  •  Host  names  •  Sourcetypes  •  Timestamp    •  Line  breaking  •  Auto  key-­‐value  fields    At  best  though,  this  isn’t  efficient.  At  worst,  it  can  strain  your  deployment  and  may  drop/lose  events  

 Factors  in  play  •  Hardware  •  RaFo  of  indexers  to  total  log  volume  •  Sourcetype  velocity  •  Data  distribuFon  (forwarders  pre  5.0.4  will  favor  first  indexer  listed  in  autoLB  outputs.conf)  •  Weird  date/Fme  informaFon  in  your  logs  •  Etc…    

Page 9: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Data  Import/DefiniFon  Pipeline    

9  

DM  =  Index  Time  Processing  •  Sourcetyping  •  Line  breaking  •  Timestamp  •  Host  field  •  etc  

KM  =  Search  Time  Processing  •  Base  level  field  extracFon  •  Normalized  field  names  •  Field  name  alignment  within    

Common  InformaFon  Model  (CIM)  •  Knowledge  objects  

Get  Data  to  Splunk   Data  Management   Knowledge  Management  

(Mark’s  View)  

Page 10: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 The  Plan  

10  

Data  Management   Score  based  on  ‘Gepng  Data  in  Correctly’  .conf  2012  preso  

Knowledge  Management   Score  based  on  length  of  fields  relaFve  to  _raw  length    (conversaFon  with  Kevin  Meeks)    Data  Curator  App  

Data  Taxonomy   Create  way  to  classify  sourcetypes  

IdenFfy  Common  Issues   Munge  through  internal  logs  

Page 11: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Data  Management  –  Props  Score  

11  

[mah_data_stanza]  TIME_PREFIX  =  MAX_TIMESTAMP_LOOKAHEAD  =  TIME_FORMAT  =  SHOULD_LINEMERGE  =    LINE_BREAKER  =  TRUNCATE  =    TZ  =    

Page 12: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Data  Management  –  Props  Score  

12  

[mah_data_stanza]  TIME_PREFIX  =  MAX_TIMESTAMP_LOOKAHEAD  =  TIME_FORMAT  =  SHOULD_LINEMERGE  =    LINE_BREAKER  =  TRUNCATE  =    TZ  =    

+1 +1

+1 OR DATETIME_CONFIG  =    +3

Page 13: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Data  Management  –  Props  Score  

13  

[mah_data_stanza]  TIME_PREFIX  =  MAX_TIMESTAMP_LOOKAHEAD  =  TIME_FORMAT  =  SHOULD_LINEMERGE  =  False  LINE_BREAKER  =  TRUNCATE  =    TZ  =    

+1

….but  what  if  my  data  should  be  merged?  

Page 14: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Data  Management  –  Props  Score  

14  

[mah_data_stanza]  TIME_PREFIX  =  MAX_TIMESTAMP_LOOKAHEAD  =  TIME_FORMAT  =  SHOULD_LINEMERGE  =  True  LINE_BREAKER  =  TRUNCATE  =    TZ  =     +1

AND

One  of  these  is  populated  BREAK_ONLY_BEFORE  MUST_BREAK_AFTER  MUST_NOT_BREAK_BEFORE  MUST_NOT_BREAK_AFTER  

Page 15: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Data  Management  –  Props  Score  

15  

[mah_data_stanza]  TIME_PREFIX  =  MAX_TIMESTAMP_LOOKAHEAD  =  TIME_FORMAT  =  SHOULD_LINEMERGE  =    LINE_BREAKER  =  TRUNCATE  =    TZ  =    

+1

Default  is  ([\r\n\]+)  

Don’t  want  to  line  break?  ((?!))  or  ((*FAIL))  are  a  couple  opFons*  

*hyp://answers.splunk.com/answers/106075/each-­‐file-­‐as-­‐one-­‐single-­‐splunk-­‐event  

Page 16: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Data  Management  –  Props  Score  

16  

[mah_data_stanza]  TIME_PREFIX  =  MAX_TIMESTAMP_LOOKAHEAD  =  TIME_FORMAT  =  SHOULD_LINEMERGE  =    LINE_BREAKER  =  TRUNCATE  =    TZ  =    

Default  is  10000  

+1

Game  your  score!  Ø  Set  this  to  anything  other  than  the  default  

i.e.  10001  or  999999  

+0

Page 17: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Data  Management  –  Props  Score  

17  

[mah_data_stanza]  TIME_PREFIX  =  MAX_TIMESTAMP_LOOKAHEAD  =  TIME_FORMAT  =  SHOULD_LINEMERGE  =    LINE_BREAKER  =  TRUNCATE  =    TZ  =     +1

If  sepng  this  across  your  environment  isn’t  possible/pracFcal  reduce  the  max  score  macro  in  the  app.  It’s  used  as  a  variable.  

Macro:    props_score_upper_bounds  =  7     6 \

Page 18: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Data  Management  –  Props  Score  

18  

[mah_data_stanza]  TIME_PREFIX  =  MAX_TIMESTAMP_LOOKAHEAD  =  TIME_FORMAT  =  SHOULD_LINEMERGE  =    LINE_BREAKER  =  TRUNCATE  =    TZ  =    

Max  Score  =  7    (st_score  *  `props_score_scale`)  /  `props_score_upper_bounds`     10

Page 19: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

Props  Score  Caveats  

19  

There  are  a  lot  of  addiFonal  props  sepngs  that  could  be  applicable  for  your  data/environment.      This  method/app  doesn’t  address  host  fields  that  are  incorrect  

syslog   Default  host  field?  

Splunk  UF  

Page 20: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

Props  Score  Caveats  

20  

There  are  a  lot  of  addiFonal  props  sepngs  that  could  be  applicable  for  your  data/environment.      This  method/app  doesn’t  address  host  fields  that  are  incorrect  

syslog   Default  host  field?  

Splunk  UF  

Page 21: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Field  ExtracFon  Score  Methodology  

21  

10.10.10.10  -­‐  -­‐  [20/Aug/2014:13:44:03.151  -­‐0400]  "POST  /services/broker/phonehome/connecFon_10.10.10.10_8089_10.10.10.10_TEST-­‐TS_68D82260-­‐CC1D-­‐4203-­‐83CA-­‐6E24F9FE6538  HTTP/1.0"  200  24  -­‐  -­‐  -­‐  1ms  

1.  Account  for  any  autokv  field  names  2.  Do  convoluted  search  to  get  length  of  fields  3.  Account  for  Fmestamp  in  log  4.  Get  total  length  

1.  Remove  spaces  2.  Remove  newline  characters  3.  Get  _raw  length  

_raw  length  Length  of  Fields  

=   %  of  Event  has    Fields  Defined  

Page 22: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Field  ExtracFon  Score  Methodology  

22  

10.10.10.10  -­‐  -­‐  [20/Aug/2014:13:44:03.151  -­‐0400]  "POST  /services/broker/phonehome/connecFon_10.10.10.10_8089_10.10.10.10_TEST-­‐TS_68D82260-­‐CC1D-­‐4203-­‐83CA-­‐6E24F9FE6538  HTTP/1.0"  200  24  -­‐  -­‐  -­‐  1ms  

1.  Account  for  any  autokv  field  names  2.  Do  convoluted  search  to  get  length  of  fields  3.  Account  for  Fmestamp  in  log  4.  Get  total  length  

1.  Remove  spaces  2.  Remove  newline  characters  3.  Get  _raw  length  

_raw  length  Length  of  Fields  

=   %  of  Event  has    Fields  Defined  

11

2 3 11 11 7 36 8 3 4

Page 23: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Field  ExtracFon  Score  Methodology  

23  

10.10.10.10  -­‐  -­‐  [20/Aug/2014:13:44:03.151  -­‐0400]  "POST  /services/broker/phonehome/connecFon_10.10.10.10_8089_10.10.10.10_TEST-­‐TS_68D82260-­‐CC1D-­‐4203-­‐83CA-­‐6E24F9FE6538  HTTP/1.0"  200  24  -­‐  -­‐  -­‐  1ms  

1.  Account  for  any  autokv  field  names  2.  Do  convoluted  search  to  get  length  of  fields  3.  Account  for  Fmestamp  in  log  4.  Get  total  length  

1.  Remove  spaces  2.  Remove  newline  characters  3.  Get  _raw  length  

_raw  length  Length  of  Fields  

=   %  of  Event  has    Fields  Defined  

11

2 3 11 11 7 36 8 3 4

*  Not  a  great  example  –  Splunk  forwarder  phonehome  logs  actually  have  +100%  field  length  compared  to  _raw    

Page 24: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Field  ExtracFon  Score  Methodology  

24  

Caveats/ConsideraFons  

Doesn’t  account  for  field  alias  (will  arFficially  inflate  score)  

If  field  extracFon  %  is  over  100  the  score  is  set  to  100  

DirecFonally  correct  is  about  the  best  this  will  get  

 Fields  extracted  !=  field  value  Ø     

Page 25: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Data  Taxonomy  

25  

Version  1  –  deprecated  out  of  the  box  

Designed  to  answer  “What  type  of  data  is  in  Splunk?”    Created  a  2nd  field  classificaFon  csv  for  several  hundred  sourcetypes  •  Data  family  •  Data  subtype    Very  useful  but  too  many  one-­‐to-­‐many  relaFonships  based  on  data  use  

netstat   ConfiguraFon?  Networking?  

Server  Monitoring  Server  InformaFon  Server  ConfiguraFon  Server  Performance  

Too  many  server  *  

Page 26: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Data  Taxonomy  –  InteracFve  Host  Dashboard  

26  

Host  A  

Page 27: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Data  Taxonomy  –  InteracFve  Host  Dashboard  

27  

Host  B  

Page 28: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Data  Curator  App  

28  

Goals  •  Flexible  scoring  scale  

•  Generate  aggregate,  system  maturity  scores  

•  Generate  ~accurate  individual  maturity  score  

•  Show  what  app/package  contained  props  sepngs  

•  Show  current  props  sepngs  

•  Highlight  issues  related  to/solvable  by  props  sepngs  –  Line  breaking  –  Timestamp  –  Transforms  issues  

Take  Note!  •  Will  NOT  tell  you  what  the  sepngs  should  be  •  Requires  Splunk  6  search  head  •  Only  able  to  work  through  issues  I  saw  in  my  

environment  -­‐  you  may  have  others.  •  I  can  troubleshoot  my  app    

–  not  your  deployment  =)  

Page 29: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Deployment  At  A  Glance  

29  

Page 30: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Props  Score  Breakdown  

30  

Holy  Crap!!  Lots  of  Work  

….but  before  you  slit  your  wrists  

Page 31: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Props  Score  Breakdown  

31  

Page 32: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Learned  Sourcetypes  (-­‐too_small  OR  -­‐#)  

32  

Beware  of  diminishing  returns  on  working  the  ‘long  tail’  

Page 33: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Sourcetype  Deep  Dive  Dashboard  

33  

Avamar  Logs  

Page 34: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Sourcetype  Deep  Dive  Dashboard  

34  

Avamar  Logs  

Not  all  items  factor  into  score  

Page 35: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Sourcetype  Deep  Dive  Dashboard  

35  

Avamar  Logs  

Loaded  score  based  on  volume  of  events  per  punct.    Score  created  on  the  fly  

Page 36: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Sourcetype  Deep  Dive  Dashboard  

36  

Avamar  Logs   Based  on  volume  of  events  per  punct.  Quick  way  to  see  how  unique  logs  in  a  parFcular  sourcetype  are.  

Had  75  unique  punct  

Page 37: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Sourcetype  Deep  Dive  Dashboard  

37  

ABDCB  (learned)  

Page 38: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Sourcetype  Deep  Dive  Dashboard  

38  

Argus  

Page 39: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 IdenFfying  Date/Time  Issues  

39  

Page 40: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 IdenFfying  Date/Time  Issues  

40  

These  events  don’t  have  Fmestamps!  

Page 41: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 IdenFfying  Date/Time  Issues  

41  

These  events  don’t  have  Fmestamps!   What  if  Splunk  thinks  the  last  known  good  Fmestamp  was  6  years  ago?  

Page 42: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 IdenFfying  Date/Time  Issues  

42  

These  events  don’t  have  Fmestamps!   What  if  Splunk  thinks  the  last  known  good  Fmestamp  was  6  years  ago?  

Page 43: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Date/Time  Workspace  Dashboard  

43  

Pre-­‐populated  with  sourcetypes  having  issues  

(DATETIME_CONFIG  added  to  view  aRer  screenshot)  

AddiFonal  Dashboard  Elements  •  Clustered  internal  logs  giving  you  a  level  of  visibility  •  100  most  recent  events  

(No  Fme  informaFon  set)  

Page 44: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Line  Breaking/Truncate  Workspace  Dashboard  

44  

Page 45: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Line  Breaking/Truncate  Workspace  Dashboard  

45  

Page 46: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Line  Breaking  Sanity  Check  Dashboard  

46  

Sourcetypes  have  line  breaking  set  but  have  mulFple  line  counts  in  recent  events  

Page 47: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Line  Breaking  Sanity  Check  

47  

Sourcetypes  have  line  breaking  set  but  have  mulFple  line  counts  in  recent  events  

Set  in  mulFple  apps;  potenFal  problem  down  the  road?  

Page 48: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

Query  TroubleshooFng  

48  

Two  main  scheduled  searches  that  are  somewhat  computaFonally  expensive.    Dashboard  allows  admin  to  compare  run  length  &  frequency  to  coverage  

Sourcetype  field  length  percentage  query  

Page 49: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Extract/Report/Transforms  Issues  

49  

08-­‐21-­‐2014  08:55:46.348  -­‐0400  WARN    SearchOperator:kv  -­‐  IndexOutOfBounds  invalid  The  FORMAT  capturing  group  id:  id=7,  transform_name='Message'  

08-­‐21-­‐2014  08:59:02.854  -­‐0400  WARN    SearchOperator:kv  -­‐  Invalid  key-­‐value  parser,  ignoring  it,  transform_name='extract_cmd_change'  

08-­‐21-­‐2014  08:59:03.345  -­‐0400  WARN    SearchOperator:kv  -­‐  Invalid  key-­‐value  parser,  ignoring  it,  transform_name='(?i)^(?:[^\|]*\|){3}(?P<dest_domain>[^\|]+)'  

…wut?    Which  app?    In  props  or  transforms?    

Example  Internal  Warning  Logs  

SoluFon:  grep  -­‐r  through  520+  packages  in  deployment-­‐apps  directory  for  ‘Message’?  

Page 50: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Extract/Report/Transforms  Issues  

50  

Page 51: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Extract/Report/Transforms  Issues  

51  

Only 5 tokens

Page 52: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Extract/Report/Transforms  Issues  

52  

Anyone  know  what  the  issue  is?  

Page 53: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 Extract/Report/Transforms  Issues  

53  

Should  be  an  EXTRACT  

Page 54: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 KM  –  Sourcetype  Fields  Comparison  

54  

Boyom  of  explanatory  text.  There  is  a  freeform  text  search  box  at  top  of  dashboard  

Page 55: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

 App  Roadmap  

55  

Now  •  Props  maturity  scores  •  Field  extracFon  scores  •  Issues  workspaces  •  Data  taxonomy  

RelaFvely  non-­‐scaling  

Next  •  Dashboard  opFmizaFon    

(ie  searchTemplate)  •  Tag  based  data  taxonomy  •  Any  iniFal  app  bug  fixes  

ARer  Next  •  Tie  in  data  model  fields  •  Field  value?  •  Expand  issue  

troubleshooFng  Based  on  community  feedback      

Page 56: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

   

56  

?  

Check  out  the  Forwarder  Health  app  in  Splunkbase  

Blog:  runals.blogspot.com  

.conf  14  updated  Ge8ng  Data  in  Correctly  presentaFon–  Andrew  Duca  

Page 57: Taming*Your*Data · *Agenda! OSU*Splunk*deployment–environmental*background* Props/field*extracFon*score*methodology* Look*atdatacurator*app* 4 FYIJ*Splunk*Admin*Focused*Presentaon*

THANK  YOU