+ All Categories
Home > Documents > Taming*Your*Data · 2017. 10. 13. · Disclaimer* 3...

Taming*Your*Data · 2017. 10. 13. · Disclaimer* 3...

Date post: 22-Oct-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
57
Copyright © 2014 Splunk Inc. Mark Runals Sr Security Engineer The Ohio State University Taming Your Data
Transcript
  • Copyright  ©  2014  Splunk  Inc.  

    Mark  Runals  Sr  Security  Engineer  The  Ohio  State  University  

    Taming  Your  Data  

  • Disclaimer  

    2  

    During  the  course  of  this  presentaFon,  we  may  make  forward-‐looking  statements  regarding  future  events  or  the  expected  performance  of  the  company.  We  cauFon  you  that  such  statements  reflect  our  current  expectaFons  and  

    esFmates  based  on  factors  currently  known  to  us  and  that  actual  events  or  results  could  differ  materially.  For  important  factors  that  may  cause  actual  results  to  differ  from  those  contained  in  our  forward-‐looking  statements,  

    please  review  our  filings  with  the  SEC.  The  forward-‐looking  statements  made  in  the  this  presentaFon  are  being  made  as  of  the  Fme  and  date  of  its  live  presentaFon.  If  reviewed  aRer  its  live  presentaFon,  this  presentaFon  may  not  contain  current  or  accurate  informaFon.  We  do  not  assume  any  obligaFon  to  update  any  forward-‐looking  statements  we  may  make.  In  addiFon,  any  informaFon  about  our  roadmap  outlines  our  general  product  direcFon  and  is  subject  to  change  at  any  Fme  without  noFce.  It  is  for  informaFonal  purposes  only,  and  shall  not  be  incorporated  into  any  contract  or  other  commitment.  Splunk  undertakes  no  obligaFon  either  to  develop  the  features  or  funcFonality  described  or  to  

    include  any  such  feature  or  funcFonality  in  a  future  release.  

  • Disclaimer  

    3  

    During  the  course  of  this  presentaFon,  we  may  make  forward  looking  statements  regarding  future  events  or  the  expected  performance  of  the  company.  We  cauFon  you  that  such  statements  reflect  our  current  expectaFons  and  

    esFmates  based  on  factors  currently  known  to  us  and  that  actual  events  or  results  could  differ  materially.  For  important  factors  that  may  cause  actual  results  to  differ  from  those  contained  in  our  forward-‐looking  statements,  

    please  review  our  filings  with  the  SEC.  The  forward-‐looking  statements  made  in  the  this  presentaFon  are  being  made  as  of  the  Fme  and  date  of  its  live  presentaFon.  If  reviewed  aRer  its  live  presentaFon,  this  presentaFon  may  not  contain  current  or  accurate  informaFon.  We  do  not  assume  any  obligaFon  to  update  any  forward  looking  statements  we  may  make.  In  addiFon,  any  informaFon  about  our  roadmap  outlines  our  general  product  direcFon  and  is  subject  to  change  at  any  Fme  without  noFce.  It  is  for  informaFonal  purposes  only  and  shall  not,  be  incorporated  into  any  contract  or  other  commitment.  Splunk  undertakes  no  obligaFon  either  to  develop  the  features  or  funcFonality  described  or  to  

    include  any  such  feature  or  funcFonality  in  a  future  release.  

  •  Agenda  

    !   OSU  Splunk  deployment  –  environmental  background  !   Props/field  extracFon  score  methodology  !   Look  at  data  curator  app  

    4  

    FYI  -‐  Splunk  Admin  Focused  PresentaFon  

  • Some  Background  &  Program  Drivers    

    5  

    135  Distributed  IT  units  around  OSU  •  Each  group  is  autonomous  •  No  standardizaFon  •  Huge  variety  of  technologies  •  Splunk  use  not  mandatory    Desired  lightweight  onboarding  process  •  For  units  &  for  Splunk  team  

    =  

    OSU  Environment  Incredible  roll-‐on/adopFon  rate  

    +  

  •  Fast  Forward  a  Year  or  2  +/-‐  

    6  

    !   2TB  Of  data  !   1,800+  Splunk  agents  !   10k  Devices  !   12  Types  of  firewalls  !   MulFple  OS  !   90+  Teams  with  data  in  Splunk  !   700+  Sourcetypes  –  many  ‘learned’  !   350+  People  

  •  Fast  Forward  a  Year  or  2  +/-‐  

    7  

    !   2TB  Of  data  !   1,800+  Splunk  agents  !   10k  Devices  !   12  Types  of  firewalls  !   MulFple  OS  !   90+  Teams  with  data  in  Splunk  !   700+  Sourcetypes  –  many  ‘learned’  !   350+  People  

    Is  data  being  ingested  correctly?    What  fields  have  been  defined?  Where?    What  types  of  data  are  in  Splunk?    What’s  not  configured  correctly?  

  • Issue  Overview  

    8  

    Out  of  the  box  and  without  specific  data  definiFon  Splunk  will  generally  ingest  data  correctly  •  Host  names  •  Sourcetypes  •  Timestamp    •  Line  breaking  •  Auto  key-‐value  fields    At  best  though,  this  isn’t  efficient.  At  worst,  it  can  strain  your  deployment  and  may  drop/lose  events  

     Factors  in  play  •  Hardware  •  RaFo  of  indexers  to  total  log  volume  •  Sourcetype  velocity  •  Data  distribuFon  (forwarders  pre  5.0.4  will  favor  first  indexer  listed  in  autoLB  outputs.conf)  •  Weird  date/Fme  informaFon  in  your  logs  •  Etc…    

  •  Data  Import/DefiniFon  Pipeline    

    9  

    DM  =  Index  Time  Processing  •  Sourcetyping  •  Line  breaking  •  Timestamp  •  Host  field  •  etc  

    KM  =  Search  Time  Processing  •  Base  level  field  extracFon  •  Normalized  field  names  •  Field  name  alignment  within    

    Common  InformaFon  Model  (CIM)  •  Knowledge  objects  

    Get  Data  to  Splunk   Data  Management   Knowledge  Management  

    (Mark’s  View)  

  •  The  Plan  

    10  

    Data  Management   Score  based  on  ‘Gepng  Data  in  Correctly’  .conf  2012  preso  

    Knowledge  Management   Score  based  on  length  of  fields  relaFve  to  _raw  length    (conversaFon  with  Kevin  Meeks)    Data  Curator  App  

    Data  Taxonomy   Create  way  to  classify  sourcetypes  

    IdenFfy  Common  Issues   Munge  through  internal  logs  

  •  Data  Management  –  Props  Score  

    11  

    [mah_data_stanza]  TIME_PREFIX  =  MAX_TIMESTAMP_LOOKAHEAD  =  TIME_FORMAT  =  SHOULD_LINEMERGE  =    LINE_BREAKER  =  TRUNCATE  =    TZ  =    

  •  Data  Management  –  Props  Score  

    12  

    [mah_data_stanza]  TIME_PREFIX  =  MAX_TIMESTAMP_LOOKAHEAD  =  TIME_FORMAT  =  SHOULD_LINEMERGE  =    LINE_BREAKER  =  TRUNCATE  =    TZ  =    

    +1 +1

    +1 OR DATETIME_CONFIG  =    +3

  •  Data  Management  –  Props  Score  

    13  

    [mah_data_stanza]  TIME_PREFIX  =  MAX_TIMESTAMP_LOOKAHEAD  =  TIME_FORMAT  =  SHOULD_LINEMERGE  =  False  LINE_BREAKER  =  TRUNCATE  =    TZ  =    

    +1

    ….but  what  if  my  data  should  be  merged?  

  •  Data  Management  –  Props  Score  

    14  

    [mah_data_stanza]  TIME_PREFIX  =  MAX_TIMESTAMP_LOOKAHEAD  =  TIME_FORMAT  =  SHOULD_LINEMERGE  =  True  LINE_BREAKER  =  TRUNCATE  =    TZ  =     +1

    AND

    One  of  these  is  populated  BREAK_ONLY_BEFORE  MUST_BREAK_AFTER  MUST_NOT_BREAK_BEFORE  MUST_NOT_BREAK_AFTER  

  •  Data  Management  –  Props  Score  

    15  

    [mah_data_stanza]  TIME_PREFIX  =  MAX_TIMESTAMP_LOOKAHEAD  =  TIME_FORMAT  =  SHOULD_LINEMERGE  =    LINE_BREAKER  =  TRUNCATE  =    TZ  =    

    +1

    Default  is  ([\r\n\]+)  

    Don’t  want  to  line  break?  ((?!))  or  ((*FAIL))  are  a  couple  opFons*  

    *hyp://answers.splunk.com/answers/106075/each-‐file-‐as-‐one-‐single-‐splunk-‐event  

  •  Data  Management  –  Props  Score  

    16  

    [mah_data_stanza]  TIME_PREFIX  =  MAX_TIMESTAMP_LOOKAHEAD  =  TIME_FORMAT  =  SHOULD_LINEMERGE  =    LINE_BREAKER  =  TRUNCATE  =    TZ  =    

    Default  is  10000  

    +1

    Game  your  score!  Ø  Set  this  to  anything  other  than  the  default  

    i.e.  10001  or  999999  

    +0

  •  Data  Management  –  Props  Score  

    17  

    [mah_data_stanza]  TIME_PREFIX  =  MAX_TIMESTAMP_LOOKAHEAD  =  TIME_FORMAT  =  SHOULD_LINEMERGE  =    LINE_BREAKER  =  TRUNCATE  =    TZ  =     +1

    If  sepng  this  across  your  environment  isn’t  possible/pracFcal  reduce  the  max  score  macro  in  the  app.  It’s  used  as  a  variable.  

    Macro:    props_score_upper_bounds  =  7     6 \

  •  Data  Management  –  Props  Score  

    18  

    [mah_data_stanza]  TIME_PREFIX  =  MAX_TIMESTAMP_LOOKAHEAD  =  TIME_FORMAT  =  SHOULD_LINEMERGE  =    LINE_BREAKER  =  TRUNCATE  =    TZ  =    

    Max  Score  =  7    (st_score  *  `props_score_scale`)  /  `props_score_upper_bounds`     10

  • Props  Score  Caveats  

    19  

    There  are  a  lot  of  addiFonal  props  sepngs  that  could  be  applicable  for  your  data/environment.      This  method/app  doesn’t  address  host  fields  that  are  incorrect  

    syslog   Default  host  field?  

    Splunk  UF  

  • Props  Score  Caveats  

    20  

    There  are  a  lot  of  addiFonal  props  sepngs  that  could  be  applicable  for  your  data/environment.      This  method/app  doesn’t  address  host  fields  that  are  incorrect  

    syslog   Default  host  field?  

    Splunk  UF  

  •  Field  ExtracFon  Score  Methodology  

    21  

    10.10.10.10  -‐  -‐  [20/Aug/2014:13:44:03.151  -‐0400]  "POST  /services/broker/phonehome/connecFon_10.10.10.10_8089_10.10.10.10_TEST-‐TS_68D82260-‐CC1D-‐4203-‐83CA-‐6E24F9FE6538  HTTP/1.0"  200  24  -‐  -‐  -‐  1ms  

    1.  Account  for  any  autokv  field  names  2.  Do  convoluted  search  to  get  length  of  fields  3.  Account  for  Fmestamp  in  log  4.  Get  total  length  

    1.  Remove  spaces  2.  Remove  newline  characters  3.  Get  _raw  length  

    _raw  length  Length  of  Fields  

    =   %  of  Event  has    Fields  Defined  

  •  Field  ExtracFon  Score  Methodology  

    22  

    10.10.10.10  -‐  -‐  [20/Aug/2014:13:44:03.151  -‐0400]  "POST  /services/broker/phonehome/connecFon_10.10.10.10_8089_10.10.10.10_TEST-‐TS_68D82260-‐CC1D-‐4203-‐83CA-‐6E24F9FE6538  HTTP/1.0"  200  24  -‐  -‐  -‐  1ms  

    1.  Account  for  any  autokv  field  names  2.  Do  convoluted  search  to  get  length  of  fields  3.  Account  for  Fmestamp  in  log  4.  Get  total  length  

    1.  Remove  spaces  2.  Remove  newline  characters  3.  Get  _raw  length  

    _raw  length  Length  of  Fields  

    =   %  of  Event  has    Fields  Defined  

    11

    2 3 11 11 7 36 8 3 4

  •  Field  ExtracFon  Score  Methodology  

    23  

    10.10.10.10  -‐  -‐  [20/Aug/2014:13:44:03.151  -‐0400]  "POST  /services/broker/phonehome/connecFon_10.10.10.10_8089_10.10.10.10_TEST-‐TS_68D82260-‐CC1D-‐4203-‐83CA-‐6E24F9FE6538  HTTP/1.0"  200  24  -‐  -‐  -‐  1ms  

    1.  Account  for  any  autokv  field  names  2.  Do  convoluted  search  to  get  length  of  fields  3.  Account  for  Fmestamp  in  log  4.  Get  total  length  

    1.  Remove  spaces  2.  Remove  newline  characters  3.  Get  _raw  length  

    _raw  length  Length  of  Fields  

    =   %  of  Event  has    Fields  Defined  

    11

    2 3 11 11 7 36 8 3 4

    *  Not  a  great  example  –  Splunk  forwarder  phonehome  logs  actually  have  +100%  field  length  compared  to  _raw    

  •  Field  ExtracFon  Score  Methodology  

    24  

    Caveats/ConsideraFons  

    Doesn’t  account  for  field  alias  (will  arFficially  inflate  score)  

    If  field  extracFon  %  is  over  100  the  score  is  set  to  100  

    DirecFonally  correct  is  about  the  best  this  will  get  

     Fields  extracted  !=  field  value  Ø     

  •  Data  Taxonomy  

    25  

    Version  1  –  deprecated  out  of  the  box  

    Designed  to  answer  “What  type  of  data  is  in  Splunk?”    Created  a  2nd  field  classificaFon  csv  for  several  hundred  sourcetypes  •  Data  family  •  Data  subtype    Very  useful  but  too  many  one-‐to-‐many  relaFonships  based  on  data  use  

    netstat   ConfiguraFon?  Networking?  Server  Monitoring  Server  InformaFon  Server  ConfiguraFon  Server  Performance  

    Too  many  server  *  

  •  Data  Taxonomy  –  InteracFve  Host  Dashboard  

    26  

    Host  A  

  •  Data  Taxonomy  –  InteracFve  Host  Dashboard  

    27  

    Host  B  

  •  Data  Curator  App  

    28  

    Goals  •  Flexible  scoring  scale  •  Generate  aggregate,  system  maturity  scores  •  Generate  ~accurate  individual  maturity  score  •  Show  what  app/package  contained  props  sepngs  

    •  Show  current  props  sepngs  •  Highlight  issues  related  to/solvable  by  props  sepngs  

    –  Line  breaking  –  Timestamp  –  Transforms  issues  

    Take  Note!  •  Will  NOT  tell  you  what  the  sepngs  should  be  •  Requires  Splunk  6  search  head  •  Only  able  to  work  through  issues  I  saw  in  my  

    environment  -‐  you  may  have  others.  •  I  can  troubleshoot  my  app    

    –  not  your  deployment  =)  

  •  Deployment  At  A  Glance  

    29  

  •  Props  Score  Breakdown  

    30  

    Holy  Crap!!  Lots  of  Work  

    ….but  before  you  slit  your  wrists  

  •  Props  Score  Breakdown  

    31  

  •  Learned  Sourcetypes  (-‐too_small  OR  -‐#)  

    32  

    Beware  of  diminishing  returns  on  working  the  ‘long  tail’  

  •  Sourcetype  Deep  Dive  Dashboard  

    33  

    Avamar  Logs  

  •  Sourcetype  Deep  Dive  Dashboard  

    34  

    Avamar  Logs  

    Not  all  items  factor  into  score  

  •  Sourcetype  Deep  Dive  Dashboard  

    35  

    Avamar  Logs  

    Loaded  score  based  on  volume  of  events  per  punct.    Score  created  on  the  fly  

  •  Sourcetype  Deep  Dive  Dashboard  

    36  

    Avamar  Logs   Based  on  volume  of  events  per  punct.  Quick  way  to  see  how  unique  logs  in  a  parFcular  sourcetype  are.  

    Had  75  unique  punct  

  •  Sourcetype  Deep  Dive  Dashboard  

    37  

    ABDCB  (learned)  

  •  Sourcetype  Deep  Dive  Dashboard  

    38  

    Argus  

  •  IdenFfying  Date/Time  Issues  

    39  

  •  IdenFfying  Date/Time  Issues  

    40  

    These  events  don’t  have  Fmestamps!  

  •  IdenFfying  Date/Time  Issues  

    41  

    These  events  don’t  have  Fmestamps!   What  if  Splunk  thinks  the  last  known  good  Fmestamp  was  6  years  ago?  

  •  IdenFfying  Date/Time  Issues  

    42  

    These  events  don’t  have  Fmestamps!   What  if  Splunk  thinks  the  last  known  good  Fmestamp  was  6  years  ago?  

  •  Date/Time  Workspace  Dashboard  

    43  

    Pre-‐populated  with  sourcetypes  having  issues  

    (DATETIME_CONFIG  added  to  view  aRer  screenshot)  

    AddiFonal  Dashboard  Elements  •  Clustered  internal  logs  giving  you  a  level  of  visibility  •  100  most  recent  events  

    (No  Fme  informaFon  set)  

  •  Line  Breaking/Truncate  Workspace  Dashboard  

    44  

  •  Line  Breaking/Truncate  Workspace  Dashboard  

    45  

  •  Line  Breaking  Sanity  Check  Dashboard  

    46  

    Sourcetypes  have  line  breaking  set  but  have  mulFple  line  counts  in  recent  events  

  •  Line  Breaking  Sanity  Check  

    47  

    Sourcetypes  have  line  breaking  set  but  have  mulFple  line  counts  in  recent  events  

    Set  in  mulFple  apps;  potenFal  problem  down  the  road?  

  • Query  TroubleshooFng  

    48  

    Two  main  scheduled  searches  that  are  somewhat  computaFonally  expensive.    Dashboard  allows  admin  to  compare  run  length  &  frequency  to  coverage  

    Sourcetype  field  length  percentage  query  

  •  Extract/Report/Transforms  Issues  

    49  

    08-‐21-‐2014  08:55:46.348  -‐0400  WARN    SearchOperator:kv  -‐  IndexOutOfBounds  invalid  The  FORMAT  capturing  group  id:  id=7,  transform_name='Message'  

    08-‐21-‐2014  08:59:02.854  -‐0400  WARN    SearchOperator:kv  -‐  Invalid  key-‐value  parser,  ignoring  it,  transform_name='extract_cmd_change'  

    08-‐21-‐2014  08:59:03.345  -‐0400  WARN    SearchOperator:kv  -‐  Invalid  key-‐value  parser,  ignoring  it,  transform_name='(?i)^(?:[^\|]*\|){3}(?P[^\|]+)'  

    …wut?    Which  app?    In  props  or  transforms?    

    Example  Internal  Warning  Logs  

    SoluFon:  grep  -‐r  through  520+  packages  in  deployment-‐apps  directory  for  ‘Message’?  

  •  Extract/Report/Transforms  Issues  

    50  

  •  Extract/Report/Transforms  Issues  

    51  

    Only 5 tokens

  •  Extract/Report/Transforms  Issues  

    52  

    Anyone  know  what  the  issue  is?  

  •  Extract/Report/Transforms  Issues  

    53  

    Should  be  an  EXTRACT  

  •  KM  –  Sourcetype  Fields  Comparison  

    54  

    Boyom  of  explanatory  text.  There  is  a  freeform  text  search  box  at  top  of  dashboard  

  •  App  Roadmap  

    55  

    Now  •  Props  maturity  scores  •  Field  extracFon  scores  •  Issues  workspaces  •  Data  taxonomy  

    RelaFvely  non-‐scaling  

    Next  •  Dashboard  opFmizaFon    

    (ie  searchTemplate)  •  Tag  based  data  taxonomy  •  Any  iniFal  app  bug  fixes  

    ARer  Next  •  Tie  in  data  model  fields  •  Field  value?  •  Expand  issue  

    troubleshooFng  Based  on  community  feedback      

  •    

    56  

    ?  

    Check  out  the  Forwarder  Health  app  in  Splunkbase  

    Blog:  runals.blogspot.com  

    .conf  14  updated  Ge8ng  Data  in  Correctly  presentaFon–  Andrew  Duca  

  • THANK  YOU  


Recommended