+ All Categories
Home > Data & Analytics > Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Date post: 02-Dec-2014
Category:
Upload: alexandre-pinto
View: 4,070 times
Download: 0 times
Share this document with a friend
Description:
Follow along with the R Markdown file at http://rpubs.com/alexcpsec/tiq-test-Summer2014-2 Source code available at: https://github.com/mlsecproject/tiq-test https://github.com/mlsecproject/tiq-test-Summer2014 https://github.com/mlsecproject/combine --------- Full Abstract: Threat Intelligence feeds are now being touted as the saving grace for SIEM and log management deployments, and as a way to supercharge incident detection and even response practices. We have heard similar promises before as an industry, so it is only fair to try to investigate. Since the actual number of breaches and attacks worldwide is unknown, it is impossible to measure how good threat intelligence feeds really are, right? Enter a new scientific breakthrough developed over the last 300 years: statistics! This presentation will consist of a data-driven analysis of a cross-section of threat intelligence feeds (both open-source and commercial) to measure their statistical bias, overlap, and representability of the unknown population of breaches worldwide. Are they a statistical good measure of the population of "bad stuff" happening out there? Is there even such a thing? How tuned to your specific threat surface are those feeds anyway? Regardless, can we actually make good use of them even if the threats they describe have no overlap with the actual incidents you have been seeing in your environment? We will provide an open-source tool for attendees to extract, normalize and export data from threat intelligence feeds to use in their internal projects and systems. It will be pre-configured with current OSINT network feed and easily extensible for private or commercial feeds. All the statistical code written and research data used (from the open-source feeds) will be made available in the spirit of reproducible research. The tool itself will be able to be used by attendees to perform the same type of tests on their own data. Join Alex and Kyle on a journey through the actual real-world usability of threat intelligence to find out which mix of open source and private feeds are right for your organization.
42
Measuring the IQ of your Threat Intelligence Feeds (#TIQtest) Alex Pinto MLSec Project @alexcpsec @MLSecProject Kyle Maxwell Researcher @kylemaxwell
Transcript
Page 1: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Measuring  the  IQ  of  your  Threat  Intelligence  Feeds  (#TIQtest)  

Alex  Pinto  MLSec  Project    

@alexcpsec  @MLSecProject!

Kyle  Maxwell  Researcher  @kylemaxwell!

Page 2: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Alex  Pinto  •  Science  guy  at  MLSec  Project  •  ML  trainer  •  Network  security  aficionado    •  Tortured  by  SIEMs  as  a  child  •  Hacker  Spirit  Animal™:  

CAFFEINATED  CAPYBARA!

whoami(s)  

Kyle  Maxwell  •  Researcher  at  [REDACTED]  •  Math  Smuggler  •  Recovering  Incident  Responder  •  GPL  zealot  •  Hacker  Spirit  Animal™:  

AXIOMATIC  ARMADILLO!

(hUps://secure.flickr.com/photos/kobashi_san/)   (hUp://www.langorigami.com/art/gallery/gallery.php?tag=mammals&name=armadillo)  

Page 3: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

•  Threat  Intel  102  •  Measuring  Intelligence  •  Data  Preparaaon  •  Tesang  the  Data  •  Tools:  • COMBINE  •  TIQ-­‐TEST  •  Some  parang  ideas  

Agenda  

(hUp://www.savagechickens.com/2008/12/iq-­‐test.html)  

Page 4: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Threat  Intel  102:  Capability  and  Intent  

•  What  are  they  able  to  do?  •  What  are  they  intending  to  do?!

Page 5: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Threat  Intel  102:  Cage  Matches  •  Signatures  vs  Indicators  •  Data  vs  Intelligence  •  Tacacal  vs  Strategic  •  Atomic  vs  Composite  

Page 6: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Threat  Intel  102:  Pyramid  of  Pain  

“Simple”  and  “easy”  aren’t  always  (David  Bianco  –  Pyramid  of  Pain)  

Page 7: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

What  about  IP  addresses?  •  Approximately  same  value  as  hostnames  (APT  vs  DGA)  •  Finite  resource  (unal  IPv6,  that  is)  •  Managed  /  controlled  by  orgs  •  Difficulty  /  economic  incenaves  /  implied  “cost”  •  Also,  recyclable    

(hUps://xkcd.com/865/)  

Page 8: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Given  IP  addresses  harvested  from  TI  feeds,  can  we  measure  how  much  they  “help”  our  defense  

metrics?!

Page 9: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Introducing  TIQ-­‐TEST  

•  All  these  tests  are  available  as  R  funcaons  at    •  hUps://github.com/mlsecproject/aq-­‐test  •  Have  fun,  prove  me  wrong,  suggest  stuff  

•  Tools  that  implement  those  tests  •  Sample  data  +  R  Markdown  file  

•  The  excuse  to  learn  a  staasacal  language  you  were  waiang  for!  

Page 10: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Data  Sources  –  Types  of  data  •  Extract  the  “raw”  informaaon  from  indicator  feeds  •  Both  IP  addresses  and  hostnames  were  extracted  

Page 11: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Data  Sources  –  Feeds  Selected  •  Data  was  separated  into  “inbound”  and  “outbound”  

Page 12: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Data  PreparaLon  and  Cleansing  •  Convert  the  hostname  data  to  IP  addresses:  •  Acave  IP  addresses  for  the  respecave  date  (“A”  query)  •  Passive  DNS  from  Farsight  Security  (DNSDB)  • We  removed  non-­‐public  IPs  from  the  dataset  (RFC1918)    •  Yeah,  we  know  it  is  a  “parking  technique”  

(hUps://xkcd.com/742/)  

Page 13: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Data  PreparaLon  and  Cleansing  •  For  each  IP  record  (including  the  ones  from  hostnames):  •  Add  asnumber  and  asname  (from  MaxMind  ASN  DB)  •  Add  country  (from  MaxMind  GeoLite  DB)  •  Add  rhost  (again  from  DNSDB)  –  most  popular  “PTR”  

•  The  experiments  will  be  around  ASNs  and  Geolocaaon  

Page 14: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Data  PreparaLon  and  Cleansing  • However,  we  will  NOT  be  using  maps.  Just  let  it  go.  

Page 15: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Data  PreparaLon  and  Cleansing  •  Small  enriched  sample:!

Page 16: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

TesLng  the  Data  •  Let’s  generate  some  interesang  metrics:  •  NOVELTY  –  How  oxen  do  they  update  themselves?  •  OVERLAP  –  How  do  they  compare  to  what  you  got?  •  POPULATION  –  what  is  in  them  anyway?  

•  Populaaon  is  tricky:  •  Could  mean  the  enare  world  (all  IPv4  space)  •  Should  ideally  mean  YOUR  world  

Page 17: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

But  WHAT  IS  THE  IQ?!??1?  •  We  will  withhold  judgment  •  The  best  data  composiaon  is  the  best  one  for  you  •  We  will  do  our  best  to  explain  results  so  you  can  decide.  •  Maybe  on  further  (or  more  private)  research…      

Page 18: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Novelty  Test  –  measuring  added  and  dropped  indicators!

Page 19: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Novelty  Test  (1)  

Page 20: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Novelty  Test  (2)  

Page 21: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Overlap  Test  –  More  data  is  beUer,  but  make  sure  it  is  not  the  same  

data!

Page 22: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Overlap  Test  

Page 23: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Overlap  Test  

Page 24: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Overlap  Test  

Page 25: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

PopulaLon  Test  •  Let  us  use  the  ASN  and  GeoIP  databases  that  we  used  to  enrich  our  data  as  a  reference  of  the  “true”  populaaon.    

•  But,  but,  human  beings  are  unpredictable!  We  will  never  be  able  to  forecast  this!  

   

Page 26: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
Page 27: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
Page 28: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Can  we  get  a  beSer  look?  •  Don’t  like  squinang  either  •  Staasacal  inference-­‐based  comparison  models  (hypothesis  tesang)  •  Exact  binomial  tests  (when  we  have  the  “true”  pop)  •  Chi-­‐squared  proporaon  tests  (similar  to  independence  tests)    

Page 29: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Can  we  get  a  beSer  look?  •  We  can  beUer  esamate,  with  confidence  intervals,  our  measures  of  error.  

•  Also,  p-­‐values!  (with  apologies  to  Alex  HuUon)  •  We  promise  to  be  very  conservaave  in  using  them.  

 

Page 30: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
Page 31: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Hacker  Spirit  Animal™  Guide  

•  US  –  Eagle  •  CA  –  Moose  •  FR  –  Frog  •  GB  –  Bulldog  •  AU  –  Koala  •  BR  –  Capybara  /  Toucan  •  Texas  –  Armadillo  

•  Disclaimer:  we  do  not  endorse  Geolocaaon-­‐based  aUribuaon  

Page 32: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Trend  Comparison  

Page 33: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
Page 34: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
Page 35: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Introducing  COMBINE  

•  Harvesang  feeds  takes  some  work.  

•  Most  of  us  let  somebody  else  do  it  without  thinking  about  what  it  actually  takes.  

Page 36: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Introducing  COMBINE  

hUps://github.com/mlsecproject/combine  

Page 37: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Introducing  COMBINE  

•  Components:  1.   Reaper  gathers  the  threat  data  directly  from  feeds.  2.   Thresher  normalizes  it  into  a  simplisac  data  model.  3.   Winnower  opaonally  performs  basic  validaaon  or  

enrichment.  4.   Baler  transforms  the  data  into  CybOX,  CSV,  JSON,  and  

CIM.  (Only  CSV  and  JSON  work  right  now).  Could  also  write  others  fairly  easily.  (nudge  nudge,  wink  wink)  

Page 38: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Introducing  COMBINE  

•  Always  trying  to  feed  it  more.  Lots  of  possibiliaes,  including  your  own  data  sources.  

•  We  clearly  do  NOT  endorse  any  included  feeds.    

Page 39: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Introducing  COMBINE  

•  Enrichments  -­‐  think  metadata.    •  AS,  geolocaaon  •  DNS  resoluaons  courtesy  of  Farsight  DNSDB    •  Ask  them  for  an  API  key  to  test  it,  tell  them  Alex  Pinto  sent  you  ;)    

Page 40: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

MLSec  Project  •  Both  projects  have  been  released  as  GPLv3  by  MLSec  Project  •  Will  replace  the  internal  versions  we  have  on  the  main  code    •  Looking  for  paracipants  and  data  sharing  agreements  •  Liked  TIQ-­‐TEST?  We  can  benchmark  your  private  feeds  using  

these  and  other  techniques    •  Visit  hSps://www.mlsecproject.org  ,  message  @MLSecProject  

or  just  e-­‐mail  me.!

Page 41: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Take  Aways  

•  Analyze  your  data.    •  Extract  value  from  it!  •  Try  before  you  buy!  Different  test  results  mean  different  things  to  different  orgs.  

•  Use  the  tools!  Suggest  new  tests!  •  Share  data  with  us!  We  take  good  care  of  it,  make  sure  it  gets  proper  exercise.    

Page 42: Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)

Thanks!  •  Q&A?  •  Feedback!  

Alex  Pinto    @alexcpsec  

@MLSecProject  

”The  measure  of  intelligence  is  the  ability  to  change."                  -­‐  Albert  Einstein    

Kyle  Maxwell  @kylemaxwell  


Recommended