ICSI Honeyfarm Status

The ICSI Honeyfarm Cui, Paxson, Weaver

ICSI Honeyfarm Status

Weidong CuiVern Paxson

Nicholas Weaver

Page 2


General Concept:A Breeding Ground for Worms

• We want a controlled, automatic breeding ground for worms and other self-propagating attacks:– Worm attacks a "monitored" address and begins to propagate in our

system• As the worm propagates, we have a suite of automatic analyzers to

study the worm– What it can infect?– Any particulars of interest?– How does it attack?

• And automatically analyze defense strategies– Does this signature block the worm?

• All within a very short time: a few seconds– And with a single point of trust for exporting information

• Also want to leverage the infrastructure for detecting other things:– Human attackers/non-self-propagating attacks– Non-random worms

Page 3


Honeyfarm: Objectives• We use network telescopes and a honeyfarm to detect scanning worms

– Network Telescopes• Distributed unallocated IP address ranges

– Honeyfarm:• Centralized cluster of honeypots• On-demand: emulating a large number of hosts on a small number of honeypots

• Detecting self-propagation– Detect self-propagation inside the honeyfarm by redirecting propagations from one honeypot to other honeypots

• Other detectors possible:– Tripwire/modification detectors– Monitored honeypots, etc…

Global Internet

Honeypots

Controller

Honeyfarm

Page 4


The Overall Goal• Framework for automatically detecting and analyzing new worms and

other attacks– For self-propagating attacks, we want to generate:

• Vulnerability signatures: What is vulnerable• Behavior signatures: What the worm needs to propagate• Attack signatures: Signatures which detect and block the attack• All signatures should be verified for effectiveness

– For non-self-propagating attacks, as much of the above as possible– Based on providing a fertile ground for constrained propagation

• Receive data from multiple sources– Small distributed telescopes, Large telescopes– Spam, Crawling?

• For a RANDOM worm, with k addresses, V victims, and M systems infected:– Pdetect = 1 – ((V-k)/V)M after M machines infected– High probability of detection when M = V/k

Page 5


ICSI's Honeyfarms

• Honeyfarm Safety• ICSI's features:

– Windows Centric– Hot Telescope– Replay

• Replay-based filtering

– Spam Telescope• The Main ICSI Honeyfarm• Other possibilities:

– "Run this" Wormholes

Page 6


ICSI Focus:Windows

• Microsoft Windows is our primary (currently only) hosted OS• This requirement dictates VM choice:

– VMWare Workstation or ESX server– Workstation: prototyping

• Limited scalability• Runs on everything

– ESX Server: production• Stringent hardware requirements• Memory sharing for (some) scalability

– Could be better– But can work across multiple close variants due to coalescing

• For now, NO host-OS specific customization– Dictates mechanism for demand allocation: NAT, instead of

customization– Allows the possibility of non-virtual honeypots as well

• ?Apple Systems?

Page 7


ICSI'sArchitecture

GRE Tunnel

Honeyfarm

Network Telescope

Filtering

Attacker

Mapping Containment

Policing

VM Clusters

Detection

VManager

Filtering

Containment

Policing

Detection

Page 8


Note onArchitecture

• Most components implemented in Click– Provides a modular, reusable framework

• Components in red we want to merge with UCSD– Need to better coordinate in this area– Relatively low overlap so far, but need

Page 9


Safety: A Common FocusOf Both UCSD and ICSI

• What if a worm propagates through the honeyfarm and then infects somebody else?– "But they would get infected anyway" doesn't cut it…

• Two safety features:– Containment: the basic decision making on what is allowed outbound

• Connections back to the infecting host• Some "phone-home" channels may also be allowed

– Much malcode/attacks grab code from a third-party site– An independent policing module

• Shutdown the honeyfarm once it detects any abnormal behavior on outbound connections

• This is a safety belt, it should NEVER actually be invoked• Want a third safety feature as well:

– A monitoring system which observes the control-plane– Has the ability to turn-off the honeyfarm by power-sequencing the

network connections• Much more details on policies in UCSD's talk

Page 10


The Telescopes

• We have 4 /16s arranged as two (almost) contiguous /15s belonging to ESNet…– Network is directly advertised and routed by ESNet

• But we also have, on loan, a "special" /23 netblock– Also advertised and routed by ESNet

• Much malcode is NOT random:– Linear scanners starting from the local address:

• Blaster and others– Local subnet preference

• Nimda, etc

• By selecting highly-likely addresses, we can gain an advantage in detection time– Local subnet preferences in particular have proven very effective

Page 11


How HotIs The Hot Address Range?

0

20000

40000

60000

80000

100000

120000

140000

160000

/23 /16 /16 /16 /16

Num

ber o

f TC

P C

onne

ctio

ns

801351394451433

Page 12


Filtering

• But we can't allow all communication:– Honeypot allocation/deallocation is very expensive for us

• VMWare doesn't support a lightweight clone

• We want to filter out known threats– But we still want to detect new attacks for existing vulnerabilities

• We want to detect Welchia as well as Blaster:– New attacks may require new signatures– New variants may be substantially more disruptive

– And we would like to avoid identification by attackers as a honeypot system

• Thus we need a low-cost mechanism to say whether an attack is worth forwarding to a real honeypot

Page 13


Basic Filtering• Scan filtering

– Allow traffic to the first N destinations from a source.– Intuition: Scans from a source is homogeneous

• Init-Data filtering– Detect known attacks by looking at the first data transfer from a source– Intuition: Many simple attacks (e.g., CodeRed, Blaster, Slammer) can be filtered.– Scheme: Acknowledge to SYNs and any data packets following it

• University of Michigan scheme• Is this enough?

– Far too many active sources on the Internet– No, many attacks require complicated "conversations" before exposing its unique

malicious attention• See Pang et al "Characterizing Internet Background Radiation"

• Application-level responders are expensive in terms of development– Also, can't do "cut-through forwarding" if the attack deviates from the known script

• Our idea: replay-based filtering

Page 14


Application IndependentReplay

• To positively identify a probe as being from a known or unknown source, it requires a complex dialog

– EG, Windows SMB file transfer• We can't build target-specific responders

– Too many variants and new targets• Can we use an existing dialog as a script for replaying an application session?

– Take one or two instances of a dialog• Eg, a recorded attack by a particular worm against one of our honeypots

– Recognize certain idioms:• Addresses, ports, and names encoded in the dialog• Ports which open for subsequent transfers• "Cookies" or session identifiers• Length fields• Prestated arguments

• Then use the current interaction as a guide– Update ports/addresses/subsequent connections as appropriate– Mimic back cookies and other changes

Page 15


Responder-Side Replay

Original Flow Replay FlowAttacker Victim

Infected!

Attacker Filter

Detected!

12

34

5

1’2’

3’4’

5

Page 16


ReplayStatus

• This works for single dialogs– For both the initiator (client) and responder (server)

• Tested with:– NFS file manipulation– FTP file transfer

• Including changing the filename argument for the client– CIFS/SMB file transfers– The Blaster worm– W32.Randex.D worm

• Performs attack through open file shares• Currently expanding to support multiple, simultaneous dialogs

– Primarily for server-side replay to act as a radiation filter– Possibility: Recognize commands by where dialogs diverge?

• Also desire replay for:– "Toxicology Screen": For this attack, what can get infected– Testing network devices, evaluating servers, interacting with Internet servers for

measurement purposes…

Page 17


Replay-BasedFilters

• There are 1700 different application dialogs among 143224 connections to port 445/tcp– Connections to active honeypots– Used tethereal to generate a one-line summary for each data packet– Formulated each dialog in a canonical format

• Want to ignore anything in the "known" dialogs set, while allowing anything in the "unknown" set

• So use replay:– Replay as the server with the group of known dialogs

• If replay successful, classify and ignore that source– If replay fails, begin replaying the new dialog against a honeypot as

the client• Using the previous dialog as the starting script• Also, mark source as unknown and allow it to contact a live honeypot if

seen again

Page 18


Attacker Filter VM

12

34

5

12

34

5

Known?

Responder-Side Replay

Initiator-Side Replay

Infected!

Page 19


The Spam Telescope

• Half of the emails to @acme.com are sent to our email server– 100,000 messages per day– 6000 unique executables in 4 days

• We implemented a real time process to parse emails and retrieve attachments– Hash attachments to gain some statistics

• We plan to run attached executables on our honeypots to detect new email worms or multimode worms– Use email to penetrate the firewall, then exploit with local exploits

Page 20


The MainHoneyfarm

• Located at LBNL in ESNet's machine room– Designed around HP DL360 G4 1u, dual processor servers

• Currently:– 1 server as "head unit"

• Previous head was a DL380, but suffered a catastrophic motherboard failure– 7 servers running ESX for honeypots

• Near term expansion (next couple of weeks)– Convert one ESX server into raw Linux for processing acme.com email

• Attach 3 TB disk array for tertiary storage– Add 6 more 1u servers– Add a redundant switch– Increase the disk space on the existing servers

• Generous support from:– ESNet: Network connectivity and rackspace– Hewlett Packard: Equipment– Microsoft: OS and software liscences– VMWare: VMWare liscences

Page 21


Possibility:The "Run This" Wormholes

• We also want small, easy to use endpoints:– Distributed secrets– Endpoints in LANs– Nonblacklistable endpoints for crawlers

• Our plan is to create a "Run This" endpoint in Click– Creates a new MAC address derived from the host's MAC

• Obtain DHCP lease• Open GRE tunnel to the specified honeyfarm

– All traffic is forwarded through the tunnel– Outgoing traffic is strongly policed by the "Run This" module:

• Limited fanout• No contacting local addresses• ?What to do about LAN broadcast packets?

• Goal is an easy to use and trustable endpoint– Which does not trust the honeyfarm.

Date post:	18-Feb-2016
Category:	Documents
Upload:	micol
View:	27 times
Download:	0 times

ICSI Honeyfarm Status

Documents