The ICSI Honeyfarm Cui, Paxson, Weaver
ICSI Honeyfarm Status
Weidong CuiVern Paxson
Nicholas Weaver
Page 2
The ICSI Honeyfarm Cui, Paxson, Weaver
General Concept:A Breeding Ground for Worms
• We want a controlled, automatic breeding ground for worms and other self-propagating attacks:– Worm attacks a "monitored" address and begins to propagate in our
system• As the worm propagates, we have a suite of automatic analyzers to
study the worm– What it can infect?– Any particulars of interest?– How does it attack?
• And automatically analyze defense strategies– Does this signature block the worm?
• All within a very short time: a few seconds– And with a single point of trust for exporting information
• Also want to leverage the infrastructure for detecting other things:– Human attackers/non-self-propagating attacks– Non-random worms
Page 3
The ICSI Honeyfarm Cui, Paxson, Weaver
Honeyfarm: Objectives• We use network telescopes and a honeyfarm to detect scanning worms
– Network Telescopes• Distributed unallocated IP address ranges
– Honeyfarm:• Centralized cluster of honeypots• On-demand: emulating a large number of hosts on a small number of honeypots
• Detecting self-propagation– Detect self-propagation inside the honeyfarm by redirecting propagations from one honeypot to other honeypots
• Other detectors possible:– Tripwire/modification detectors– Monitored honeypots, etc…
Global Internet
Honeypots
Controller
Honeyfarm
Page 4
The ICSI Honeyfarm Cui, Paxson, Weaver
The Overall Goal• Framework for automatically detecting and analyzing new worms and
other attacks– For self-propagating attacks, we want to generate:
• Vulnerability signatures: What is vulnerable• Behavior signatures: What the worm needs to propagate• Attack signatures: Signatures which detect and block the attack• All signatures should be verified for effectiveness
– For non-self-propagating attacks, as much of the above as possible– Based on providing a fertile ground for constrained propagation
• Receive data from multiple sources– Small distributed telescopes, Large telescopes– Spam, Crawling?
• For a RANDOM worm, with k addresses, V victims, and M systems infected:– Pdetect = 1 – ((V-k)/V)M after M machines infected– High probability of detection when M = V/k
Page 5
The ICSI Honeyfarm Cui, Paxson, Weaver
ICSI's Honeyfarms
• Honeyfarm Safety• ICSI's features:
– Windows Centric– Hot Telescope– Replay
• Replay-based filtering
– Spam Telescope• The Main ICSI Honeyfarm• Other possibilities:
– "Run this" Wormholes
Page 6
The ICSI Honeyfarm Cui, Paxson, Weaver
ICSI Focus:Windows
• Microsoft Windows is our primary (currently only) hosted OS• This requirement dictates VM choice:
– VMWare Workstation or ESX server– Workstation: prototyping
• Limited scalability• Runs on everything
– ESX Server: production• Stringent hardware requirements• Memory sharing for (some) scalability
– Could be better– But can work across multiple close variants due to coalescing
• For now, NO host-OS specific customization– Dictates mechanism for demand allocation: NAT, instead of
customization– Allows the possibility of non-virtual honeypots as well
• ?Apple Systems?
Page 7
The ICSI Honeyfarm Cui, Paxson, Weaver
ICSI'sArchitecture
GRE Tunnel
Honeyfarm
Network Telescope
Filtering
Attacker
Mapping Containment
Policing
VM Clusters
Detection
VManager
Filtering
Containment
Policing
Detection
Page 8
The ICSI Honeyfarm Cui, Paxson, Weaver
Note onArchitecture
• Most components implemented in Click– Provides a modular, reusable framework
• Components in red we want to merge with UCSD– Need to better coordinate in this area– Relatively low overlap so far, but need
Page 9
The ICSI Honeyfarm Cui, Paxson, Weaver
Safety: A Common FocusOf Both UCSD and ICSI
• What if a worm propagates through the honeyfarm and then infects somebody else?– "But they would get infected anyway" doesn't cut it…
• Two safety features:– Containment: the basic decision making on what is allowed outbound
• Connections back to the infecting host• Some "phone-home" channels may also be allowed
– Much malcode/attacks grab code from a third-party site– An independent policing module
• Shutdown the honeyfarm once it detects any abnormal behavior on outbound connections
• This is a safety belt, it should NEVER actually be invoked• Want a third safety feature as well:
– A monitoring system which observes the control-plane– Has the ability to turn-off the honeyfarm by power-sequencing the
network connections• Much more details on policies in UCSD's talk
Page 10
The ICSI Honeyfarm Cui, Paxson, Weaver
The Telescopes
• We have 4 /16s arranged as two (almost) contiguous /15s belonging to ESNet…– Network is directly advertised and routed by ESNet
• But we also have, on loan, a "special" /23 netblock– Also advertised and routed by ESNet
• Much malcode is NOT random:– Linear scanners starting from the local address:
• Blaster and others– Local subnet preference
• Nimda, etc
• By selecting highly-likely addresses, we can gain an advantage in detection time– Local subnet preferences in particular have proven very effective
Page 11
The ICSI Honeyfarm Cui, Paxson, Weaver
How HotIs The Hot Address Range?
0
20000
40000
60000
80000
100000
120000
140000
160000
/23 /16 /16 /16 /16
Num
ber o
f TC
P C
onne
ctio
ns
801351394451433
Page 12
The ICSI Honeyfarm Cui, Paxson, Weaver
Filtering
• But we can't allow all communication:– Honeypot allocation/deallocation is very expensive for us
• VMWare doesn't support a lightweight clone
• We want to filter out known threats– But we still want to detect new attacks for existing vulnerabilities
• We want to detect Welchia as well as Blaster:– New attacks may require new signatures– New variants may be substantially more disruptive
– And we would like to avoid identification by attackers as a honeypot system
• Thus we need a low-cost mechanism to say whether an attack is worth forwarding to a real honeypot
Page 13
The ICSI Honeyfarm Cui, Paxson, Weaver
Basic Filtering• Scan filtering
– Allow traffic to the first N destinations from a source.– Intuition: Scans from a source is homogeneous
• Init-Data filtering– Detect known attacks by looking at the first data transfer from a source– Intuition: Many simple attacks (e.g., CodeRed, Blaster, Slammer) can be filtered.– Scheme: Acknowledge to SYNs and any data packets following it
• University of Michigan scheme• Is this enough?
– Far too many active sources on the Internet– No, many attacks require complicated "conversations" before exposing its unique
malicious attention• See Pang et al "Characterizing Internet Background Radiation"
• Application-level responders are expensive in terms of development– Also, can't do "cut-through forwarding" if the attack deviates from the known script
• Our idea: replay-based filtering
Page 14
The ICSI Honeyfarm Cui, Paxson, Weaver
Application IndependentReplay
• To positively identify a probe as being from a known or unknown source, it requires a complex dialog
– EG, Windows SMB file transfer• We can't build target-specific responders
– Too many variants and new targets• Can we use an existing dialog as a script for replaying an application session?
– Take one or two instances of a dialog• Eg, a recorded attack by a particular worm against one of our honeypots
– Recognize certain idioms:• Addresses, ports, and names encoded in the dialog• Ports which open for subsequent transfers• "Cookies" or session identifiers• Length fields• Prestated arguments
• Then use the current interaction as a guide– Update ports/addresses/subsequent connections as appropriate– Mimic back cookies and other changes
Page 15
The ICSI Honeyfarm Cui, Paxson, Weaver
Responder-Side Replay
Original Flow Replay FlowAttacker Victim
Infected!
Attacker Filter
Detected!
12
34
5
1’2’
3’4’
5
Page 16
The ICSI Honeyfarm Cui, Paxson, Weaver
ReplayStatus
• This works for single dialogs– For both the initiator (client) and responder (server)
• Tested with:– NFS file manipulation– FTP file transfer
• Including changing the filename argument for the client– CIFS/SMB file transfers– The Blaster worm– W32.Randex.D worm
• Performs attack through open file shares• Currently expanding to support multiple, simultaneous dialogs
– Primarily for server-side replay to act as a radiation filter– Possibility: Recognize commands by where dialogs diverge?
• Also desire replay for:– "Toxicology Screen": For this attack, what can get infected– Testing network devices, evaluating servers, interacting with Internet servers for
measurement purposes…
Page 17
The ICSI Honeyfarm Cui, Paxson, Weaver
Replay-BasedFilters
• There are 1700 different application dialogs among 143224 connections to port 445/tcp– Connections to active honeypots– Used tethereal to generate a one-line summary for each data packet– Formulated each dialog in a canonical format
• Want to ignore anything in the "known" dialogs set, while allowing anything in the "unknown" set
• So use replay:– Replay as the server with the group of known dialogs
• If replay successful, classify and ignore that source– If replay fails, begin replaying the new dialog against a honeypot as
the client• Using the previous dialog as the starting script• Also, mark source as unknown and allow it to contact a live honeypot if
seen again
Page 18
The ICSI Honeyfarm Cui, Paxson, Weaver
Attacker Filter VM
12
34
5
12
34
5
Known?
Responder-Side Replay
Initiator-Side Replay
Infected!
Page 19
The ICSI Honeyfarm Cui, Paxson, Weaver
The Spam Telescope
• Half of the emails to @acme.com are sent to our email server– 100,000 messages per day– 6000 unique executables in 4 days
• We implemented a real time process to parse emails and retrieve attachments– Hash attachments to gain some statistics
• We plan to run attached executables on our honeypots to detect new email worms or multimode worms– Use email to penetrate the firewall, then exploit with local exploits
Page 20
The ICSI Honeyfarm Cui, Paxson, Weaver
The MainHoneyfarm
• Located at LBNL in ESNet's machine room– Designed around HP DL360 G4 1u, dual processor servers
• Currently:– 1 server as "head unit"
• Previous head was a DL380, but suffered a catastrophic motherboard failure– 7 servers running ESX for honeypots
• Near term expansion (next couple of weeks)– Convert one ESX server into raw Linux for processing acme.com email
• Attach 3 TB disk array for tertiary storage– Add 6 more 1u servers– Add a redundant switch– Increase the disk space on the existing servers
• Generous support from:– ESNet: Network connectivity and rackspace– Hewlett Packard: Equipment– Microsoft: OS and software liscences– VMWare: VMWare liscences
Page 21
The ICSI Honeyfarm Cui, Paxson, Weaver
Possibility:The "Run This" Wormholes
• We also want small, easy to use endpoints:– Distributed secrets– Endpoints in LANs– Nonblacklistable endpoints for crawlers
• Our plan is to create a "Run This" endpoint in Click– Creates a new MAC address derived from the host's MAC
• Obtain DHCP lease• Open GRE tunnel to the specified honeyfarm
– All traffic is forwarded through the tunnel– Outgoing traffic is strongly policed by the "Run This" module:
• Limited fanout• No contacting local addresses• ?What to do about LAN broadcast packets?
• Goal is an easy to use and trustable endpoint– Which does not trust the honeyfarm.