Bouncersecuring software by blocking bad input
Miguel Castro
Manuel Costa, Lidong Zhou, Lintao Zhang, and Marcus Peinado
Microsoft Research
Software is vulnerable
• bugs are vulnerabilities • attackers can exploit vulnerabilities
– to crash programs – to gain control over the execution
• vulnerabilities are routinely exploited• we keep finding new vulnerabilities
How we secure software today• static analysis to remove vulnerabilities
• checks to prevent exploits– type-safe languages– unsafe languages with instrumentation
• but what do you do when a check fails?– (usually) no code to recover from failure– restarting may be the only option
• bad because programs are left vulnerable to – loss of data and low-cost denial of service attacks
Blocking bad input
• Main Idea: We know the bug, and have an exploit for it, but we don’t have a patch yet!
• Bouncer filters check input when it is received• filters block bad input before it is processed
– example: drop TCP connection with bad message– input is bad if it can exploit a vulnerability
• most programs can deal with input errors• programs keep working under attack
– correctly because filters have no false positives– efficiently because filters have low overhead
Outline
• architecture
• symbolic execution
• symbolic summaries
• precondition slicing
• evaluation
program instrumented to detect attacks
& log inputs
program instrumented to detect attacks & generate trace
generate filter
conditions for sample
generation of alternative
exploits
combine sample
conditionsconditionstrace
filter new exploit
sample exploit
Bouncer architecture
attacks
Example• vulnerable code:
char buffer[1024]; char p0 = 'A'; char p1 = 0;
if (msg[0] > 0) p0 = msg[0];
if (msg[1] > 0) p1 = msg[1];
if (msg[2] == 0x1) {sprintf(buffer, "\\servers\\%s\\%c", msg+3, p0);StartServer(buffer, p1);
}
• sample exploit:
1 1 1 097 97 97 97
Symbolic execution
• analyze trace to compute path conditions– execution with any input that satisfies path
conditions follows the path in the trace– inputs that satisfy path conditions are exploits
• execution follows same path as with sample exploit
• use path conditions as initial filter– no false positives: only block potential exploits
Computing the path conditions
• start with symbolic values for input bytes: b0,…
• perform symbolic execution along the trace
• keep symbolic state for memory and registers
• add conditions on symbolic state for:– branches: ensure same outcome – indirect control transfers: ensure same target– load/store to symbolic address: ensure same target
Example symbolic execution
mov eax, msg
movsx eax, [eax]
cmp eax, 0
jle L
mov p0, al
L: mov eax, msg
movsx eax, [eax+1]
cmp eax, 0
jle M
mov p1, al
M:
symbolic state*msg b0,b1,…
eax (movsx b0)
eflags (cmp (movsx b0) 0)
p0 b0
path conditions(jg (cmp (movsx b0) 0)) [b0 > 0]
Example symbolic execution
mov eax, msg
movsx eax, [eax]
cmp eax, 0
jle L
mov p0, al
L: mov eax, msg
movsx eax, [eax+1]
cmp eax, 0
jle M
mov p1, al
M:
symbolic state*msg b0,b1,…
eflags (cmp (movsx b0) 0)
p0 b0
path conditions(jg (cmp (movsx b0) 0)) [b0 > 0]
eax (movsx b1)
Example symbolic execution
mov eax, msg
movsx eax, [eax]
cmp eax, 0
jle L
mov p0, al
L: mov eax, msg
movsx eax, [eax+1]
cmp eax, 0
jle M
mov p1, al
M:
symbolic state*msg b0,b1,…
eflags (cmp (movsx b1) 0)
p0 b0
path conditions(jg (cmp (movsx b0) 0)) [b0 > 0]
eax (movsx b1)
and (jg (cmp (movsx b1) 0)) [b1 > 0]
Properties of path conditions
• path conditions can filter with no false positivesb0 > 0 Λ b1 > 0 Λ b2 = 1 Λ b1503 = 0 Λ bi ≠ 0 for all 2 < i < 1503
• they catch many exploit variants [Vigilante]
• but they usually have false negatives:– fail to block exploits that follow different path
– example: they will not block exploits with b0 ≤ 0
– attacker can craft exploits that are not blocked
• we generalize filters to block more attacks
Symbolic summaries
• symbolic execution in library functions– adds many conditions– little information to guide analysis to remove them
• symbolic summaries– use knowledge of library function semantics– replace conditions added during library calls – are generated automatically from a template
• template is written once for each function• summary is computed by analyzing the trace
Example symbolic summary
• the vulnerability is in the call– sprintf(buffer, "\\servers\\%s\\%c", msg+3, p0)
• the symbolic summary is – bi ≠ 0 for all 2 < i < 1016
– it constrains the formatted string to fit in buffer– it is expressed as a condition on the input
• summary is computed by– using concrete and symbolic argument values– traversing trace backwards to find size of buffer
Pre-condition slicing
• analyze code and trace to compute a path slice:– slice is a subsequence of trace instructions whose
execution is sufficient to exploit the vulnerability
• generalize filter using the slice– keep conditions added by instructions in slice– discard the other conditions– reduces false negative rate– does not introduce false positives
Computing the slice
• add instruction with the vulnerability to slice
• traverse trace backwards
• track dependencies for instructions in slice
• add instructions to slice when– branches:
• path from branch may not visit last slice instruction• path to last slice instruction can change dependencies
– other instructions: can change dependencies
• combination of static and dynamic analysis
Example
char buffer[1024]; char p0 = 'A'; char p1 = 0;
if (msg[0] > 0) p0 = msg[0];
if (msg[1] > 0) p1 = msg[1];
if (msg[2] == 0x1) {sprintf(buffer, "\\servers\\%s\\%c", msg+3, p0);StartServer(buffer, p1);
}
Slicing example
1 mov eax, msg 2 movsx eax, [eax+1]3 cmp eax, 0 4 jle 6 5 mov p1, al 6 mov eax, msg7 movsx eax, [eax+2]8 cmp eax, 19 jne Na movsx eax, p0b mov ecx, msgc add ecx, 3d push eax # call sprintfe push ecx
dependenciesmsg[3], msg[4], msg[5],…,ecx
slice…,e,d,c
msg
,b,9
,eflags
,8
,eax
,7
,msg[2],msg[2]
,6
Example filter after each phase
• after symbolic executionb0 > 0 Λ b1 > 0 Λ b2 = 1 Λ b1503 = 0
Λ bi ≠ 0 for all 2 < i < 1503
• after symbolic summaryb0 > 0 Λ b1 > 0 Λ b2 = 1 Λ bi ≠ 0 for all 2 < i < 1016
• after slicingb2 = 1 Λ bi ≠ 0 for all 2 < i < 1016
• the last filter is optimal
Deployment scenarios
• distributed scenario – instrument production code to detect exploits– run Bouncer locally on each exploit we detect– deploy improved filter after processing an exploit
• centralized scenario– software vendor runs cluster to compute filters– vendor receives sample exploits from customers– run Bouncer iterations in parallel in the cluster
Evaluation
• implemented Bouncer prototype– detected memory corruption attacks with DFI– generated traces with Nirvana– used Phoenix to implement slicing
• evaluated Bouncer with four real vulnerabilities– SQL server, ghttpd, nullhttpd, and stunnel– started from sample exploit described in literature– ran iterations with search for alternative exploits– single machine; max experiment duration: 24h
Filter accuracy
service false positives false negatives
SQL server no no
ghttpd no yes
nullhttpd no yes
stunnel no no
• bouncer filters have no false positives• perfect filters for two vulnerabilities
Conditions after each phase
Filter generation time
Throughput with filters
Throughput with filters
50 Mbits/sec
Conclusion
• filters block bad input before it is processed
• Bouncer filters have – low overhead– no false positives– no false negatives for some vulnerabilities
• programs keep running under attack
• a lot left to do