ROSAEC Clinic Internals
Jaeho Shin
ROPAS Show&Tell2011-01-14
*
* Joint work with Sungkeun Cho, Kihong Heo, Jisoo Jung, Jinyoung Kim, Seungjoon Lee, Woosuk Lee, and Hakjoo Oh
Contents
1. Tour
2. Design and Implementation
3. Preprocessing Issues & Ideas
Tour
ROSAEC Clinic Service
From User’sPoint of View
ROSAEC Clinic Service
From User’sPoint of View
ROSAEC Clinic Service
From User’sPoint of View
ROSAEC Clinic Service
From User’sPoint of View
ROSAEC Clinic Service
From User’sPoint of View
ROSAEC Clinic Service
From User’sPoint of View
ROSAEC Clinic Dashboard
From OurPoint of View
ROSAEC Clinic Dashboard
From OurPoint of View
ROSAEC Clinic Dashboard
From OurPoint of View
ROSAEC Clinic Dashboard
From OurPoint of View
ROSAEC Clinic Dashboard
From OurPoint of View
ROSAEC Clinic Dashboard
From OurPoint of View
ROSAEC Clinic Dashboard
From OurPoint of View
ROSAEC Clinic Dashboard
From OurPoint of View
ROSAEC Clinic Dashboard
From OurPoint of View
ROSAEC Clinic Dashboard
From OurPoint of View
ROSAEC Clinic Dashboard
From OurPoint of View
ROSAEC Clinic Service
From User’sPoint of View
ROSAEC Clinic Service
From User’sPoint of View
ROSAEC Clinic Service
From User’sPoint of View
ROSAEC Clinic Service
From User’sPoint of View
Goal of ROSAEC Clinic
• To let the public experience static analysis technology
• To collect samples of erroneous code for better research
Design and Implementation
Overview
Web Interface
Clinic Engine
lighttpd php
frontend
notification
analyzer
managerSparrow Workspacereporterclassifier
public internal
coreutilsbash sandbox xsltproc php Case
Structure for a Case
Case
input build analysis
alarms report log*.started
alarms.**.finished*.output
.target
...
target
a unique directory for each analysis
Workspace
runnew preprocessing analyzing
...classifying
archive2010 2011
...
Structure for Organizing Cases
poolCase
Case
CaseCase
Case Case
Case Case Case
Case
Case
log
etc
State Transitionnew
confirmed
received
approving
preprocessing
analyzing
analyzed
classifying
reportingreported
finished
failed
failed-notified
notify owner
owner confirms
notify admin
admin approves
run preprocessors
run analyzers
notify admin
newowner submits code
admin finishes classification
notify owner
record in archive
new
wheneversomethinggoes wrong
notify admin
admin classifies alarms
generate report
State Transition Tableengine/manager/clinic-process
Analyzer Driverspreprocess.X
analyze.Xfor each analyzer X
Clinic Engine
frontend
analyzer
Sparrow
Analyzer Driverspreprocess.X
analyze.Xfor each analyzer X
Clinic Engine
frontend
analyzer
Sparrow
X
PreprocessingIssues & Ideas
Preprocess?
Source code
Source code
Preprocess?
Pure C code
Source code
Build ➥
Source code
Preprocess?
BinPure C codeSource
code
Build ➥
Source code
Preprocess?
BinPure C code
AiracStatic analyzersusually requirepure C code as input
Source code
Build ➥
Source code
Pure C Extraction
Source code
Source code
Pure C Extraction
Source code
Source code
Pure C Extraction
Pure C code
Source code
Build ➥
Source code
Pure C code
Pure C Extraction
BinPure C codeSource
code
Build ➥
Source code
Pure C code
Pure C Extraction
BinPure C code
Airac
We currently observe the build process to get a copy of Pure C
Source code
Build ➥
Source code
Pure C code
Security Issue
Since our service run builds,
Malicious commands uploaded by userscan cause bad things
e.g. • rm -rf ~• mail [email protected] </etc/passwd
ROSAEC Clinic Service system
Sandbox for Builds
Clinic Engine
sandbox
frontend
/bin /usr/bin/usr/lib
/usr/share/lib
/etc/usr/include
/bin /usr/bin/usr/lib
/usr/share/lib
/etc /usr/include...
...
Workspace
So, we perform builds in a sandbox* with chroot(2)* a separate system created with debootstrap(8)
ROSAEC Clinic Service system
Sandbox for Builds
Clinic Engine
sandbox
frontend
Sandbox system
/bin /usr/bin/usr/lib
/usr/share/lib
/etc/usr/include
/bin /usr/bin/usr/lib
/usr/share/lib
/etc /usr/include...
...
Workspace
So, we perform builds in a sandbox* with chroot(2)* a separate system created with debootstrap(8)
ROSAEC Clinic Service system
Sandbox for Builds
Clinic Engine
sandbox
frontend
Sandbox system
Case
/bin /usr/bin/usr/lib
/usr/share/lib
/etc/usr/include
build
/bin /usr/bin/usr/lib
/usr/share/lib
/etc /usr/include...
...
Workspace
So, we perform builds in a sandbox* with chroot(2)* a separate system created with debootstrap(8)
ROSAEC Clinic Service system
Sandbox for Builds
Clinic Engine
sandbox
frontend
Sandbox system
Case
/bin /usr/bin/usr/lib
/usr/share/lib
/etc/usr/include
build
/bin /usr/bin/usr/lib
/usr/share/lib
/etc /usr/include...
...
Workspace
So, we perform builds in a sandbox* with chroot(2)* a separate system created with debootstrap(8)
Remaining Issues
Pure C code
Source code
Remaining Issues
Pure C code
Source code
?GNU Make, Autotools, configure,
CMake, SCons, Visual Studio, Xcode, ...
How should we build
Remaining Issues
Pure C code
Source code
External Library
?Have we got all the header files
?GNU Make, Autotools, configure,
CMake, SCons, Visual Studio, Xcode, ...
How should we build
Remaining Issues
Pure C code
Source code
External Library
?Have we got all the header files
?GNU Make, Autotools, configure,
CMake, SCons, Visual Studio, Xcode, ...
How should we build
We can’t run analyzers unlessuser gives us full information
Idea 1
Provide our preprocessor, andlet users collect the Pure C code themselves
Exercise
int main(void) { int s = 0; for (int i=1; i<=10; i++) s += i; return s;}
a.c:
Will this compile with “gcc a.c”?
Exercise
$ gcc a.ca.c: In function ‘main’:a.c:3: error: ‘for’ loop initial declarations are only allowed in C99 modea.c:3: note: use option -std=c99 or -std=gnu99 to compile your code
No.
Idea 2
Recover trivial compile errors
e.g. • -std=c99• #include “X” where X exists somewhere
Idea 3
*.h names ⇌ Debian *-dev packages
Index names of header files in popular open source libraries
and install them in sandbox before builds
Q&A + Discussion
Development
• Gitgit clone ropas.snu.ac.kr:~netj/2010/rosaec-clinic
• Fully testable with local instance
Remaining Work
• Feedback System & UI
• Better Report