Fine-Grained Trackingof Grid Infections
Ashish Gehani
SRI
Basim Baig, Salman Mahmood, Dawood Tariq, Fareed Zaffar
LUMS
Fine-Grained Tracking of Grid Infections – p. 1/18
IntroductionGrid semantics
Not middleware-specificDistributed system“Application community”
InfectionSecurity, Reliability, Quality-of-Service
ConstraintsFine-grained monitoringGrid-wide correlationTimely analysis
Fine-Grained Tracking of Grid Infections – p. 2/18
Motivation
Attractive attack platformAccess to large set of resources
Automatic privilege escalationSingle sign-on
Significant consequencesIntegrity loss of valuable data
Exposed servicesOpen ports for callbacks
Fine-Grained Tracking of Grid Infections – p. 3/18
ApplicationCommunity
ThreatDigestThreat
Anomalies
Anomalies
DigestThreat
Grid Node
Grid Node
Grid Node
Grid Node
Digest
Grid NodeGrid Node
Grid Node
Grid Node
Anomalies
Risk Monitor
Grid Node
Edit Layer 0 and 1 as needed
Edit Layer 0 and 1 as needed
Edit Layer 0 and 1 as needed
Edit Layer 0 and 1 as needed
Edit Layer 0 and 1 as needed
Edit Layer 0 and 1 as needed
Edit Layer 0 and 1 as needed
Fine-Grained Tracking of Grid Infections – p. 4/18
Central Monitoring
Collect Grid-wide anomalies
Raw stream saturates network
35 clients, 10Mb/s (Oliner et al, RAID 2010)Only event typesNo arguments
Must scale to hundreds of nodes
Fine-Grained Tracking of Grid Infections – p. 5/18
Local Monitoring
Framed as set operations
Application activitySet of events
Normal behaviorUnion of events during training
Anomalous behaviorDifference set (by subtracting normal)
Correlating node activityIntersection of anomaly sets
Fine-Grained Tracking of Grid Infections – p. 6/18
Approach
Decompose sets into epochs
Compress epoch activity
Collect data provenance
Map anomalies to provenance
Fine-Grained Tracking of Grid Infections – p. 7/18
Epoch Compression
Anomaly
BloomFilter
Grid Node Risk Monitor
Digest
Application
Auditing
SystemCalls
User
Kernel
Log
DetectorAnomaly
Set representationAllows use of Bloom filters
Fold filter ⌈log(f + b)⌉ timesIncrease update frequency by f
Decrease bandwidth used by b
More false positivesFine-Grained Tracking of Grid Infections – p. 8/18
Correlating Activity
Combine Bloom filtersCounting filter
Event on τ nodesCorresponding buckets are ≥ τ
Construct vaccinationBloom filter bit 1 ⇐⇒counting filter bucket ≥ τ
Fine-Grained Tracking of Grid Infections – p. 9/18
Data Provenance
Process
File 1 Read
File 2 Read
close()open()
open() close()
File 3 Write
Process execution Time
close()open()
File 3
File 1 File 2
Owner
Record few argumentsProcess creation, File versionsFile access, modification
Fine-Grained Tracking of Grid Infections – p. 10/18
Anomaly Tracking
Synthetic attack
Unexpected writeof dump.log
Read
userenv.log
Write
prefs-1.js
Write
Process CreateRead
syscfg32.exe
Write
Process Create
Read
Read Read
dump.log
Write
iexplore.exe
application.ini
SDBOT05B.EXE-0A1D8339.pf
sdbot05b.exeSYSCFG32.EXE-2E372F88.pf
syscfg32.exe config.initargets.txt
boinc.exe
Process Create
Fine-Grained Tracking of Grid Infections – p. 11/18
Evaluation Platform
Microsoft Windows XP (SP3)
BOINC 6.10.43 volunteer Grid application
Process Monitor 2.7 tool
Open Bloom Filter library
Synthetic infectionInternet Explorer vulnerabilityWindows CreateRemoteThread()MailBoy 2004 injected
Fine-Grained Tracking of Grid Infections – p. 12/18
Workload
24 hours 20 minutes
1.5 million events
Raw log: 216 MB / Grid node
Anomaly detection with 11-tuples
MailBoy 2004 as spam relay20 threads30 second timeout1,700 email addresses
Fine-Grained Tracking of Grid Infections – p. 13/18
Storage
0.1
1
10
100
1000
10000
100000
1e+06
0 200000 400000 600000 800000 1e+06 1.2e+06 1.4e+06
Sto
rage
Spa
ce U
sed
(KB
)
Number of Events
Disk Storage4000-bit Bloom filter2000-bit Bloom filter
Fine-Grained Tracking of Grid Infections – p. 14/18
Normal Operation
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 200000 400000 600000 800000 1e+06 1.2e+06 1.4e+06 1.6e+06
Fal
se P
ositi
ves
Nor
mal
ized
by
Ano
mal
ous
Seq
uenc
es
Total Number of Events
500-bit Bloom filter1000-bit Bloom filter2000-bit Bloom filter4000-bit Bloom filter
Fine-Grained Tracking of Grid Infections – p. 15/18
Malware Injected
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 200000 400000 600000 800000 1e+006 1.2e+006 1.4e+006 1.6e+006 1.8e+006
Fal
se P
ositi
ves
Nor
mal
ized
by
Ano
mal
ous
Seq
uenc
es
Total Number of Events
500-bit Bloom filter1000-bit Bloom filter2000-bit Bloom filter4000-bit Bloom filter
Fine-Grained Tracking of Grid Infections – p. 16/18
ProvenanceDatabase
0
500
1000
1500
2000
2500
0 200 400 600 800 1000 1200 1400 1600
Num
ber
of U
niqu
e Id
entif
iers
Time (minutes)
File Identifiers in Normal DataFile Identifiers in Attack Data
Process Identifiers in Normal and Attack Data
Fine-Grained Tracking of Grid Infections – p. 17/18
Conclusion
Apparent tensionFine-grained anomaly detectionGrid-wide monitoring
SolutionAudit provenance on Grid nodesCompress event streamMap anomalies to provenance
AcknowledgementNSF Grant OCI-0722068
Fine-Grained Tracking of Grid Infections – p. 18/18