1
Beauty and the Burst Remote Identification of Encrypted
Video Streams
Roei Schuster
Cornell Tech, Tel Aviv University
Vitaly Shmatikov
Cornell Tech
Eran Tromer
Columbia University, Tel Aviv University
2
Video traffic is interesting
3
Video traffic is encrypted
4
What can still be learned?
Video traffic is encrypted
5
victim
streaming service
Traffic analysis for video identification
6
victim
streaming service
Traffic analysis for video identification
7
victim
Metadata! packet times, sizes, …
streaming service
Traffic analysis for video identification
8
Victim is watching
“Beauty and the Beast”!
victim
Metadata! packet times, sizes, …
streaming service
Traffic analysis for video identification
9
Initial buffering, then “on”/“off” bursts pa
cke
t siz
e (
byte
s)
time (seconds)
10
Initial buffering, then “on”/“off” bursts pa
cke
t siz
e (
byte
s)
time (seconds)
11
[RLLTBD ’11], [ARNL ’12], [MFWS ’13], …
Initial buffering, then “on”/“off” bursts pa
cke
t siz
e (
byte
s)
time (seconds)
12
[RLLTBD ’11], [ARNL ’12], [MFWS ’13], …
Where do bursts come from?
Initial buffering, then “on”/“off” bursts pa
cke
t siz
e (
byte
s)
time (seconds)
13
streaming service
Video representation on server
14
streaming service
Video representation on server
15
streaming service
Pulp Fiction
Die Hard
12 Monkeys
The Fifth Element
Die Hard II
Armageddon
Video representation on server
16
Pulp Fiction
Die Hard
12 Monkeys
The Fifth Element
Die Hard II
Armageddon
MPEG-DASH standard:
widely adopted by Netflix, YouTube, others
Video representation on server
17
Pulp Fiction
Die Hard
12 Monkeys
The Fifth Element
Die Hard II
Armageddon
segment2.m4s
segment3.m4s
segment4.m4s
segment1.m4s
video stored in segment-files
MPEG-DASH standard:
widely adopted by Netflix, YouTube, others
Video representation on server
18
Pulp Fiction
Die Hard
12 Monkeys
The Fifth Element
Die Hard II
Armageddon
segment2.m4s
segment3.m4s
segment4.m4s
segment1.m4s
video stored in segment-files
segment = a few seconds
of playback
MPEG-DASH standard:
widely adopted by Netflix, YouTube, others
0-5sec
5-10sec
10-15sec
15-20sec
Video representation on server
19
server client
buffer below threshold?
yes
request next
segment
no
segment2.m4s
segment3.m4s
segment4.m4s
segment1.m4s
server
segment5.m4s
segment6.m4s
DASH client-server interaction (simplified)
20
server client
buffer below threshold?
yes
request next
segment
no
segment2.m4s
segment3.m4s
segment4.m4s
segment1.m4s
server
segment5.m4s
segment6.m4s
segment fetched every few seconds
DASH client-server interaction (simplified)
21
server client
buffer below threshold?
yes
request next
segment
no
segment2.m4s
segment3.m4s
segment4.m4s
segment1.m4s
server
segment5.m4s
segment6.m4s
fetching causes a traffic burst
segment fetched every few seconds
DASH client-server interaction (simplified)
22
Different video seconds require different amount of bytes to encode
Bitra
te (
byte
s)
Time (seconds)
Iguana vs. Snakes VBR
Variable bit rate encoding
23
Phases of Iguana vs Snakes in Bitrate
scenery, movement,
tension rising
Bitra
te (
bits p
er
se
co
nd
)
Time (seconds)
24
Phases of Iguana vs Snakes in Bitrate
tension peaking, iguana is still
Bitra
te (
bits p
er
se
co
nd
)
Time (seconds)
25
Phases of Iguana vs Snakes in Bitrate
chase
Bitra
te (
bits p
er
se
co
nd
)
Time (seconds)
26
Phases of Iguana vs Snakes in Bitrate
chase
iguana almost captured
Bitra
te (
bits p
er
se
co
nd
)
Time (seconds)
27
Phases of Iguana vs Snakes in Bitrate
iguana safe, resting
Bitra
te (
bits p
er
se
co
nd
)
Time (seconds)
28
Different video seconds require different amount of bytes to encode
Bitra
te (
byte
s)
Time (seconds)
Iguana vs. Snakes VBR
Variable bit rate encoding
29
Pulp Fiction
Die Hard
12 Monkeys
The Fifth Element
Die Hard II
Armageddon Segment2.m4s
Segment3.m4s
Segment4.m4s
Segment1.m4s
Segment5.m4s
0-5sec
5-10sec
10-15sec
15-20sec
20-25sec
Variable bit rate variable segment size
30
Variable segment size variable burst size
buffering On/off bursts
bu
rst siz
e (
byte
s)
Time (seconds)
31
Variable segment size variable burst size
buffering On/off bursts
bu
rst siz
e (
byte
s)
Time (seconds)
32
segments
VBR pattern
stream time
burst sizes
content
MPEG-DASH leak
33
burst sizes
stream time
Does the pattern of burst (segment) sizes uniquely
characterize a title?
Can we learn a title’s identifying
pattern?
From a leak to a fingerprint
34
burst sizes
stream time
Does the pattern of burst (segment) sizes uniquely
characterize a title?
Can we learn a title’s identifying
pattern?
Diversity: empirically measure pairwise distances for 3500 downloaded and segmented YouTube titles
From a leak to a fingerprint
35
burst sizes
stream time
Does the pattern of burst (segment) sizes uniquely
characterize a title?
Consistency: empirically evaluate attacker’s measurement error bound
Can we learn a title’s identifying
pattern?
Diversity: empirically measure pairwise distances for 3500 downloaded and segmented YouTube titles
From a leak to a fingerprint
36
burst sizes
stream time
Does the pattern of burst (segment) sizes uniquely
characterize a title?
Consistency: empirically evaluate attacker’s measurement error bound
Can we learn a title’s identifying
pattern?
~20% of YouTube titles have fingerprints
Diversity: empirically measure pairwise distances for 3500 downloaded and segmented YouTube titles
From a leak to a fingerprint
37
victim network
Pulp Fiction
Die Hard
Armageddon attacker network
12 Monkeys
Attack overview
38
victim network
Pulp Fiction
Die Hard
Armageddon attacker network
12 Monkeys
Attack overview
39
metadata
victim network
Pulp Fiction
Die Hard
Armageddon attacker network
12 Monkeys
Attack overview
40
detectors
metadata
train
ing
victim network
Pulp Fiction
Die Hard
Armageddon attacker network
12 Monkeys
Attack overview
41
detectors
metadata
train
ing
victim network
Pulp Fiction
Die Hard
Armageddon Armageddon attacker network
12 Monkeys
Attack overview
42
detectors
metadata
train
ing
victim network
Pulp Fiction
Die Hard
Armageddon Armageddon attacker network
12 Monkeys
Attack overview
43
detectors
metadata
train
ing
victim network
Pulp Fiction
Die Hard
Armageddon Armageddon attacker network
Victim is watching
“Armageddon”!
12 Monkeys
Attack overview
44
detectors
train
ing
victim network
Pulp Fiction
Die Hard
12 Monkeys
Armageddon Armageddon attacker network
vantage point?
Attack details
metadata
45
bursts
Wi-Fi access points, proxies, routers, enterprise or national network censors, ISPs
on-path vantage
point
Scenario I: on-path attack
46
detectors
train
ing
victim network
Pulp Fiction
Die Hard
12 Monkeys
Armageddon Armageddon attacker network
machine learning
Attack details
metadata
47
• Very good at learning high-level concepts that are hard to
express formally (e.g., “traffic traces are similar”)
• Existing NN architectures very accurate on classification
and detection problems
Deep neural networks
48
• Robust: can operate on noisy and coarse
measurements
• Agnostic to protocol-specific attributes (e.g.,
QUIC vs. TLS)
• Can learn features other than burst patterns,
e.g., arrival patterns of individual packets
• Can use multiple session representations,
train on all at once
Advantages of neural networks
49
Each feature is a time-series, sampled at 0.25-second intervals (example: bytes per second)
0 0.25 0.5 0.75 1
time (seconds)
packe
t siz
e
1500
300
0.25 ⋅
15000
2 ⋅ 1500300……
Features considered: downstream/upstream/total values of bytes per
second, packet per second, average packet length, and burst sizes
Features
50
detectors
train
ing
victim network
Pulp Fiction
Die Hard
12 Monkeys
Armageddon Armageddon attacker network
neural net
Attack
On-path attacker
metadata
51
10 titles 100 1-minute sessions
18 titles 100 3-minute sessions + 3500 sessions of different other titles
10 titles 100 1.5-minute sessions
100 titles 100 1-minute sessions
Datasets and identification experiments
52
10 titles 100 1-minute sessions
18 titles 100 3-minute sessions + 3500 sessions of different other titles
10 titles 100 1.5-minute sessions
100 classes
100 titles 100 1-minute sessions
Datasets and identification experiments
53
10 titles 100 1-minute sessions
18 titles 100 3-minute sessions + 3500 sessions of different other titles
10 titles 100 1.5-minute sessions
100 classes
18+1=19 classes
open-world identification
100 titles 100 1-minute sessions
Datasets and identification experiments
54
10 titles 100 1-minute sessions
18 titles 100 3-minute sessions + 3500 sessions of different other titles
10 titles 100 1.5-minute sessions
100 classes
18+1=19 classes
10 classes
10 classes
open-world identification
100 titles 100 1-minute sessions
Datasets and identification experiments
55
10 titles 100 1-minute sessions
18 titles 100 3-minute sessions + 3500 sessions of different other titles
10 titles 100 1.5-minute sessions
100 classes
18+1=19 classes
10 classes
10 classes
98.5% accuracy
99.5% accuracy
98.6% accuracy
92.5% accuracy
open-world identification
100 titles 100 1-minute sessions
Datasets and identification experiments
56
Netflix (feature: total burst size)
YouTube (feature: total burst size)
Predicted label Predicted label
“unknown” class, 3500 samples
Empirical results: confusion matrices
57
Netflix (feature: total burst size)
YouTube (feature: total burst size)
Predicted label Predicted label
“unknown” class, 3500 samples Exactly 2 false positives
No recurrent confusions (despite many same-series titles)
Empirical results: confusion matrices
58
0 false positives with 0.988 recall
0.0005 false positive rate with 0.93 recall
Netflix (feature: total burst size)
YouTube (feature: total burst size)
Tuning for precision
59
detectors
train
ing
victim network
Pulp Fiction
Die Hard
12 Monkeys
Armageddon Armageddon attacker network
vantage point?
Attack details
train
ing
neural net
metadata
60
Wi-Fi access points, proxies, routers, enterprise or national network censors, ISPs
bursts
victim network
on-path vantage
point
Off-path attackers
61
bursts
victim network Off-path attackers
62
A visited webpage? A smartphone app?
bursts
victim network Off-path attackers
63
A visited webpage? A smartphone app?
bursts
victim network Off-path attackers
Example: checking Facebook feed while streaming “Armageddon”
64
A visited webpage? A smartphone app?
bursts
victim network Off-path attackers
Example: checking Facebook feed while streaming “Armageddon”
65
A visited webpage? A smartphone app?
bursts
victim network Off-path attackers
Example: checking Facebook feed while streaming “Armageddon”
66
A visited webpage? A smartphone app?
bursts
victim network
Web ad
Off-path attackers
Example: checking Facebook feed while streaming “Armageddon”
67
A visited webpage? A smartphone app?
bursts
victim network
Web ad
Off-path attackers
Example: checking Facebook feed while streaming “Armageddon”
68
A visited webpage? A smartphone app?
bursts
victim network
Web ad
Three-fold confinement: different device, browser process, sandboxed iframe
Off-path attackers
Example: checking Facebook feed while streaming “Armageddon”
69
Browser
neighbor
viewer Cross-device attack
70
JavaScript attacker client
Browser
attacker Web site
neighbor
viewer Cross-device attack
71
JavaScript attacker client
Browser
attacker Web site
neighbor
viewer Cross-device attack
messages
72
JavaScript attacker client
Congestion
Browser
attacker Web site
neighbor
viewer Cross-device attack
messages
73
JavaScript attacker client
Congestion
Browser
attacker Web site
bursts
neighbor
viewer Cross-device attack
messages
74
JavaScript attacker client
Congestion
Browser
attacker Web site
bursts
delays
neighbor
viewer Cross-device attack
messages
75
JavaScript attacker client
Noisy, coarse estimate of actual traffic bursts
Congestion
Browser
attacker Web site
bursts
delays
neighbor
viewer Cross-device attack
messages
76
traffic burst sizes (scaled down) Message
delays
Delay-bursts dela
y (
mill
iseconds)
time (seconds)
77
For each traffic burst, compute aggregate delay induced. Use resulting time-series as input to neural network
traffic burst sizes (scaled down) Message
delays
Delay-bursts dela
y (
mill
iseconds)
time (seconds)
78
Delay-bursts vs. traffic bursts
delay-bursts time series: the delays induced by traffic bursts
79
Accuracy: 0.965
false positive rate: 0.003, recall 0.933
1/10 cross-device attack: precision vs. recall
80
JavaScript detector code
Browser
attacker Web site
Cross-device attack
neighbor
viewer
81
attacker Web site
Cross-site attack
Streaming client
victim PC browser window
browser window
JavaScript detector code
82
• Modern streaming traffic characteristics
– Title bitrate pattern unique when sampled at few-seconds granularity
– Fetching at segment granularity (= every few seconds)
• Maximizes “quality of experience”, server load, and network
bandwidth utilization
• However, information leakage is intrinsic…
Buffer below threshold?
yes
fetch next segment
no
Mitigating the DASH leak
83
• Further information and the paper:
https://beautyburst.github.io/
Thank you!