WiFi Issues In The WildA view from the cloud #WLPC Phoenix 2018
KN Gopinath (VP of Engineering)Karan Gupta (Senior Researcher)
© Mojo Networks2
Control Plane - Distributed Zero functionality loss if connectivity with cloud breaks
Management traffic to
and from cloud
Management Plane - CentralizedAccess | Security | Engagement
Data Plane – Flexible
Local Breakout of Data
1
Data Tunnel
VirtualizedTraffic Aggregator
3 2
3a3b
Data Integration (Syslog, RESTful APIs)
• Logging• SIEM• CRM
Mojo Cloud Architecture - Inception in 2007, Evolving Continuously
SCALEOne example: A Cloud instance is managing 100,000 LIVE APs
ROBUSTWorks even if connectivity to manager is LOST
© Mojo Networks3
Cognitive Wi-Fi Powered Through Cloud
1. Management Plane - Centralized
3. Data Plane - Flexible 2. Control Plane – Distributed
4. Cognition Plane - Intelligent
3b. Data tunnel
3a. Local data breakout
BIG DATAStore key client parameters & run ML algorithms.
SMART EDGE APsFailure Detection “Needle in the hay stack”
© Mojo Networks4
Wi-Fi User Experience: Connectivity
Client EventsSent real-time to the cloud.
Client Journey Extending the Philosophy of Mojo Packets (launched in 2013)
© Mojo Networks5
Wi-Fi User Experience: Performance
Anomaly Detection & Baselining of various performance parameters computed and stored in the cloud.
© Mojo Networks6
User Application Experience
NEW! Launching at WLPC!
Deep Packet Inspection, Machine Learning
© Mojo Networks7
Failure Analysis Overview
Reference Data
Connectivity Failures
Performance Failures
Conclusion
© Mojo Networks8
Reference Data
Clients
237k+
Associations
31M
Applications
400+
Duration
1 week
Verticals
Enterprise
Education
Manufacturing
Retail & Hospitality
Anonymized and a subset from our production cloud.
© Mojo Networks9
The State of Apps
0102030405060708090
100
Google APIs
Amazon W
eb ServicesApple
Google PlayiTunes
YouTube
Microsoft
Akamai
iCloudgmail
Doubleclick
Windows Update
MS Onlin
e
Apple Update
Facebook Video
MS Offic
e 365Yahoo
Google Ads
Amazon
SpotifyIC
MP
Adobe
Google Drive
MS Outlo
okSTUN
Snapchat
Netflix V
ideo Stream
Exchan
ge Onlin
eCIFS
WhatsApp M
edia Messa
ge
APNS
AppNexus
Google Analytic
s
Crashlyt
ics
Integral Ad Science
% o
f Cus
tom
ers
10 © Mojo Networks. Confidential Information.
Wi-Fi Connectivity Failure & Latencies
© Mojo Networks11
WiFi almost always gets blamed!
© Mojo Networks12
Client Connectivity Failures
Ultimate Truth: Instrumented AP Driver Code to tap into client’s state machine.
Association Failures
• AP association limit exceeded
• Capability mismatch• Generic assoc. failure
Authentication Failures
• Fast roaming failed• RADIUS auth. failure• RADIUS Server not
reachable• Incorrect PSK• EAPOL 4-Way
handshake failed
Network Failures
• DHCP failure• DNS failure
© Mojo Networks13
The State of WiFi Connectivity
94% 6%
Successful Connections Failed Connections
More than 3000 connections happen in our network per minute (on an average).
Approximately 6%of connections fail.
Note that this corresponds to about 5% of the clients.
© Mojo Networks14
Why do we see Connectivity Failures in an otherwise well configured and operational Network?
2%
46%52%
Association Authentication Network
More than 50% of the connection failures are due to wired side issues.
© Mojo Networks15
Failure Distribution - We DID notice some connectivity errors that were transient in nature
19% PSK Errors. Possibly due to Guest users and/or BYOD
13% EAPOL Errors. Mostly, transient and self-correcting in nature.
Assoc. Failure2%
Mac Filtering4% RADIUS Auth
Failure4%
RADIUS Server Unresponsive
1%
Incorrect PSK19%
EAPOL Handshake
Failure13%
Portal Failures5%
DHCP Failure25%
DNS Failure27%
© Mojo Networks16
Connectivity – With Respect To 2.4 & 5 GHz Bands
0
1
2
3
4
5
6
7
8
2.4GHz 5GHz
% C
lient
sAssociation Authentication Network
How many of you think that the connectivity issues are similar in both the bands?
© Mojo Networks17
Connectivity Failures – With Respect To Verticals
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Enterprise Manufacturing Education Retail &Hospitality
% o
f con
nect
ions
Assocation Authentication Network
DHCP/DNS errors predominate in Retail & Hospitality
Due to transient nature of guest users.
© Mojo Networks18
State of AAA Latency For Successful Connections
0
2
4
6
8
10
12
14
16
18
20
(0,20]
(40,60]
(80,100]
(120,14
0]
(160,18
0]
(200,220]
(240,260]
(280,300]
(320,340]
(360,380]
(400,420]
(440,460]
(480,500]
(520,540]
(560,580]
(600,620]
(640,660]
(680,700]
(720,740]
(760,78
0]
(800,820]
(840,860]
(880,900]
(920,940]
(960,980]
% o
f cus
tom
ers
Latency (ms)
~ 62% of the customers have their baseline latency below 500ms
© Mojo Networks19
State of DHCP Latency For Successful Connections
0
10
20
30
40
50
60
70
80
90
100
(0,20]
(40,60]
(80,100]
(120,14
0]
(160,18
0]
(200,220]
(240,260]
(280,300]
(320,340]
(360,380]
(400,420]
(440,460]
(480,500]
(520,540]
(560,580]
(600,620]
(640,660]
(680,700]
(720,740]
(760,78
0]
(800,820]
(840,860]
(880,900]
(920,940]
(960,980]
% o
f cus
tom
ers
Latency (ms)
~ 80% of the customers have their baseline latency below 20ms
And ~ 90% have their baseline latency below 100ms
20 © Mojo Networks. Confidential Information.
Wi-Fi Performance Failures & Latencies
© Mojo Networks21
Client Performance Failures – An Overview
300+ counters monitored by an AP, but following four key metrics capture client health.
Low Data Rate guideline is 20Mbps.
Low RSSI guideline is -70dbm
© Mojo Networks22
The State of WiFi Performance
0
10
20
30
40
50
60
70
80
90
100
Low DataRate
Low RSSI Stickiness High RetryRate
Unaffected
% C
lient
s
19% of the clients are affected due to performance issues.
(vs 5% clients that are affected due to connectivity)
Low Data Rate (17%) is the dominant factor.
© Mojo Networks23
Performance – With Respect To 2.4 & 5 GHz Bands
0
5
10
15
20
25
30
35
40
2.4GHz 5GHz
% C
lient
s
Low Data Rate Low RSSI High Retry Rate Stickiness
2.4GHz band has 3 times the performance issue compared to 5GHz.
30% unique affected
11% unique affected
3x of 5GHz Clients have a tendency
to chose 2.4 GHz at lower RSSIs.
Tx power and Cell Size matter
© Mojo Networks24
Performance Issues– With Respect To Verticals
0
10
20
30
40
50
60
70
80
90
100
Enterprise Retail & Hospitality Manufacturing Education
% C
lient
s
Low Data Rate
Low RSSI
High Retry Rate
Stickiness
© Mojo Networks25
State of DNS Latency in % Customers
0
5
10
15
20
25
30
35
40
45
50
(0,20]
(40,60]
(80,100]
(120,14
0]
(160,18
0]
(200,220]
(240,260]
(280,300]
(320,340]
(360,380]
(400,420]
(440,460]
(480,500]
(520,540]
(560,580]
(600,620]
(640,660]
(680,700]
(720,740]
(760,78
0]
(800,820]
(840,860]
(880,900]
(920,940]
(960,980]
% o
f cus
tom
ers
Latency (ms)
~ 70% of the customers have their baseline latency below 100ms
© Mojo Networks26
State of WAN Latency in % Customers
0
5
10
15
20
25
30
(0,20]
(40,60]
(80,100]
(120,14
0]
(160,18
0]
(200,220]
(240,260]
(280,300]
(320,340]
(360,380]
(400,420]
(440,460]
(480,500]
(520,540]
(560,580]
(600,620]
(640,660]
(680,700]
(720,740]
(760,78
0]
(800,820]
(840,860]
(880,900]
(920,940]
(960,980]
% o
f cus
tom
ers
Latency (ms)
~ 51 % of the customers have their baseline latency below 100ms
© Mojo Networks27
Benchmarking Latencies Across Verticals
4 4 4 440 44 39 5079 89 75
108
376335
467 453
0
50
100
150
200
250
300
350
400
450
500
Education Enterprise Manufacturing Retail_Hospitality
Late
ncy
(ms)
-M
edia
n
DHCP DNS WAN AAA
Clearly AAA latency needs to be optimized.
© Mojo Networks28
OCE: Short Connection Times
Auth. Frame
Auth. Response
Assoc. Request
Assoc. Response
FILS AUTH
FILS AUTH
EAP- RP used to reduce delays;
FILS key generated
during initial authentication
with AAA
Request for IP address can be piggybacked
AP can send IP address
Avoids delays due to security and IP address assignment
DHCP Messages
Higher layer packet containment IE added
to assoc. frames
© Mojo Networks29
Key Take-Aways
5% Connectivity Failures observed in production.
Can affect user experience
Its NOT Wi-Fi Always
DHCP & DNS failures dominate
Low link speed and RSSI dominant performance factors.
Need good visualization & automated analysis tools
Benchmark organizations against their peers/vertical
Improve overall user satisfaction of existing deployments
Cross customer analysis possible only through cloud.
Thank You!
@gopinathkn@mojonetworks_
Abhishek Kunal