wifi issues in the wild · wifi issues in the wild a view from the cloud #wlpc phoenix 2018 kn...

30
WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

Upload: others

Post on 30-Dec-2019

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

WiFi Issues In The WildA view from the cloud #WLPC Phoenix 2018

KN Gopinath (VP of Engineering)Karan Gupta (Senior Researcher)

Page 2: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

© Mojo Networks2

Control Plane - Distributed Zero functionality loss if connectivity with cloud breaks

Management traffic to

and from cloud

Management Plane - CentralizedAccess | Security | Engagement

Data Plane – Flexible

Local Breakout of Data

1

Data Tunnel

VirtualizedTraffic Aggregator

3 2

3a3b

Data Integration (Syslog, RESTful APIs)

• Logging• SIEM• CRM

Mojo Cloud Architecture - Inception in 2007, Evolving Continuously

SCALEOne example: A Cloud instance is managing 100,000 LIVE APs

ROBUSTWorks even if connectivity to manager is LOST

Page 3: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

© Mojo Networks3

Cognitive Wi-Fi Powered Through Cloud

1. Management Plane - Centralized

3. Data Plane - Flexible 2. Control Plane – Distributed

4. Cognition Plane - Intelligent

3b. Data tunnel

3a. Local data breakout

BIG DATAStore key client parameters & run ML algorithms.

SMART EDGE APsFailure Detection “Needle in the hay stack”

Page 4: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

© Mojo Networks4

Wi-Fi User Experience: Connectivity

Client EventsSent real-time to the cloud.

Client Journey Extending the Philosophy of Mojo Packets (launched in 2013)

Page 5: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

© Mojo Networks5

Wi-Fi User Experience: Performance

Anomaly Detection & Baselining of various performance parameters computed and stored in the cloud.

Page 6: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

© Mojo Networks6

User Application Experience

NEW! Launching at WLPC!

Deep Packet Inspection, Machine Learning

Page 7: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

© Mojo Networks7

Failure Analysis Overview

Reference Data

Connectivity Failures

Performance Failures

Conclusion

Page 8: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

© Mojo Networks8

Reference Data

Clients

237k+

Associations

31M

Applications

400+

Duration

1 week

Verticals

Enterprise

Education

Manufacturing

Retail & Hospitality

Anonymized and a subset from our production cloud.

Page 9: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

© Mojo Networks9

The State of Apps

0102030405060708090

100

Google

Google APIs

Amazon W

eb ServicesApple

Facebook

Google PlayiTunes

YouTube

Microsoft

Akamai

iCloudgmail

Doubleclick

Windows Update

Twitter

MS Onlin

e

Apple Update

Facebook Video

MS Offic

e 365Yahoo

Google Ads

Amazon

SpotifyIC

MP

Adobe

Google Drive

MS Outlo

okSTUN

LinkedIn

Snapchat

Netflix V

ideo Stream

Exchan

ge Onlin

eCIFS

WhatsApp M

edia Messa

ge

Instagram

APNS

AppNexus

Google Analytic

s

Crashlyt

ics

Integral Ad Science

% o

f Cus

tom

ers

Page 10: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

10 © Mojo Networks. Confidential Information.

Wi-Fi Connectivity Failure & Latencies

Page 11: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

© Mojo Networks11

WiFi almost always gets blamed!

Page 12: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

© Mojo Networks12

Client Connectivity Failures

Ultimate Truth: Instrumented AP Driver Code to tap into client’s state machine.

Association Failures

• AP association limit exceeded

• Capability mismatch• Generic assoc. failure

Authentication Failures

• Fast roaming failed• RADIUS auth. failure• RADIUS Server not

reachable• Incorrect PSK• EAPOL 4-Way

handshake failed

Network Failures

• DHCP failure• DNS failure

Page 13: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

© Mojo Networks13

The State of WiFi Connectivity

94% 6%

Successful Connections Failed Connections

More than 3000 connections happen in our network per minute (on an average).

Approximately 6%of connections fail.

Note that this corresponds to about 5% of the clients.

Page 14: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

© Mojo Networks14

Why do we see Connectivity Failures in an otherwise well configured and operational Network?

2%

46%52%

Association Authentication Network

More than 50% of the connection failures are due to wired side issues.

Page 15: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

© Mojo Networks15

Failure Distribution - We DID notice some connectivity errors that were transient in nature

19% PSK Errors. Possibly due to Guest users and/or BYOD

13% EAPOL Errors. Mostly, transient and self-correcting in nature.

Assoc. Failure2%

Mac Filtering4% RADIUS Auth

Failure4%

RADIUS Server Unresponsive

1%

Incorrect PSK19%

EAPOL Handshake

Failure13%

Portal Failures5%

DHCP Failure25%

DNS Failure27%

Page 16: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

© Mojo Networks16

Connectivity – With Respect To 2.4 & 5 GHz Bands

0

1

2

3

4

5

6

7

8

2.4GHz 5GHz

% C

lient

sAssociation Authentication Network

How many of you think that the connectivity issues are similar in both the bands?

Page 17: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

© Mojo Networks17

Connectivity Failures – With Respect To Verticals

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Enterprise Manufacturing Education Retail &Hospitality

% o

f con

nect

ions

Assocation Authentication Network

DHCP/DNS errors predominate in Retail & Hospitality

Due to transient nature of guest users.

Page 18: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

© Mojo Networks18

State of AAA Latency For Successful Connections

0

2

4

6

8

10

12

14

16

18

20

(0,20]

(40,60]

(80,100]

(120,14

0]

(160,18

0]

(200,220]

(240,260]

(280,300]

(320,340]

(360,380]

(400,420]

(440,460]

(480,500]

(520,540]

(560,580]

(600,620]

(640,660]

(680,700]

(720,740]

(760,78

0]

(800,820]

(840,860]

(880,900]

(920,940]

(960,980]

% o

f cus

tom

ers

Latency (ms)

~ 62% of the customers have their baseline latency below 500ms

Page 19: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

© Mojo Networks19

State of DHCP Latency For Successful Connections

0

10

20

30

40

50

60

70

80

90

100

(0,20]

(40,60]

(80,100]

(120,14

0]

(160,18

0]

(200,220]

(240,260]

(280,300]

(320,340]

(360,380]

(400,420]

(440,460]

(480,500]

(520,540]

(560,580]

(600,620]

(640,660]

(680,700]

(720,740]

(760,78

0]

(800,820]

(840,860]

(880,900]

(920,940]

(960,980]

% o

f cus

tom

ers

Latency (ms)

~ 80% of the customers have their baseline latency below 20ms

And ~ 90% have their baseline latency below 100ms

Page 20: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

20 © Mojo Networks. Confidential Information.

Wi-Fi Performance Failures & Latencies

Page 21: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

© Mojo Networks21

Client Performance Failures – An Overview

300+ counters monitored by an AP, but following four key metrics capture client health.

Low Data Rate guideline is 20Mbps.

Low RSSI guideline is -70dbm

Page 22: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

© Mojo Networks22

The State of WiFi Performance

0

10

20

30

40

50

60

70

80

90

100

Low DataRate

Low RSSI Stickiness High RetryRate

Unaffected

% C

lient

s

19% of the clients are affected due to performance issues.

(vs 5% clients that are affected due to connectivity)

Low Data Rate (17%) is the dominant factor.

Page 23: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

© Mojo Networks23

Performance – With Respect To 2.4 & 5 GHz Bands

0

5

10

15

20

25

30

35

40

2.4GHz 5GHz

% C

lient

s

Low Data Rate Low RSSI High Retry Rate Stickiness

2.4GHz band has 3 times the performance issue compared to 5GHz.

30% unique affected

11% unique affected

3x of 5GHz Clients have a tendency

to chose 2.4 GHz at lower RSSIs.

Tx power and Cell Size matter

Page 24: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

© Mojo Networks24

Performance Issues– With Respect To Verticals

0

10

20

30

40

50

60

70

80

90

100

Enterprise Retail & Hospitality Manufacturing Education

% C

lient

s

Low Data Rate

Low RSSI

High Retry Rate

Stickiness

Page 25: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

© Mojo Networks25

State of DNS Latency in % Customers

0

5

10

15

20

25

30

35

40

45

50

(0,20]

(40,60]

(80,100]

(120,14

0]

(160,18

0]

(200,220]

(240,260]

(280,300]

(320,340]

(360,380]

(400,420]

(440,460]

(480,500]

(520,540]

(560,580]

(600,620]

(640,660]

(680,700]

(720,740]

(760,78

0]

(800,820]

(840,860]

(880,900]

(920,940]

(960,980]

% o

f cus

tom

ers

Latency (ms)

~ 70% of the customers have their baseline latency below 100ms

Page 26: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

© Mojo Networks26

State of WAN Latency in % Customers

0

5

10

15

20

25

30

(0,20]

(40,60]

(80,100]

(120,14

0]

(160,18

0]

(200,220]

(240,260]

(280,300]

(320,340]

(360,380]

(400,420]

(440,460]

(480,500]

(520,540]

(560,580]

(600,620]

(640,660]

(680,700]

(720,740]

(760,78

0]

(800,820]

(840,860]

(880,900]

(920,940]

(960,980]

% o

f cus

tom

ers

Latency (ms)

~ 51 % of the customers have their baseline latency below 100ms

Page 27: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

© Mojo Networks27

Benchmarking Latencies Across Verticals

4 4 4 440 44 39 5079 89 75

108

376335

467 453

0

50

100

150

200

250

300

350

400

450

500

Education Enterprise Manufacturing Retail_Hospitality

Late

ncy

(ms)

-M

edia

n

DHCP DNS WAN AAA

Clearly AAA latency needs to be optimized.

Page 28: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

© Mojo Networks28

OCE: Short Connection Times

Auth. Frame

Auth. Response

Assoc. Request

Assoc. Response

FILS AUTH

FILS AUTH

EAP- RP used to reduce delays;

FILS key generated

during initial authentication

with AAA

Request for IP address can be piggybacked

AP can send IP address

Avoids delays due to security and IP address assignment

DHCP Messages

Higher layer packet containment IE added

to assoc. frames

Page 29: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

© Mojo Networks29

Key Take-Aways

5% Connectivity Failures observed in production.

Can affect user experience

Its NOT Wi-Fi Always

DHCP & DNS failures dominate

Low link speed and RSSI dominant performance factors.

Need good visualization & automated analysis tools

Benchmark organizations against their peers/vertical

Improve overall user satisfaction of existing deployments

Cross customer analysis possible only through cloud.

Page 30: WiFi Issues In The Wild · WiFi Issues In The Wild A view from the cloud #WLPC Phoenix 2018 KN Gopinath (VP of Engineering) Karan Gupta (Senior Researcher)

Thank You!

@gopinathkn@mojonetworks_

Abhishek Kunal