Date post: | 06-Jan-2017 |
Category: |
Technology |
Upload: | thousandeyes |
View: | 466 times |
Download: | 4 times |
1
Network IntelligenceWithout BordersMohit LadCEO and Co-founder
2
About ThousandEyes
Founded by network experts; strong investor
backing
Relied on for critical operations
by leading enterprises
Recognized as an innovative new approach
ThousandEyes delivers network intelligence into every network.
30 Fortune 5005 top 5 SaaS Companies
4 top 6 US Banks
3
When You Think of Network Troubleshooting
4
Legacy EnvironmentsNY Branch
HK Branch
Datacenter
• On-premises Apps• Users in branch offices
over wired connections• MPLS backboneMPLS
MPLS
5
Internet Centric Environment• Adoption of Cloud
Applications• Split-tunnel from
branch offices• Direct Internet
Connectivity between branch offices
• Wireless becoming primary connectivity at branch offices
• Remote Users accessing cloud applications directly
NY Branch
HK Branch
Datacenter 0365
Internet
6
ThousandEyes Cloud AgentsNY Branch
Datacenter 0365Internet
7
ThousandEyes Enterprise AgentsNY Branch
Datacenter 0365Internet
8
ThousandEyes Endpoint AgentsNY Branch
Datacenter 0365Internet
9
Product Design Principles
Intuitive &Effective UI
Harness the Power of SaaS
Innovative Data Collection &
Analytics• Powerful visualizations to
model complex data• UI design that is re-
usable and scalable• Seamless support help
• Minimal deployment effort• Auto-updates• Centralized configuration• Cross-customer data
correlation and analysis• Easy data sharing
between different customers
• Measure black-box environments using active probing
• Measure with minimum instrumentation
10
• Tackling Hybrid Network Environments with Enterprise Agents– Nick Kephart
• End to End Visibility with Endpoint Agent– Scott Cressman, Martin Dam
• Internet Outage Detection– Ricardo Oliveira
Rest of the Day
11
Tackling Hybrid Network Environments with Enterprise AgentsNick Kephart
12
Enterprise Agent: Internal Vantage PointKey Use Cases• Internet connectivity
of ISP ingress and egress
• WAN visibility between branches and data centers
• Performance of web, voice and FTP application traffic
NY Branch
HK Branch
Datacenter 0365
Internet
13
Deploying Enterprise Agents
• Locations with containerized monitoring and operations tools
• For remote branches and stores with limited IT infrastructure
• Branch and WAN routers (IOS XE 3.17+ on ASR 1000 and ISR 4000)
New
New
New
Virtual Appliance
Docker Container
Linux Package
Intel NUC Installer
Cisco IOS Virtual Container
• Easily deployable across the enterprise WAN and data center
14
Visualizing the Entire Network PathHighlights• Forward and reverse
path (helpful for asymmetric routing)
• Measure and locate changes in loss, latency and QoS in each direction
• Also test UDP in addition to TCP
15
End-to-End Visibility with Endpoint AgentScott CressmanMartin Dam
16
End User Visibility Challenges• Remote and traveling
workers• SaaS deployments• LAN and WAN issues
in satellite offices
NY Branch
HK Branch
Datacenter 0365
Internet
17
Today’s “Solutions”
18
Enter ThousandEyes Endpoint Agent
You can’t get this from any other monitoring solution, period.
• Extends visibility to the end-user, in the office, at home, on-the-go
• Troubleshoot individual user sessions with live performance data
• Analyze trends across user populations, applications, geographies
19
How Endpoint Agent Works
Lightweight client softwareWindows 7+, Mac OS X 10.9+
Negligible resource consumptionTypically <1% CPU, <40MB mem, <50MB disk
Easy deployment via standard toolsmsi & pkg installers w/ auto-registration
End-user & background componentsBrowser plugin (Chrome & IE) & system service
Always up-to-dateUpdates automatically, runs in the background
WEB/APPLICATIONCompletion, availability, response time, page load waterfall
NETWORKLoss, latency, jitter, failures, path visualization, wireless topology, VPN, proxy, Wi-Fi quality
(live user sessions!)
Browser-based web applications• Only collects data for domains you choose to monitorData streamed instantly to ThousandEyes service
20
Complete Visibility from End User to Application
Internet Outage Detection
Ricardo OliveiraCTO and Co-founder
22
The Problem Landscape• Lack of visibility to apps
relying on the Internet {UC,S,I,P}aaS
• Lack of visibility to wireless/remote/mobile users
• Traditional NPM solutions design for static clients and on-prem apps– Packet capture– SNMP polling
NY Branch
HK Branch
Datacenter 0365
Internet
23
ThousandEyes AgentsNY Branch
Datacenter 0365Internet
24
• Internet is a shared network – same event impacts multiple customers
• Harness data from multiple customers for more accurate inference of problem
• Drive more value to customers with knowledge of depth and breadth of problem
Drive for Internet Outage Detection
25
• Detect outages in ISPs and understand their impact both globally and as it relates to a specific customer
Overview: Internet Outage Detection
• See the global and account scope, as well as likely root cause of BGP reachability outages
Traffic Outage Detection
Routing Outage Detection
26
1. Anonymized (http) traffic data is aggregated from all tests across the entire user base
2. Algorithms then look for patterns in path traces terminating in the same ISP
3. Exclude: noisy interfaces and networks not belonging to ISPs
How Traffic Outage Detection Works
New YorkCloud Agent
BostonEnterprise Agent
Los AngelesCloud Agent
Level 3 in San Jose
Cogent in Denver
Salesforce
NY Times
Customer 2
Customer 1
27
Traffic Outage DetectionAccount scope
Global scope
Severity and scope of the issue at this interface
28
• ~ 170 affected interfaces / hour
Traffic Outages All the Time
29
Routing Outage DetectionAggregates reachability issues in routing data from 350 routers
Global scope
Account scope
Root cause analysis
30
• ~ 1.6k prefixes affected / hour
Routing Outages All the Time
31
Hurricane Electric route leak affecting AWSTrans-Atlantic issues in Level 3– https://blog.thousandeyes.com/trans-atlantic-issues-level-3-network/Tata and TISparkle issues with submarine cable– https://blog.thousandeyes.com
/smw-4-cable-fault-ripple-effects-across-networks/Hurricane Electric removed >500 prefixesTata cable cut in Singapore affecting DropboxLevel 3, NTT routing issues affecting JIRA– https://blog.thousandeyes.com/identifying-root-cause-routing-outage-detecti
on/
Widespread issues in Telia’s network in Ashburn– https://blog.thousandeyes.com/analyzing-internet-issues-traffic-outage-dete
ction/
Recent Major Outages DetectedApril 23May 3
May 20
June 6June 24July 10
July 17
32
Examples of Notable Outages
33
1. Network Layer Issues in Telia in Ashburn
Detected outage coincides with packet loss spikes
Ashburn, VA is “ground zero” for this outage
https://fvqmu.share.thousandeyes.com/
34
Specific Failure Points in Telia
High severity and wide scope (Outages affecting at least 20 tests for a NA/EU interface are likely to be wide in scope)
Terminal nodes in Telia
35
2. Hurricane Electric Route Flap
Detected outage coincides with spike in AS path changes
Root cause analysis points to Hurricane Electric and Telx
https://njjgkif.share.thousandeyes.com/
36
Route Flap by Hurricane Electric
Hurricane Electric
Routes flap from using HE to NTT, then back to HE
37
Traffic Issues in Hurricane Electric
Hurricane Electric
38
3. NTT and Level 3 Routing Issues Affect JIRA
JIRA saw 0% availability and 100% packet loss
Most affected interfaces are in Ashburn, VA
https://ncigwwph.share.thousandeyes.com/
39
Traffic Terminating in NTT
Traffic paths originally traversed Level 3 and NTT
Traffic paths then change to traverse only NTT, terminating there
40
JIRA’s /24 Prefix Becomes Unreachable
As the primary upstream ISP, Level 3 is associated with the most affected routes Routes through upstream
ISPs NTT and Level 3 all withdrawn
41
Routers Begin Using Misconfigured /16 Prefix
The backup /16 prefix directs to NTT, not JIRA’s network. This is why the traffic path changed to traverse only NTT, terminating there when JIRA’s IP couldn’t be found in NTT’s network.
42
Traffic Outages @ Cloud• IaaS/PaaS (CDNs, hosting, DNS providers)• SaaS (+ app context)
Routing Outages• Leaks and hijacks
Outage Event Stream• Outage geo + topology maps• Alerts based on outage impact/location/type/etc
What’s Next
43
Outage Created by Level3 Flaphttps://btwzofam.share.thousandeyes.com
44
• Look for purple indicators and the ‘Outage Detected’ dropdown when investigating issues—these indicate detected outages!
• Use quick links or select specific nodes/ASes to see how paths have changed over time
• Correlate data from the web, network and routing layers to analyze root cause
• See our blogs and Knowledge Base articles for more info:– Blog on Traffic Outage Detection
– https://blog.thousandeyes.com/analyzing-internet-issues-traffic-outage-detection/
– Blog on Routing Outage Detection– https
://blog.thousandeyes.com/identifying-root-cause-routing-outage-detection/ – Knowledge Base: https://support.thousandeyes.com/entries/110214366
Tips for Diagnosing Internet Outages
45
Thank You@thousandeyes