inter-domain traffic engineering principles, applications and case studies

61
Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Upload: tavion-legate

Post on 31-Mar-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Inter-Domain Traffic Engineering

Principles, Applications and Case Studies

Page 2: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Who We Are

Josh Wepman Applications Engineer/Snake Oil Salesman Ixia NetOps [email protected]

Joe Abley Toolmaker/Engineer/Token Canadian MFN PAIX [email protected]

Page 3: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

What We Are Talking About

Inter-domain Measurement, Analysis and Control

Improving Connectivity With whom? Where? At what speed?

Page 4: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

What we are NOT talking about

MPLS DiffServ RSVP CR-LDP All sorts of other words with lots of

capital letters that have become associated with “traffic engineering…”

Page 5: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Goals For The Afternoon

Methods and Concepts on how to "improve" inter-domain connectivity Depending on who YOU are, "improve" will have

different meanings

Finding ways to reduce impact of failure in peer or transit networks a.k.a. "increasing reliability“

WARNING: Some operational complexity may arise! Put on your peril-sensitive glasses...

Page 6: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Presentation Outline

Inter-Domain TE Goals Definition Inter-domain TE Measurement Applying Data to Address Your Goals Eliciting Control and the Feedback-Loop Conceptual Examples Who is Doing This Stuff? Real_Live_Network Examples No Questions? Good!

Page 7: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Inter-Domain TE Goals Definition

Iteration-1 – Conceptual

Define Goals, Measure, Analyze, Refine Goals, Action

What is it you need to accomplish?

Page 8: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Examples of Goals

Need to offload my "NSFnet" peering links outbound (congestion management)

Need to expand my inter-domain peering links cluefully (growth)

Need to find some people to provide my services to (sales) That's right, I said it…sell stuff!!!

Page 9: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Adjusting Your Assumptions

Be prepared to adjust your assumptions based on measured data!

What you planned to do, and what you end up doing may change substantially.

Do not fear - this is real network data! Clue should increase as valid network data

becomes available and consulted

Page 10: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Data Needs…

What data sets are required? Flow-export data

BGP routing data

Active measurement data

SNMP

Some public tools available (cflowd, zebra, ping, scotty, etc)

Some commercial products available…

Page 11: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Inter-domain TE Measurement

Also Known As:Getting good, problem/goal specific data!

Page 12: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Assumed Network Model

Hierarchical Network Model

Ingress/Egress Network services are separated from Transit Services

Works in other network models (as we will show), but this is what we are focusing on...

Page 13: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Hierarchical Network Model

Core1 Core2

Peer1 Peer2

AS2 AS3 AS3 AS4

Core Network Services

AS9

LocalASN

RemoteASN

Page 14: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Types of Data to Measure

Routing Data Focus here is BGP

Traffic Data Flow-export V5 is the focus here

Active Measurement Performance Data Ping/Traceroute/One-way delay/Jitter

Page 15: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Routing Data

Routers generally do this well

Core competency by design (Routers route...)

Different data sets are available for measurement

IBGP (Good if you are looking at the whole system, looking outbound or using a flat network model)

Route-Reflection (Often needed for inbound analysis, can create some complexity in flat netowrk models)

EBGP (Good for seeing your neighbor's view of you)

Choose the right one to measure based on your needs/goals

Page 16: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Routing Data – In/Outbound

Core1 Core2

Peer1 Peer2

AS2 AS3 AS3 AS4

Core Network Services

AS9

LocalASN

RemoteASN

Collector

Routes

Data

IBGP vs.Route-Reflection

Page 17: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Routing Data – In/Outbound

When your goal is outbound characterization, and your measurement point is the exit point for traffic, IBGP is your guy/girl/other. Routes are always external, and thus always

propagated (sans election and policy of course) “Protocols hate being anthropomorphized”

When your goal is inbound characterization, and your measurement point is the entry point for traffic, Route-Reflection must be used. Only way to get internal routes “cleanly”

Page 18: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Route Data – Full Mesh (tangent)

Value of full mesh monitoring… Historical route tracking Policy benchmarking Tracking med-selection issue Identifying disasters the FIRST time cluefully

Don’t just wait for it to happen again! PLEASE! For everyone’s sake!

Slightly off topic, but pretty darn important!

Page 19: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Route Data – Full Mesh (pic)

Core2

Core1

Core2

Core1Core2 Core1

Core2

Core1

Core2

Core1Core1 Core2

Collector

Page 20: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Traffic Accounting Data

Also Known As: Flow-export NetFlow Cflow A MAJOR pain in the AS!

Page 21: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

The Quick Skinny on Flow

Packet and Byte counters per unique set of traffic attributes

Measured from strategic routers per input interface

Which interfaces depends on your defined goals/needs...

Come a long way in the last few years In some respects…

Page 22: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Flow Data Inbound - Easy

Core1 Core2

Peer1 Peer2

AS2 AS3 AS3 AS4

Core Network Services

AS9

LocalASN

RemoteASN

Collector

Routes

Data

Page 23: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Flow Data Outbound - Easy

Core1 Core2

Peer1 Peer2

AS2 AS3 AS3 AS4

Core Network Services

AS9

LocalASN

RemoteASN

Collector

Routes

Data

Page 24: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Flow Data Outbound - Harder

Core

Core

Core Core

CoreAS6

AS2 AS4

AS3

Page 25: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Flow Data Outbound - Harder

Since flow-export data is inbound only, all potential feeder links in a non-hierarchical, mixed services device must be accounted for in order to catch all traffic outbound

Issue: How do you know what data coming in core link4 is bound for the local external link? Route Reflection is bad here! Can double-count!

Problem exacerbated by complex policy

Page 26: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

18 Words or less on flow data

Micro-management of networks based on flows == BAD

Macro-management of networks based on flows == GOOD

Page 27: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Operational Challenges (1)

Keep this in mind!

Gilb’s Law: “Anything can be measured in a way that is

superior to not measuring it at all.”

Page 28: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Operational Challenges (2)

ACLs vs. data-export in the great beast! Sampled NetFlow on the GSR is usually

distributed to the LCs ACL > SNF > PIRC > IP Coloring >

BGP Policy accounting > FR Traffic policing which is not FR traffic shaping

Apparently this changes in 12.0(18)S

Page 29: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Operational Challenges (3)

Some releases of JUNOS have bugs where only flow data from the highest-numbered ifIndex gets exported

Check for PR20159

Page 30: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Operational Challenges (4)

On high-speed interfaces, the best you can realistically do is sample at some ratio < 1:1 If you need to count bytes, this will introduce

errors If you need to compare samples, make sure

the samples are normalized This does NOT mean multiply by interval!

Lack of current research on statistical validity of flow data based on samples Last research circa 1993 Research predates substantial HTTP traffic

Page 31: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Operational Challenges (5)

The Gilb-Wepman Construct: “The total P.I.T.A. factor experienced through

the process of network measurement is far less than the total P.I.T.A factor experienced through planning and engineering a network without network measurements.”

P.I.T.A = Pain In The Ass those without customers may be unfamiliar with

this term

Page 32: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Performance Data

Active measurement Round-trip vs. one-way

mrtg and link utilization

Important, but not part of our examples Short on time sadly…

Helps in goal selection and re-selection Bottom line – is it better or worse?

Page 33: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Applying Data to your Goals

What to do with all this data?

Traffic Accounting Data applied to Routing data?

Traffic Load per <something> attribute or route The focus here is on traffic stats (byte and packet

rates) per AS-PATH

Page 34: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

AS-PATH / Traffic-data tables

Traffic load per AS-PATH creates a tree of traffic relationships (101) X-bits/sec (101,1234) Y-bits/sec (101,1234,9995) Z-bits/sec 101 -> 1234 -> 9995

X+Y+Z -> Y+Z -> Z Addresses the middle mile AS’s instead of

traditional first or last ASN. Allows "TO“ (source/sink) and "THROUGH“

(transit) values instead of just "TO" values.

Page 35: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Data Aggregation - Time

Aggregate data over timeframes (macro-level view) Long term averages Short term benchmarks

Of course, short term means “~long term”. Micro-management of networks based on flows

BAD!

Page 36: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Data Aggregation - Interfaces

Aggregate across the set of interfaces that represent your problem statement

What interfaces am I interested in? Can be interface specific (one) Can be router specific (many) Can be domain wide (all) Can be N of M interfaces (some)

Pretty common…

Page 37: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

What to do with all this?

What does one do once they have all this data?

Page 38: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Eliciting Control and The Feedback Loop

Sit down, Josh Begone with your Snake Oil It’s time to beat on some routers

Page 39: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Assumptions about your Routing Architecture

Routes to external networks are in BGP Your IGP tells you how to find the NEXT_HOP

addresses in BGP We select exit points for traffic based on BGP

path selection, not some other weird thing If your routing policy differs significantly from

this, you have more problems than measurement can solve

Page 40: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Fixing Outbound Traffic

Mark policy on BGP routes at the place where you learn them General policy -- prefer peering links over

expensive transit links, prefer private peering links over public peering links

Specific policy -- temporarily avoid NAP X for traffic to AS Y, prefer AS C to reach remote network D

Page 41: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Tweakable Knobs

LOCAL_PREF MED AS_PATH Check your vendor’s BGP path selection

tiebreaker list, and chose a set of knobs that gives you the kind of control your policy dictates

Page 42: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Control of Outbound Traffic

Danger, Will Robinson! Helpdesk phone may ring Small change, pause, check, log, pause,

breathe, repeat Exit selection is a reasonably precise

science

Page 43: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Fixing Inbound Traffic

Controlling inbound traffic flow is all about trying to influence the BGP path selection decisions which happens in networks you don’t control

Some of those networks you pay money to. Money is sometimes an appropriate weapon

It’s nice to buy people drinks at NANOG

Page 44: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Tweakable Knobs

Provider-specific knobs whois -h whois.ra.net as1755

CIDR abuse Cheap trick Longest prefix wins

AS_PATH stuffing AS_PATH pollution

Another cheap trick

Page 45: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Responsible Citizenship

Some tweakable knobs have an unwelcome impact on the networks of others Have you met my friend, MED?

Your relationship with your target networks is symbiotic

It is inappropriate to make demands of someone else’s routing policy, but asking nicely is OK

Page 46: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Conceptual Examples (1)

Who are the top consumers of my network resources? Top sources of traffic Top sinks of traffic Asymmetry

Page 47: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Conceptual Examples (2)

Traffic Aggregation Points and Peering Optimisation Appropriate network expansion Offloading the expensive peer

Mitigating settlement fees and traffic ratios Mitigating congestion

Do it without MED selection issues Maximize route availibility (N>1 copies, not 1 or 0)

Page 48: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Conceptual Examples (3)

Theft-over-IP (how to know when peers are stealing from you) Peers dumping traffic at you for routes you

didn’t send them Rather rude Catch them in the act

Page 49: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Who is doing this stuff?

Yahoo! - Jeffrey Papen (TUNDRA Tool) Peering Analysis, Capacity Planning, Performance

Analysis Features:

Custom macros for AS analysis: Source and Destination AS bandwidth details Transit AS (hop counts) bandwidth summary data Bandwidth forecasting; peering merit analysis Billing formulas for cost/benefit budget analysis

Also: Analyze internal usage for Charge Back Billing POP-to-POP Network Performance Analysis (latency / loss) DOS attack detection

Page 50: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Destination vs. Transit Traffic – UUNet (Yahoo – TUNDRA Output)

Page 51: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Who is doing this stuff?

MFN Lots of people, we think Not enough people, we think

Page 52: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Real Live Network Examples 1

We peer with a particular large regional ISP in several places. Due to various familiar reasons, the demands on the peering circuits approach supply

Who are the top talkers and top listeners that we reach via this peer?

Maybe we can peer with them directly Not just sinks, but traffic aggregation

points (middle mile)

Page 53: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Network Facts

Topology is not pure core/edge in some locations, so we might expect some complexities

All peering routers happen to be GSR12000s

Peering circuits are all OC12 Backbone links are mostly OC48

Page 54: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Data Collection

Relative traffic volumes Low NetFlow sample ratio is OK

Turning on “ip route-cache flow sampled” seems like it can cause traffic belches

Turn off all inbound ACLs on peering interfaces

Turn off all outbound ACLs on peering routers Drink from the Hose Take off every /var

Page 55: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Analysis of Data

Relative byte count through and to networks reached through the peer in question

Ranked list of peering candidates Absolute numbers don’t really matter; we

have a list of people we should be talking to, in order of how useful they would be to peer with

Page 56: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

SeeASP Output

Facets:TimeInterval : 12/4/01 11:03:59.55 - 12/6/01 13:40:10.02 EST

RouterIpv4Addr : 63.136.120.65RouterAS : 3549

RouterName : DiamondJoeAS P ppsThru bpsThru ppsTo bpsTo ppsTotal bpsTotal

----- - ---------- ------------ ---------- ------------ ---------- ------------3561 P 1.05 933.57 74.34 64.633K 75.39 65.567K701 P 4.63 2.401K 35.21 14.653K 39.84 17.054K209 P 0.82 7.324K 0 1.36 0.82 7.325K

3967 P 0.6 297.19 11.3 5.694K 11.91 5.991K6461 P 0 3.1 11.51 4.790K 11.51 4.793K8112 - 0 0 0.57 4.699K 0.57 4.699K

19262 - 0.57 4.699K 0 0 0.57 4.699K7018 P 8.44 4.244K 0 1.26 8.44 4.246K

1 P 8.6 3.576K 0 1.56 8.6 3.578K87 - 0 0 8.16 3.396K 8.16 3.396K

286 - 0.24 2.621K 0 0.05 0.24 2.621K2603 - 0.24 2.620K 0 0.24 0.24 2.620K1653 - 0 0.05 0.24 2.619K 0.24 2.620K

10764 - 0 0 5.36 2.230K 5.36 2.230K703 - 0 1.25 4.36 1.815K 4.37 1.816K

7660 - 0 0 3.23 1.344K 3.23 1.344K3549 - 0 0 2.75 1.306K 2.75 1.306K

14265 - 0 0 1.05 934 1.05 934

Page 57: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Real Live Network Examples 2

AS R wants to peer That’s fine, we’ll public peer with

anybody. We’re easy. AS R wants to private peer right away,

since they say we send them 140M of traffic already

Can we confirm those numbers before we dedicate a port to them?

Page 58: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Network Facts

We currently reach AS R through AS T We peer with AS T in six places One of the peering routers is a 7500,

which doesn’t do SNF One of the peering routers is a router

which is also being used to collect data to answer the previous question

Page 59: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

More Network Facts

Topology is not edge/core everywhere We want numbers out of this, so we

need to manage the SNF ratios K1dd13s keep attacking the routers

Ops folk attack K1dd13s with ACLs The ACL attacks the SNF The SNF dies!

Page 60: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Analysis

We only have traffic samples, but we want absolute numbers

We have interface byte and packet counters

We can take AS R traffic as a proportion of all AS T traffic, and divide up the mrtg/duck data in proportion

Page 61: Inter-Domain Traffic Engineering Principles, Applications and Case Studies

Summary

What did we talk about? Answering specific, ad-hoc questions by attacking

them with numbers Inter-Domain Traffic Engineering is an Iterative

process (lather, rinse, repeat) What didn’t we talk about?

Experience exporting from Juniper (and other non-cisco) routers

Construction of a full-time, general-purpose measurement infrastructure

What if my vendor does not support flow-export and traffic accounting?

Questions? No? Good.