a hierarchical ipv4 framework
DESCRIPTION
A Hierarchical IPv4 Framework. Patrick Frejborg [email protected] 24 Feb 2009. Why hIPv4 ?. Addressing RFC 4984 - PowerPoint PPT PresentationTRANSCRIPT
Why hIPv4 ?
Addressing RFC 4984
It is commonly recognized that today’s Internet routing and addressing system is facing serious scaling problems. The ever increasing user population, as well as multiple other factors including multi-homing, traffic engineering, and policy routing, have been driving the growth of the Default Free Zone (DFZ) routing table size at an increasing and potentially alarming rate. While it has been long recognized that the existing routing architecture may have serious scalability problems, effective solutions have yet to be identified, developed, and deployed.
Influence sources
The Locator ID Separation Protocol development work at IRTF MPLS solutions, mainly the shim header that made it possible to create new
services on top of an IP backbone Anycast Rendezvous Point (RP) with Multicast Source Discovery Protocol
(MSDP) IPv6 installations at Enterprises
Why would enterprises migrate to IPv6 – what will they gain? Bigger migration project than Y2K – for what reason? Applications have to be ported to IPv6, a lot of work to be done – who will
sponsor? Shortage of IPv4 is not the problem of an enterprise – will use NAT instead!
PSTN architecture Haven’t seen or heard that PSTN will soon run out of decimal numbers and that
we have to migrate to hexadecimal keypads, do you? Either not aware of scalability issues with SS7 – hidden prefixes to solve routing
issues are used between PSTN switches
So, what if…
What if we borrow concepts from existing solutions and glue them together Basic ideas and goals in LISP are definitely interesting, especially the Routing Locators
(RLOC) and Endpoint ID (EID) concept MPLS forwarding and shim header concept Anycast RP Numbering architecture from the PSTN, i.e. country and national destination code concepts
are ported to the IPv4 world – an “Internet country” is an Autonomous System or an area of a service provider!
Trade off is New hardware is needed at some spots in the Internet Minor software upgrade for Internet routers Extensions are needed for DNS and DHCP Extension to current IPv4 stack at hosts, but most applications continue to use the IPv4
socket API (stream and datagram sockets) Raw socket applications needs to be enhanced
Some basic rules (1)
Allocate a globally unique IPv4 block for RLOC allocations; hereafter called the Global RLOC Block (GRB)
Assign one RLOC for each Autonomous System (AS) or service provider, this AS or service provider area is called a RLOC realm
Only GRB prefixes are exchanged between RLOC realms A multihomed enterprise with an AS number will have a RLOC assigned
and thus is a RLOC realm Regional Internet Registries will allocate Provider Independent IP addresses
for enterprises – both single and multihomed. This assignment is unique in the country/countries where the IP block is deployed
Residential/consumer customers will use Provider Aggregatable IP addresses
Some basic rules (2)
Introduce extensions to current protocols DNS; add RLOC record for each host DHCP; add RLOC option for a scope Current IGP and BGP are still valid routing protocols Define a “shim” header that contains RLOC and EID information. The new shim
header is called a LISP header When the LISP header is inserted to an IPv4 datagram the new header
combination is called a hIPv4 header Introduce new functionalities, routing is still done upon the IPv4 forwarding
plane LISP Switch Router (LSR); in a certain situation the LSR shall swap the IPv4 and
LISP header The RLOC identifier is configured as an Anycast address on one or several LSR
within a RLOC realm Intermediate routers need to support hIPv4 in the control plane in order to reply
to ICMP requests
Outcome, when hIPv4 is fully implemented
Gaining several “recyclable” IPv4 address blocks Allocation of PI blocks are unique within a country or countries of deployment PA addresses are only locally significant within the RLOC realm
Creating hierarchy at the control plane Only GRB prefixes are announced between RLOC realms Multihomed enterprises will only advertise their assigned RLOC to the service
providers Single homed PI addresses are installed in the RIB of the local RLOC realm PA addresses are installed in the RIB of the local RLOC realm Current size of the Default Free Zone (DFZ) RIB is decreased No or minor changes to the current DFZ topology
No new signaling protocols, neither an overlay topology is introduced – instead AS destination based routing with IPv4 as the forwarding plane!
Life of a hIPv4 connection
AS 3RLOC 172.16.0.3
AS 4RLOC 172.16.0.4
AS 1RLOC 172.16.0.1
AS 2RLOC 172.16.0.2
AS 5RLOC 172.16.0.5
AS 6RLOC 172.16.0.6
10.1.1.1
www.foo.com10.2.2.2
LSR
LSR
LSR
LSR
LSR
10.2.2.2
LSR
Client -> Server
www.foo.com?
A-record: 10.2.2.2RLOC:172.16.0.5
R:172.16.0.3 E:10.2.2.2
S:10.1.1.1 D:172.16.0.5
R:172.16.0.3 E:10.2.2.2
S:10.1.1.1 D:172.16.0.5
S:172.16.0.3 D:10.2.2.2
R:172.16.0.5 E:10.1.1.1
S:10.1.1.1 D:10.2.2.2
S:172.16.0.3 D:10.2.2.2
R:172.16.0.5 E:10.1.1.1
S:10.1.1.1 D:10.2.2.2
IPv4 API
IPv4 header
LISP header
AS 3RLOC 172.16.0.3
AS 4RLOC 172.16.0.4
AS 1RLOC 172.16.0.1
AS 2RLOC 172.16.0.2
AS 5RLOC 172.16.0.5
AS 6RLOC 172.16.0.6
10.1.1.1
www.foo.com10.2.2.2
LSR
LSR
LSR
LSR
LSR
10.2.2.2
LSR
Server -> Client
R:172.16.0.3 E:10.2.2.2
S:172.16.0.5 D:10.1.1.1
R:172.16.0.3 E:10.2.2.2
S:172.16.0.5 D:10.1.1.1
S:10.2.2.2 D:172.16.0.3
R:172.16.0.5 E:10.1.1.1
S:10.2.2.2 D:10.1.1.1
S:10.2.2.2 D:172.16.0.3
R:172.16.0.5 E:10.1.1.1
S:10.2.2.2 D:10.1.1.1
IPv4 API
IPv4 header
LISP header
The hIPv4 header
Version 4 is still valid but new protocol IDs are needed for current IPv4 protocols (ICMP, IGMP, TCP, UDP, IP in IP, GRE, ESP, AH etc) in order for the stack to identify when IPv4 or hIPv4 header is applied
Forwarding network devices will calculate the IPv4 header checksum per each hop
Hosts shall calculate the TCP and UDP pseudoheader checksum including RLOC and EID values
Since remote LSR will swap the IPv4 and LISP header the TCP checksum will be bogus, unless…
LSR functionality
The assigned RLOC shall be configured as an Anycast address and announced to the Internet
When the IPv4 header’s destination address of the hIPv4 packet is equal to the RLOC at the remote LSR, then
verify IP and TCP/UDP checksum, include RLOC and EID values for the pseudoheader calculation
replace the source address in the IPv4 header with the RLOC address of the LISP header replace the destination address in the IPv4 header with the EID address of the LISP header replace the RLOC address in the LISP header with the destination address of the IPv4
header replace the EID address in the LISP header with the source address of the IPv4 header decrease TTL with one calculate IP and TCP/UDP checksums, include RLOC and EID values for the pseudoheader
calculation forward the datagram upon the destination address of the IPv4 header
The hIPv4 stack functionalities
The IPv4 socket API is still using the tuplets RLOC identifiers are provided by DHCP
and DNS schemas The hIPv4 stack must assemble the
outgoing datagram with local IP address -> src IP address remote IP address -> EID local RLOC -> RLOC remote RLOC -> dst IP address
The hIPv4 stack must present the headers of the incoming datagram to the IPv4 socket API as
src IP address -> remote RLOC dst IP address -> local IP address RLOC -> local RLOC EID -> remote IP address
Considerations
Src IP = Dst IP considerations
Since source and destination addresses are only locally significant within a RLOC realm there is a slight chance that source and destination address at the API will be the same when connections are established between RLOC realms.
Connection is still unique since two processes communicating over TCP form a logical connection that is uniquely identifiable by the tuplets involved, that is by the combination of < local_IP_address, local_port, remote_IP_address, remote_port>
AS 3RLOC 172.16.0.3
AS 4RLOC 172.16.0.4
AS 1RLOC 172.16.0.1
AS 2RLOC 172.16.0.2
AS 5RLOC 172.16.0.5
AS 6RLOC 172.16.0.6
10.1.1.1
www.foo.com10.2.2.2
LSR
LSR
LSR
LSR
LSR
10.2.2.2
LSR
Src IP = Dst IP considerations
www.foo.com?
A-record: 10.2.2.2RLOC:172.16.0.5
R:172.16.0.4 E:10.2.2.2
S:10.2.2.2 D:172.16.0.5
R:172.16.0.4 E:10.2.2.2
S:10.2.2.2 D:172.16.0.5
S:172.16.0.4 D:10.2.2.2
R:172.16.0.5 E:10.2.2.2
S:10.2.2.2 D:10.2.2.2
S:172.16.0.4 D:10.2.2.2
R:172.16.0.5 E:10.2.2.2
S:10.2.2.2 D:10.2.2.2
IPv4 API
IPv4 header
LISP header
“Identical connection situation”
Since source and destination addresses are only locally significant within a RLOC realm there is a slight chance that source and destination address and source ports at the API will be the same when connections are established from two clients residing in separate RLOC realms contacting a server in a third RLOC realm.
Connection is unique since two processes communicating over TCP form a logical connection that is uniquely identifiable by the tuplets involved, that is by the combination of < local_IP_address, local_port, remote_IP_address, remote_port>
But if the source port from both clients have the same value the connection is no longer unique!
Solution is, the hIPv4 stack must accept only one unique connection upon RLOC information, the “identical connection” is not allowed and the client is informed by an ICMP notification
AS 3RLOC 172.16.0.3
AS 4RLOC 172.16.0.4
AS 1RLOC 172.16.0.1
AS 2RLOC 172.16.0.2
AS 5RLOC 172.16.0.5
AS 6RLOC 172.16.0.6
10.1.1.1
www.foo.com10.2.2.2
LSR
LSR
LSR
LSR
LSR
10.1.1.1
LSR
“Identical connection situation”
www.foo.com?
A-record: 10.2.2.2RLOC:172.16.0.5
R:172.16.0.4 E:10.2.2.2
S:10.1.1.1 D:172.16.0.5
R:172.16.0.4 E:10.2.2.2
S:10.1.1.1 D:172.16.0.5
S:172.16.0.4 D:10.2.2.2
R:172.16.0.5 E:10.1.1.1
S:10.1.1.1 D:10.2.2.2
S:172.16.0.4 D:10.2.2.2
R:172.16.0.5 E:10.1.1.1
S:10.1.1.1 D:10.2.2.2
IPv4 API
IPv4 header
LISP header
S:10.1.1.1 D:10.2.2.2
R:172.16.0.3 E:10.2.2.2
S:10.1.1.1 D:172.16.0.5
R:172.16.0.3 E:10.2.2.2
S:10.1.1.1 D:172.16.0.5
R:172.16.0.5 E:10.1.1.1
S:172.16.0.3 D:10.2.2.2 R:172.16.0.5 E:10.1.1.1
S:172.16.0.3 D:10.2.2.2
www.foo.com?
A-record: 10.2.2.2RLOC:172.16.0.5
Traceroute considerations
The routers and devices in the path to the remote RLOC realm needs to support ICMP extensions
ICMP services are deployed in the control plane, the forwarding plane remains intact
That is, software upgrade is needed for the control plane The hIPv4 ICMP extensions shall be compatible with RFC 4884
AS 3RLOC 172.16.0.3
AS 4RLOC 172.16.0.4
AS 1RLOC 172.16.0.1
AS 2RLOC 172.16.0.2
AS 5RLOC 172.16.0.5
AS 6RLOC 172.16.0.6
10.1.1.1
www.foo.com10.2.2.2
LSR
LSR
LSR
LSR
LSR
10.2.2.2
LSR
Traceroute,1 (intra-AS)traceroute www.foo.com
A-record: 10.2.2.2RLOC:172.16.0.5
R:172.16.0.3 E:10.2.2.2
S:10.1.1.1 D:172.16.0.5
S:10.2.2.2 D:10.1.1.1
IPv4 API
IPv4 header
LISP header
ICMP extensions
S:172.16.0.3 D:10.1.1.1
R:172.16.0.5 E:OIF
ICMP extensions
AS 3RLOC 172.16.0.3
AS 4RLOC 172.16.0.4
AS 1RLOC 172.16.0.1
AS 2RLOC 172.16.0.2
AS 5RLOC 172.16.0.5
AS 6RLOC 172.16.0.6
10.1.1.1
www.foo.com10.2.2.2
LSR
LSR
LSR
LSR
LSR
10.2.2.2
LSR
Traceroute,2 (inter-AS)traceroute www.foo.com
A-record: 10.2.2.2RLOC:172.16.0.5
R:172.16.0.3 E:10.2.2.2
S:10.1.1.1 D:172.16.0.5
S:10.2.2.2 D:10.1.1.1
IPv4 API
IPv4 header
LISP header
ICMP extensions
R:172.16.0.3 E:10.2.2.2
S:10.1.1.1 D:172.16.0.5
S:OIF D:172.16.0.3
R:172.16.0.1 E:10.1.1.1
ICMP extensionsS:172.16.0.1 D:10.1.1.1
R:172.16.0.3 E:OIF
ICMP extensions
AS 3RLOC 172.16.0.3
AS 4RLOC 172.16.0.4
AS 1RLOC 172.16.0.1
AS 2RLOC 172.16.0.2
AS 5RLOC 172.16.0.5
AS 6RLOC 172.16.0.6
10.1.1.1
www.foo.com10.2.2.2
LSR
LSR
LSR
LSR
LSR
10.2.2.2
LSR
Traceroute,3 (target-AS)traceroute www.foo.com
A-record: 10.2.2.2RLOC:172.16.0.5
R:172.16.0.3 E:10.2.2.2
S:10.1.1.1 D:172.16.0.5
S:10.2.2.2 D:10.1.1.1
IPv4 API
IPv4 header
LISP header
ICMP extensions
R:172.16.0.3 E:10.2.2.2
S:10.1.1.1 D:172.16.0.5
S:OIF D:172.16.0.3
R:172.16.0.5 E:10.1.1.1
ICMP extensionsS:172.16.0.5 D:10.1.1.1
R:172.16.0.3 E:OIF
ICMP extensions
R:172.16.0.5 E:10.1.1.1
S:172.16.0.3 D:10.2.2.2
S:OIF D:172.16.0.3
R:172.16.0.5 E:10.1.1.1
ICMP extensions
Multicast considerations
Source address (S) for a group (G) is no longer visible outside the local RLOC realm (only GRB prefixes are seen), therefore Reverse Path Forwarding (RPF) is only valid within the local RLOC realm
In order to enable RPF globally for a (S,G), the multicast enabled LSR (mLSR) must at the source RLOC realm replace the source address with the local RLOC identifier
LSR in the source RLOC realm shall act as an Anycast RP with MSDP capabilities
The mLSR will decide which multicast groups are announced to other AS The receiver will locate the source via MSDP, the shared tree can be
established to the mLSR Source Specific Multicast schema will need an extension, RLOC and EID
options shall be added to SSM
Multicast forwarding
AS 3RLOC 172.16.0.3
AS 4RLOC 172.16.0.4
AS 1RLOC 172.16.0.1
AS 2RLOC 172.16.0.2
AS 5RLOC 172.16.0.5
AS 6RLOC 172.16.0.6
Source10.1.1.1
Receiver10.2.2.2
LSR
LSR
LSR
LSR
LSR
LSR& RP
S:10.1.1.1
G:225.5.5.5
R:172.16.0.3 E:10.1.1.1
S:172.16.0.3 D:225.5.5.5S:10.1.1.1 D:225.5.5.5R:172.16.0.3 l E:10.1.1.1
S:172.16.0.3 D:225.5.5.5
R:172.16.0.3 E:10.1.1.1
S:172.16.0.3 D:225.5.5.5
S:10.1.1.1 D:225.5.5.5
S:10.1.1.1 D:225.5.5.5
IPv4 API
IPv4 header
LISP header
R:172.16.0.3 E:10.1.1.1
RTCP receiver reports
AS 3RLOC 172.16.0.3
AS 4RLOC 172.16.0.4
AS 1RLOC 172.16.0.1
AS 2RLOC 172.16.0.2
AS 5RLOC 172.16.0.5
AS 6RLOC 172.16.0.6
Source10.1.1.1
Receiver10.2.2.2
LSR
LSR
LSR
LSR
LSR
LSR& RP
S:10.1.1.1
G:225.5.5.5
S:10.2.2.2 D:172.16.0.3
R:172.16.0.5 E:10.1.1.1S:172.16.0.5 D:10.1.1.1
R:172.16.0.3 E:10.2.2.2
S:10.2.2.2 D:172.16.0.3
R:172.16.0.5 E:10.1.1.1S:172.16.0.5 D:10.1.1.1
R:172.16.0.3 E:10.2.2.2
IPv4 API
IPv4 header
LISP header
S:10.1.1.1 D:225.5.5.5
Traffic Engineering considerations
Load balancing is influenced by the placement of LSRs within a RLOC realm; LSR provides “nearest routing” schema
A service provider can have several RLOC assigned; traffic engineering and filtering can be done upon RLOC addresses
If needed an RLOC identifier based Traffic Engineering solution can perhaps be developed. Establish explicit routing paths upon RLOC information, that is create explicit paths that can be engineered via specific RLOC realms.
Path MTU Discovery considerations
Since the hIPv4 header is assembled at the host the hIPv4 packet will use current PTMUD mechanisms
The network will not see any differences between the sizes of an IPv4 or an hIPv4 datagram
SIP considerations
SIP uses the local IP address of the host in the messages In SDP for the target of the media In the Contact of a REGISTER as the target for incoming INVITE In the Via of request as the target for a response
Since SIP is carrying IP addresses of hosts it have caused a lot of problems in NAT environments – hIPV4 can mitigate the pain since it will reduce the need of NAT
SIP needs to be extended to support the hIPV4 framework, i.e. carry RLOC information in the SIP messages New SDP attribute is needed to provide the RLOC information to the remote UA Add a RLOC Extension Header Field for SIP
AS 3RLOC 172.16.0.3
AS 4RLOC 172.16.0.4
AS 1RLOC 172.16.0.1
AS 2RLOC 172.16.0.2
AS 5RLOC 172.16.0.5
AS 6RLOC 172.16.0.6
LSR
LSR
LSR
LSR
LSR
LSR& RP
sip.foo.com10.3.3.3
SIP Registrar
SIP considerations, INVITEsip.foo.com?
A-record: 10.3.3.3RLOC:172.16.0.4
R:172.16.0.3 E:10.3.3.3
S:10.1.1.1 D:172.16.0.4
INVITE: [email protected] a=10.1.1.1SDP m=45668 RTPSDP l=172.16.0.3
R:172.16.0.4 E:10.1.1.1
S:172.16.0.3 D:10.3.3.3
INVITE: [email protected] a=10.1.1.1SDP m=45668 RTPSDP l=172.16.0.3
[email protected];R=172.16.0.5
R:172.16.0.4 E:10.2.2.2
S:10.3.3.3 D:172.16.0.5
INVITE: [email protected] a=10.1.1.1SDP m=45668 RTPSDP l=172.16.0.3
R:172.16.0.5 E:10.3.3.3
S:172.16.0.4 D:10.2.2.2
INVITE: [email protected] a=10.1.1.1SDP m=45668 RTPSDP l=172.16.0.3
AS 3RLOC 172.16.0.3
AS 4RLOC 172.16.0.4
AS 1RLOC 172.16.0.1
AS 2RLOC 172.16.0.2
AS 5RLOC 172.16.0.5
AS 6RLOC 172.16.0.6
LSR
LSR
LSR
LSR
LSR
LSR& RP
sip.foo.com10.3.3.3
SIP Registrar
SIP considerations, 200 OK
R:172.16.0.5 E:10.3.3.3
S:10.2.2.2 D:172.16.0.4
200 OKSDP a=10.2.2.2SDP m=35678 RTPSDP l=172.16.0.5
R:172.16.0.5 E:10.3.3.3
S:10.2.2.2 D:172.16.0.4
200 OKSDP a=10.2.2.2SDP m=35678 RTPSDP l=172.16.0.5
R:172.16.0.4 E:10.2.2.2
S:172.16.0.5 D:10.3.3.3
200 OKSDP a=10.2.2.2SDP m=35678 RTPSDP l=172.16.0.5
R:172.16.0.4 E:10.1.1.1
S:10.3.3.3 D:172.16.0.3
200 OKSDP a=10.2.2.2SDP m=35678 RTPSDP l=172.16.0.5
R:172.16.0.3 E:10.3.3.3
S:172.16.0.4 D:10.1.1.1
200 OKSDP a=10.2.2.2SDP m=35678 RTPSDP l=172.16.0.5
AS 3RLOC 172.16.0.3
AS 4RLOC 172.16.0.4
AS 1RLOC 172.16.0.1
AS 2RLOC 172.16.0.2
AS 5RLOC 172.16.0.5
AS 6RLOC 172.16.0.6
LSR
LSR
LSR
LSR
LSR
LSR& RP
sip.foo.com10.3.3.3
SIP Registrar
SIP considerations, RTPINVITE: [email protected] a=10.1.1.1SDP m=45668 RTPSDP l=172.16.0.3
R:172.16.0.5 E:10.1.1.1
S:10.2.2.2 D:172.16.0.3
RTP
200 OKSDP a=10.2.2.2SDP m=35678 RTPSDP l=172.16.0.5
R:172.16.0.3 E:10.2.2.2
S:10.1.1.1 D:172.16.0.5
RTP
Mobility considerations
Site mobility, a site wishes to changes its attachment point to the Internet without changing its IP address block. The change of attachment point is possible when PI addresses are allocated to the site. Only local RLOC identifier needs to be changed.
Host mobility, Alex C. Snoeren’s and Hari Balakrishnan’s “An End-to-End Approach to Host Mobility” is interesting. Since the IPv4 stacks needs to be enhanced studies should be carried out to see if “TCP connection method” can be implemented in the hIPv4 stack.
Another interesting host mobility solution is “Reliable Network Connections” paper by Victor C. Zandy and Barton P. Miller. Studies should be carried out to see rocks and racks can be integrated to the hIPv4 stack. http://pages.cs.wisc.edu/~zandy/rocks/
Transition considerations
Upgrades of host stacks, DNS & DHCP databases, security devices and network devices can be carried out in parallel without change of topology or major network breaks
LSRs can be added to an AS or a service provider area when commercially available in order to create a RLOC realm
When the hIPV4 framework is ready at a RLOC realm the RLOC record can be added for those hosts in the DNS, one by one.
Legacy IPv4 clients will still use legacy IPv4 schema but when a hIPv4 client receives a DNS response with RLOC (and not matching local RLOC) it can use the hIPV4 framework to reach the server. Intra-RLOC realm connections (remote RLOC=local RLOC) will use legacy IPv4 connections – no added value to use the hIPv4 framework inside a RLOC realm.
When will Internet migrate from a flat to a hierarchical topology? Possible tipping point #1; when the RIB of DFZ is getting close to the capabilities of current
hardware – who will pay for the upgrade? Or will the service provider only accept GRB prefixes from other providers and avoid capital expenses?
Possible tipping point #2; when the exhaust of IPv4 addresses is causing enough problems for enterprises
Both customer and provider have a common interest that Internet is available and affordable!
Security considerations
Hijacking of prefixes by longest match from another RLOC realm is no longer possible since the source prefix is separated by a locator.
In order execute a hijack of a certain prefix the whole RLOC realm must be routed via a bogus RLOC realm. Studies should be carried out with the Secure Inter-Domain Routing (SIDR) workgroup if the RLOC identifiers can be protected from hijacking.
Summary
Carrots for Everyone, Long Term
Enterprises No need to learn a new protocol, only RLOC concept is introduced Minimize porting of applications to a new protocol, IPv4 socket API is extended Get Provider Independent addresses without multihoming requirement, i.e.
achieve site mobility When hosts are upgraded to support the hIPv4 framework, NAT solutions can be
removed Internet Service Providers
No need to learn new routing protocols Remove IPv4 address constraints Hierarchical BGP, smaller RIB for each RLOC realm Internal prefix flaps are not seen in other RLOC realms, only GRB state changes
are reflected globally – “update churn” is reduced