dns and cdns (content distribution networks) paul francis cornell computer science

39
DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

Upload: sheena-knight

Post on 24-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

DNS and CDNs (Content Distribution Networks)

Paul Francis

Cornell Computer Science

Page 2: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

What do all of these have in common?

http://www.cnn.com/news/story.htmlHTTP (web)

mailto://[email protected]

sip://[email protected] (Session Initiation Protocol)

Page 3: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

They all have a DNS name somewhere

http://www.cnn.com/news/story.htmlHTTP (web)

[email protected]

sip://[email protected] (Session Initiation Protocol)

Page 4: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

Why is DNS so important?

Names are easier to remember than IP [email protected] ???

And in any event, IP addresses are not “dependable”They change often (dialup)They are not all unique

Page 5: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

DNS is the “core” of the Internet

So “we” (humans, and applications) like to deal with dependable, stable, friendly DNS names

The names get “mapped” into IP addresses by lower layersBy the Domain Name System (DNS)

Then the learned IP address is put into packets, and IP routing gets the packets across the Internet

Page 6: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

Picture of DNS query/reply

Page 7: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

Why all these dots?

Why falcon.cs.cornell.edu? Why not “cornell-falcon” or

something?

Page 8: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

It wasn’t always that way

Twenty years ago, this was a valid email address:george@isi

How did my computer learn the IP address of “isi”?

Page 9: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

The “host table” and DNS

Before DNS, there was the host table This was a complete list of all the hosts in

the Internet! It was copied every night to every machine

on the Internet! At some point, this was perceived as a

potential scaling bottleneck… So a distributed directory called the

“Domain Name System” was invented (DNS)

Page 10: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

The host table (historic)

Host Name IP Address

mit-dlab 133.65.14.77

isi-mail 24.72.188.13

mit-lcs 133.65.29.1

… …

Page 11: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

Distributed Directory

A primary goal of DNS was to have a distributed “host table”, so that each site could manage its own name-to-address mapping

But also, it should scale well!

Page 12: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

DNS is simple but powerful

Only one type of query Query(domain name, RR type)

• Resource Record (RR) type is like an attribute type

Answer(values, additional RRs) Limited number of RR types Hard to make new RR types

Not for technical reasons… Rather because each requires global

agreement

Page 13: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

DNS is the core of the Internet

Global name spaceCan be the core of a naming or

identifying scheme Global directory service

Can resolve a name to nearly every computer on the planet

Page 14: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

Important DNS RR types

NS: Points to next Name Server down the tree A: Contains the IP address

AAAA for IPv6 MX: Contains the name of the mail server Service-oriented RR types

SRV: Contains addresses and ports of services on servers

• One way to learn what port number to use NAPTR: Essentially a generalized mapping from one

name space (i.e. phone numbers) to another (i.e. SIP URL)

Page 15: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

DNS tree structure

.

edu.

cornell.edu.

cs.cornell.edu.

com. jp. us.

cmu.edu. mit.edu.

eng.cornell.edu.

foo.cs.cornell.edu A 10.1.1.1bar.cs.cornell.edu A 10.1.1.1

NS RR “pointers”

Page 16: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

Primary and secondary servers

cornell.edu.

cs.cornell.edu.

NS RRs point to both primary and secondary servers

RRs are initially configured into primary server

Primary server replicates RRs onto secondary servers periodically(updates are incremental)

Page 17: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

Resolver structure and configuration

.

edu.

cornell.edu.

cs.cornell.edu.

com. jp.

cmu.edu.

eng.cornell.edu.

Static configurationof root servers

Stub resolver resides on client

host, points to configured recursive

server

Resolver manages DNS queries on behalf of stub resolvers

Page 18: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

Resolver structure and configuration

.

edu.

cornell.edu.

cs.cornell.edu.

com. jp.

cmu.edu.

eng.cornell.edu.

1. Stub resolver sends recursive query

2,3,4… Resolver makes iterative queries to servers

N. Resolver returns final answer to stub resolver (which also caches result)

Resolver caches results for efficiency

Page 19: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

DNS cache management

All RRs have Time-to-live (TTL) values When TTL expires, cache entries are

removed NS RRs tend to have long TTLs

Cached for a long time Reduces load on higher level servers

A RRs may have very short TTLs Order one minute for some web services Order one day for typical hosts

Page 20: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

Caching is the key to performance

Without caching, the small number of machines at the top of the hierarchy would be overwhelmed

But what if you want to change the IP address of a host? How do you change all those cached entries around the world?You can’t…you wait until they timeout

on their own, then make your change

Page 21: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

Changing a DNS name

Say your TTL was set to one day This means that even if you change DNS

now, some hosts will continue to use the old address for a day

So, give the host two IP addresses for a while (the old one and the new one) But DNS only answers with the new one

After a day, the old one is cleaned out of caches, and you can remove it from the host

Page 22: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

DNS Issues

DoS attacks on (13) root serversDoS = Denial of Service

Mis-configuration issues But on the whole DNS is an incredible

system, and is in many important respects is the “core” of the Internethttp://www.cnn.com/[email protected]

Page 23: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

Next, Content Distribution Networks

Idea here is to replicate a “web server” in many places over the InternetLatency to a single centralized web

server farm may be too highA centralized web server farm may fail

Page 24: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

Content Routing Principle(a.k.a. Content Distribution Network)

S

ISP

BackboneISP

IX IX

S S

Site

S

ISP

S S S

ISP

S S

BackboneISP

BackboneISP

HostingCenter

HostingCenter

Sites

Page 25: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

Content Routing Principle(a.k.a. Content Distribution Network)

S

ISP

BackboneISP

IX IX

S S

Site

S

ISP

S S S

ISP

S S

BackboneISP

BackboneISP

HostingCenter

HostingCenter

Sites

CS CS CS

CS

CS

Content Origin hereat Origin Server

Content Servers distributed

throughout the Internet

OS

Page 26: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

Content Routing Principle(a.k.a. Content Distribution Network)

S

ISP

BackboneISP

IX IX

S S

Site

S

ISP

S S S

ISP

S S

BackboneISP

BackboneISP

HostingCenter

HostingCenter

Sites

CS CS CS

CS

CS

Content is served from content

servers nearer to the client

CC

OS

Page 27: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

Two basic types of CDN: cached and pushed

S

ISP

BackboneISP

IX IX

S S

Site

S

ISP

S S S

ISP

S S

BackboneISP

BackboneISP

HostingCenter

HostingCenter

Sites

CS CS CS

CS

CS

C C

OS

Page 28: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

Cached CDN

S

ISP

BackboneISP

IX IX

S S

Site

S

ISP

S S S

ISP

S S

BackboneISP

BackboneISP

HostingCenter

HostingCenter

Sites

CS CS CS

CS

CS

1. Client requests content.

C C

OS

Page 29: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

Cached CDN

S

ISP

BackboneISP

IX IX

S S

Site

S

ISP

S S S

ISP

S S

BackboneISP

BackboneISP

HostingCenter

HostingCenter

Sites

CS CS CS

CS

CS

1. Client requests content.

2. CS checks cache, if miss gets content from origin server.

C C

OS

Page 30: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

Cached CDN

S

ISP

BackboneISP

IX IX

S S

Site

S

ISP

S S S

ISP

S S

BackboneISP

BackboneISP

HostingCenter

HostingCenter

Sites

CS CS CS

CS

CS

1. Client requests content.

2. CS checks cache, if miss gets content from origin server.

3. CS caches content, delivers to client.

C C

OS

Page 31: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

Cached CDN

S

ISP

BackboneISP

IX IX

S S

Site

S

ISP

S S S

ISP

S S

BackboneISP

BackboneISP

HostingCenter

HostingCenter

Sites

CS CS CS

CS

CS

1. Client requests content.

2. CS checks cache, if miss gets content from origin server.

3. CS caches content, delivers to client.

4. Delivers content out of cache on subsequent requests.

C C

OS

Page 32: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

Pushed CDN

S

ISP

BackboneISP

IX IX

S S

Site

S

ISP

S S S

ISP

S S

BackboneISP

BackboneISP

HostingCenter

HostingCenter

Sites

CS CS CS

CS

CS

1. Origin Server pushes content out to all CSs.

C

OS

C

Page 33: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

Pushed CDN

S

ISP

BackboneISP

IX IX

S S

Site

S

ISP

S S S

ISP

S S

BackboneISP

BackboneISP

HostingCenter

HostingCenter

Sites

CS CS CS

CS

CS

1. Origin Server pushes content out to all CSs.

2. Request served from CSs.

C C

OS

Page 34: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

CDN benefits

Content served closer to client Less latency, better performance

Load spread over multiple distributed CSs More robust (to ISP failure as well as other

failures) Handle flashes better (load spread over

ISPs) But well-connected, replicated Hosting

Centers can do this too

Page 35: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

CDN costs and limitations

Cached CDNs can’t deal with dynamic/personalized content More and more content is dynamic “Classic” CDNs limited to images

Managing content distribution is non-trivial Tension between content lifetimes and

cache performance Dynamic cache invalidation Keeping pushed content synchronized and

current

Page 36: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

What if lots of clients try to access the same CS?

S

ISP

BackboneISP

IX IX

S S

Site

S

ISP

S S S

ISP

S S

BackboneISP

BackboneISP

HostingCenter

HostingCenter

Sites

CS CS CS

CS

CS

C C

OS

C CCC

Page 37: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

How can the CDN spread this load around?

S

ISP

BackboneISP

IX IX

S S

Site

S

ISP

S S S

ISP

S S

BackboneISP

BackboneISP

HostingCenter

HostingCenter

Sites

CS CS CS

CS

CS

C C

OS

C CCC

Page 38: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

Guess what: DNS!

Smart DNS server monitors load on the content servers

When it answers a DNS request, it picks a server that is not overloaded (and near the client)

The DNS answer has a small TTL (30 seconds – one minute) Small TTL allows the DNS load balancer to

make fine-grained load decisions Can quickly offload a busy or even crashed

content server

Page 39: DNS and CDNs (Content Distribution Networks) Paul Francis Cornell Computer Science

How well do CDNs work?

Hard to say… Some evidence suggests they are not so

good a picking nearby servers Internet bandwidth is improving, so not as

important to pick nearby servers Central hosting centers are easier to

manage, and perform increasingly well In fact, Akamai is beginning to find it difficult

to justify its service!