CS193H:High Performance Web Sites
Lecture 14: Rule 11 – Avoid Redirects
Steve SoudersGoogle
announcementsmidterm Friday 10/31 3:15-4:05pm
30-40 short answer questions
10/29 3:15pm (now) – check five Web 100 Performance Profile sites
11/3 – Doug Crockford from Yahoo! will be guest lecturer talking about "Ajax Performance"
3xx status codes"further action needs to be taken by the user agent in order to fulfill the request"• 300 Multiple Choices (based on Content-Type)• 301 Moved Permanently• 302 Moved Temporarily (aka, Found)• 303 See Other (clarification of 302)• 304 Not Modified • 305 Use Proxy• 306 (no longer used)• 307 Temporary Redirect (clarification of 302)
http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3
response for conditional GET request
most popular
HTTP/1.1
redirect example
go to the new location instead of the original onewhy use redirects?
prettier URLstrack trafficauthentication
GET / HTTP/1.1Host: astrology.yahoo.com
Request
HTTP/1.1 301 Moved PermanentlyLocation: http://shine.yahoo.com/astrology/
Response
worst blocker
inserting a redirect to the HTML document is worse than how stylesheets and scripts block• all resources in the page are delayed• the user gets very little feedback (nothing in the
page)• rendering, even the HTML text, is delayed
2nd worse – redirecting to a script
caching redirects
"Moved Permanently" – is it cached?no
spec: "cacheable if indicated by a Cache-Control or Expires header field"
GET / HTTP/1.1Host: astrology.yahoo.com
Request
HTTP/1.1 301 Moved PermanentlyDate: Tue, 28 Oct 2008 07:39:53 GMTLocation: http://shine.yahoo.com/astrology/Cache-Control: privateConnection: closeTransfer-Encoding: chunkedContent-Type: text/html; charset=utf-8
Response
caching: 301, 302, Expires
past Expires no Expires future Expires
301Moved
Permanently
don't cache:all
don't cache: IE, FF3, Safari, Opera
cache: FF2, Chrome
don't cache: IE, FF3, Safari, Opera
cache: FF2, Chrome
302Moved
Temporarily
don't cache:all
don't cache: IE, FF3, Safari, Opera, Chrome
cache: FF2
don't cache: IE, FF3, Safari, Opera
cache: FF2, Chrome
IE 7 & 8(beta 2), Firefox 2.0 & 3.0, Safari 4, Opera 9.61, Chrome 0.2
FF2 and Chrome – only browsers to cache redirects*Firefox 3.1 fixes regression from FF2 to FF3
past Expires no Expires future Expires
301Moved
Permanently
don't cache:all
don't cache: IE, FF3, Safari, Opera
cache: FF2, Chrome
don't cache: IE, FF3*, Safari, Opera
cache: FF2, Chrome
302Moved
Temporarily
don't cache:all
don't cache: IE, FF3, Safari, Opera, Chrome
cache: FF2
don't cache: IE, FF3*, Safari, Opera
cache: FF2, Chrome
redirect alternativesJavaScript
document.location = "destination.php";
what if JavaScript is disabled or not present?
meta refresh – put in document's HEAD<meta http-equiv="refresh"
content="0; url=destination.php">in IE, causes conditional GET requests for all resources
(similar to Reload button)
# of seconds
cache workaround<html><head>
<script type="text/javascript">window.onload = function () { document.location = "destination.php"; }</script>
<noscript><meta http-equiv="refresh"
content="0; url=destination.php"></noscript></head>
one last thing – make this document cacheable!
need to let the page load so it can be cached
redirects in the top 10
mostly ads
# redirects
www.aol.com 5
www.ebay.com
www.facebook.com
www.google.com/search
search.live.com/results
www.msn.com 1
www.myspace.com
en.wikipedia.org/wiki
www.yahoo.com 2
www.youtube.com
common uses1. redirect from blah.com to www.blah.com2. missing trailing slash3. tracking internal traffic4. tracking outbound traffic5. prettier URLs, preserve old URLs6. connecting web sites7. ads8. authentication
use 1: www
GET / HTTP/1.1Host: aol.com
Request
HTTP/1.x 301 Moved PermanentlyDate: Tue, 28 Oct 2008 23:01:42 GMTExpires: Tue, 28 Oct 2008 23:31:42 GMTLocation: http://www.aol.com/
Response
why redirect from http://aol.com/ to http://www.aol.com/?• set cookies on www domain – non-issue• cache resources once regardless of which URL is used
http://aol.com/logo.gif
http://www.aol.com/logo.gif
use 1: www in the top 10
status
Expires
aol.com 301 +30 mins
ebay.com 301
facebook.com 301 July 1997
google.com 301 +30 days
search.live.com na
msn.com 301
myspace.com 301
wikipedia.org 200
yahoo.com 301
youtube.com 303 Apr 1971
303 See Other – "MUST NOT be cached"
how is this possible?!
which top 10 sites redirect from blah.com to www.blah.com?
use 1: www & Wikipedia
all resources referenced via full URLseasy, if you're doing
CDNdomain shardingcookieless domain
another alternative<base href="http://www.wikipedia.org">
use 2: missing trailing slash
GET /msn HTTP/1.1Host: astrocenter.astrology.msn.com
Request
HTTP/1.x 301 Moved PermanentlyLocation: http://astrocenter.astrology.msn.com/msn/
Response
reasons to redirect for missing trailing slash:autoindexing
workaround: don't use autoindexing
relative URLs for resourcesworkaround: use base href, full URLs, or URLs
relative to root
use 3: internal tracking
GET /_ylt=Al…ume/**http%3A//tools.search.yahoo.com/about/forsearchers.html HTTP/1.1
Host: m.www.yahoo.com
Request
HTTP/1.x 302 Moved TemporarilyLocation: http://tools.search.yahoo.com/about/forsearchers.html
Response
"More" link on Yahoo! front pageworkaround: track referer [sic] on internal servers
use 4: outbound tracking
GET /url?…url=http%3A%2F%2Fwww.npr.org/… HTTP/1.1Host: www.google.com
Request
HTTP/1.x 302 FoundLocation: http://www.npr.org/
Response
clicking on a Google search resultworkarounds:
image beacon – race conditionsXMLHttpRequest readyState 2 – faster, more complexHTML 5
<a ping="http://..."><link rel=pingback href="http://...">
use 5: prettier URLs
GET / HTTP/1.1Host: music.myspace.com
Request
HTTP/1.x 302 MovedLocation: http://profile.myspace.com/index.cfm?fuseaction=music
Response
prettier URLs are easier to rememberalso, preserve old URLs when code changesworkaround: mod_rewrite, cacheable redirects
use 6: connecting siteshttp://toolbar.google.com/
http://toolbar.google.com/index.html http://toolbar.google.com/T5/
http://toolbar.google.com/T5/intl/en/index.html http://www.google.com/tools/firefox/toolbar/
FT3/intl/en/index.html
redirects are an easy way to "integrate" separate teams (T4, T5), separate code bases (IE, FF), separate servers (toolbar, www)
workarounds: CNAMEs, mod_rewrite
use 7: adsspecifically, counting ad impressionsadvertisers and publishers have a hard time
reconciling the countwhen do you count an ad impression?• when a page containing an ad is served?
what if the page never arrives?• when a page containing an ad arrives at the client?
what if the ad request fails, or the user stops the page?• when the content of the ad (image, Flash) is
requested from the advertiser? what if the user leaves the page before the content arrives?
• after the content arrives?is it the publisher's fault if the content is slow?
use 7: adshow do you count an ad impression?• when a page containing an ad is served?
count it on the publisher's backend• when a page containing an ad arrives at the client?
send a beacon from the client• when the content of the ad (image, Flash) is
requested from the advertiser? count it on the advertiser's backend
• after the content arrives?send a beacon from the client
redirects can help count when content is served and reconcile the two parties
use 7: adshttp://ad.doubleclick.net/im…5|1;;cs=o%3fhttp://ad.doubleclick.net/dot.gif?258824979
http://ad.doubleclick.net/dot.gif?258824979http://eatps.web.aol.com:9000/open_web_adhoc?subtype=40…458
http://www.aolcdn.com/pops_promo/pixelhttp://ad.doubleclick.net/ad/N553.AEAOLService/B2775919.11;dcadv=1297440;sz=1x1;ord=4613012?
http://m1.2mdn.net/viewad/1297440/1x1.gif
from http://www.aol.com/double logging?
use 7: adshttp://ads.cnn.com/event.ng/Type=count&Clie…ARd
http://i.cdn.turner.com/cnn/images/1.gifhttp://ad.doubleclick.net/ad/N3880.SD146.3880/B3107454.25;dcove=o;sz=1x1;ord=dwgksue,beqpWcytARh?
http://m1.2mdn.net/viewad/1139835/67-1x1.gif
from http://www.cnn.com/double logging?
use 8: authenticationcookies are used for authenticationcookies can only be set on the page's domainhow authenticate someone on domain A if
they're currently on domain B? redirects
authentication is often on https servershow authenticate someone on https if they're
currently on http?redirects
use 8: authenticationhttps://www.google.com/accounts/ServiceLoginBoxAuth
https://www.google.com/accounts/CheckCookie?continue=http%3A%2F%2Fgroups.google.com%2Fgroups%2Fauth%3F_done… http://groups.google.com/groups/auth?_done…
http://groups.google.com/groups/auth?_done… http://groups.google.com/
one reason why redirects with Set-Cookie are sometimes not cached
avoid redirectseliminate the need
base href or full URLs for resourcesreferer trackingHTML 5 – A ping and LINK pingbackCNAMEsmod_rewriteno autoindex
make them cacheable301 with future ExpiresJavaScript & meta refresh with future Expires
Homeworkstudy for midterm – 10/31 3:15-4:05pm11/7 11:59pm – rules 4-10 applied to your
"Improving a Top Site" class project
QuestionsWhat's the status text for 301 and 302?What HTTP response header contains the URL the user is
redirected to?Why are redirects worse than stylesheets and scripts in
terms of blocking?If a redirect is "Moved Permanently", does that mean it's
cached?Which browsers today cache redirects?What are two other techniques for doing redirects? How
do they compare to the 301/302 status approach?