content negotiation and transcoding
DESCRIPTION
Content Negotiation and Transcoding. Herng-Yow Chen. Outline. A single URL may need to correspond to different resources: multiple language support for different request language users. - PowerPoint PPT PresentationTRANSCRIPT
1
Content Negotiation and Transcoding
Herng-Yow Chen
2
Outline
A single URL may need to correspond to different resources: multiple language support for different request language users.
HTTP provides content-negotiation methods that allow clients and servers to make such determinations, accessing a single URL corresponding to different resources (e.g., French or English version) called variants.
3
Servers also can make other types of decisions about what content is best to send to a client for a particular URL.
Servers even can automatically generate customized pages– for instance, converting an HTML page into a WML page for your handheld device.
This kind of dynamic content transformations are called transcodings.
4
Content-Negotiation Techniques
There are three distinct methods for deciding which page at a server is the right one for a client: Present the choice to the client Decide automatically at the server Ask an intermediary to select.
5
Summary of content-negotiation techniques
TechniqueHow it works
Advantages Drawbacks
Client-driven
Client makes a request, server sends list of choices to client, client choose.
Easiest to implement at server side. Client can make best choice.
Adds latency: at least two requests are needs to get the correct content.
Server-driven
Server examines client’s request headers and decides what version to serve.
Quicker than client-driven negotiation. HTTP provides a q-value mechanism to allow servers to make approximate matches and a Vary header for servers to tell downstream devices how to evaluate requests.
If the decision is not obvious (headers don’t match up), the server must guess.
Transparent An intermediate device (usually a proxy cache) does the request negotiation on the client’s behalf.
Offloads the negotiation from the web server. Quicker than client-driven negotiation.
No formal specification for how to do transparent negotiation.
6
Client-Driven Negotiation
Client makes a request. Server sends list of choices to client. Client chooses.
Disadvantage: two requests are needed One to get the list and a second to get the
selected copy, leading slow (increased latency) , tedious decision process made manually at the client side in the browser.
7
For servers, two ways to present choices (are manually decided)
By sending back an HTML with links to the different versions of the page and descriptions,
By sending back an HTTP/1.1 response with the 300 Multiple response code. The client browser may receive this response and
display a page with the links, as in the first method, or it may pop up a dialog asking for selection.
Another problem: requires multiple URLs One for the main page, one for each specific page
8
Server-Driven Negotiation
Client-driven approach has several drawbacks, as discussed previously; however, the most one is the increased communication between client and server to decide on the best page.
Why do we let the server decide which page to send back? Client must send enough information about its
preferences
9
Two mechanisms to evaluate the proper response
Examining the set of content-negotiation headers. The server looks at the clients’ Accept
header and tries to match them with corresponding response headers.
Varying on other (non-content-negotiation) headers. For example, the server could send
responses based on the client’s User-Agent header.
10
Content-Negotiation Headers
Header Description
AcceptUsed to tell the server what media types are okay to send
Accept-Language
Used to tell the server what language types are okay to send
Accept-Charset Used to tell the server what charsets types are okay to send
Accept-Encoding
Used to tell the server what encodings types are okay to send
11
Accept and matching document headers
Accept header
Entity header
Accept Content-Type
Accept-Language
Content-Language
Accept-Charset Content-Type
Accept-Encoding
Content-Encoding
12
We have discussed the entity header in
Chapter 15, which are like a shipping label for describing the attributes of the message body.
Content-negotiation header, on the other hand, are used by clients and servers to exchange preference information and to choose between different versions, so that the best, or the most closely one (q values) matching the preferences is served.
13
Content-Negotiation Header Quality Values
For example, clients send an Accept-Language header as below
Accept-Language: en;q=0.5, fr;q=0.0, nl; q=1.0, tr;q=0.0 Where q value ranges from 0.0 to 1.0 (the highest preference) In this case, the client prefers to receive a Dutch (nl) version,
but an English (en) version will do. Under no circumstance does the client want a French (fr) or Turkish (tr) version.
Order is not important. Occasionally, the server may not have any documents
that mach any of the client’s preference. In this case, the server may change or transcode the docume
nt to match the client’s preference (discussed later).
14
Varying on Other Headers
Servers also can attempt to match up responses with other client request headers, such as User-Agent.
Server may know that old versions of a browser or browser types do not support JavaScript, for example, and may therefore send back a version without Javascript.
In this case, there is no q-value to look for approximate best match. The server either looks for an exact match or simply serves whatever it has.
15
Because caches must attempt to server correct “best” versions of cached document, the HTTP defines a Vary header that the server sends in responses;
The Vary headers tells caches (and clients, and any downstream proxies) which headers the server is using to determine the best version of the response to send. (discussed later)
16
Content Negotiation on Apache
A web site content provider– Joe, for example– to provide different version of Joe’s index page. Joe must put all his index page files in the appropriate directory of the Apache server. There are two ways to enable this.
In the web site directory, create an type-map file for each URI in the web site that has variant.
Enable the MultiViews directive, which causes Apache to create type-map files for the directory automatically.
17
Using type-map file
AddHandler type-map .var
Here is a smaple type-map file URI: joes-hardware.html
URI: joes-hardware.en.html Content-type: text/html Content-language: en
URI: joes-hardware.en.html Content-type: text/html;charset=iso-885902 Content-language: fr, de
18
Using Multi-Views
Use Options directive to enable multi-view for the directory (<Directory>, <Location>, or <Files>).
The server looks for all files with “joe-hardware” in the name and creates a type-map file for them.
Based on the names , the server guesses the appropriate content-negotiation header to which the files correspond.
Another two ways to implement content negotiation at the server is by
Server-side extension, such as Microsoft’s Active Server Page (ASP)
any CGI-program,i.e., doing this by yourself
19
Transparent Negotiation
Seeks to move the load of server-driven negotiation away from the server, while minimizing message exchange with the client by having an intermediary proxy negotiate on behalf of client.
The proxy is assumed to have knowledge of the client’s expectations and be capable of performing the negotiations on its behalf.
20
Caching and Alternates
21
Caches use content-negotiation headers to send back correct responses to client
GET / HTTP/1.1
Host: www.joes-hardware.com
User-agent: spiffy multimedia browser
Accept-language: fr;q=1.0
Cache
French-speaking user Web
server
Bonjour
Hi! Welcome to Joe’s Hardware Store.
Hola! Bienvenido a Joe’s Hardware Store.
Bonjour! Bienvenue a Joe’s Hardware Store.
22
Caches use content-negotiation headers to send back correct responses to client
GET / HTTP/1.1
Host: www.joes-hardware.com
User-agent: spiffy multimedia browser
Accept-language: es;q=1.0
Cache
Spanish-speaking user Web
server
Hola! Bienvenido a Joe’s Hardware Store.
Bonjour
Bienvenido
23
The Vary Header
The huge number of different User-Agent and Cookie values could generate many variants: Vary: User-Agent, Cookie
24
Caches match request headers
GET / HTTP/1.1
Host: www.joes-hardware.com
User-agent: spiffy multimedia browser
Accept-language: fr;q=1.0
Cache
French-speaking user
1
Web server
I need to send her French document. Since she has such a cool browser, I’ll send her a media-rich version of the page.
BonjourHTTP/1.1 200 OKContent-language: frVary: User-agent
Bonjour[…media-rich content]
25
Caches match request headersGET / HTTP/1.1
Host: www.joes-hardware.com
User-agent: simpy wireless device
Accept-language: fr;q=1.0
Cache
French-speaking user
2
Web server
He wants a French copy of the document and I have it in my cache, but I’d better not send it to him. The server said my cached copy was for a spiffy browser. This guy has a wimpy wireless one. I had better ask the server for a French version for the wireless browser.
Bonjour
HTTP/1.1 200 OKContent-language: frVary: User-agent
Bonjour[…simple text content]
Bonjour
26
Transcoding We have discussed the mechanism by which cl
ients and servers can choose between a set of documents for a URL and send the one that best matches the client’s needs.
What happens, however, when a server does not have a document that matches the client’s needs at all? Respond to client with an error, but Yet another solution– transcoding, transforming th
e unsatisfactory one into something that the client can use.
27
Three categories of Transcoding Format Conversion
Compatible problem Bandwidth issues
Information Synthesis Information summary Advertisement removal
Content Injection (increasing the amount of content)
Automatic ad generator User-tracking system
Collect statistics about how the page is viewed and how the clients surf the Web.
28
Hypothetical transcoding
Before After
HTML document WML document
High-resolution image Low-resolution image
Image in 64K colors Black-and-white image
Complex page with frames
Simple text page without frames or images
HTML page with Java applets
HTML page without Java applets
Page with ads Page with ads removes
29
Transcoding Versus Static Pregeneration
An alternative to transcoding is to build different copies of web pages at the web server.
For example, one with HTML, one with WSML, one with high-resolution, one with low-resolution.
Is this practical? Storage cost, management problem
30
Content transformation or transcoding at a proxy cache
GET / HTTP/1.1Host: www.joes-hardware.comUser-agent: wimpy wireless deviceAccept-language: fr;q=1.0
CacheFrench-speaking user
Web server
I have a French copy of the document that the wants, but my copy is very media-rich and he has a wimpy wireless browser. I will strip out all of the multimedia content and send it to him.
Bonjour
HTTP/1.1 200 OKContent-language: frVary: User-agent
Bonjour[…simple text content]
Bonjour
Transmogrifier
Since I have transformed this document for a wireless device, I will store the transformed copy as an alternate in case someone else wants it as well.
31
For More Information
RFC 2616, Hypertext Transfer Protocol--HTTP 1/1
RFC 2295, Transparent Content Negotiation in HTTP
RFC 2296, HTTP Remote Variant Selection Algorithm-RSVA 1.0
RFC 2936, HTTP MIME Type Handler Detection
http://www.imc.org/ietf-medfree/index.htmla link to the Content Negotiation (CONNEG) working group