1 http messages entities and encoding herng-yow chen

43
1 HTTP messages Entities and Encoding Herng-Yow Chen

Upload: margaret-dean

Post on 26-Dec-2015

224 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: 1 HTTP messages Entities and Encoding Herng-Yow Chen

1

HTTP messagesEntities and Encoding

Herng-Yow Chen

Page 2: 1 HTTP messages Entities and Encoding Herng-Yow Chen

2

Outline

The format and behavior of HTTP message entities as HTTP containers

How HTTP describes the size of entity bodies, and what HTTP requires in the way of sizing

The entity headers used to describe the format, alphabet, and language of content, so clients can process it properly

Page 3: 1 HTTP messages Entities and Encoding Herng-Yow Chen

3

Reversible content encoding transforms data format to take up less space or be more secure

Transfer encoding modifies how HTTP ships data to enhance the communication of some kinds of data

Chunked encoding chops data into multiple pieces to deliver content of unknown length safely

Page 4: 1 HTTP messages Entities and Encoding Herng-Yow Chen

4

The assortment of tags, labels, times, and checksums help clients get the latest version of requested content

Ranges are useful for continuing aborted downloads where they left off

Delta encoding extensions allow client to request just those parts of a web page that actually have changed since a previously viewed revision

Page 5: 1 HTTP messages Entities and Encoding Herng-Yow Chen

5

Checksums of entity bodies are used to detect changes in entity content as it passes through proxies

Page 6: 1 HTTP messages Entities and Encoding Herng-Yow Chen

6

Message is made up of header and body

HTTP/1.0 200 OKServer: Netscape_Enterprise/3.6Date: Sun, 17 Sep 2000 00:01:05 GMTContent_type: text/plainContent-length :18

Hi!I’m a message! Entity body

Entity headers

Entity

Page 7: 1 HTTP messages Entities and Encoding Herng-Yow Chen

7

HTTP 1.1 defines 10 entity headers

Content-Type Content-Length Content-Language Content-Encoding Content-Location Content-Range

Content-MD5 Last-Modified Expires Allow ETag Cache-Control

Page 8: 1 HTTP messages Entities and Encoding Herng-Yow Chen

8

Entity Bodies

Page 9: 1 HTTP messages Entities and Encoding Herng-Yow Chen

9

Why content-length is important?

Detecting Truncation Incorrect Content-Length problems?

When connection is persistent, where one entity body ends and the next message begins.

Chunked encoding is an alternate, sending the data in a series of chunks, each with a specified chunk size.

When content-encoding is applied Content-length refers to the encoded body, not the

length of the original, unencoded body.

Page 10: 1 HTTP messages Entities and Encoding Herng-Yow Chen

10

Entity Digest

Content-MD5 Is used to check message integrity Also can be used as a key into a hash

table to quickly locate documents and reduce duplicate storage of content.

Page 11: 1 HTTP messages Entities and Encoding Herng-Yow Chen

11

Media type and Charset Content-type refers to original entity bo

dy type before encoding. Support optional parameters to further

specify the content type. Character Encodings for Text Media Content-Type: text/html; charset=iso-8859-

4

Page 12: 1 HTTP messages Entities and Encoding Herng-Yow Chen

12

Common media typesMedia type Description

Text/html Entity body is an HTML document

Text/plain Entity body is a document in plain text

Image/gif Entity body is an image of type GIF

Image/jpeg Entity body is an image of type JPEG

Audio/x-wav Entity body contains WAV sound data

Model/vrml Entity body is a three-dimensional VRML model

Application/vnd.ms-powerpoint

Entity body is a Microsoft PowerPoint presentation

Multipart/byteranges Entity body has multiple parts,each containing a different range(in bytes) of the full document

Message/http Entity body contains a complete HTTP message (see TRACE)

Page 13: 1 HTTP messages Entities and Encoding Herng-Yow Chen

13

Multipart Media Types

MIME “multipart” email messages contain multiple messages stuck together and sent as a single, complex message.

Each component is self-contained, with its own headers describing its contents; the different components are concatenated together and delimited by a string.

HTTP also supports multipart bodies; however, only used in two cases: fill-in form submission and range responses carrying pieces of a document.

Page 14: 1 HTTP messages Entities and Encoding Herng-Yow Chen

14

Multipart Form Submissions

<form action=http://xxx/cgi enctype="multipart/form-data“

method=POST> <P> Your Name? <INPUT type=“text” name=“submit-name”><br> Your File to send? <INPUT type=“file” name=“files”> <br>

<INPUT type=“submit” value=“send”> <INPUT type=“reset”><form>

Page 15: 1 HTTP messages Entities and Encoding Herng-Yow Chen

15

If the user enters “John” and selects the text file “hello.txt”

Content-Type: multipart/form-data; boundary=AaBo3x--AaBo3xContent-Disposition: form-data; name=“submit-name”John--AaBo3xContent-Disposition: form-data; name=“files”; filename=“hello.t

xt”Content-Type: text/plain… contents of hello.txt …--AaBo3x

Page 16: 1 HTTP messages Entities and Encoding Herng-Yow Chen

16

If selects the text file “hello.txt” and the second image file “image.gif”

Content-Type: multipart/form-data; boundary=AaBo3x--AaBo3xContent-Disposition: form-data; name=“submit-name”John--AaBo3xContent-Disposition: form-data; name=“files”; Content-type: multipart/mixed; boundary=BbC04y--BbC04yContent-Disposition: file: filename=“hello.txt”Content-type: text/plain… contents of hello.txt …--BbC04yContent-Disposition: file: filename=“image.gif”Content-Type: image/gifContent-Transfer-Encoding: binary… contents of image.gif …--BbC04y--AaBo3x

Page 17: 1 HTTP messages Entities and Encoding Herng-Yow Chen

17

Multipart Range Response

HTTP/1.0 206 Partial ContentServer: Microsoft-IIS/5.0Content-Location: http://xxx/hello.txtContent-Type: martipart/x-byteranges; boundary=--[abcdefghik…z]--

----[abcdefghik…z]—Content-Type: text/plainContent-Range: bytes 0-174/1441 …. Part I content -----[abcdefghik…z]--Content-Type: text/plainContent-Range: bytes 1344-1441/1441 …. Part II content -----[abcdefghik…z]--

Page 18: 1 HTTP messages Entities and Encoding Herng-Yow Chen

18

Content-Encoding

HTTP applications sometimes want to encode content before sending it, to help lesson the time it takes to transmit the data.

Content-Type is the type of the original format, before encoding

Content-Length is the length of the encoded length

Page 19: 1 HTTP messages Entities and Encoding Herng-Yow Chen

19

Content EncodingOriginal contentContent-Type: text/htmlContent-Length: 17571

Original contentContent-Type: text/htmlContent-Length: 17571

Content-encoded contentContent-Type: text/htmlContent-Length: 5746content-encoding: gzip

0111000100110010

Gzip contentdecoder Gzip content

encoder

Page 20: 1 HTTP messages Entities and Encoding Herng-Yow Chen

20

Content-encoding tokens

Content-encoding value

Description

gzip Using the GNU zip encoding (RFC1952)

compress Using the UNIX file compression program

deflate Using zlib format (RFC1950) for deflate compression (RFC 1951)

identity No encoding has been performed. When a Content-encoding header is not present, this can be assumed.

Page 21: 1 HTTP messages Entities and Encoding Herng-Yow Chen

21

Accept-Encoding Headers

serverclient

HTTP/1.1 200 OKContent-type: image/gifContent-encoding: gzip[…]

Request message

Response message

…00101101……00101101…

The server compresses the image with gzip to transport a smaller file over the thinNetwork connection between itself and the client.This saves network bandwidthAnd reduces the amount of time that the client waits for the transfer.Though,theClient will have to spend time decompressing the image once the image is served.

gzipgunzip

GET /logo.gif HTTP/1.1Accept-encoding: gzip[…]

Page 22: 1 HTTP messages Entities and Encoding Herng-Yow Chen

22

Client can indicate preferred encodings by attaching Q values

Accept-Encoding: compress, gzipAccept-Encoding:Accept-Encoding: *Accept-Encoding: compress;q=0.5, gzip;q=1.0Accept-Encoding: gzip;q=1.0, identity;q=0.5; *;q=0

Page 23: 1 HTTP messages Entities and Encoding Herng-Yow Chen

23

Transfer Encoding

Content-Encodings are to deal with the entity content to be encoded for less-space or security reason, tightly associated with the content format.

In comparison, transfer encodings are applied for architectural reasons and are independent of the content format.

Page 24: 1 HTTP messages Entities and Encoding Herng-Yow Chen

24

Content encoding vs. transfer encoding

HTTP/1.0 200 OKcontent-encoding: gzipContent-Type: text/html[…][encoded message]

HTTP/1.1 200 OKTransfer-encoding: Chunked

10abcdefghijk1a

Content-encoded response

Transfer-encoded response

Normal header block

Normal entity(just encoded)

Basic header

Encoded blocks

A content-encoded message just encodes the entitySection of the message. With Transfer-encodedMessages the encoding is a function of the entireMessage, changing the structure of the message itself

Page 25: 1 HTTP messages Entities and Encoding Herng-Yow Chen

25

Transfer-Encoding Headers

TE Used in the request header to tell the

server what extension transfer encoding are okay to use.

Transfer-Encoding Used in the response header to tell the

receiver (client) what encoding has been perform

Page 26: 1 HTTP messages Entities and Encoding Herng-Yow Chen

26

Example

GET /1.html HTTP/1.1Host: www.csie.ncnu.edu.twUser-Agent: Mozilla/4.61TE: trailers, chunked

HTTP/1.1 200 okTransfer-Encoding: chunkedServer: Apache 3.0

Page 27: 1 HTTP messages Entities and Encoding Herng-Yow Chen

27

Chunked Encoding

Page 28: 1 HTTP messages Entities and Encoding Herng-Yow Chen

28

Chunked Encoding (continued)

Chunking and Persistent connection

Trailers in chunked messages

Combining Content and Transfer Encoding

Page 29: 1 HTTP messages Entities and Encoding Herng-Yow Chen

29

Combining Content and Transfer Encodings

9BF2578EA42670CD

9BF2578EA42670CD

4268EA

25798B

426

8EA257

98B

Content encoding

Transfer encoding(chunking)

Content-type: text/heml

Content-Type: text/htmlcontent-encoding: gzip

Content-Type: text/htmlcontent-encoding: gzipTransfer-encoding: chunked

Page 30: 1 HTTP messages Entities and Encoding Herng-Yow Chen

30

Time-Varying Instance

Web objects usually are not static. The same URL can, over time, point

to different versions of an object.

For example, the website of any media company like CNN, and BBC.

Page 31: 1 HTTP messages Entities and Encoding Herng-Yow Chen

31

Time-Varying Instances

Page 32: 1 HTTP messages Entities and Encoding Herng-Yow Chen

32

Validators and Freshness In the previous CNN example, the client got th

e initial resource V1 and can cache this copy, but for how long?

Once the document has “expired” at the client, it must request a fresh copy from the server.

Using a “conditional request” to tell the server which version it currently has, using a validator, and ask for a copy to be sent only if its current copy is no long valid.

Page 33: 1 HTTP messages Entities and Encoding Herng-Yow Chen

33

Cache-Control header directives

Directive Message type

no-cache Request

no-store Request

max-age Request

max-fresh Request

no-transform Request

only-if-cached Request

public Response

private Response

Page 34: 1 HTTP messages Entities and Encoding Herng-Yow Chen

34

Cache-Control header directives

Directive Message type

no-cache Response

no-store Response

no-transform Response

must-revalidate Response

proxy-revalidate Response

max-age Response

s-max-age Response

Page 35: 1 HTTP messages Entities and Encoding Herng-Yow Chen

35

Conditional request types

Request type validator

If-Modified-Since Last-Modified

If-Unmodified-Since Last-Modified

If-Match ETag

If-None-Match ETag

Page 36: 1 HTTP messages Entities and Encoding Herng-Yow Chen

36

Range Request

HTTP allows clients to actually request just part or a range of a document.

Applications: Request RoI (Region of Interest) Media Indexing and Access Streaming applications

Page 37: 1 HTTP messages Entities and Encoding Herng-Yow Chen

37

Range Requests

GET /bigfile.html HTTP/1.1[…]

GET /bigfile.html HTTP/1.1Range: bytes=20224-[…]

HTTP/1.1 200 OKContent-Type: text/htmlContent-Length: 65537Accept-Ranges: bytes[…]

HTTP/1.1 200 OKContent-Type: text/htmlRange: bytes=20224-Accept-Ranges: bytes

[…]

Response message

Range response message

Request message

www.csie.ncnu.edu.tw

www.csie.ncnu.edu.tw

client

110100111001101001110010

The client’s original request wasInterrupted,but a second requestFor the part of the message that Was not received allows the Client to resume form the pointOf the interruption

Range request message

Page 38: 1 HTTP messages Entities and Encoding Herng-Yow Chen

38

Delta Encoding

An extension to the HTTP protocol that optimizes transfer by communicating changes instead of entire objects.

RFC 3229 describe delta encoding.

Page 39: 1 HTTP messages Entities and Encoding Herng-Yow Chen

39

Delta Encoding

Page 40: 1 HTTP messages Entities and Encoding Herng-Yow Chen

40

Delta Encoding

Page 41: 1 HTTP messages Entities and Encoding Herng-Yow Chen

41

Delta-encoding headers

Etag If-None-Match A-IM IM Delta-Base

Page 42: 1 HTTP messages Entities and Encoding Herng-Yow Chen

42

IANA registered types of instance manipulations

Type Descriptionvcdiff Delta using the vcdiff algorithm

diffe Delta using the Unix diff-e command

gdiff Delta using the gdiff algorithm

gzip Compression using the gzip algorithm

deflate Compression using the deflate algorithm

range Used in a server response to indicate that the response is partial content as the result of a range selection

identity Used in a client request’s A-IM header to indicate that the client is willing to accept an identity instance manipulation

Page 43: 1 HTTP messages Entities and Encoding Herng-Yow Chen

43

For More Information

http://www.ietf.org/rfc/rfc2616.txt Hypertext Transfer Protocol -- HTTP/1.1

http://www.ietf.org/rfc/rfc3229.txt Delta encoding in HTTP

http://www.ietf.org/rfc/rfc1521.txt MIME (Multipurpose Internet Mail Extensions) Part One:Mechanisms for

Specifying and Describing the Format of Internet Message Bodies http://www.ietf.org/rfc/rfc2045.txt

Multipurpose Internet Mail Extensions(MIME) Part One:Format of Internet Message Bodies

http://www.ietf.org/rfc/rfc1864.txt The Content-MD5 Header Field

http://www.ietf.org/rfc/rfc3230.txt Instance Digests in HTTP