cp476 internet computing lecture 5 : http, www and url 1 lecture 5. www, http and url objective: to...

9
CP476 Internet Computing Lecture 5 : HTTP, WWW and URL 1 Lecture 5. WWW, HTTP and URL Objective: • to review the concepts of WWW • to understand how HTTP works • URL World Wide Web (the Web) A collection of documents of information distributed over the internet, and connected by links Based on TCP/IP, using HTTP Web server software and Web browsers It grows to involve many existing applications, such email, file transfer Tim Berners-Lee at CERN proposed the World Wide Web application HyperText Transfer Protocol (HTTP) is the core of the Web, which was defined in HTTP1.0 (RFC:1945) and currently use HTTP1.1 RFC:2616

Upload: lorraine-conley

Post on 27-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CP476 Internet Computing Lecture 5 : HTTP, WWW and URL 1 Lecture 5. WWW, HTTP and URL Objective: to review the concepts of WWW to understand how HTTP works

CP476 Internet Computing Lecture 5 : HTTP, WWW and URL

1

Lecture 5. WWW, HTTP and URL

Objective: • to review the concepts of WWW• to understand how HTTP works• URL

• World Wide Web (the Web)A collection of documents of information– distributed over the internet, and connected

by links– Based on TCP/IP, using HTTP – Web server software and Web browsers– It grows to involve many existing

applications, such email, file transfer

• Tim Berners-Lee at CERN proposed the World Wide Web application

• HyperText Transfer Protocol (HTTP) is the core of the Web, which was defined in HTTP1.0 (RFC:1945) and currently use HTTP1.1 RFC:2616

Page 2: CP476 Internet Computing Lecture 5 : HTTP, WWW and URL 1 Lecture 5. WWW, HTTP and URL Objective: to review the concepts of WWW to understand how HTTP works

CP476 Internet Computing Lecture 5 : HTTP, WWW and URL

2

• The documents used in the Web, can contains links, called hypertext. The ideas of hypertext was also proposed and used by other researchers.

• HTML (Hypertext Markup Language) is a coding language create hypertext.

• HTTP is based on TCP/IP, and implemented in two programs: a client program (e.g., Web browser) executing on a client machine and a sever program (Web server) executing on a server machine, they talk each other by exchanging HTTP massages

• A Web document (or Web page) is an object such as an HTML file, a JPEG image, GIF image, Java applet, etc. A Web page is stored in a Web server machine, and has an location, and is accessible by URL

Page 3: CP476 Internet Computing Lecture 5 : HTTP, WWW and URL 1 Lecture 5. WWW, HTTP and URL Objective: to review the concepts of WWW to understand how HTTP works

CP476 Internet Computing Lecture 5 : HTTP, WWW and URL

3

• Uniform Resource Locator (URL) is the global address of documents and other resources on the World Wide Web. – The first part of the address indicates what

protocol to use. – The second part specifies the IP address or

the domain name of the server host. – Third part is the path of the object in the

server directory.

– Example:http://physcomp1.wlu.ca/~hfan/cp476/cp476.php

• A browser is user agent for the Web, it implements client side HTTP, and displays the requested Web page and provides many navigation and configuration functions.

• A Web server houses Web objects, each addressable by URL, Web server implements server side HTTP.

Page 4: CP476 Internet Computing Lecture 5 : HTTP, WWW and URL 1 Lecture 5. WWW, HTTP and URL Objective: to review the concepts of WWW to understand how HTTP works

CP476 Internet Computing Lecture 5 : HTTP, WWW and URL

4

• HTTP functions like the FTP and SMTP

– Like FTP, it does file transfer from server to client. But it uses only one connection, i.e., it uses the services of TCP on port 80.

– Like SMTP, a client sends a request which looks like a mail. The server responds like send a mail to the client

– The server is stateless, i.e., it does not keep records of previous requests.

• HTTP1.0 is nonpersistent, i.e., the connection is closed after a request is done

• HTTP/1.1 is persistent, i.e., the connection will keep alive for a short period time, another request coming within the time can use the same connection.

PC runninga browser

TCP Open ?

OK

HTTP GET …

Data

TCP Close

Server runninga Server program

Initiate TCP connection

Request and response

Page 5: CP476 Internet Computing Lecture 5 : HTTP, WWW and URL 1 Lecture 5. WWW, HTTP and URL Objective: to review the concepts of WWW to understand how HTTP works

CP476 Internet Computing Lecture 5 : HTTP, WWW and URL

5

• A request massage

– Request line: request type + URL+HTTP version

• Request types

– HTTP/1.0

• GET, POST, HEADasks server to leave requested object out of response

– HTTP/1.1

• GET, POST, HEAD,

• PUT: uploads file in entity body to path specified in URL field

• DELETE: deletes file specified in the URL field

• URL scheme://host:port/path

– scheme : http, ftp, telnet, file, mailto, news.

Request line

General headerRequest header

Entity header

Blank line

Body

Page 6: CP476 Internet Computing Lecture 5 : HTTP, WWW and URL 1 Lecture 5. WWW, HTTP and URL Objective: to review the concepts of WWW to understand how HTTP works

CP476 Internet Computing Lecture 5 : HTTP, WWW and URL

6

– General headercache control, connection, date, MIME-version, upgrade

– Request header: accept, accept-charset, accept-encoding, accept-language, authorization, from, host, referrer, user-agent

– Entity header: allow, content-encoding, content-language, content-range, content-type, expires, last-modified, location

– Example

GET /somedir/page.html HTTP/1.1

Host: www.someschool.edu

User-agent: Mozilla/4.0

Connection: close

Accept-language:fr

Page 7: CP476 Internet Computing Lecture 5 : HTTP, WWW and URL 1 Lecture 5. WWW, HTTP and URL Objective: to review the concepts of WWW to understand how HTTP works

CP476 Internet Computing Lecture 5 : HTTP, WWW and URL

7

• Response massage:

– Status line: HTTP version+status code+status phase

• Status code3100 = continues, 101=switching, 200=OK, 201=created, 202=accepted, 204=no content301=multiple choices, 302=moved permanently304=moved temporarily, 400=bad request, 401=unauthorized, 403=forbidden, 404=not found405=method not allowed, 406=not acceptable500=internal service error, 501=not implemented503=service unavailable

General header– Response header: accept-range, age, public,

retry-after, server

Status line

General headerResponse header

Entity header

Blank line

Body

Page 8: CP476 Internet Computing Lecture 5 : HTTP, WWW and URL 1 Lecture 5. WWW, HTTP and URL Objective: to review the concepts of WWW to understand how HTTP works

CP476 Internet Computing Lecture 5 : HTTP, WWW and URL

8

– ExampleHTTP/1.1 200 OK

Date: Mon, 27 Jun 2002 17:22:47 GMT

Server: Apache/1.3.22 (Unix) (Red-Hat/Linux)

Last-modified: Wed, 26 Jun 2002 18:12:29 GMT

Etag: "841fb-4b-3d1a0179"

Accept-ranges: bytes

Content-length: 75

Connection: close

Content-type: text/html

data data data data data ...

• Proxy server: a computer that keeps copies of responses to recent requests. In the presenceof a proxy server, the HTTP client sends a requestto the proxy server. The proxy server sends the

• Request to the corresponding server. Incomingresponses are sent to the proxy server, and storedfor future requests from the other clients.

Page 9: CP476 Internet Computing Lecture 5 : HTTP, WWW and URL 1 Lecture 5. WWW, HTTP and URL Objective: to review the concepts of WWW to understand how HTTP works

CP476 Internet Computing Lecture 5 : HTTP, WWW and URL

9

• Cookies– Stateless in server site. Cookies are used on

client side to store access information to a page

– Four components

1. cookie header line in the HTTP response message

2. cookie header line in HTTP request message

3. cookie file kept on user’s host and managed by user’s browser