cp476 internet computing lecture 5 : http, www and url 1 lecture 5. www, http and url objective: to...
TRANSCRIPT
CP476 Internet Computing Lecture 5 : HTTP, WWW and URL
1
Lecture 5. WWW, HTTP and URL
Objective: • to review the concepts of WWW• to understand how HTTP works• URL
• World Wide Web (the Web)A collection of documents of information– distributed over the internet, and connected
by links– Based on TCP/IP, using HTTP – Web server software and Web browsers– It grows to involve many existing
applications, such email, file transfer
• Tim Berners-Lee at CERN proposed the World Wide Web application
• HyperText Transfer Protocol (HTTP) is the core of the Web, which was defined in HTTP1.0 (RFC:1945) and currently use HTTP1.1 RFC:2616
CP476 Internet Computing Lecture 5 : HTTP, WWW and URL
2
• The documents used in the Web, can contains links, called hypertext. The ideas of hypertext was also proposed and used by other researchers.
• HTML (Hypertext Markup Language) is a coding language create hypertext.
• HTTP is based on TCP/IP, and implemented in two programs: a client program (e.g., Web browser) executing on a client machine and a sever program (Web server) executing on a server machine, they talk each other by exchanging HTTP massages
• A Web document (or Web page) is an object such as an HTML file, a JPEG image, GIF image, Java applet, etc. A Web page is stored in a Web server machine, and has an location, and is accessible by URL
CP476 Internet Computing Lecture 5 : HTTP, WWW and URL
3
• Uniform Resource Locator (URL) is the global address of documents and other resources on the World Wide Web. – The first part of the address indicates what
protocol to use. – The second part specifies the IP address or
the domain name of the server host. – Third part is the path of the object in the
server directory.
– Example:http://physcomp1.wlu.ca/~hfan/cp476/cp476.php
• A browser is user agent for the Web, it implements client side HTTP, and displays the requested Web page and provides many navigation and configuration functions.
• A Web server houses Web objects, each addressable by URL, Web server implements server side HTTP.
CP476 Internet Computing Lecture 5 : HTTP, WWW and URL
4
• HTTP functions like the FTP and SMTP
– Like FTP, it does file transfer from server to client. But it uses only one connection, i.e., it uses the services of TCP on port 80.
– Like SMTP, a client sends a request which looks like a mail. The server responds like send a mail to the client
– The server is stateless, i.e., it does not keep records of previous requests.
• HTTP1.0 is nonpersistent, i.e., the connection is closed after a request is done
• HTTP/1.1 is persistent, i.e., the connection will keep alive for a short period time, another request coming within the time can use the same connection.
PC runninga browser
TCP Open ?
OK
HTTP GET …
Data
TCP Close
Server runninga Server program
Initiate TCP connection
Request and response
CP476 Internet Computing Lecture 5 : HTTP, WWW and URL
5
• A request massage
– Request line: request type + URL+HTTP version
• Request types
– HTTP/1.0
• GET, POST, HEADasks server to leave requested object out of response
– HTTP/1.1
• GET, POST, HEAD,
• PUT: uploads file in entity body to path specified in URL field
• DELETE: deletes file specified in the URL field
• URL scheme://host:port/path
– scheme : http, ftp, telnet, file, mailto, news.
Request line
General headerRequest header
Entity header
Blank line
Body
CP476 Internet Computing Lecture 5 : HTTP, WWW and URL
6
– General headercache control, connection, date, MIME-version, upgrade
– Request header: accept, accept-charset, accept-encoding, accept-language, authorization, from, host, referrer, user-agent
– Entity header: allow, content-encoding, content-language, content-range, content-type, expires, last-modified, location
– Example
GET /somedir/page.html HTTP/1.1
Host: www.someschool.edu
User-agent: Mozilla/4.0
Connection: close
Accept-language:fr
CP476 Internet Computing Lecture 5 : HTTP, WWW and URL
7
• Response massage:
– Status line: HTTP version+status code+status phase
• Status code3100 = continues, 101=switching, 200=OK, 201=created, 202=accepted, 204=no content301=multiple choices, 302=moved permanently304=moved temporarily, 400=bad request, 401=unauthorized, 403=forbidden, 404=not found405=method not allowed, 406=not acceptable500=internal service error, 501=not implemented503=service unavailable
General header– Response header: accept-range, age, public,
retry-after, server
Status line
General headerResponse header
Entity header
Blank line
Body
CP476 Internet Computing Lecture 5 : HTTP, WWW and URL
8
– ExampleHTTP/1.1 200 OK
Date: Mon, 27 Jun 2002 17:22:47 GMT
Server: Apache/1.3.22 (Unix) (Red-Hat/Linux)
Last-modified: Wed, 26 Jun 2002 18:12:29 GMT
Etag: "841fb-4b-3d1a0179"
Accept-ranges: bytes
Content-length: 75
Connection: close
Content-type: text/html
data data data data data ...
• Proxy server: a computer that keeps copies of responses to recent requests. In the presenceof a proxy server, the HTTP client sends a requestto the proxy server. The proxy server sends the
• Request to the corresponding server. Incomingresponses are sent to the proxy server, and storedfor future requests from the other clients.
CP476 Internet Computing Lecture 5 : HTTP, WWW and URL
9
• Cookies– Stateless in server site. Cookies are used on
client side to store access information to a page
– Four components
1. cookie header line in the HTTP response message
2. cookie header line in HTTP request message
3. cookie file kept on user’s host and managed by user’s browser