copyright (c) 2010, dr. kuanchin chen1 the client-server architecture of the www dr. kuanchin chen
TRANSCRIPT
Copyright (c) 2010, Dr. Kuanchin Chen 2
The Hypertext Transfer Protocol (HTTP)
HTTP governs the messages exchanged between a Web server and a Web client.
Who put together the HTTP standard? What is the current version of it? Links:
Main site for HTTP: http://www.w3.org/Protocols/
The HTTP Specification: http://www.w3.org/Protocols/rfc2616/rfc2616.html
Copyright (c) 2010, Dr. Kuanchin Chen 3
WWW Client-Side Technologies
(X)HTML CSS JavaScript (VBScript, PerlScript, etc.) XML/XSLT
Copyright (c) 2010, Dr. Kuanchin Chen 4
Server-Side Technologies (I)
Active Server Pages (ASP) Languages: VBScript, JavaScript, … Platforms: mostly Windows
ASP.NET is supported on more operating systems than ASP, but …
Portability: Medium-to-low Execution: Interpreted (compiled in .NET) Technology: Proprietary
Java Server Pages (JSP) Language: Java Platforms: Windows, UNIX, Mac OS, … Portability: High Execution: Compiled Technology: Proprietary
Copyright (c) 2010, Dr. Kuanchin Chen 5
Server-Side Technologies (II)
PHP Hypertext Preprocessor (PHP) Language: PHP Platforms: Windows, UNIX, Mac OS,… Portability: High Execution: Compiled Technology: Open source
Common Gateway Interface (CGI) Language: any Platforms: Windows, UNIX, Mac OS, … Portability: High Execution: Interpreted and/or compiled Technology: Proprietary or open source
Others
Copyright (c) 2010, Dr. Kuanchin Chen 6
What is a Port?
A port is a logical connection place for client and server programs to “talk” to each other. Each server program or service is assigned a
numeric port number. This is like a main telephone number (the server
computer’s physical network connection) and extensions (individual port numbers for services).
Well-known port numbers: FTP: port 21; HTTP: port 80; SMTP: port 25; IMAP:
port 143 IANA (Internet Assigned Numbers Authority ) port as
signment http://www.ibm.com vs http://www.ibm.com:80
Copyright (c) 2010, Dr. Kuanchin Chen 7
WWW Message Exchange
Client (E.g., a browser)
Server (E.g., a Web server)
A REQUEST message
A RESPONSE message
http://www.yahoo.com
HTTP
Copyright (c) 2010, Dr. Kuanchin Chen 8
The Hypertext Transfer Protocol (HTTP)
HTTP governs the messages exchanged between a Web server and a Web client.
Who put together the HTTP standard? What is the current version of it? Links:
Main site for HTTP: http://www.w3.org/Protocols/
The HTTP Specification: http://www.w3.org/Protocols/rfc2616/rfc2616.html
Copyright (c) 2010, Dr. Kuanchin Chen 9
Syntax of the REQUEST Message
Request = Request-Line …………………….. (1) *(( general-header | request-header | entity-header ) CRLF) …………………….. (2) CRLF …………………….. (3) [ message-body ] …………………….. (4)
Symbols used in the syntax:1. | denotes OR2. [ ] denotes OPTIONAL3. * denotes ONE OR MORE4. CRLF: Carriage Return (ASCII 13) and Line
Feed( ASCII 10)
Copyright (c) 2010, Dr. Kuanchin Chen 10
Syntax of the Request Line
Request-Line = Method SP Request-URI SP HTTP
Version CRLF
Note: 1. Method: POST | GET | others2. HTTP version: E.g., HTTP/1.1
Copyright (c) 2010, Dr. Kuanchin Chen 11
The REQUEST Message – A Graphical View
GET / HTTP/1.1 CRLF
Host: www.yahoo.com CRLF
From: [email protected] CRLF
Accept: text/plain, text/html CRLF
CRLF
The request line
The header section
An empty line
The optional message body
The REQUEST Message
Copyright (c) 2010, Dr. Kuanchin Chen 12
Want More Information about the REQUEST message?
See the REQUEST section of the HTTP specification.
Copyright (c) 2010, Dr. Kuanchin Chen 13
The Request Message - Examples
Example 1: http://www.yahoo.comGET / HTTP/1.1 CRLFHost: www.yahoo.com CRLFCRLF
Example 2: http://www.wmich.edu/libraryGET /library/ HTTP/1.1 CRLFHost: www.wmich.edu CRLFCRLF
Q: What are the equivalent URLs for the above two examples?
Copyright (c) 2010, Dr. Kuanchin Chen 14
Syntax of the RESPONSE Message
Response = Status-Line …………………….. (1) *(( general-header | request-header | entity-header ) CRLF) …………………….. (2) CRLF …………………….. (3) [ message-body ] …………………….. (4)
Symbols used in the syntax:1. | denotes OR2. [ ] denotes OPTIONAL3. * denotes ONE OR MORE4. CRLF: Carriage Return (ASCII 13) and Line
Feed( ASCII 10)
Copyright (c) 2010, Dr. Kuanchin Chen 15
Syntax of the Status Line
Status-Line = HTTP-Version SP Status-Code SP
Reason-Phrase CRLF
Note: 1. HTTP version: E.g., HTTP/1.12. Status-Code: See status code def.
Copyright (c) 2010, Dr. Kuanchin Chen 16
Common Status Codes
200 OK: the request is fulfilled and the requested document is attached in the MESSAGE BODY section of the response message.
404 Not Found: the requested document is not found.
505 Internal Server Error: the server cannot fulfill the request (mostly because of the problems in the server-side programs)
For more status code definitions, see the HTTP specification.
Copyright (c) 2010, Dr. Kuanchin Chen 17
The RESPONSE Message – A Graphical View
HTTP/1.1 200 OK
Date: Wed, 25 Dec 2002 06:45:47 GMT
Content-Type: text/html
Content-length: 7386
CRLF
<html><head> <title>My Test Page</title></head>…
The status line
An empty line
The optional message body
The REQUEST Message
The header section
Copyright (c) 2010, Dr. Kuanchin Chen 18
The Response Message – An Example
HTTP/1.1 200 OK CRLFDate: Mon, 08 Sep 2003 04:04:24 GMT CRLFContent-Type: text/html CRLFCRLF<html><head> <title>Yahoo!</title>…
Copyright (c) 2010, Dr. Kuanchin Chen 20
Server vs. Client?
Web Servers Apache Internet Information Service (IIS) …
Web Clients Browsers Programs simulating the functionality of
browsers
Q: How do you know what server a site uses and how many days a site has been up?
Copyright (c) 2010, Dr. Kuanchin Chen 21
Introduction to Web Servers
The Web Server market Market share User community
Installation & Configuration See a document on WebCT
Directories/Folders Server root Document root
Server Root vs. Document Root
Copyright (c) 2010, Dr. Kuanchin Chen 22
Internet Information Services (IIS)
Apache (actually WAMP, a variant of Apache)
Document Root
Server Root
Copyright (c) 2010, Dr. Kuanchin Chen 23
Default File(s)
Default files The file to load when no file name is
provided in a URL Apache and others: index.htm … IIS: default.htm, default.htm, index.htm…
Locations of files and directories http://www.wmich.edu http://www.wmich.edu/library http://localhost
Copyright (c) 2010, Dr. Kuanchin Chen 24
Checking What Server-Side Technology is Used on a Web Site
By file extension By checking modules installed on
the Web server By asking around
Copyright (c) 2010, Dr. Kuanchin Chen 25
XHTML Form Methods
GET Used to retrieve an online document Also used to send HTML form data to a web server Data are sent along with the URL (therefore, they
are visible in the address box) Limited URL length, therefore limited data volume Easier to bookmark dynamic pages
POST Data are sent in the message body section (not
visible to the user) Larger volume of data is possible More secure (compared with the GET method)
Other form methods
Copyright (c) 2010, Dr. Kuanchin Chen 26
URL Encoding (I)
An encoded URL: http://www.mysite.com/search.aspx?term=%22SQL+Server%22
Certain characters of form data are encoded "...Only alphanumerics [0-9a-zA-Z], the special characters $-_.+!
*'(), and reserved characters used for their reserved purposes may be used unencoded within a URL." (RFC 1738)
Reserved characters that have special meanings in a URL: Dollar ("$") Ampersand ("&") Plus ("+") Comma (",") Forward slash/Virgule ("/") Colon (":") Semi-colon (";") Equals ("=") Question mark ("?") 'At' symbol ("@")
Copyright (c) 2010, Dr. Kuanchin Chen 27
URL Encoding (II)
Unsafe characters that need to be encoded: "{", "}", "|", "\", "^", "~", "[", "]", and "`".
How encoding is done.