copyright (c) 2010, dr. kuanchin chen1 the client-server architecture of the www dr. kuanchin chen

28
Copyright (c) 2010, Dr. Kuanchin Chen 1 The Client-Server Architecture of the WWW Dr. Kuanchin Chen

Upload: claribel-richardson

Post on 28-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Copyright (c) 2010, Dr. Kuanchin Chen 1

The Client-Server Architecture of the WWW

Dr. Kuanchin Chen

Copyright (c) 2010, Dr. Kuanchin Chen 2

The Hypertext Transfer Protocol (HTTP)

HTTP governs the messages exchanged between a Web server and a Web client.

Who put together the HTTP standard? What is the current version of it? Links:

Main site for HTTP: http://www.w3.org/Protocols/

The HTTP Specification: http://www.w3.org/Protocols/rfc2616/rfc2616.html

Copyright (c) 2010, Dr. Kuanchin Chen 3

WWW Client-Side Technologies

(X)HTML CSS JavaScript (VBScript, PerlScript, etc.) XML/XSLT

Copyright (c) 2010, Dr. Kuanchin Chen 4

Server-Side Technologies (I)

Active Server Pages (ASP) Languages: VBScript, JavaScript, … Platforms: mostly Windows

ASP.NET is supported on more operating systems than ASP, but …

Portability: Medium-to-low Execution: Interpreted (compiled in .NET) Technology: Proprietary

Java Server Pages (JSP) Language: Java Platforms: Windows, UNIX, Mac OS, … Portability: High Execution: Compiled Technology: Proprietary

Copyright (c) 2010, Dr. Kuanchin Chen 5

Server-Side Technologies (II)

PHP Hypertext Preprocessor (PHP) Language: PHP Platforms: Windows, UNIX, Mac OS,… Portability: High Execution: Compiled Technology: Open source

Common Gateway Interface (CGI) Language: any Platforms: Windows, UNIX, Mac OS, … Portability: High Execution: Interpreted and/or compiled Technology: Proprietary or open source

Others

Copyright (c) 2010, Dr. Kuanchin Chen 6

What is a Port?

A port is a logical connection place for client and server programs to “talk” to each other. Each server program or service is assigned a

numeric port number. This is like a main telephone number (the server

computer’s physical network connection) and extensions (individual port numbers for services).

Well-known port numbers: FTP: port 21; HTTP: port 80; SMTP: port 25; IMAP:

port 143 IANA (Internet Assigned Numbers Authority ) port as

signment http://www.ibm.com vs http://www.ibm.com:80

Copyright (c) 2010, Dr. Kuanchin Chen 7

WWW Message Exchange

Client (E.g., a browser)

Server (E.g., a Web server)

A REQUEST message

A RESPONSE message

http://www.yahoo.com

HTTP

Copyright (c) 2010, Dr. Kuanchin Chen 8

The Hypertext Transfer Protocol (HTTP)

HTTP governs the messages exchanged between a Web server and a Web client.

Who put together the HTTP standard? What is the current version of it? Links:

Main site for HTTP: http://www.w3.org/Protocols/

The HTTP Specification: http://www.w3.org/Protocols/rfc2616/rfc2616.html

Copyright (c) 2010, Dr. Kuanchin Chen 9

Syntax of the REQUEST Message

Request = Request-Line …………………….. (1) *(( general-header | request-header | entity-header ) CRLF) …………………….. (2) CRLF …………………….. (3) [ message-body ] …………………….. (4)

Symbols used in the syntax:1. | denotes OR2. [ ] denotes OPTIONAL3. * denotes ONE OR MORE4. CRLF: Carriage Return (ASCII 13) and Line

Feed( ASCII 10)

Copyright (c) 2010, Dr. Kuanchin Chen 10

Syntax of the Request Line

Request-Line = Method SP Request-URI SP HTTP

Version CRLF

Note: 1. Method: POST | GET | others2. HTTP version: E.g., HTTP/1.1

Copyright (c) 2010, Dr. Kuanchin Chen 11

The REQUEST Message – A Graphical View

GET / HTTP/1.1 CRLF

Host: www.yahoo.com CRLF

From: [email protected] CRLF

Accept: text/plain, text/html CRLF

CRLF

The request line

The header section

An empty line

The optional message body

The REQUEST Message

Copyright (c) 2010, Dr. Kuanchin Chen 12

Want More Information about the REQUEST message?

See the REQUEST section of the HTTP specification.

Copyright (c) 2010, Dr. Kuanchin Chen 13

The Request Message - Examples

Example 1: http://www.yahoo.comGET / HTTP/1.1 CRLFHost: www.yahoo.com CRLFCRLF

Example 2: http://www.wmich.edu/libraryGET /library/ HTTP/1.1 CRLFHost: www.wmich.edu CRLFCRLF

Q: What are the equivalent URLs for the above two examples?

Copyright (c) 2010, Dr. Kuanchin Chen 14

Syntax of the RESPONSE Message

Response = Status-Line …………………….. (1) *(( general-header | request-header | entity-header ) CRLF) …………………….. (2) CRLF …………………….. (3) [ message-body ] …………………….. (4)

Symbols used in the syntax:1. | denotes OR2. [ ] denotes OPTIONAL3. * denotes ONE OR MORE4. CRLF: Carriage Return (ASCII 13) and Line

Feed( ASCII 10)

Copyright (c) 2010, Dr. Kuanchin Chen 15

Syntax of the Status Line

Status-Line = HTTP-Version SP Status-Code SP

Reason-Phrase CRLF

Note: 1. HTTP version: E.g., HTTP/1.12. Status-Code: See status code def.

Copyright (c) 2010, Dr. Kuanchin Chen 16

Common Status Codes

200 OK: the request is fulfilled and the requested document is attached in the MESSAGE BODY section of the response message.

404 Not Found: the requested document is not found.

505 Internal Server Error: the server cannot fulfill the request (mostly because of the problems in the server-side programs)

For more status code definitions, see the HTTP specification.

Copyright (c) 2010, Dr. Kuanchin Chen 17

The RESPONSE Message – A Graphical View

HTTP/1.1 200 OK

Date: Wed, 25 Dec 2002 06:45:47 GMT

Content-Type: text/html

Content-length: 7386

CRLF

<html><head> <title>My Test Page</title></head>…

The status line

An empty line

The optional message body

The REQUEST Message

The header section

Copyright (c) 2010, Dr. Kuanchin Chen 18

The Response Message – An Example

HTTP/1.1 200 OK CRLFDate: Mon, 08 Sep 2003 04:04:24 GMT CRLFContent-Type: text/html CRLFCRLF<html><head> <title>Yahoo!</title>…

Copyright (c) 2010, Dr. Kuanchin Chen 19

Exercise

“Browser-less” request and response messages

Copyright (c) 2010, Dr. Kuanchin Chen 20

Server vs. Client?

Web Servers Apache Internet Information Service (IIS) …

Web Clients Browsers Programs simulating the functionality of

browsers

Q: How do you know what server a site uses and how many days a site has been up?

Copyright (c) 2010, Dr. Kuanchin Chen 21

Introduction to Web Servers

The Web Server market Market share User community

Installation & Configuration See a document on WebCT

Directories/Folders Server root Document root

Server Root vs. Document Root

Copyright (c) 2010, Dr. Kuanchin Chen 22

Internet Information Services (IIS)

Apache (actually WAMP, a variant of Apache)

Document Root

Server Root

Copyright (c) 2010, Dr. Kuanchin Chen 23

Default File(s)

Default files The file to load when no file name is

provided in a URL Apache and others: index.htm … IIS: default.htm, default.htm, index.htm…

Locations of files and directories http://www.wmich.edu http://www.wmich.edu/library http://localhost

Copyright (c) 2010, Dr. Kuanchin Chen 24

Checking What Server-Side Technology is Used on a Web Site

By file extension By checking modules installed on

the Web server By asking around

Copyright (c) 2010, Dr. Kuanchin Chen 25

XHTML Form Methods

GET Used to retrieve an online document Also used to send HTML form data to a web server Data are sent along with the URL (therefore, they

are visible in the address box) Limited URL length, therefore limited data volume Easier to bookmark dynamic pages

POST Data are sent in the message body section (not

visible to the user) Larger volume of data is possible More secure (compared with the GET method)

Other form methods

Copyright (c) 2010, Dr. Kuanchin Chen 26

URL Encoding (I)

An encoded URL: http://www.mysite.com/search.aspx?term=%22SQL+Server%22

Certain characters of form data are encoded "...Only alphanumerics [0-9a-zA-Z], the special characters $-_.+!

*'(), and reserved characters used for their reserved purposes may be used unencoded within a URL." (RFC 1738)

Reserved characters that have special meanings in a URL: Dollar ("$") Ampersand ("&") Plus ("+") Comma (",") Forward slash/Virgule ("/") Colon (":") Semi-colon (";") Equals ("=") Question mark ("?") 'At' symbol ("@")

Copyright (c) 2010, Dr. Kuanchin Chen 27

URL Encoding (II)

Unsafe characters that need to be encoded: "{", "}", "|", "\", "^", "~", "[", "]", and "`".

How encoding is done.

Copyright (c) 2010, Dr. Kuanchin Chen 28

URL Encoding (III)

http://www.mysite.com/myasp.aspx?name=John+Doe&phone=237-8095

http://www.google.com/search?q=asp.net&start=0&ie=utf-8&oe=utf-8&client=firefox-a&rls=org.mozilla:en-US:official