jan.2001c.watters1 world wide web basics. jan.2001c.watters2 what is an internet anyway? 2 or more...

55
Jan.2001 C.Watters 1 World Wide Web Basics

Upload: flora-edwards

Post on 13-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Jan.2001 C.Watters 1

World Wide Web

Basics

Jan.2001 C.Watters 2

What is an internet anyway?

• 2 or more networks that can communicate

Jan.2001 C.Watters 3

Historical View: Internet

• 1969 - Telnet

• 1970 - 4 computers – Stanford, UCLA, UC Santa Barbara, U Utah

• 1971 - FTP

• 1983 - 562 computers on the internet

• 1993 - 1.2 million computers on the internet

• 1999 - ?? Kazillions

Jan.2001 C.Watters 4

What is the World Wide Web ?

• Hypertext connectivity of “documents”

Jan.2001 C.Watters 5

Size of the Internet

Jan.2001 C.Watters 6

Jan.2001 C.Watters 7

The web

• The Web is protocol that uses the internet as the communication structure

• links documents stored in computers communicating with the internet

• main authority: W3 consortium• www.w3.com

Jan.2001 C.Watters 8

Historical View: WWW

• 1989 - Berners Lee - web doc proposal

• 1990 - Berners Lee - text browser (physicists)

• 1992 - public access to web docs at CERN

• 1993 - 60 web servers & Mosaic (graphics)– (500 servers by year end)

• 1995 - more Internet email than US post

• 1999 - x million docs & y million servers

Jan.2001 C.Watters 9

http://www.lexis-nexis.com

Jan.2001 C.Watters 10

Basics

• Web server - machine that services internet request

• Web client - machine that initiates internet request

• Browser - software to interact with internet data at the client

• TCP/IP - internet data protocol

• FTP - internet file transfer protocol

• HTTP - hypertext transfer protocol

• HTML - hypertext markup language

Jan.2001 C.Watters 11

Client-Server Model

Jan.2001 C.Watters 12

Looking in the Cloud

• /opt/sbin/traceroute

Jan.2001 C.Watters 13

CA*net2 layer 2 links

Jan.2001 C.Watters 14

CA*net3 Physical Links

Jan.2001 C.Watters 15

1. Client-Server & Web

• Cloud model

• TCP/IP

• HTTP and MIME types

• FTP

• protocol stacks

Jan.2001 C.Watters 16

Servers and Clients

• Servers - computer systems at the end of a network that store files and provide other services

• Clients - computer systems that are end points for users of the data

Jan.2001 C.Watters 17

Network Architectures

• ISO’s OSI model

• 1970’s

• International Organization for Standards

• Open Systems Interconnection reference Model

• 7 layer architecture

Jan.2001 C.Watters 18

ISO - OSI Model

• Application layer• presentation layer• session layer• transport layer• network layer• data link layer• physical layer

• Ftp, telnet, etc• data compression, format• set up connections• end-to-end trans of packets• guide packets along links• send packet between nodes• deliver bits between nodes

Jan.2001 C.Watters 19

ISO OSI model

Jan.2001 C.Watters 20

INTERNET MODEL• 4 layers

• Application layer– communication services (ftp, telnet, email)

• transport layer– transmission of messages end-to-end

• network layer– transmission of messages sequence of links

• link layer– transmission of packet across one link

Jan.2001 C.Watters 21

Internet layers

Jan.2001 C.Watters 22

Application Layer

• FTP

• HTTP

• SMTP

• telnet

• etc

Jan.2001 C.Watters 23

TCP/IP

• Suite of protocols made the standard for the Internet

• facilitates communication between heterogeneous and similar networks that are connected together

• reliable, connection oriented, byte stream protocol

Jan.2001 C.Watters 24

Transport layer: TCP and UDP

• TCP– transmission control

protocol

– full duplex byte stream

– virtual path (connected)

– error free

– uses acknowledgements

– 16 bit address of ports

• UDP– user datagram protocol

– connectionless

– no acknowledgements

– no flow control

– no resending of erroroneous packets

– some error detection

– 16 bit port addresses

Jan.2001 C.Watters 25

TCP/IPTransport Control Protocol

Internet Protocol

Jan.2001 C.Watters 26

TCP and IP

Jan.2001 C.Watters 27

Network Layer: IP

• Delivers packets up to 64kbytes, 1 at a time

• Each packet has a header– sending host and intended host network addresses– 32 bit addresses

• IP layer (like UDP)– unreliable– connectionless

Jan.2001 C.Watters 28

Link Layer: links

• Connect computer to Internet

• SLIP– serial line IP (asynchronous, 1 char at a time)– move IP packets to common link (phone line)

• PPP– point-to-point protocol– also synchronous transfer for packets

Jan.2001 C.Watters 29

Data encapsulation using TCP on Ethernet

Jan.2001 C.Watters 30

TCP/IP apps

• TCP/IP software usually includes:– remote terminal client using TELNET protocol

for remote login– electronic mail client using SMTP protocol to

transfer e-mail to remote system – file transfer client using FTP protocol to

transfer files between 2 machines

Jan.2001 C.Watters 31

HTTPHyperText Transport Protocol

• Native protocol for WWW

• sits on top of internet’s TCP/IP protocol

• HTTP is a 4 step process per transaction

• uses a predefined set of document formats from MIME

Jan.2001 C.Watters 32

MIME

• MIME - multipurpose internet mail extensions– defines file formats (images, video, text, etc)– e.g. Content-type: text/html– Data type/subtype

» text/html» text/plain» image/gif» video/mpeg» application/msword » etc!!!

Jan.2001 C.Watters 33

HTTP Connection

• 1. Client – makes an HTTP request for a web page– makes a TCP/IP connection

• 2. Server accepts request– sends page as HTTP

• 3. Client downloads page

• 4. Server breaks the connection

Jan.2001 C.Watters 34

HTTP is Stateless!!!!

• Each operation or transaction makes a new connection

• each operation is unaware of any other connection

• each click is a new connection

• So how do they do those shopping carts??

Jan.2001 C.Watters 35

What does it look like?

• Header + object file• Header

– plain text– info about the object (MIME etc)– methods allowed– etc– browser sends a header to server each time you ask

for information– server sends a header and possibly content

Jan.2001 C.Watters 36

HTTP Transaction Example

GET /catalog/ip/ip.htm HTTP 1.0

Accept: text/plain

Accept: text/html

Referer: http://www.june.com/catalog.html

User-Agent: Mozilla/2.0

<CR/LF>

Jan.2001 C.Watters 37

HTTP REQUEST PROTOCOL

Request = Simple | Full

Simple = GET <URI> CRLF

Full = Method URI ProtVersion CRLF [<HTRQ Header>*] [CRLF <data>]

Method = GET | POST | HEAD | ….

<HTRQ Header> = <Fieldname>:<Value>CRLF

<data> = MIME conforming message

www.w3.org/Protocols/HTTP/

Jan.2001 C.Watters 38

HTTP Header fields

• General-header fields– used for both requests and responses

• Request-header fields– used for responses– extra client information for use by server– optional

Jan.2001 C.Watters 39

General-header fields

• Date: mon,11, Jan 1999 08:14:32 GMT

• MIME-version: 1.0

• Pragma: no cache– directives

Jan.2001 C.Watters 40

Request-header fields

• acceptable MIME types for response – Accept:text/html– Accept:*.*

• 401 response from client– Authorization: Basic abcdef (uuencoded

username and password)

• From:client-email-addr

Jan.2001 C.Watters 41

More Request-header fields

• If-Modified-Since:date– conditional get

• source of current requested Url– Referer:URL

• robot/browser identification– User-Agent:Mozilla/2.0

Jan.2001 C.Watters 42

Looking at the HTTP Header Values

• In Perl– $ENV{“From”}

• In Netscape– www.cs.dal.ca/~watters/cgi-bin/webcourse/env.html

Jan.2001 C.Watters 43

HTTP Methods

• Client requests either– simple request– full request

Request-line= method Request-URI HTTP-version CRLF

GET /catalog/ip.html HTTP/1.0

Jan.2001 C.Watters 44

Simple requests

• Only for HTTP 0.9

• only uses Get method

• causes the server to locate and transfer the object specified

• client responsible for handling the object

GET <uri> CRLF

Jan.2001 C.Watters 45

Full Request

• Uses HTTP version and more methods

• method tells server what to do to the resource requested

• Methods– GET– POST– HEAD

Jan.2001 C.Watters 46

GET Method

• Request server to retrieve object specified

• conditional GET– request message includes– If-Modified-Since in header

Jan.2001 C.Watters 47

HEAD Method

• Like GET but does not return the object

• returns a header about the resource requested (metainformation)

• good way to test link validity

Jan.2001 C.Watters 48

POST Method

• Include an object in the request

• server should use that object in processing the request

• must include a Content-Length in header

Jan.2001 C.Watters 49

HTTP Response Message

• HTTP protocol version

• 3 digit status code

• reason phrase

• CRLF

• optional header fields

• CRLF

Jan.2001 C.Watters 50

HTTP Response Header Fields

• Additional information about the server

• such as:– LOCATION: exact URI address– SERVER: server software (CERN/3.0)– WWW-AUTHENTICATE:

• status 401 responses (unauthorized request)

• server challenges client

• client may use to send authorization info to server

Jan.2001 C.Watters 51

Understanding STATUS Codes

• 1xx - not yet in use

• 2xx - action successful

• 3xx - further action needed

• 4xx - client request error

• 5xx - server error

Jan.2001 C.Watters 52

HTTP Transaction

• 1. Client and server establish a connection

• 2. Client makes a request

• 3. Server makes a response

• 4. Server terminates connection

Jan.2001 C.Watters 53

• Step 1 establish connection– TCP/IP connection set up– uses a port number as application reference– usually port 80– ports < 1024 are privileges (>1024 are open)

• Step 2 client request– Http message sent with a request line– request-line = method URL HTTP version

Jan.2001 C.Watters 54

Web Port Assignments

• 21 FTP

• 23 Telnet

• 25 smtp (mail)

• 70 gopher

• 79 finger

• 80 HTTP

Jan.2001 C.Watters 55

• Step 3 Server response– server sends Http message and optionally

requested data– resp-message = HTTP version statuscode

reason-phrase [optional stuff]

• Step 4 connection terminated– usually the server– sometimes the client “stops” it– anything else, whoever notices terminates