jan.2001c.watters1 world wide web basics. jan.2001c.watters2 what is an internet anyway? 2 or more...
TRANSCRIPT
Jan.2001 C.Watters 3
Historical View: Internet
• 1969 - Telnet
• 1970 - 4 computers – Stanford, UCLA, UC Santa Barbara, U Utah
• 1971 - FTP
• 1983 - 562 computers on the internet
• 1993 - 1.2 million computers on the internet
• 1999 - ?? Kazillions
Jan.2001 C.Watters 7
The web
• The Web is protocol that uses the internet as the communication structure
• links documents stored in computers communicating with the internet
• main authority: W3 consortium• www.w3.com
Jan.2001 C.Watters 8
Historical View: WWW
• 1989 - Berners Lee - web doc proposal
• 1990 - Berners Lee - text browser (physicists)
• 1992 - public access to web docs at CERN
• 1993 - 60 web servers & Mosaic (graphics)– (500 servers by year end)
• 1995 - more Internet email than US post
• 1999 - x million docs & y million servers
Jan.2001 C.Watters 10
Basics
• Web server - machine that services internet request
• Web client - machine that initiates internet request
• Browser - software to interact with internet data at the client
• TCP/IP - internet data protocol
• FTP - internet file transfer protocol
• HTTP - hypertext transfer protocol
• HTML - hypertext markup language
Jan.2001 C.Watters 15
1. Client-Server & Web
• Cloud model
• TCP/IP
• HTTP and MIME types
• FTP
• protocol stacks
Jan.2001 C.Watters 16
Servers and Clients
• Servers - computer systems at the end of a network that store files and provide other services
• Clients - computer systems that are end points for users of the data
Jan.2001 C.Watters 17
Network Architectures
• ISO’s OSI model
• 1970’s
• International Organization for Standards
• Open Systems Interconnection reference Model
• 7 layer architecture
Jan.2001 C.Watters 18
ISO - OSI Model
• Application layer• presentation layer• session layer• transport layer• network layer• data link layer• physical layer
• Ftp, telnet, etc• data compression, format• set up connections• end-to-end trans of packets• guide packets along links• send packet between nodes• deliver bits between nodes
Jan.2001 C.Watters 20
INTERNET MODEL• 4 layers
• Application layer– communication services (ftp, telnet, email)
• transport layer– transmission of messages end-to-end
• network layer– transmission of messages sequence of links
• link layer– transmission of packet across one link
Jan.2001 C.Watters 23
TCP/IP
• Suite of protocols made the standard for the Internet
• facilitates communication between heterogeneous and similar networks that are connected together
• reliable, connection oriented, byte stream protocol
Jan.2001 C.Watters 24
Transport layer: TCP and UDP
• TCP– transmission control
protocol
– full duplex byte stream
– virtual path (connected)
– error free
– uses acknowledgements
– 16 bit address of ports
• UDP– user datagram protocol
– connectionless
– no acknowledgements
– no flow control
– no resending of erroroneous packets
– some error detection
– 16 bit port addresses
Jan.2001 C.Watters 27
Network Layer: IP
• Delivers packets up to 64kbytes, 1 at a time
• Each packet has a header– sending host and intended host network addresses– 32 bit addresses
• IP layer (like UDP)– unreliable– connectionless
Jan.2001 C.Watters 28
Link Layer: links
• Connect computer to Internet
• SLIP– serial line IP (asynchronous, 1 char at a time)– move IP packets to common link (phone line)
• PPP– point-to-point protocol– also synchronous transfer for packets
Jan.2001 C.Watters 30
TCP/IP apps
• TCP/IP software usually includes:– remote terminal client using TELNET protocol
for remote login– electronic mail client using SMTP protocol to
transfer e-mail to remote system – file transfer client using FTP protocol to
transfer files between 2 machines
Jan.2001 C.Watters 31
HTTPHyperText Transport Protocol
• Native protocol for WWW
• sits on top of internet’s TCP/IP protocol
• HTTP is a 4 step process per transaction
• uses a predefined set of document formats from MIME
Jan.2001 C.Watters 32
MIME
• MIME - multipurpose internet mail extensions– defines file formats (images, video, text, etc)– e.g. Content-type: text/html– Data type/subtype
» text/html» text/plain» image/gif» video/mpeg» application/msword » etc!!!
Jan.2001 C.Watters 33
HTTP Connection
• 1. Client – makes an HTTP request for a web page– makes a TCP/IP connection
• 2. Server accepts request– sends page as HTTP
• 3. Client downloads page
• 4. Server breaks the connection
Jan.2001 C.Watters 34
HTTP is Stateless!!!!
• Each operation or transaction makes a new connection
• each operation is unaware of any other connection
• each click is a new connection
• So how do they do those shopping carts??
Jan.2001 C.Watters 35
What does it look like?
• Header + object file• Header
– plain text– info about the object (MIME etc)– methods allowed– etc– browser sends a header to server each time you ask
for information– server sends a header and possibly content
Jan.2001 C.Watters 36
HTTP Transaction Example
GET /catalog/ip/ip.htm HTTP 1.0
Accept: text/plain
Accept: text/html
Referer: http://www.june.com/catalog.html
User-Agent: Mozilla/2.0
<CR/LF>
Jan.2001 C.Watters 37
HTTP REQUEST PROTOCOL
Request = Simple | Full
Simple = GET <URI> CRLF
Full = Method URI ProtVersion CRLF [<HTRQ Header>*] [CRLF <data>]
Method = GET | POST | HEAD | ….
<HTRQ Header> = <Fieldname>:<Value>CRLF
<data> = MIME conforming message
www.w3.org/Protocols/HTTP/
Jan.2001 C.Watters 38
HTTP Header fields
• General-header fields– used for both requests and responses
• Request-header fields– used for responses– extra client information for use by server– optional
Jan.2001 C.Watters 39
General-header fields
• Date: mon,11, Jan 1999 08:14:32 GMT
• MIME-version: 1.0
• Pragma: no cache– directives
Jan.2001 C.Watters 40
Request-header fields
• acceptable MIME types for response – Accept:text/html– Accept:*.*
• 401 response from client– Authorization: Basic abcdef (uuencoded
username and password)
• From:client-email-addr
Jan.2001 C.Watters 41
More Request-header fields
• If-Modified-Since:date– conditional get
• source of current requested Url– Referer:URL
• robot/browser identification– User-Agent:Mozilla/2.0
Jan.2001 C.Watters 42
Looking at the HTTP Header Values
• In Perl– $ENV{“From”}
• In Netscape– www.cs.dal.ca/~watters/cgi-bin/webcourse/env.html
Jan.2001 C.Watters 43
HTTP Methods
• Client requests either– simple request– full request
Request-line= method Request-URI HTTP-version CRLF
GET /catalog/ip.html HTTP/1.0
Jan.2001 C.Watters 44
Simple requests
• Only for HTTP 0.9
• only uses Get method
• causes the server to locate and transfer the object specified
• client responsible for handling the object
GET <uri> CRLF
Jan.2001 C.Watters 45
Full Request
• Uses HTTP version and more methods
• method tells server what to do to the resource requested
• Methods– GET– POST– HEAD
Jan.2001 C.Watters 46
GET Method
• Request server to retrieve object specified
• conditional GET– request message includes– If-Modified-Since in header
Jan.2001 C.Watters 47
HEAD Method
• Like GET but does not return the object
• returns a header about the resource requested (metainformation)
• good way to test link validity
Jan.2001 C.Watters 48
POST Method
• Include an object in the request
• server should use that object in processing the request
• must include a Content-Length in header
Jan.2001 C.Watters 49
HTTP Response Message
• HTTP protocol version
• 3 digit status code
• reason phrase
• CRLF
• optional header fields
• CRLF
Jan.2001 C.Watters 50
HTTP Response Header Fields
• Additional information about the server
• such as:– LOCATION: exact URI address– SERVER: server software (CERN/3.0)– WWW-AUTHENTICATE:
• status 401 responses (unauthorized request)
• server challenges client
• client may use to send authorization info to server
Jan.2001 C.Watters 51
Understanding STATUS Codes
• 1xx - not yet in use
• 2xx - action successful
• 3xx - further action needed
• 4xx - client request error
• 5xx - server error
Jan.2001 C.Watters 52
HTTP Transaction
• 1. Client and server establish a connection
• 2. Client makes a request
• 3. Server makes a response
• 4. Server terminates connection
Jan.2001 C.Watters 53
• Step 1 establish connection– TCP/IP connection set up– uses a port number as application reference– usually port 80– ports < 1024 are privileges (>1024 are open)
• Step 2 client request– Http message sent with a request line– request-line = method URL HTTP version
Jan.2001 C.Watters 54
Web Port Assignments
• 21 FTP
• 23 Telnet
• 25 smtp (mail)
• 70 gopher
• 79 finger
• 80 HTTP
Jan.2001 C.Watters 55
• Step 3 Server response– server sends Http message and optionally
requested data– resp-message = HTTP version statuscode
reason-phrase [optional stuff]
• Step 4 connection terminated– usually the server– sometimes the client “stops” it– anything else, whoever notices terminates