outline of today’s class

81
Outline of Today’s Class Web Servers Static and Dynamic Web Pages CGI Programming What makes the CGI work? FORM GET and POST Methods QUERY_STRING and CONTENT_LENGTH SGML, HTML and XHTML XML and DTD XML Examples

Upload: others

Post on 29-Dec-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Outline of Today’s Class Web Servers Static and Dynamic Web Pages CGI Programming What makes the CGI work?

FORM GET and POST Methods QUERY_STRING and CONTENT_LENGTH

SGML, HTML and XHTML XML and DTD XML Examples

Web Servers How does a web server work?

You contact the web server and request a file. The server returns the file.

Web Server

Files /myDir/index.html /myDir/foo.html /myDir/bar.html

PC-1

PC-2

GET foo.html

Foo.html

GET index.html

Index.html

Request 1

Request 2

Web Servers

Web Servers

Most web servers are very simple. They just return files to the PC that requests it

The web browser does the hard work of translating a file into pretty pictures

See “View->Source” for the file actually returned

Web Servers

It would be a Bad Thing if anyone on the internet could retrieve any file on the web server.

The files are kept in a special directory — requests for files are relative to that directory.

Static Web Pages

Retrieve file

Send file

Request file

Dynamic Web Pages

Do Computation Generate HTML page with results of computation

Return dynamically generated HTML file

Request service

CGI and Web Forms

How to write the HTML that sends data to the server?

What does the server have to do to process this information?

The most common method to handle this is CGI -- Common Gateway Interface

CGI Programming

CLIENT

HTTP SERVER

CGI Program

HTTP SERVER

CGI Program

stdin

stdout

Environment Variables

CGI Programming

Important CGI Environment Variables

REQUEST_METHOD

QUERY_STRING

CONTENT_LENGTH

Request Method: Get

GET requests can include a query string as part of the URL:

GET /cgi-bin/finger?hollingd HTTP/1.0

Request Method

Resource Name

Delimiter

Query String

CGI URLs

There is a mapping between URLs and CGI programs provided by a web server. The exact mapping is not standardized (web server admin can set it up)

Typically: requests that start with /CGI-BIN/ , /cgi-bin/

or /cgi/, etc. refer to CGI programs (not to static documents).

CGI Programs

When the user hits the “submit” button the data is sent to the web server

The CGI program that handles it on the web server is specified in the HTML Form tag

<FORM method=post action="http://unix.aml.yorku.ca/cgi-bin/formProcessor.pl">

CGI Programs

Anything special about the program? The web server has to have permissions set to

allow the program to be executed. Typically this is only turned on in a few directories, eg /cgi-bin

Has to comply with the usual security things for that system.

CGI Programs

What kind of program does it need to be? Can be written in any language—C++, C,

perl, etc. Just has to be able to process the attribute-value pairs.

Perl is excellent for its pattern matching and text processing capabilities.

CGI Programs The data is sent to the CGI program in a specific format of attribute-value pairs. The attribute is the name of the field in the HTML tag, the values are what the user inputs

firstName=lee middleName=harvey lastName=oswald

First name: <input type="text" name="firstName"> Middle name: <input type="text" name="middleName"><br> Last name: <input type="text" name="lastName"><br>

CGI Programs

Strengths: A simple method to send data to the server. Dynamically generates HTML pages.

Weaknesses All the processing happens on the server. Takes time to launch the CGI process on the

server. Use the process, instead of thread.

Web Forms Overview of Web forms

HTML form components

GET & POST methods

Server-side processing with forms

CGI-based Web Application

Get Data

HTT Request

HTTP Document

Output (HTML)

HTML forms to invoke CGI scripts

CGI Scripts/ Applications

Web Browser Web Server

Database Return data

Presenter
Presentation Notes

Form Interaction with CGI

Web Browser Web Server

CGI Program

User requests form

Returns form to client

User submits form Forwards to CGI program

Returns results to server Returns results to client

Network Server

Forms Forms work in a different and slightly more

complex way than standard HTML pages. Forms consist of a number of separate data entry

components such as menus and text areas. The user can select different options from the menus

and enter text in the text entry fields. A single form can contain many text entry fields

and/or many menus. To differentiate the menus and text areas from each

other each one is given a unique name, selected by the Web form designer.

HTML Forms

Each form includes a METHOD that determines what http method is used to submit the request.

Each form includes an ACTION that determines where the request is made.

HTML Forms HTML includes elements or tags for creating forms on Web pages.

There are three stages to creating a form: define the form data [a set of variables] design the form itself define the method for processing the form’s data on the

server-side

When the Web page containing the form is loaded, the user can: enter data into the form then submit that data to the Web server

[usually by clicking a submit button on the form]

HTML Form Variables A variable has:

a name a value

A form contains one or more variables. When the user fills in the form, values are assigned to these variables.

When the user clicks the submit button, the set of variable names & corresponding values are sent to the Web server in a HTTP request. The Web server can extract the set of variables & values from the HTTP request, and can do something with them...

Example for HTML Form <html> <head> <title>Query Form</title> </head>

<body> <h2>Query Form</h2> <form method="GET” action="doquery.php”> <p>Your name: <input name="name" type="text" size=30></p> <p>Your ID: <input name="id" type="text" size=15></p> <p><input type="submit" value="Submit your query"></p> <p><input type="reset" value="Clear your query"></p> </form> </body> </html>

Note that this form contains two variables

name & id

Example for HTML Form

<input name="name" type="text" size=30>

<input name="id" type="text" size=15>

<input type="submit" value="Submit your query"> <input type="reset" value="Clear your query">

Forms

<?xml version = "1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns = "http://www.w3.org/1999/xhtml"> <head> <title>Web Engineering - Feedback Form</title> </head> <body><h1>Feedback Form</h1> <p>Any comments please.</p> <form method = "post" action = "/cgi-bin/feedbackform"><p> <input type = "hidden" name = "recipient" value = "[email protected]" /> <input type = "hidden" name = "subject" value = "Feedback Form" /> <input type = "hidden" name = "redirect" value = "main.html" /> </p> </form> <p>

Each form must begin and end with form tags.

The method attribute specifies how the form’s data is sent to the Web server. The post method appends form data

to the browser request.

The value of the action attribute specifies the URL of a script on

the Web server.

Input elements are used to send data to the script that processes the form.

A hidden value for the type attribute sends data that is

not entered by the user.

Forms <label>Name: <input name = "name" type = "text" size = "25" maxlength = "30" /> </label></p> <p><form> <input type = "submit" value = "Submit comments" /> <input type = "reset" value = "Clear comments" /> </p> </form></body></html>

The value attribute displays a name on the buttons created.

The maxlength attribute gives the maximum number of

Characters the user can input.

The size attribute gives the number of characters

visible in the text box.

The label element describes the data the user needs to enter in the text box.

Forms

Text box created using

input element.

Reset button created

using input element.

Submit button created

using input element.

Table & Form

<TABLE FRAME = none> <TR><TD ALIGN = right> Name:<BR> Card number:<BR> Expires:<BR> Telephone:<BR> <TD ALIGN=left><BR> <FORM method="POST" action=”/cgi-bin/myscript.cgi”> <INPUT NAME=“name” SIZE=18><BR> <INPUT NAME=“cardnum” SIZE=18><BR> <INPUT NAME=“expires-month” SIZE=2>/ <INPUT NAME=“expires-year ” SIZE=2><BR> <INPUT NAME=“phone” SIZE=18> </FORM> </TABLE>

Form Methods The method attribute on the form tag specifies how the Web

Browser should send the data to the Web server.

Two options: GET: pass the data in a HTTP GET request POST: pass the data in a HTTP POST request

In a HTTP GET request, the browser appends the form data to a

URL. For example:

http://www.yorku.ca/jhuang/doquery.cgi?name=joe+bloggs&id=1234

Note how the variable names & values are appended to the URL. Any spaces in a value are converted to +.

Form Actions The action attribute on the form tag specifies what the

Web server should do with the form data.

Common options: email the data to someone [the mailto action] pass the data to a script or program

The script will be parsed the variables & values, and

can then process them.

For example, the CGI script could use the name & id to look up student info in a database.

Form Actions <form method="GET" action="mailto:[email protected]">

Until you can actually use scripts on the server, use the

mailto action. It operates in the same way as the mailto that you have used in the HTML document.

When used in a form, the mailto action will send an email to the email address of the person specified. The mailto action is of limited use for complicated forms but works adequately for simple forms.

The email received contains all of the names and values in one long list.

What a CGI will get

The query (from the environment variable QUERY_STRING) will be a URL-encoded string containing the name, value pairs of all form fields.

The CGI must decode the query and separate the individual fields.

GET vs. POST

The GET method delivers data (query) as part of the URL

When using forms, it’s generally better to use POST: there are limits on the maximum size of a GET

query string (environment variable) a post query string doesn’t show up in the

browser as part of the current URL

CGI reading POST

If REQUEST_METHOD is a POST, the query is coming in STDIN.

The environment variable CONTENT_LENGTH tells us how much data to read.

CGI Method Summary GET: REQUEST_METHOD is “GET” QUERY_STRING is the query

POST: REQUEST_METHOD is “POST” CONTENT_LENGTH is the size of the query

(in bytes) query can be read from STDIN

HTTP Form Processing

1. user fills in form & clicks submit

internet

5. Browser displays the script results*

4. server sends script results to

Browser

*The script results will usually be HTML text

3. server runs the script doquery.cgi passing form data to it

2. Browser sends GET http://www.yorku.ca/jhuang/doquery.cgi?name=joe+bloggs&id=1234

A More Complex Form Example

Password field

Radio buttons

Drop-down list

Check boxes

Text area

Text field

Buttons

Form Processing & Results The easiest way to deal with form data is to simply email it to an

email address using a mailto form action:

<form method="POST" action="mailto:[email protected]">

More often, we want to process the data on the server-side, using a program or script.

The old way is to use a so-called CGI Script, usually with a URL something like:

<form method="POST" action=”/cgi-bin/myscript.cgi">

The newer way is to use an HTML-embedded script language such as Servlet, JSP, or ASP. We’ll look at how to use Servlet later in the course...

Alternatives for Generating Dynamic Pages

Java Servlets

Java Server Pages

Active Server Pages (ASP)

Can dynamically generate page in other ways?

Dynamic Web Pages

CGI program

other program

( application )

WWW server

API

WWW client

Java servlet

Java applet

script ( embedded in HTML )

SSI

HTTP

server side

client side

CGI

Some History

SGML

HTML XML and XHTML

SGML

Standardized General Markup Language Developed by a committee! Led by Charles Goldfarb, 1978-1986 A grammar to define the structure of documents

Rules define the construct or structure Terminals are <tags> and strings

HTML & XML

HTML is a subset of SGML with a shared DTD

HTMLDOC::=(<html> HEAD BODY </html>)

XML is a subset of SGML with many DTDs

allowed

XML Uses tags to identify semantics of data looks like HTML, but isn’t

<slide><title>Introduction</title> <author><first>Jimmy</first> <last>Huang</last> </author> <content>XML this and that</content> </slide>

is license free, platform-independent and well-supported

HTML

Hypertext Markup Language

Hypertext Markup Language

Presents documents via WWW browsers Specifies document layout and hyperlink

Predefines set of tags (ie. Common DTD)

<HTML> <TITLE>Statistics Canada</TITLE> <BODY> <H3>Welcome to Stats Canada</H3> Statistics Canada ……. . <p> We like numbers….. <img src=“mapleleaf.gif> <ul>What we do <li><a href=“census.html”>Census</a> <li><a href=“special.html”>Special surveys</a> <li><a href=“online.html”>Online data</a> </ul> </BODY> </HTML>

HTML: An Example

HTML HTML - Advantages

Simple - fixed set of tags Portable - used with all browsers Linking - within and to external documents

HTML - Disadvantages

Limited tag set Can’t separate the presentation from content Can’t define structure of contents

XHTML EXtensible Hyper-Text Makeup Language

)

XHTML Basics

Very few real changes from HTML But more strict

All tags are in lowercase All tags must be closed

Empty tags Paired tags

XHTML Document Structure

Overlap versus Nesting

XHTML tags

Start tags and end tags

Start tags - delimited by < and >

End tags - delimited by </ and > <h1>This is a Large Heading</h1>

<br>This text starts on a new line.

Some start tags also include attributes which further define information about the element.

!DOCTYPE HTML 3.2

<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 3.2 Draft//EN”>

Netscapes HTML standard <!DOCTYPE HTML PUBLIC “-//WebTechs//DTD Mozilla

HTML 2.0//EN”> Not strictly necessary for HTML, highly recommended Future browsers can still attempt to display your older documents

(written to previous HTML standards) in the way that was originally intended, even though the HTML language may have evolved

XHTML <?xml version = "1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0

Strict//EN“ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

!DOCTYPE

<?xml version = "1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <!– Comments: name_of_webpage.html --> <html xmlns = "http://www.w3.org/1999/xhtml"> <head> <title> Web Engineering: XHTML I </title> </head> <body> <p>Welcome to XHTML!</p> </body> </html>

!DOCTYPE Title tags

Body tags

Images

<?xml version = "1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <!-- Pictures with XHTML --> <html xmlns = "http://www.w3.org/1999/xhtml"> <head> <title>Web Engineering - pictures</title> </head> <body> <p><img src = "angelheart.jpg" height = "251" width = "367" alt = "An angel" /> <img src = "grail.jpg" height = "180" width = "130" alt = "A chalice" /></p> </body> </html>

The value of the src attribute

of the image element is the

location of the image file.

The value of the alt attribute gives a description of the image. This description

is displayed if the image cannot be displayed.

The height and width attributes of the

image element give the height

and width of the image.

Colours <BODY TEXT=“aqua”>

<BODY TEXT=“#00FF00”> <FONT COLOR = “#rrggbb” | “colour name”>

text</FONT>

aqua

black

blue

fuchsia gray

green

lime

maroon navy

olive

purple

red silver

teal

white

yellow

000000 00FF00 FFFFFF

BLACK BRIGHT-GREEN WHITE

Inline Styles <h1 style="color:blue; font-style: italic">First

Stylesheet Example</h1> <p>The first example of stylesheets uses an inline

style.</p> <h1>Second Stylesheet Example</h1> <p>The second example of stylesheets uses a document-

level style.</p> <h1>Third Stylesheet Example</h1> <p> The third example of stylesheets uses an external

stylesheet.</p>

Demonstration: inline_css.html

XML EXtensible Markup Language

)

XML Introduction The Extensible Markup Language (XML) is a document

processing standard proposed by the World Wide Web Consortium (W3C), which is related to Standard Generalised Markup Language (SGML).

Possible to search, sort, manipulate and render XML using Extensible Markup Language (XSL).

Highly portable

Files end in the .xml extension.

Presenter
Presentation Notes
XML stands for Extensible Markup Language All the XML files end in the .xml extension.

XML & W3C • XML has been in development since the 1960s through its parent called SGML (Standard Generalized Markup Language) which is also the parent for HTML

• XML is a streamlined version of SGML designed for transmission of structured data over the Web by a working group in the World Wide Web Consortium (W3C) in 1996

• Passed as W3C standard in Feb 1998

- www.w3.org/xml - www.xml.com/axml/axml.html (annotated version)

XML-related Technologies DTD (Document Type Definition) and XML Schemas are

used to define legal XML tags and their attributes for particular purposes

CSS (Cascading Style Sheets) describe how to display HTML or XML in a browser

XSLT (eXtensible Stylesheet Language Transformations) and XPath are used to translate from one form of XML to another

DOM (Document Object Model), SAX (Simple API for XML, and JAXP (Java API for XML Processing) are all APIs for XML parsing

From HTML to XML.. • HTML major drawback – information loses its structure when translated into HTML

• HTML is a presentation-oriented markup language, so information embodied in it is difficult to process

• Information and knowledge servers are overloaded since we have to search information and perform format processing

• Servers often answer the same request many times if users request several views on the same data

• HTML: - Lacks extensibility – can’t create tags or attributes to parameterise or semantically qualify data - Lacks structure – does not support the specification of deep structures needed to represent database schemas or object-oriented hierarchies - Lacks validation – does not support language specification that lets applications check imported data’s structural validity

From HTML to XML..

XML Goals As a portable, platform independent data storage

• support a wide variety of applications, • easy to use across the Internet, • compatible with SGML, • easy to create programs that process XML, • clear and legible (self-describing), • XML documents should be easy to create • XML designs should be quickly prepared, formal & concise etc.

XML.. • XML is not for displaying information but for managing information. •Working group of World Wide Web Consortium (W3C) created XML as a standard for creating markup languages. • Designed it for distributing structured documents over the web • A kind of “light” SGML (Standard General Markup Language) simplified to meet Web requirements • Unlike HTML, XML lets users:

⇒ Extract data from a document ⇒ Define their own tags and attributes ⇒ Define data structures and nest document structures to any complexity level ⇒ Make applications that validate a documents structure. Any XML document can contain an optional description of its grammar for use by applications that perform structural validation

XML..

The problem that XML helps us to solve is how to transfer data between servers, or between the client and the server.

It is a Markup language for describing structured data – content is separated from presentation.

XML documents contain only data Applications decide how to display the data

Language for creating markup languages Can create new tags

XML documents contain only data, not formatting instructions, so applications that process XML documents must decide how to display the documents data.

For example a PDA (personal digital assistant) may render an XML document differently than a wireless phone or desktop computer would render that document.

HTML and XML

XML stands for eXtensible Markup Language HTML is used to mark up text so it can be displayed to users

XML is used to mark up data so it can be processed by computers

HTML describes both structure (e.g. <p>, <h2>, <em>) and appearance (e.g. <br>, <font>, <i>)

XML describes only content, or “meaning”

HTML uses a fixed, unchangeable set of tags

In XML, you make up your own tags

XML.. XML is a meta-language With HTML, existing markup is static: <HEAD> and <BODY>

for example, are tightly integrated into the HTML standard and cannot be changed or extremely difficult extended.

XML.. XML is a meta-language With HTML, existing markup is static: <HEAD> and <BODY>

for example, are tightly integrated into the HTML standard and cannot be changed or extremely difficult extended.

XML, on the other hand, allows ou to create your own markup tags and configure each to your liking: for example <WebEngHeading> <WebEngSummary>

<WebEngReallyWildFont>

Each of these elements can be defined through user defined document type definitions (DTD) and stylesheets are applied to one or more XML documents.

There are no ‘correct’ tags for an XML document, except those defined by the author

Some Code Schema Entity

Passport Details SubEntities

Last Name First Name Address

Entity

Address SubEntities

Street City Town State Province ……..

<!ELEMENT passport_details (last_name,first_name+,address)> <!ELEMENT last_name (#PCDATA)> <!ELEMENT first_name (#PCDATA)> <!ELEMENT address

(street,(city|town),(state|province),(ZIP|postal_code),country,contact_no?,email*)> <!ELEMENT street (#PCDATA)> <!ELEMENT city (#PCDATA)> <!ELEMENT town (#PCDATA)> <!ELEMENT state (#PCDATA)> <!ELEMENT province (#PCDATA)> <!ELEMENT ZIP (#PCDATA)> <!ELEMENT postal_code (#PCDATA)> <!ELEMENT country (#PCDATA)> <!ELEMENT phone_home (#PCDATA)> <!ELEMENT email (#PCDATA)>

DTD

Internal DTD and Instance <?xml version='1.0'?> <!DOCTYPE passport_details [ <!ELEMENT passport_details

(last_name,first_name+,address)> <!ELEMENT last_name (#PCDATA)> <!ELEMENT first_name (#PCDATA)> <!ELEMENT address

(street,(city|town),(state|province) ,(ZIP|postal_code),country,contact_no?,email*)> <!ELEMENT street (#PCDATA)> <!ELEMENT city (#PCDATA)> <!ELEMENT town (#PCDATA)> <!ELEMENT state (#PCDATA)> <!ELEMENT province (#PCDATA)> <!ELEMENT ZIP (#PCDATA)> <!ELEMENT postal_code (#PCDATA)> <!ELEMENT country (#PCDATA)> <!ELEMENT phone_home (#PCDATA)> <!ELEMENT email (#PCDATA)> ]>

<passport_details> <last_name>Smith</last_name> <first_name>Jo</first_name> <first_name>Stephen</first_name> <address> <street>1 Great Street</street> <city>GreatCity</city> <state>GreatState</state> <postal_code>1234</postal_code> <country>GreatLand</country> <email>[email protected]</email> </address> </passport_details>

Shared DTD XML Document specifies the DTD <?xml version='1.0'?> <!DOCTYPE passport_details SYSTEM "PassportExt.dtd"> <passport_details> <last_name>Smith</last_name> <first_name>Jo</first_name> <first_name>Stephen</first_name> <address> <street>1 Great Street</street> <city>GreatCity</city> <state>GreatState</state> <postal_code>1234</postal_code> <country>GreatLand</country> <email>[email protected]</email> </address> </passport_details>

XML Examples

XML Source File http://www.yorku.ca/jhuang/xml/04.adhoc.topics.xml

XML Style language

http://www.yorku.ca/jhuang/xml/04.adhoc.topics.xsl

Parsing and rendering XML with IE5+

http://www.yorku.ca/jhuang/xml/04.adhoc.topics_xsl.xml

XML Applications XML permits document authors to create markup for

virtually any type of information.

Authors can create entirely new markup languages for describing specific types of data, including mathematical formulas, chemical molecular structures, music, recipes etc.

- XHTML - VoiceXML (for speech) - MathML (for mathematics) - SMIL (the Synchronous Multimedia Integration Language, for

multimedia presentations) - CML (Chemical Markup Language, for chemistry) - XBRL (Extensible Business Reporting Language, for financial

data exchange)

XML Parsers Processing an XML document requires a software program

called an XML parser (or processer). These are available at no charge in many languages (Java, Python, C++ etc.).

http://www.xml.com/programming/ Parsers check an XML documents syntax and enable software

programs to process marked-up data. XML parsers can support the Document Object Model (DOM) or the Simple API for XML (SAX).

DOM: Build a tree structure containing the XML document’s data

SAX: Process the document and generate events

Presenter
Presentation Notes
How to process an XML document? We need a software program called an XML parser. You can get some useful information for XML programming by looking at this web site.

XML-related Vocabulary SGML: Standard Generalized Markup Language XML : Extensible Markup Language DTD: Document Type Definition element: a start and end tag, along with their contents attribute: a value given in the start tag of an element entity: a representation of a particular character or string PI: a Processing Instruction, to possibly be used by a program

that processes this XML namespace: a unique string that references a DTD well-formed XML: XML that follows the basic syntax rules valid XML: well-formed XML that conforms to a DTD