web-dbms a quick reference

Post on 17-Dec-2014

2.415 Views

Category:

Education

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

Web Technology & DBMS– A Quick Reference

2

Chapter - Objectives

• Basics of Internet, Web, HTTP, HTML, URLs.• Difference between two-tier and three-tier client-server

architecture.• Advantages and disadvantages of Web as a database platform.• Approaches for integrating databases into Web:

• The Common Gateway Interface (CGI).• Server-Side Includes.• HTTP Cookies.

3

Chapter - Objectives

• Extending the Web Server.• Java and JDBC, JSQL and JRB.• Scripting Languages (JavaScript and VBScript).• Microsoft Active Platform: Active Server Pages (ASP) and Active Data

Objects (ADO).• Oracle Network Computing Architecture.

• Security problems in Web environment.• Proposed HTTP/1.1 and XML.

4

Introduction

• World-Wide Web (Web, WWW, or W3) possibly most popular and powerful networked information system to date.

• As architecture of Web was designed to be platform-independent, can significantly lower deployment and training costs.

• Organizations using Web as strategic platform for innovative business solutions, in effect becoming Web-centric.

5

Introduction

• Many Web sites today are file-based where each Web document is stored in separate file.

• For large sites, this can lead to significant management problems.• Also many Web sites now contain more dynamic information,

such as product and pricing data. • Maintaining such data in both a database and in separate HTML

files is problematic. • Accessing database directly from Web would be a better

approach.

6

Internet

Worldwide collection of interconnected networks.

• Began in late ‘60s in ARPANET, a US DOD project, investigating how to build networks that could withstand partial outages.

• Starting with a few nodes, Internet estimated to have over 100 million users in 1997, and over 270 million users in over 100 countries in 1998, with one million new users joining each month.

• May be 199 million users of Web by year 2000.

7

Intranet and Extranet

• Intranet - A Web site or group of sites belonging to an organization, accessible only by the members of an organization.

• Extranet - An intranet that is partially accessible to authorized outsiders.

• Whereas intranet resides behind firewall and is accessible only to people who are members of same organization, extranet provides various levels of accessibility to outsiders.

8

The Web

Hypermedia-based system that provides a simple ‘point and click’ means of browsing information on the Internet using hyperlinks.

• Information presented on Web pages, which can contain text, graphics, pictures, sound, and video.

• Can also contain hyperlinks to other Web pages, which allow users to navigate in a non-sequential way through information.

• Web documents written using HTML.

9

The Web

• Web consists of network of computers that can act in two roles: • as servers, providing information; • as clients (browsers), requesting information.

• Protocol that governs exchange of information between Web server and browser is HTTP and locations within documents identified as a URL.

• Much of Web’s success is due to its simplicity and platform-independence.

10

Basic Components of Web Environment

11

HyperText Transfer Protocol (HTTP)

Protocol used to transfer Web pages through Internet.

• Based on request-response paradigm: Connection - Client establishes connection with Web

server.

Request - Client sends request to Web server.Response - Web server sends response (HTML

document) to client.

Close - Connection closed by Web server.

12

HyperText Transfer Protocol (HTTP)

• HTTP/1.0 is stateless protocol - each connection is closed once server provides response.

• This makes it difficult to support concept of a session that is essential to basic DBMS transactions.

13

HyperText Markup Language (HTML)

Document formatting language used to design most Web pages.

• A simple, yet powerful, platform-independent document language.

• HTML is an application of Standardized Generalized Markup Language (SGML), a system for defining structured document types and markup languages to represent instances of those document types.

14

HyperText Markup Language (HTML)

15

HyperText Markup Language (HTML)

16

Uniform Resource Locators (URLs)

String of alphanumeric characters that represents location or address of a resource on Internet and how that resource should be accessed.

• Defines uniquely where documents (resources) can be found. • Uniform Resource Identifiers (URIs) - generic set of all Internet

resource names/addresses. • Uniform Resource Names (URNs) - persistent, location-

independent name. Relies on name lookup services.

17

Uniform Resource Locators (URLs)

• URL consists of three basic parts: • protocol used for the connection, • host name, • path name on host where resource stored.

• Can optionally specify:• port through which connection to host should be made,• query string.

http://www.w3.org/WWW/MarkUp.html

18

Static and Dynamic Web Pages

• HTML document stored in file is static Web page.• Content of dynamic Web page is generated each time it is

accessed. • Thus, dynamic Web page can:

• respond to user input from browser. • be customized by and for each user.

• Requires hypertext to be generated by servers. • Need scripts that perform conversions from different data

formats into HTML ‘on-the-fly’.

19

Requirements for Web-DBMS Integration

• Ability to access valuable corporate data in a secure manner.• Data and vendor independent connectivity to allow freedom of

choice in DBMS selection.• Ability to interface to database independent of any proprietary

browser or Web server.• Connectivity solution that takes advantage of all the features of

an organization’s DBMS.

20

Requirements for Web-DBMS Integration

• Open-architecture to allow interoperability with a variety of systems and technologies. For example: • different Web servers; • Microsoft's (Distributed) Common Object Model (DCOM/COM);• CORBA/IIOP (Internet Inter-ORB protocol);• Java/Remote Method Invocation.

21

Requirements for Web-DBMS Integration

• Cost-effective solution that allows for scalability, growth, and changes in strategic directions, and helps reduce applications development costs.

• Support for transactions that span multiple HTTP requests.• Support for session- and application-based authentication.• Acceptable performance.

22

Requirements for Web-DBMS Integration

• Minimal administration overhead.• Set of high-level productivity tools to allow applications to be

developed, maintained, and deployed with relative ease and speed.

23

Two-Tier Client-Server Architecture

24

Three-Tier Client-Server Architecture

• Client side presented two problems preventing true scalability:• ‘Fat’ client, requiring considerable resources on client’s computer to

run effectively.• Significant client side administration overhead.

• By 1995, three layers proposed, each potentially running on a different platform.

25

Three-Tier Client-Server Architecture

• Advantages: • ‘Thin’ client, requiring less expensive hardware.• Application maintenance centralized. • Easier to modify or replace one tier without affecting others. • Separating business logic from database functions makes it easier to

implement load balancing. • Maps quite naturally to Web environment.

26

Three-Tier Client-Server Architecture

27

Advantages of Web-DBMS Approach

• DBMS advantages• Simplicity• Platform independence• Graphical User Interface• Standardization• Cross-platform support• Transparent network access• Scalable deployment• Innovation

28

Disadvantages of Web-DBMS Approach

• Reliability• Security• Cost• Scalability• Limited functionality of HTML• Statelessness• Bandwidth• Performance• Immaturity of development tools

29

Approaches to Integrating Web and DBMSs

• Common Gateway Interface (CGI).• Server-Side Includes.• HTTP Cookies.• Extending the Web Server.• Java and JDBC, JSQL and JRB.• Scripting Languages (JavaScript and VBScript).• Microsoft Active Platform: ASP and ADO.• Oracle Network Computing Architecture.

30

Common Gateway Interface (CGI)

Specification for transferring information between a Web server and a CGI program.

• Server only intelligent enough to send documents and to tell browser what kind of document it is.

• But server also knows how to launch other programs. • When server sees that URL points to a program (script), it

executes script and sends back script’s output to browser as if it were a file.

31

CGI - Environment

32

Common Gateway Interface (CGI)

• CGI defines how scripts communicate with Web servers. • A CGI script is any script designed to accept and return data that

conforms to the CGI specification. • Before server launches script, prepares number of environment

variables representing current state of the server, who is requesting the information, and so on.

• Script picks this up and reads STDIN.

33

Common Gateway Interface (CGI)

• Then performs necessary processing and writes its output to STDOUT.

• Script responsible for sending MIME header, which allows browser to differentiate between components.

• CGI scripts can be written in almost any language, provided it supports reading and writing of an operating system’s environment variables.

34

Common Gateway Interface (CGI)

• Four primary methods for passing information from browser to a CGI script:

• Passing parameters on the command line.• Passing environment variables to CGI programs.• Passing data to CGI programs via standard input.• Using extra path information.

35

CGI - Passing Parameters on Command Line

36

CGI - Advantages

• CGI is the de facto standard for interfacing Web servers with external applications.

• Possibly most commonly used method for interfacing Web applications to data sources.

• Advantages:• simplicity, • language independence, • Web server independence, • wide acceptance.

37

CGI - Disadvantages

• Communication between client and database server must always go through Web server.

• Lack of efficiency and transaction support, and difficulty validating user input inherited from statelessness of HTTP protocol.

• HTTP never intended for long exchanges or interactivity. • Server has to generate a new process or thread for each CGI script. • Security.

38

Server-Side Includes (SSI)

• Allows a program to be executed, like CGI, and to incorporate its output into the document.

• Generally, end result is a text document.• SSI is not governed by an Internet RFC or other standard. Each

server vendor is free to implement SSI on an ad-hoc basis, if at all.

• Most servers follow NCSA’s specification.• All SSI commands are embedded within regular HTML

comments, making the HTML portable. • Security risks of SSI are similar to those of CGI.

39

HTTP Cookies

• Cookies can make CGI scripts more interactive. • Cookies are small text files stored on Web client. • CGI script creates cookie and has Web server send it to client’s

browser to store on hard disk. • Later, when client revisits Web site and uses a CGI script that

requests this cookie, client’s browser sends information stored in the cookie.

• However, not all browsers support cookies.

40

Extending the Web Server

• To overcome limitations of CGI, many servers provide an API that adds functionality to server.

• Two of main APIs are Netscape’s NSAPI and Microsoft’s ISAPI. • Scripts are loaded in as part of the server, giving back-end

applications full access to all the I/O functions of server. • One copy of application is loaded and shared between multiple

requests to server.

41

Extending the Web Server

• Approach more complex than CGI, possibly requiring specialized programmers.

• Can provide very flexible and powerful solution. • API extensions can provide same functionality as a CGI program,

but as API runs as part of the server, API approach can perform significantly better than CGI.

• Extending Web server is potentially dangerous, since server executable is being changed.

42

Comparison of CGI and API

• CGI and API both extend capabilities of server. • CGI scripts run in environment created by Web server program.• Scripts only execute once Web server interprets request from

browser, then returns results back to the server. • API approach not nearly so limited in its ability to communicate. • API-based extensions are loaded into same address space as Web

server.

43

Java

• Proprietary language developed by Sun and currently marketed by JavaSoft.

• Originally intended to support environment of networked machines and embedded systems.

• Now, Java is rapidly becoming de facto language for Web computing.

• Interesting because of its potential for building Web applications (applets) and server applications (servlets).

44

Java

• Java is ‘a simple, object-oriented, distributed, interpreted, robust, secure, architecture neutral, portable, high-performance, multi-threaded and dynamic language’.

• Has a machine-independent target architecture, the Java Virtual Machine (JVM).

• Since almost every Web browser vendor has already licensed Java and implemented an embedded JVM, Java applications can currently be deployed on most end-user platforms.

45

Java

46

Java

• Before Java application can be executed, it must first be loaded into memory.

• Done by Class Loader, which takes ‘.class’ file(s) containing bytecodes and transfers it into memory.

• Class file can be loaded from local hard drive or downloaded from network.

• Finally, bytecodes must be verified to ensure that they are valid do not violate Java’s security restrictions.

47

Java

• Loosely speaking Java is a ‘safe’ C++. • Safety features include strong static type checking, automatic

garbage collection, and absence of machine pointers at language level.

• Safety is central design goal: ability to safely transmit Java code across Internet.

• Security is also integral part of Java’s design - sandbox ensures untrusted application cannot gain access to system resources.

48

JDBC

• Modeled after ODBC, JDBC API supports basic SQL functionality. • With JDBC, Java can be used as host language for writing database

applications. • On top of JDBC, higher-level APIs can be built. • Currently, two types of higher-level APIs:

• An embedded SQL for Java. • A direct mapping of relational database tables to Java classes.

49

JDBC

• JDBC API consists of two main interfaces: an API for application writers, and a lower-level driver API for driver writers.

• Applications and applets can access databases using:• JDBC API with pure Java JDBC drivers, • ODBC drivers and existing database client libraries.

50

JDBC - Advantages/Disadvantages

• Advantage of using ODBC drivers is that they are a de facto standard for PC database access, and are available for many DBMSs, for very low price.

• Disadvantages with this approach:• Non-pure JDBC driver will not necessarily work with a Web browser.• Currently downloaded applet can connect only to database located

on host machine.• Deployment costs increase.

51

JDBC

52

JSQL

• Another JDBC-based approach uses Java with static embedded SQL.

• JSQL comprises a set of clauses that extend Java to include SQL constructs as statements and expressions.

• JSQL translator transforms JSQL clauses into standard Java code that accesses database through a CLI.

53

Java Relational Binding (JRB)

• Middleware product that bridges from Java to RDBMSs. • Provides orthogonal persistence through three-stage process:

• database creation, • an import program, • JRB API.

54

Java Relational Binding (JRB)

55

Java Relational Binding (JRB)

• API is set of public classes used by programmer, which includes methods to connect to database server, open database, start/end transactions, create/update/read objects.

• JRB relies on security and integrity of underlying RDBMS:• references between objects implemented as FKs; • OIDs modeled as system generated PKs.

56

Java Relational Binding (JRB)

• JRB also provides notion of class extents. • Can associate predicate with class extent and retrieve objects

based on their contents (in a select-from-where style).• The runtime system is implemented on top of a JDBC-compliant

interface layer.

57

Scripting Languages (JavaScript and VBScript)

• Scripting languages can be used to extend browser and Web server with database functionality.

• As script code is embedded in HTML, it is downloaded every time page is accessed.

• Updating browser is simply a matter of changing Web document on server.

• Two popular scripting languages are: JavaScript and VBScript. • Both are interpreted languages, not compiled, making it easy to

create small applications.

58

Server-Side JavaScript for Database Access

59

Microsoft Active Platform

• Microsoft Active Platform is an ‘open, standards-based software architecture for delivering applications over the Internet and intranets’.

• Contains HTML, scripting languages, and components (Java, ActiveX).

• On client machine called an Active Desktop.• On Web server called an Active Server. • Active Platform is encompassing term given to these related

technologies.

60

Object Linking and Embedding for DataBases (OLE DB)

• Microsoft has defined set of data objects, collectively known as OLE DB.

• Allows OLE-oriented applications to share and manipulate sets of data as objects.

• OLE DB is an object-oriented specification based on C++ API. • Components can be treated as data consumers and data

providers. Consumers take data from OLE DB interfaces and providers expose OLE DB interfaces.

61

OLE DB

62

Active Server Pages (ASP)

• ASP is programming model that allows dynamic, interactive Web pages to be created on server.

• ASP provides flexibility of CGI, without performance overhead discussed previously.

• ASP runs in-process with the server, and is optimized to handle large volume of users.

• When an ‘.asp’ file is requested, Web server calls ASP, which reads requested file, executes any commands, and sends generated HTML page back to browser.

63

Active Server Pages (ASP)

64

Active Data Objects (ADO)

• Programming extension of ASP supported by Microsoft IIS for database connectivity.

• Supports following key features:• Independently-created objects.• Support for stored procedures.• Support for different cursor types.• Batch updating.• Support for limits on number of returned rows.

• Designed as an easy-to-use interface to OLE DB.

65

Microsoft Internet Database Connector (IDC)

• Similar approach to ASP, again specific to Microsoft Internet Information Server.

• IDC is an ISAPI that reads an ‘.idc’ file that contains SQL commands.

• IDC communicates with a DBMS’s ODBC driver to retrieve necessary data from database and format it using information in an ‘.htx’ file.

66

Microsoft Internet Database Connector (IDC)

67

Oracle Network Computing Architecture(NCA)

• NCA aimed at providing extensibility for distributed environments.

• It is three-tier architecture based on industry standards such as:• OMG’s CORBA 2.0 technology.• HTTP and HTML for Web enablement.• IIOP for object interoperability. • OMG’s IDL for language neutral interfaces.

68

Oracle NCA

69

Oracle NCA

70

Security

• All Internet traffic travels ‘in the clear’ and anyone who monitors traffic can read it.

• Need to ensure with communication that:• It is inaccessible to anyone but sender and receiver (privacy).• It has not been changed during transmission (integrity).• Receiver can be sure it came from sender (authenticity).

71

Security

• Sender can be sure receiver is genuine (non-fabrication).• Sender cannot deny he or she sent it (non-repudiation).

• Must also protect information once it has reached Web server.

72

Security

• Download may have executable content, which can perform following malicious actions:• Corrupt data or execution state of programs.• Reformat complete disks.• Perform a total system shutdown.• Collect and download confidential data.• Usurp identity and impersonate user.• Lock up resources.• Cause non-fatal but unwelcome effects.

73

Security

• Look at:

• Proxy Servers. - Firewalls• Message Digest - Digital

Algorithms Signatures.• Digital Certificates. - Kerberos.• SSL and S-HTTP. - SET and SST.• Java Security. - ActiveX Security.

74

Proxy Servers

• Proxy server is computer that sits between browser and Web server.

• It intercepts all requests to Web server to try to fulfil requests itself.

• Has two main purposes: • improve performance • filter requests.

75

Firewalls

• Designed to prevent unauthorized access to/from a private network.

• Can be implemented in both hardware and software, or a combination of both.

• Several types of firewall techniques: • Packet filter. • Application gateway.• Circuit-level gateway. • Proxy server.

76

Message Digest Algorithms

• Message digest algorithm takes an arbitrary-sized string (message) and generates fixed-length string (digest or hash).

• A digest has following characteristics:• It should be computationally infeasible to find another message that

will generate same digest.• Digest does not reveal anything about message.

77

Digital Signatures

• Digital signature consists of two parts: • string of bits computed from data being ‘signed’ • private key of individual or organization wishing the signature.

• Can be used to verify data comes from this individual or organization.

78

Digital Signatures

• Digital signature has many useful properties:• Authenticity can be verified, using public key.• Cannot be forged (assuming private key is kept secret). • Function of data signed and cannot be claimed to be signature for

any other data.• Signed data cannot be changed or signature will no longer verify

data as being authentic.

79

Digital Certificates

• Attachment to electronic message used for security purposes (e.g. verify user sending message), and provide receiver with means to encode reply.

• Sender applies for certificate from Certificate Authority (CA). • CA issues encrypted certificate containing applicant’s public key

and other identification information.

80

Digital Certificates

• CA makes its own public key readily available. • Recipient uses CA’s public key to decode certificate attached to

message, verifies it as issued by CA, and obtains sender’s public key and identification information held within certificate.

• With this information, recipient can send an encrypted reply. • CA’s role is critical, acting as go-between in relationship between

two parties.

81

Kerberos

• A server of secured user names and passwords.• Provides one centralized security server for all data and resources

on network. • Database access, login, authorization control, and other security

features are centralized on trusted Kerberos servers. • Has similar function to that of Certificate server: to identify and

validate a user.

82

Secure Sockets Layer (SSL)

• Encryption protocol for transmitting private documents • Designed to prevent eavesdropping, tampering, and message

forgery. • Works by using private key to encrypt data that is transferred over

SSL connection. • Layered between application-level protocols such as HTTP and

TCP/IP transport-level protocol.• Thus, may be used for other application-level protocols such as

FTP and NNTP.

83

Secure-HTTP (S-HTTP)

• Protocol for securely transmitting individual messages over Web.• Both SSL and S-HTTP use techniques such as encryption, and

digital signatures, and:• Allow browsers and servers to authenticate each other.• Allow controlled access to Web site. • Ensure data exchanged between browser and server is secure and

reliable.

84

Secure Electronic Transactions

• Open, interoperable standard for processing credit card transactions over Internet, in simple and secure way.

• Transaction is split in such a way that merchant has access to information about:• what is being purchased, • how much it costs, • whether payment is approved,

• but no information on what payment method customer is using.

85

SET

• Card issuer (e.g. Visa) has access to purchase price, but no information on type of merchandise involved.

• Certificates are heavily used by SET, both for certifying cardholder and for certifying that merchant has relationship with financial institution.

86

SET

87

Secure Transaction Technology (SST)

• Protocol designed to handle secure bank payments over Internet. • Uses DES encryption of information, RSA encryption of bankcard

information, and strong authentication of all parties involved in transaction.

88

Java Security

• Sandbox ensures untrusted application cannot gain access to system resources.

• Involves three components: • class loader, • bytecode verifier, • security manager.

• Safety features provided by language and JVM, and enforced by compiler and runtime system.

• Security is a policy built on top of safety layer.

89

Class Loader

• Allocates a (hierarchically structured) namespace for each class.• Never allows class from ‘less protected’ namespace to replace

class from more protected namespace. • Thus, I/O primitives, defined in local Java class, cannot be invoked

or overridden by classes from out with local machine. • As browsers and Java applications can provide their own class

loader, this may be viewed as weakness in security.

90

Bytecode Verifier

• JVM verifies bytecode instructions before allowing application/applet to run

• Typical checks include verifying:• Compiled code is correctly formatted.• Internal stacks will not overflow/underflow.• No ‘illegal’ data conversions will occur.• Bytecode instructions are appropriately typed.• All class member accesses are valid.

91

Security Manager

• Each Java application defines and implements its own security policy.

• A Java-enabled browser contains its own Security Manager, and any applets it downloads are subject to its policies.

• Generally, downloaded applets are prevented from:• Reading and writing files on client’s file system. • Making network connections to machines other than host. • Starting other programs on the client.

92

Security Manager

• Loading libraries.• Defining method calls.

• These restrictions apply to applets downloaded over Internet/intranet.

• Also do not apply to applets on client’s local disk and in directory on CLASSPATH.

• Local applets are loaded by file system loader and can read and write files, exit JVM, and are not passed through bytecode verifier.

93

ActiveX Security

• ActiveX security model places no restrictions on what a control can do.

• Instead, each ActiveX control can be digitally signed by its author using system called Authenticode™.

• Digital signatures are then certified by CA. • This security model places responsibility for the computer’s

security on the user.

94

HTTP/1.1

• Number of new features added. Look at two:• Persistent connections become default behavior. While open, client

can send synchronous or asynchronous messages, and server can respond to them in order.

• Digest authentication provided as replacement for basic authentication. Password remains secret between client and server. Client and server compute digest value using the MD5 and digest is sent across network.

95

XML (eXtensible Markup Language)

• XML is new standard that could preserve general application independence that makes HTML portable and powerful.

• Pared-down version of SGML, designed especially for Web documents.

• Designers can create their own customized tags to provide functionality not available with HTML.

96

XML (eXtensible Markup Language)

• SGML allows document to be logically separated into two: • Document Type Definition (DTD)• other containing the text itself.

• Useful features include:• Database Schema Definition. • Linking to relative objects or elements. • Support for bi-directional links.

• Simplicity may be lost with move to XML.

top related