document types

26
DOCUMENT TYPES DOCUMENT TYPES

Upload: raya-anderson

Post on 30-Dec-2015

31 views

Category:

Documents


0 download

DESCRIPTION

DOCUMENT TYPES. Digital Documents. Converting documents to an electronic format will preserve those documents, but how would such a process be organized? And then, how could the electronic documents be distributed? Building a digital library for books and articles by: - PowerPoint PPT Presentation

TRANSCRIPT

DOCUMENT TYPESDOCUMENT TYPES

Digital DocumentsDigital Documents

Converting documents to an electronic format Converting documents to an electronic format will preserve those documents, but how would will preserve those documents, but how would such a process be organized?such a process be organized?

And then, how could the electronic documents be And then, how could the electronic documents be distributed?distributed?

Building a digital library for books and articles by:Building a digital library for books and articles by:

Digitizing books and articlesDigitizing books and articles

Storing them in an indexed databaseStoring them in an indexed database

Mark-upsMark-ups Mark-up is everything in a document that is not content.Mark-up is everything in a document that is not content.

Procedural mark-upProcedural mark-up Procedural mark-up are codes that contain information on how Procedural mark-up are codes that contain information on how

a specific application should process the document (example a specific application should process the document (example of procedural mark-up formats: Microsoft Word).of procedural mark-up formats: Microsoft Word).

Presentational mark-upPresentational mark-up Presentational mark-up are codes that describe how the Presentational mark-up are codes that describe how the

document should be presented or laid out, either on a document should be presented or laid out, either on a computer screen or on acomputer screen or on a

Descriptive mark-upDescriptive mark-up Descriptive mark-up are codes that describe the logical Descriptive mark-up are codes that describe the logical

structurestructure

printed page (example of presentational mark-up language: printed page (example of presentational mark-up language: HTML).HTML).

by many different software applications (example of descriptive by many different software applications (example of descriptive markupmarkup

meta-language: XML). Competitionmeta-language: XML). Competition

documentsdocuments Microsoft WordMicrosoft Word

Rich Text FormatRich Text Format

templates templates

To reduce the time of creating documents of the same type or To reduce the time of creating documents of the same type or class, like memos, letters, technical reports, research articles class, like memos, letters, technical reports, research articles and invoices, document can help you.and invoices, document can help you.

Template contains styles sheet that will be used to format this Template contains styles sheet that will be used to format this type of document and framework with elements such as a type of document and framework with elements such as a standard front page, headers and footers, a standard set of standard front page, headers and footers, a standard set of sections and headings, etc.sections and headings, etc.

Word processing software uses the most common form of Word processing software uses the most common form of procedural mark-up.procedural mark-up.

Word processing format, such as Word, is useful when you Word processing format, such as Word, is useful when you have to create or edit a document.have to create or edit a document.

The mark-up in a word processor serves to specify how the The mark-up in a word processor serves to specify how the document should be laid out when printed, and to control document should be laid out when printed, and to control the functions of the word processing application.the functions of the word processing application.              

Using a word processor such as Microsoft Word, you can set Using a word processor such as Microsoft Word, you can set the style sheet, apply templates and create a visual structure the style sheet, apply templates and create a visual structure for your document.for your document.

Microsoft Word uses a proprietary, binary format: this Microsoft Word uses a proprietary, binary format: this causes problems in terms of standardization.causes problems in terms of standardization.

To resolve these problems, Microsoft have created another To resolve these problems, Microsoft have created another procedural format, RTF, that is a plain text format used as the procedural format, RTF, that is a plain text format used as the exchange format between word processing applications.exchange format between word processing applications.

HTML is an acronym, standing for Hypertext Markup Language. HTML is an acronym, standing for Hypertext Markup Language. It is a language that can be transferred around the Internet and It is a language that can be transferred around the Internet and read by a Web Browserread by a Web Browser

Simple HTML documents can be created easily using any text Simple HTML documents can be created easily using any text editor.editor.

All content is defined by the markup "tags" of HTML, that areAll content is defined by the markup "tags" of HTML, that arecontainers for whatever you put in the document.containers for whatever you put in the document.

Using HTML you can define basic presentation of a Using HTML you can define basic presentation of a documentdocument(headers, paragraphs, lists and tables), hyperlinks and (headers, paragraphs, lists and tables), hyperlinks and multimedia information.multimedia information.from Word (doc) to HTML/PDF, from Word (doc) tofrom Word (doc) to HTML/PDF, from Word (doc) toXML, and XML to HTML/PDF.XML, and XML to HTML/PDF.a rendition in a word processing format, such as Microsofta rendition in a word processing format, such as MicrosoftWord, is useful when creating or editing the document,Word, is useful when creating or editing the document,

an HTML rendition is useful when viewing it on the Web,an HTML rendition is useful when viewing it on the Web,andand

a page rendition as a bitmap graphic or PDF format may bea page rendition as a bitmap graphic or PDF format may beuseful when a read-only page layout view is required.useful when a read-only page layout view is required.Conversion can be carried out:Conversion can be carried out:manually, when a person creates the rendition by re-keying the manually, when a person creates the rendition by re-keying the

document content, and inserting the mark-up necessary. document content, and inserting the mark-up necessary.         

using one or more computer programs that automatically using one or more computer programs that automatically convert the document from Document one format to another.convert the document from Document one format to another.Microsoft Word is often chosen as the original document Microsoft Word is often chosen as the original document creation applicationcreation application

However, many organizations are beginning to use XML to hold However, many organizations are beginning to use XML to hold the source documents because it is easy to transform to other the source documents because it is easy to transform to other renditions; moreover, its mark-up captures the logical meaning renditions; moreover, its mark-up captures the logical meaning of the content, it is open source and well defined with public of the content, it is open source and well defined with public specifications.specifications.

There are a number of tools available on the market which can There are a number of tools available on the market which can plug in to Word to help make the transformation to XML. plug in to Word to help make the transformation to XML.

They generally use Word styles to make the transformation and They generally use Word styles to make the transformation and rely on users of the word processor applying word styles in a rely on users of the word processor applying word styles in a consistent manner.consistent manner.

In this case it is necessary that users have created Word In this case it is necessary that users have created Word documents using styles and templates correctly. If not, it is documents using styles and templates correctly. If not, it is quite difficult to make a fully automated transformation from quite difficult to make a fully automated transformation from Word to XML. Word to XML.

One of the great advantages of XML is that it is very easy to One of the great advantages of XML is that it is very easy to transform XML mark-up to another format. The Extensible Style transform XML mark-up to another format. The Extensible Style sheet Language for Transformations (XSLT) offers a standard sheet Language for Transformations (XSLT) offers a standard way to transform XML and there are many XSLT transformation way to transform XML and there are many XSLT transformation processors available, both as open source and as commercial processors available, both as open source and as commercial products.products.

There is also a standard way to transform XML into page-There is also a standard way to transform XML into page-formatted renditions such as PDF, Postscript or RTF, the XSL-formatted renditions such as PDF, Postscript or RTF, the XSL-FO.FO.

XSL-FO (XSL Formatting Objects) is a set of XML elements that XSL-FO (XSL Formatting Objects) is a set of XML elements that represent objects such as pages, text blocks, tables, lists, represent objects such as pages, text blocks, tables, lists, footnotes, etc.footnotes, etc.

GIF, JPG, PNGGIF, JPG, PNG

The photograph or scanned image is sampled andThe photograph or scanned image is sampled and

mapped as a grid of dots or picture elements (pixels).mapped as a grid of dots or picture elements (pixels).

GIF, JPG, PNGGIF, JPG, PNG

PDF (Portable Document Format ) is a procedural mark-up PDF (Portable Document Format ) is a procedural mark-up language that allows page-formatted documents to be language that allows page-formatted documents to be viewed and printed in their original format on almost any viewed and printed in their original format on almost any software platform.software platform.

PDF is an ideal format for scientific documents that contain PDF is an ideal format for scientific documents that contain unusual symbols, and for multilingual documents.unusual symbols, and for multilingual documents.

The compression and incremental loading features of PDF The compression and incremental loading features of PDF make it well suited for transmission of documents over the make it well suited for transmission of documents over the Internet.Internet.

Many software packages can be used to create PDF Many software packages can be used to create PDF documents, and PDF viewers are available free of charge. documents, and PDF viewers are available free of charge.

A PDF document contains a set of pages which are described by A PDF document contains a set of pages which are described by three main object types: path objects, image objects and text three main object types: path objects, image objects and text objects.objects.

Embedded TIFFs are PDF documents where the entire pages are Embedded TIFFs are PDF documents where the entire pages are TIFF images.TIFF images.

XML, born as a profile of SGML, is an open standard for XML, born as a profile of SGML, is an open standard for descriptive mark-up, used as exchange format between descriptive mark-up, used as exchange format between

applications.applications.

An XML document is well formed if it follows the basic rules of An XML document is well formed if it follows the basic rules of XML syntax.XML syntax.

A Document Type Definition (DTD) and XML Schema are sets of A Document Type Definition (DTD) and XML Schema are sets of rules which specify the logical structure that is allowable for a rules which specify the logical structure that is allowable for a particular type of document.particular type of document.

An XML document is valid if it complies with the rules set out in a An XML document is valid if it complies with the rules set out in a

DTD or XML Schema with which it is associated.DTD or XML Schema with which it is associated.

A Cascading Style Sheet (CSS) is a separate style sheet A Cascading Style Sheet (CSS) is a separate style sheet which contains simple rendering instructions for a XML which contains simple rendering instructions for a XML document.document.

Extensible Style sheet Language for Transformations Extensible Style sheet Language for Transformations (XSLT) is used to create style sheets which define (XSLT) is used to create style sheets which define transformations from XML to other XML or non-XML transformations from XML to other XML or non-XML formats. formats.