document types. digital documents converting documents to an electronic format will preserve those...

26
DOCUMENT TYPES DOCUMENT TYPES

Upload: asa-perry

Post on 31-Mar-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?

DOCUMENT TYPESDOCUMENT TYPES

Page 2: DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?

Digital DocumentsDigital Documents

Converting documents to an electronic format Converting documents to an electronic format will preserve those documents, but how would will preserve those documents, but how would such a process be organized?such a process be organized?

And then, how could the electronic documents be And then, how could the electronic documents be distributed?distributed?

Building a digital library for books and articles by:Building a digital library for books and articles by:

Digitizing books and articlesDigitizing books and articles

Storing them in an indexed databaseStoring them in an indexed database

Page 3: DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?

Mark-upsMark-ups Mark-up is everything in a document that is not content.Mark-up is everything in a document that is not content.

Page 4: DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?

Procedural mark-upProcedural mark-up Procedural mark-up are codes that contain information on how Procedural mark-up are codes that contain information on how

a specific application should process the document (example a specific application should process the document (example of procedural mark-up formats: Microsoft Word).of procedural mark-up formats: Microsoft Word).

Page 5: DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?

Presentational mark-upPresentational mark-up Presentational mark-up are codes that describe how the Presentational mark-up are codes that describe how the

document should be presented or laid out, either on a document should be presented or laid out, either on a computer screen or on acomputer screen or on a

Page 6: DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?

Descriptive mark-upDescriptive mark-up Descriptive mark-up are codes that describe the logical Descriptive mark-up are codes that describe the logical

structurestructure

Page 7: DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?

printed page (example of presentational mark-up language: printed page (example of presentational mark-up language: HTML).HTML).

by many different software applications (example of descriptive by many different software applications (example of descriptive markupmarkup

meta-language: XML). Competitionmeta-language: XML). Competition

Page 8: DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?

documentsdocuments Microsoft WordMicrosoft Word

Rich Text FormatRich Text Format

templates templates

To reduce the time of creating documents of the same type or To reduce the time of creating documents of the same type or class, like memos, letters, technical reports, research articles class, like memos, letters, technical reports, research articles and invoices, document can help you.and invoices, document can help you.

Page 9: DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?

Template contains styles sheet that will be used to format this Template contains styles sheet that will be used to format this type of document and framework with elements such as a type of document and framework with elements such as a standard front page, headers and footers, a standard set of standard front page, headers and footers, a standard set of sections and headings, etc.sections and headings, etc.

Page 10: DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?

Word processing software uses the most common form of Word processing software uses the most common form of procedural mark-up.procedural mark-up.

Word processing format, such as Word, is useful when you Word processing format, such as Word, is useful when you have to create or edit a document.have to create or edit a document.

The mark-up in a word processor serves to specify how the The mark-up in a word processor serves to specify how the document should be laid out when printed, and to control document should be laid out when printed, and to control the functions of the word processing application.the functions of the word processing application.              

Page 11: DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?

Using a word processor such as Microsoft Word, you can set Using a word processor such as Microsoft Word, you can set the style sheet, apply templates and create a visual structure the style sheet, apply templates and create a visual structure for your document.for your document.

Microsoft Word uses a proprietary, binary format: this Microsoft Word uses a proprietary, binary format: this causes problems in terms of standardization.causes problems in terms of standardization.

Page 12: DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?

To resolve these problems, Microsoft have created another To resolve these problems, Microsoft have created another procedural format, RTF, that is a plain text format used as the procedural format, RTF, that is a plain text format used as the exchange format between word processing applications.exchange format between word processing applications.

Page 13: DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?

HTML is an acronym, standing for Hypertext Markup Language. HTML is an acronym, standing for Hypertext Markup Language. It is a language that can be transferred around the Internet and It is a language that can be transferred around the Internet and read by a Web Browserread by a Web Browser

Page 14: DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?

Simple HTML documents can be created easily using any text Simple HTML documents can be created easily using any text editor.editor.

All content is defined by the markup "tags" of HTML, that areAll content is defined by the markup "tags" of HTML, that arecontainers for whatever you put in the document.containers for whatever you put in the document.

Using HTML you can define basic presentation of a Using HTML you can define basic presentation of a documentdocument(headers, paragraphs, lists and tables), hyperlinks and (headers, paragraphs, lists and tables), hyperlinks and multimedia information.multimedia information.from Word (doc) to HTML/PDF, from Word (doc) tofrom Word (doc) to HTML/PDF, from Word (doc) toXML, and XML to HTML/PDF.XML, and XML to HTML/PDF.a rendition in a word processing format, such as Microsofta rendition in a word processing format, such as MicrosoftWord, is useful when creating or editing the document,Word, is useful when creating or editing the document,

an HTML rendition is useful when viewing it on the Web,an HTML rendition is useful when viewing it on the Web,andand

Page 15: DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?

a page rendition as a bitmap graphic or PDF format may bea page rendition as a bitmap graphic or PDF format may beuseful when a read-only page layout view is required.useful when a read-only page layout view is required.Conversion can be carried out:Conversion can be carried out:manually, when a person creates the rendition by re-keying the manually, when a person creates the rendition by re-keying the

document content, and inserting the mark-up necessary. document content, and inserting the mark-up necessary.         

Page 16: DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?

using one or more computer programs that automatically using one or more computer programs that automatically convert the document from Document one format to another.convert the document from Document one format to another.Microsoft Word is often chosen as the original document Microsoft Word is often chosen as the original document creation applicationcreation application

Page 17: DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?

However, many organizations are beginning to use XML to hold However, many organizations are beginning to use XML to hold the source documents because it is easy to transform to other the source documents because it is easy to transform to other renditions; moreover, its mark-up captures the logical meaning renditions; moreover, its mark-up captures the logical meaning of the content, it is open source and well defined with public of the content, it is open source and well defined with public specifications.specifications.

There are a number of tools available on the market which can There are a number of tools available on the market which can plug in to Word to help make the transformation to XML. plug in to Word to help make the transformation to XML.

Page 18: DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?

They generally use Word styles to make the transformation and They generally use Word styles to make the transformation and rely on users of the word processor applying word styles in a rely on users of the word processor applying word styles in a consistent manner.consistent manner.

In this case it is necessary that users have created Word In this case it is necessary that users have created Word documents using styles and templates correctly. If not, it is documents using styles and templates correctly. If not, it is quite difficult to make a fully automated transformation from quite difficult to make a fully automated transformation from Word to XML. Word to XML.

Page 19: DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?

One of the great advantages of XML is that it is very easy to One of the great advantages of XML is that it is very easy to transform XML mark-up to another format. The Extensible Style transform XML mark-up to another format. The Extensible Style sheet Language for Transformations (XSLT) offers a standard sheet Language for Transformations (XSLT) offers a standard way to transform XML and there are many XSLT transformation way to transform XML and there are many XSLT transformation processors available, both as open source and as commercial processors available, both as open source and as commercial products.products.

Page 20: DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?

There is also a standard way to transform XML into page-There is also a standard way to transform XML into page-formatted renditions such as PDF, Postscript or RTF, the XSL-formatted renditions such as PDF, Postscript or RTF, the XSL-FO.FO.

XSL-FO (XSL Formatting Objects) is a set of XML elements that XSL-FO (XSL Formatting Objects) is a set of XML elements that represent objects such as pages, text blocks, tables, lists, represent objects such as pages, text blocks, tables, lists, footnotes, etc.footnotes, etc.

Page 21: DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?

GIF, JPG, PNGGIF, JPG, PNG

The photograph or scanned image is sampled andThe photograph or scanned image is sampled and

mapped as a grid of dots or picture elements (pixels).mapped as a grid of dots or picture elements (pixels).

GIF, JPG, PNGGIF, JPG, PNG

Page 22: DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?

PDF (Portable Document Format ) is a procedural mark-up PDF (Portable Document Format ) is a procedural mark-up language that allows page-formatted documents to be language that allows page-formatted documents to be viewed and printed in their original format on almost any viewed and printed in their original format on almost any software platform.software platform.

PDF is an ideal format for scientific documents that contain PDF is an ideal format for scientific documents that contain unusual symbols, and for multilingual documents.unusual symbols, and for multilingual documents.

Page 23: DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?

The compression and incremental loading features of PDF The compression and incremental loading features of PDF make it well suited for transmission of documents over the make it well suited for transmission of documents over the Internet.Internet.

Many software packages can be used to create PDF Many software packages can be used to create PDF documents, and PDF viewers are available free of charge. documents, and PDF viewers are available free of charge.

Page 24: DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?

A PDF document contains a set of pages which are described by A PDF document contains a set of pages which are described by three main object types: path objects, image objects and text three main object types: path objects, image objects and text objects.objects.

Embedded TIFFs are PDF documents where the entire pages are Embedded TIFFs are PDF documents where the entire pages are TIFF images.TIFF images.

XML, born as a profile of SGML, is an open standard for XML, born as a profile of SGML, is an open standard for descriptive mark-up, used as exchange format between descriptive mark-up, used as exchange format between

applications.applications.

Page 25: DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?

An XML document is well formed if it follows the basic rules of An XML document is well formed if it follows the basic rules of XML syntax.XML syntax.

A Document Type Definition (DTD) and XML Schema are sets of A Document Type Definition (DTD) and XML Schema are sets of rules which specify the logical structure that is allowable for a rules which specify the logical structure that is allowable for a particular type of document.particular type of document.

An XML document is valid if it complies with the rules set out in a An XML document is valid if it complies with the rules set out in a

DTD or XML Schema with which it is associated.DTD or XML Schema with which it is associated.

Page 26: DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?

A Cascading Style Sheet (CSS) is a separate style sheet A Cascading Style Sheet (CSS) is a separate style sheet which contains simple rendering instructions for a XML which contains simple rendering instructions for a XML document.document.

Extensible Style sheet Language for Transformations Extensible Style sheet Language for Transformations (XSLT) is used to create style sheets which define (XSLT) is used to create style sheets which define transformations from XML to other XML or non-XML transformations from XML to other XML or non-XML formats. formats.