dr. dinesh katre chief investigator centre of excellence for digital preservation associate director...

Post on 02-Apr-2015

223 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Dr. Dinesh KatreChief InvestigatorCentre of Excellence for Digital Preservation

Associate Director & HODHuman-Centred Design & Computing GroupC-DAC, Pune, INDIA.

dinesh@cdac.in

Digital Preservation Requirements of Electronic Educational Content(A Perspective of National Digital Preservation Programme)

Punch cards

Punch tapesSelectron tubes

Magnetic tapes

Audio Cassette Magnetic Drum

Evolution and obsolescence of storage media

8-inch floppy disk

5 ¼ floppy disk3 1/2 inch floppy disk

Compact diskHard disk

DVDEvolution and obsolescence of storage media

Discontinued Tools, Closed Formats and Outdated Storage Devices

Computer hardwareContinued changes in CPU speed, memory, processing, etc.New hardware introducing new peripheral connections

Operating SystemUpgraded versions or new OS does not run the old softwareContinued transition from 8bit OS to 64bit OSs and so on

SoftwareSoftware upgrades do not support the former file formatsProprietary and closed source softwareDiscontinuation of software, lack of support

File formats Proprietary and closed format specificationChange in the format specificationDiscontinuation of the required softwareData corruption

Storage devices and mediaContinued reduction in size and cost of storage devicesContinued increase in storage capacity and performancePolycarbonate media like CD and DVD have uncertain lifetimes (Cerf, 2010)Obsolete storage media and unavailable reading devices e.g. 5 1/4” or 3 1/2” floppiesNew approaches like storage virtualization

Physical threatsImproper storage environment (temperature, humidity, dust, light)Overuse and handling of mediaNatural disaster Infrastructure failureHuman errorSabotage

Tangible versus non-tangible

Electronic These & Dissertation

DefinitionLong Term Digital Preservation (LTDP) is a secure and trustworthy mechanism to ingest, process, store, manage, protect, find, access, and interpret digital information such that the same information can be used at some arbitrary point in the future in spite of obsolescence of everything: hardware, software, processes, format, people, etc.

What does “Long Term” mean?How can we increase the likelihood that data generated in 2010 or earlier will still be accessible in useful form in 2020 and later? (Cerf, 2010)

Data should normally be preserved and accessible for not less than 10 years for any projects, and for projects of clinical or major social, environmental or heritage importance, the data should be retained for up to 20 years“ (Research Councils UK 2008:6)

An archive is expected to provide permanent or indefinite Long Term, preservation of digital information. (OAIS, 2009)

BenefitsDigital preservation provides benefits such as legal protection, knowledge heritage for future work / future generations, trend analysis, reuse etc.

What is Digital Preservation?

1 Terabyte = 1,000,000,000,000 bytes B = 1012

1 Petabyte = 1,000,000,000,000,000 bytes B = 1015

1 Exabyte = 1,000,000,000,000,000,000 bytes B = 1018

1 Zettabyte = 1,000,000,000,000,000,000,000 bytes B = 1021

Digital Universe / Digital Dark AgeAs per the “Digital Universe Study Report” by International Data Corporation (IDC), 2010 -

• Estimated Size of Digital Universe in 2010 1.2 million Petabytes or 1.2 Zettabytes

• Estimated Size of Digital Universe in 2020 35 Zettabytes (As all major forms of media –voice, TV, radio, print would have completed the journey from analog to digital.)

• Estimated Size of Unprotected Data needing protection in 2020 18,000 Exabytes

Sustainable Economics for a Digital Planet, Ensuring Long Term Access to Digital Information, February 2010 – A report prepared by Blue Ribbon Task Force

• What digital information should we preserve? • Who will preserve it? • Who will pay for it?

National Digital Information Infrastructure and Preservation Program (NDIIPP), USA – started in 2000

Network of Expertise in Long-term STOrage of Digital Resources (NESTOR), Germany – started in 2003

CASPAR - Cultural, Artistic and Scientific knowledge for Preservation, Access and Retrieval UK – started in 2006

International Trends of Digital Preservation

Planets: Preservation and Long-term Access through Networked Services

DigitalPreservationEurope (DPE)

Digital Curation Centre (DCC)

APARSEN: Network of Excellence

International Trends of Digital Preservation

International Trends of Digital Preservation

Alliance for Permanent Access (APA), EU project

International Trends of Digital Preservation

Proceedings of Indo US Workshop

Panel recommendations for India’s National Digital Preservation Programme

Indo-US Workshop on International Trends in Digital Preservation March 23-24, 2009

Dr. A. K. Chakravarti, Chairman of Expert Group

Dr. Dinesh Katre, Principal Investigator (C-DAC)

Dr. Gautam Bose, National Informatics Centre

Ms. Renu Budhiraja, e-Governance Division, DIT

Mr. Sumnesh Joshi, Unique Identification Authority of India

Dr. Meena Gautam, National Archives of India

Ms. Debjani Nag, Controller of Certifying Authorities

Mr. Vakul Sharma, Supreme Court

Dr. Mukul Sinha, Expert Software Consultants

Dr. Kamalini Dutt, Doordarshan

Dr. Ramesh C. Gaur, Indira Gandhi National Centre for the Arts

Ms. Manju Mathur, All India Radio

Dr. S. B. Bhattacharyya, e-Health Consultant

Dr. Usha Munshi, Indian Institute of Public Administration

Dr. Vandana Sinha, American Institute of Indian Studies

Dr. A. Moorthy, Defence Scientific Information & Documentation Centre, DRDO

Mr. Ramchandra Budihal, WIPRO

Mr. Sanjeev Kumar Gupta, IBM

Mr. V.V.S. Nageswara Rao, National Remote Sensing Centre, Dept of Space

Mr. Zia Saquib, Centre for Development of Advanced Computing

Mr. Ashok Kapoor, Reserve Bank of India

Mr. Patrick Kishor, State Bank of India

Mr. Sukhdev Singh, National Informatics Centre

Mr. Vivek K. Srivastava, e-Governance Division, DIT

Ms. Seema Sridhar, Life Insurance Corporation of India

Mr. N. S. Mani, National Archives of India

Mr. V. H. Jadhav, National Film Archives of India

Dr. V. C. V. Rao, Centre for Development of Advanced Computing

Dr. Y.K. Somayajulu, National Institute of Oceanography

May 20-21, 2010

National Meet of Expert Group Members from 30 different stakeholder organizations held at C-DAC, Pune

National Study Report on

Digital Preservation Requirements of India

Volume – I Recommendations for National Digital Preservation Programme

Chapter 1. Need of Digital Preservation Chapter 2. Scope of Digital Preservation in IndiaChapter 3. Recommendations for NDPP

Volume – IIPosition Papers by National Expert Group Members

23 Status ReportsStakeholder RecommendationsShort Term (3Yrs) and Long Term Actions (10Yrs)

A) Conduct research and development in digital preservation to produce the required tools, technologies, guidelines and best practices.

B) Develop the pilot digital preservation repositories and provide help in nurturing the network of Trustworthy Digital Repositories (National Digital Preservation Infrastructure) as a long-term goal.

C) Define the digital preservation standards by involving the experts from stakeholder organizations, consolidate and disseminate the digital preservation best practices generated through various projects under National Digital Preservation Programme, being the nodal point for pan-India digital preservation initiatives.

D) Provide inputs to Department of Information Technology in the formation of national digital preservation policy and strategy by identifying and selecting the activities for the National Digital Preservation Programme.

E) Spread awareness about the potential threats and risks due to digital obsolescence and the digital preservation best practices.

Objectives of Centre of Excellence (C-DAC Pune) Project Duration: April 2011 to March 2014

Offering appropriate Delivery Information

Package to Designated Users

Creating a complete Archival Information

Package

Preparing for Valid Submission

Information Package

Digital Preservation ofGovernment Archives

Digital Preservation of Born Digital Records

Preservation of Cultural Digital Content

Digital Repository Portals for Access to Designated Users

Authenticity Management and

Digital Preservation Audit

Portal for National Digital Preservation Program (NDPP)

Domain Specific Digital Preservation and Archival Systems

Digital Preservation Research & Development

Digital Preservation Best Practices and Standards for e-Governance

Archival Science

Library Science

Digital Repository 01 Digital Repository 02 Digital Repository 03

Scope of CoE-DP Project

Motivations for Management of Electronic Theses and Dissertations

Archive and preserve valuable scholarly / academic resources Make it accessible to designated users Enable further research advancements Knowledge enhancement, problem solving Collective growth Comparative quality

InformationObject

PhysicalObject

DigitalObject

InformationObject

DigitalObject

BitSequence

BitSequence

Reformatted Digital Information Born Digital Information

Deteriorating due to time, changing weather, handling

Less accessible

Digital Surrogate

Best capture of current condition

More accessible

Interlinking between both is needed for

continuity

Analysis

Experimentation

Data Collection

Final Manuscript

Software

Artefacts

Data Formats

Raw data

Dissemination

Final Manuscript Dissemination

ISO 19005-1 PDF/A-1a

Key characteristics1. Preserves the visual appearance of document

2. Includes visible contents like text, raster images, vector graphics, fonts, color information

3. Documents logical structure

4. 100% self contained

5. Long term reproducibility - internationally accepted as a Standard for long-term electronic archiving

Prohibitions6. Information from direct or indirect external sources

7. Transparency and sound and movie actions

Databases

Statistics Graphs

Images

Video Audio Hyperlinks to URLs

3D Models

Documents

Algorithms

Base Programs

2. DEFINITION AND CONTENT OF THESES2.1 Definition

2.1.1 The Research Awards Rules (sr. 1.4(1)) define a thesis as 'original written work', which, for the PhD may have been published (thesis by publication), or may be comprised of video recordings, film or other works of visual or sonic arts, computer software, digital material or other non-written material.

2.1.2 ‘Written work' for a thesis, includes video recordings, film or other works of visual or sonic arts submitted by a student for examination.

2.3.2 A thesis by publication may also include video recordings, film or other works of visual or sonic arts, computer software, digital material or other non-written material for which approval has been given for submission in alternative format

E-thesis and dissertations Research data Paper publications

Research Resources

University Repository of Electronic Educational Content

Learning Resources

Lecture notes Power Point Slides Audio / Video / Animations E-learning content

State Level Repository of Electronic Educational

ContentUniv 1

Univ 2Univ 3

Univ 4

Univ 5

Engineering Degrees Awarded (2008)(R Banerjee , Engineering Education in India, 2008)

Ph.D.s 1000

Masters 20000

Bachelor 230000

Annual projection of Science Ph.Ds in India in 2020 - 20,000(Education: The PhD factory, Nature, 20 April 2011)

Location, building architectureServer hardware / racks / KVMStorage hardware (SAN, NAS)Backup deviceNetwork Infrastructure (Switches, routers, UTM)Raised flooringConnectivity and bandwidthDisaster recovery sitePhysical SecurityBiometric SecurityNetwork Security (Firewall)Motion detection for lightingCCTV digital recordersHumidity controlCoolingFire suppression Smoke detectionRedundant power supply

Physical setup

Trusted Digital Repository of Electronic Educational Content

Electronic Educational

Content

Key concerns in ETD and e-content preservation Raw research data

Use of Indian languages for thesis writing

Specifications for learning objects

Version control

De-duplication

Copyright protection

Authentication

Type of data and file formats

Define the standard practices

Digital preservation strategy

Data and value sharing rules

India’s National Digital Preservation Programmewww.ndpp.in

Thank You

top related