archivists’ toolkit preliminaries: architecture, db leslie myrick nyu
Post on 02-Jan-2016
213 Views
Preview:
TRANSCRIPT
Archivists’ ToolkitPreliminaries: Architecture, DB
Leslie Myrick
NYU
Possible Java Architecture
• JSP Model 2 Architecture– Servlet Controller
• Handles requests, View selection, instantiates beans
– JSPs update the View in the browser– JavaBeans used to represent the object in
memory; access DB using JDBC• manage the Model
– JDBC connection to the data source
Similar Use of Servlet/JSP Modelin Digital Library Applications
• Dspace
• UC Berkeley’s GenX system
• CDL Preservation Repository
JSP Model 2
• Cleanest separation of presentation and content– Clear delineation of roles of developers and designers
• Takes advantage of strengths of servlets and JSPs for serving dynamic content– JSP for presentation layer
– Servlets for performing process-intensive tasks• Servlet as Controller in charge of request processing, creation
of beans or objects used by JSPs to forward request
• No processing logic in JSPs -- simply responsible for retrieving objects or beans instantiated by servlets
JSP Model 2 Architecture
JSP Model 1
• Bulk of processing performed by JSP– Process requests and draw view
• Fine for simple applications
JSP Model 1 Architecture
MySQL vs postgreSQL
• Both ACID compliant (transaction safe)• Both support referential integrity (as of MySQL 4.x)• MySQL faster; postgreSQL more robust• Finer grained locking in postgreSQL
– MultiVersion Currency Control in postgreSQL
• Want triggers? Views? Inheritance? For now go with postgreSQL
• MySQL has built-in full-text search capability• Ease of installation and maintenance – MySQL hands
down.
The ACID test
• Atomicity - All elements of a given transaction take place or none do.
• Consistency - Each transaction transforms the database from one valid state to another valid state.
• Isolation - The effects of a transaction are not visible to other transactions in the system until it is complete.
• Durability - Once a transaction has been committed, it's effects are permanent-- even if the system crashes, or a disk dies.
Proposed DB Schema: Archaeology / Genealogy
• Ultimately based on MOA II model
• With refinements to NYU’s zeroDB schema for digital object metadata
• Torqued to describe archival objects and their digital surrogates
• Same essential hook: pure Aristotelian hierarchy
It all comes down to object
• Pivotal entity is object nesting other objects– objectType can be fonds, collection, component– componentType can be series, file, item,
accretion
• Object hierarchy maintained through:– objectID, parentID, nextSibID
Object Table
object
PK objectID
objectTypeIDcomponentTypeIDparentIDnextSibIDhasChildren
FK1 rightsIDFK4 accessionIDFK5 provenanceIDFK2 physDescID
processFinalFK3 physLocID
Accession Table
accession
PK accessionID
accessionTypeIDresourceIDrecordCollectionTypeIDcollectionSurveyprocessingPlanprocessingNoteacqinfoaccrualsappraisalabstractgeneralNotescopecontentarrangementaccessrestrictpreservationNoteconservationNoteotherfindaidtransferFinal
Provenance Table
provenance
PK provenanceID
bioghistbibliographycustodhistfileplandonorNoteprovenanceNote
Physical Location TablesphysLoc
PK physLocID
FK1 physLocLevelIDFK2 physLocTypeID
physLocisPublicobjectID
physLocLevel
PK physLocLevelID
physLocLevel
physLocType
PK physLocTypeID
physLocType
CREATE TABLE physLoc (
physLocID int(11) NOT NULL auto_increment,
physLocLevelID int(11) not NULL default '0',
physLocTypeID int(11) NOT NULL default '0',
physLoc varchar(128) NOT NULL default '',
isPublic tinyint(1) unsigned NOT NULL default '0',
PRIMARY KEY (physLocID)
);--
-- Data for table physLocLevel
--
INSERT INTO physLocLevel (physLocLevel) VALUES ('repository');
INSERT INTO physLocLevel (physLocLevel) VALUES ('internal location');
INSERT INTO physLocLevel (physLocLevel) VALUES ('physical container');
--
-- Data for table 'physLocType'
--
INSERT INTO physLocType (physLocType) VALUES ('accession location');
INSERT INTO physLocType (physLocType) VALUES ('processing location');
INSERT INTO physLocType (physLocType) VALUES ('shelflist location');
INSERT INTO physLocType (physLocType) VALUES ('offsite location');
Ingest of Legacy Datafrom marcxml
• Student Programmers’ Assignment
• Probably involve JAXP/DOM
• Already undertaken conversion of records from Innopac iiirecord dtd to marc21slim schema; tape .mrc to marcxml using marc4J
Ingest of Legacy Data from EAD
• Testbed creation tool
• XSLT with Java Extensions using Xalan– Get nextID from database– Extensions instantiate and increment DBID,
parentID, nextSibID for each component in <dsc>
– Write out to .sql file to dump into DB
<xalan:component prefix="counter" elements="init incr" functions="read"> <xalan:script lang="javaclass" src="xalan://MyCounter"/> </xalan:component>
<xsl:template match="/"> <counter:init name="index"/>
<xsl:template name="dsc"><xsl:for-each select="ead/archdesc/dsc">
<xsl:variable name="dsc-parentID"><xsl:value-of select="counter:read('index')"/></xsl:variable> <counter:incr name="index"/><xsl:for-each select="c01">DBID: <xsl:value-of select="counter:read('index')"/>PARENTID <xsl:value-of select="$dsc-parentID"/>Series: c01-<xsl:number/>Unittitle: <xsl:apply-templates select="did/unittitle"/>Abstract: <xsl:apply-templates select="did/abstract"/><xsl:if test="./child::scopecontent">Scopecontent:<xsl:for-each select="scopecontent/p"><xsl:apply-templates select="."/></xsl:for-each></xsl:if>
DBID: 3PARENTID 2Series: c01-1Unittitle: Series I: Documentary Material
DBID: 4PARENTID:3Subseries: c02-1Unittitle: Subseries A: Subjects
DBID:5PARENTID: 4Subseries: c03-1Box: 1Folder: 1Unittitle: AdvertisingUnitdate:undated
DBID:6PARENTID: 4Subseries: c03-2Box: 1Folder: 2-6Unittitle: Art & CollectingUnitdate: undated
INSERT INTO OBJECT (objectID, parentID, nextSibID, hasChildren, componentTypeID)VALUES (3,2,126,1,1);
INSERT INTO TITLE (titleID, titleTypeID, title, objectID)VALUES (NULL,1,"Series I: Documentary Material",3)
DBID: 3PARENTID: 2NEXTSIBID: 126Series: c01-1Unittitle: Series I: Documentary Material
top related