editing records with the marceditorkslibassoc.org/2012conf/handouts/marceditsession_three.pdfediting...

50
EDITING RECORDS WITH THE MARCEDITOR Terry Reese Gray Family Chair for Innovative Library Services [email protected]

Upload: hacong

Post on 01-Apr-2018

219 views

Category:

Documents


2 download

TRANSCRIPT

EDITING RECORDS WITHTHE MARCEDITORTerry ReeseGray Family Chair for Innovative Library [email protected]

Keypoints

MarcEditor What is it? What do the properties mean Preview mode? Paging mode?

Editing Functions Field Count Task Automation Validation OAI Harvesting

Editing MARC

MarcEditor Specialized TextPad designed specifically for MARC records.

Is UTF8 aware – can be used to generate records in MARC8 (thoughmnemonics) or UTF8 charactersets.

MarcEditor Properties

Templates Fonts Encodings Preview Settings

MarcEdit Templates

Templates work much like Microsoft WordTemplates Define a set of default data that will appear on a

screen Templates exist for all material formats Can be customized to suit your needs.

MarcEdit’s Preview Mode

One of the most confusing features Allows MarcEdit’s MarcEditor to address files over the

allowed 2 GB Windows page file limit (thoughpractical limits are closer to 300 MB)

Reads a small snippet of the file into the editor – butedits are done to the entire file.

Can be turned off.

Configuring Preview Mode

MarcEdit Preview Mode

Configuring New Paging

Set in the Options dialog

MarcEdit Paging

Paging Change Notes The preview page functionality is still present, but

full page now defaults to the new pagingfunctionality.

Preview functionality – on load – the applicationreads the entire file to prep – this is where most ofthe loading time takes place. After that, pagesare addressed directly.

Paging Example

Preview Mode Still Exists

Paging Example

If you load the full file, or turn the previewmode off

Editing MARC

MarcEditor Supports a number of global editing functions:

Find/Replace functionality Globally Add/Delete MARC fields Globally Edit Subfield data

Conditionally add/remove field data Globally Edit Indicator data Globally Swap field data Record Deduplication Record Sorting Call Number Generator Macros Z39.50 Cataloging

Editing MARC – Find/Replace

Works like a normalFind/Replace in mostTextpad utilities.

Unlike most Textpads,Replace supports UTF-8 (when working withUTF-8 files) and regularexpressions.

Editing MARC – Find All

Find all function wasdesigned for use withthe Paging mode

Allows users to find anytext across all pages

Generates a jump listthat can be used to findindividual records foredit

Jump List

Find All

Jump List

Jump List Example

Jump List

When using the jump list: Will jump to the page and record within the set Will save (temporarily) any items modified or

pages automatically (though to set saved items,you need to actually save the page)

Jump to

Jump to…record: Allows you to jump to any records

Jump to…page: Allows you to jump to any page

Editing MARC – GlobalAdd/Delete Field

Globally add fields to all MARC records Allows users to set insertion position.

Globally delete fields Allows global delete Allows conditional delete

Supports Regular Expressions

Editing MARC – Modifying subfielddata Allows for the modification of variable MARC

field subfield data (MARC fields >10) Allows for the modification of control field data

by position or range of positions Allows users to prepend and append data to

subfields. Allows users to change subfield tagging.

Editing MARC – Modifying subfielddata Allows users to insert new subfields and

define subfield placement. Allows users to move field data from one field

to another. Supports:

UTF-8 with UTF-8 files Regular Expressions Adding new subfields.

Editing MARC – Modifying subfielddata

Editing MARC – SwappingFields

Swap parts of MARCFields or entire MARCfields Define field, indicator

and subfields to move. Can move field data

and delete the originalfield or clone the fielddata and move theclone to the newlocation.

Can add data to anexisting field.

Character Conversions withinthe MarcEditor

MarcEditor allowsusers to convertcharacter databetween differentcharactersets.

Fixing Boo-boos

MarcEdit’s Special Undo Allows you to step back one global change.

Sorting Fields MarcEdit provides multiple sorting types:

Control Number Sorts record position within the file

Title Sorts record position within the file

Author Sorts record position within the file

Call Number Sorts record position within the file

0xx Fields Sorts the 0xx fields within individual records

(does *not* change record position within afile)

All Fields Sorts all fields within individual records (does

*not* change record position within a file) Custom Sort

Sorts all defined fields within individualrecords (does *not* change record positionwithin a file)

Record Deduplication

MarcEdit provides a simplededup tool that can: Dedup on a defined control

field (any field) Dedup on a transaction

field (or using an additionaltransaction field)

Output Removes all duplications

and saves the duplicationsto a file

Prints just unique itemswithin the file (i.e., thosewithout a duplicate pair)

Field Counts

Field Count Provides a quick

count of fields Report of subfields

used within aparticular field

Detailed reports ofall fields/subfieldsused within a fileset.

Material Type Report

Material Type Report Reports number of

records by materialtype

Breaks down materialtype by sub-types

Utilizes the Leader,008 and GMD todetermine formattypes

In-Line Validation

MarcValidator-lite Can access

MarcValidator forquick validation ofdata elements foundin the file set

Validation can useany defined rulesset.

Task Automation Tool

New to MarcEdit 5.2, Task Automations Task automation provides a way for non-

programmers to create defined task lists that canthen be executed automatically

The different between a task and a macro is thatMarcEdit tasks essentially function like the userwas calling specific functions within MarcEdit.

Anything that you can do in the MarcEditor, youcan automate as a task.

Task Automation

Managing Tasks Task management

works like macromanagement

You can Create new tasks Clone tasks Rename tasks Delete tasks Edit tasks

Task Automation Demo

Additional Information: Youtube:

Introduction to task automation:http://www.youtube.com/watch?v=gmqTGfTubU4

Introduction to new task automation functions:http://www.youtube.com/watch?v=fnorN0MFFN0

OCLC Classify Service

MarcEdit can leverage OCLC WorldCat togenerate call numbers automatically for files Fields used:

001 010$a$z 020$a$z 022$a$z 024$a$z 1xx$a 776$w$z

OCLC Classify Service

MarcEdit Regular ExpressionSupport When processing regular expressions with MarcEdit, MarcEdit

makes entire fields or subfields available for processing i.e., when processing a delete field function – all data from =[field

number] are part of the field that can be queried. MarcEdit’s regular expression by default deals with one field at a

time (i.e., regular expressions do not allow you to find data acrossfields by default)

MarcEdit’s Regular Expression Support Pre-5.x was a customregular expression engine.

MarcEdit’s Regular Expression Support 5.x+ is defined by Microsoft.NET’s Regular Expression object This object uses a syntax that looks Perl-like, but has some differences.

MarcEdit Regular ExpressionSupport When working with regular expressions with

the Replace Function, MarcEdit will rememberthe last 10 replacements. This should helpwith trial and error.

When dealing with Regular Expressions or anyglobal replacements, MarcEdit has a SpecialUndo function that will undo your last globalupdate.

Microsoft’s Regular Expressionlanguage Concepts:

Character escapes Anchors Character classes Grouping Qualifiers Substitutions

Let’s open the net_regular_expressions.htmfile.

How we use Regular Expressionsin MarcEdit Your most important parts of the regular

expression language are:1. Character escapes: \d\r\n\$\x##2. Character Classes [] & [^]3. Grouping Elements ()4. Anchors: ^$5. Quantifiers: *?+{#}6. Substitutions: $#

Examples

Looking at example.mrk using the replacefunction:

Add a period to the 500 if it is missing

Add a $h of cartographic resources between the$a and $c .

Split the 856 into two fields, breaking on the $u.

Examples 1

Add a period to the 500 if it is missing Find What: (=500 ..)(.*[^.]$) Replace With: $1$2.

Explanation: (=500 ..)

Searches for the 500 field. We leave two blanks becausethere are always 2 blank characters as part of the mnemonicformat. The two periods which stand for any character. If wewant to search for exact indicators, you’d place those valuesrather than the periods.

(.*[^.]$) Take any characters, and match on a field where the last

character in the field isn’t a period.

Example 2

Add a $h of cartographic resources between the$a and $c .

Find What: (=245.{4})(\$a.*)(/.*) (=245.{4})

Match the 245 field with any value in the next 4characters being valid.

(\$a.*) Select everything within the subfield a

(/\$c.*) Select the / value and the subfield c (and other

data)

Replace With: $1$2$$h[cartographicresource] $3

Example 3

Split the 856 into two fields, breaking on the$u. Find What: (=856.{4})(\$u.*[^$])(\$u.*)

(=856.{4}) Matches the 856 field

(\$u.*[^$]) Match $u, but stop at the end of the subfield

(\$u.*) Match reminder of field

Replace With: $1$2\n=856 41$3

Lcase/ucase

MarcEdit’s regular expression engine includesto extension functions for dealing with caseswitching of characters. lcase & ucase

Usage: (=450.{4})(\$a.)(.*) $1$2lcase($3)

Example: Find the 500 with all upper casecharacters and convert the case of all values butthe first letter in the sentence to lower case.

Example (Lcase)

Find the 500 with all upper case charactersand convert the case of all values but the firstletter in the sentence to lower case.

Find What: (=500.{4})(\$a.)([A-Z .]*) Replace With: $1$2lcase($3)

Multi-Field Replacements

By default, MarcEdit handles one field at atime when doing regular expressions. However, when you need to do evaluations

against multiple fields, you can by adding /m tothe end of your replacement in the ReplaceFunction in the MarcEditor

This is a special function added to the MarcEditregular expression engine

Example

Using test.mrk

The file has multiple 028 fields. The first fieldhas a $a and $b, the second a $b. Copy the$b to the second 028, but only if they areconsecutive

Multi-Line Example

The file has multiple 028 fields. The first fieldhas a $a and $b, the second a $b. Copy the$b to the second 028, but only if they areconsecutive Find What:

(=028.{4}\$a[^\$]+)(\$b[^\$]+)(\r?\n)(=028.{4}\$a[^\$\r\n]+)(\r?\n)/m

Replace With: $1$2$3$4$2$3

Getting Regular ExpressionHelp The MarcEdit Listserv has a number of regular

expression experts that provide a lot of help tousers looking for it

http://metis3.gmu.edu/cgi-bin/wa?A0=MARCEDIT-L