digitization projects for small archives and museums
TRANSCRIPT
Basic Imaging: How to Do aSmall Digitization Project
Anna Naruta-Moya, PhD
April 22, 20141
Anna Naruta-Moya, PhD
Formerly an archivist for the HooverInstitution Archives, Stanford University, andthe US National Archives; consultingarchivist (annanaruta.com) with credentialsapproved by the New Mexico HistoricalRecords Advisory Board; archivist for Stateof New Mexico Department of Cultural Affairs(ARMS)
2
Workshop Objectives Understand basics of emerging practices in
digitization projects Get familiar with guidelines emerging from the
Federal Agencies Digitization GuidelinesInitiative, Still Images Working Group (FADGIguidelines)
Define a low-cost digitization project Identify photographic objects appropriate for
a low-cost system Understand basics of color management,
including creation of color profiles Learn to implement and operate the system
to create archival digital scans3
Emerging Practices (FADGI)
Federal Agencies DigitizationGuidelines Initiative, Still ImageWorking GroupTechnical Guidelines for Digitizing
Cultural Heritage Materials: Creation ofRaster Image Master Files (2010)http://www.digitizationguidelines.gov/guidelin
es/digitize-technical.html
4
Large Scale DigitizationProjects (i.e. not us!)
“MassDigitization”
High-endequipment Example:
Kirtas Large staff Large
volume Big budget
5
Low-Cost Digitization Project
In contrast to large-scale digitization project: Doesn’t have a full-time project manager Tools cost ~$500 - $2500
(from low end consumer scanner to Epson Expression 11000XL)
You’ll need: Relatively homogenous format Well-defined project scope
Planning, selecting and evaluating material,creating metadata usually most costly part ofproject Also big part of enabling success of project!
6
Advantages of Digitization
Increase accessFacilitate new uses Increase preservation of original due to
use of surrogate copy
7
Limitations
Full-text searching: ideal vs. realityAbility successfully use Optical Character
Recognition (OCR) software heavilydependent on source material
Computer tech requirementsStorage spaceAccess system
Software and hardware
8
Project Planning
StakeholdersGoals for project
Create inhouse image databaseresearchers can use? Just for institution?
Public website?TimelineResources (personnel, equipment,
financial)
9
Selecting What to Digitize
Considerations: Homogeneity of format Contextualization -- Is descriptive information
about the collection available? Availability of metadata Intellectual property issues Possible access restrictions Audience: researcher interest Whole series? Subseries? Item-level or folder-level description?
10
Rights and Permissions
Is it in copyright? Depends on type of material and what law was
in effect when it was created Refer to chart:
http://copyright.cornell.edu/resources/publicdomain.cfm
Other permission needed? Check the deed of gift
Other permissions or sensitivities that mayneed to be considered?
11
Metadata
See notes from Meet Metadata, YourNew BFF training by John HyrumMartinez, State Records Administrator
Plan which fields you needEdit or create metadata before
digitizingKeep digitization workflow separate
12
File Names Should at Least Be unique Use lowercase letters of the Latin alphabet and the
numerals 0-9 Have no spaces between characters Avoid punctuation marks other than hyphens and
underscores Have no more than 31 characters (the fewer the
better) Have a single period between the file name and the
three-letter extension
http://www.library.umass.edu/assets/aboutus/attachments/UMass-Amherst-Libraries-Best-Practice-Guidelines-for-Digitization-20110523-templated.pdf
13
To Keep in Mind whenDesigning Your File Naming
Scheme Each file name must be unique Name for the long term: how will this name scale as you add digital
material to your collections? File names should provide context: names could include codes for
department or collection. Keep file names simple for readability Self-explanatory file names make it easier to understand the context
of files as they make their way through digitization work flows The more complicated the file name, the higher likelihood of human
error when entering the name. Consider including the systemʼs unique digital object ID in the name of
the individual files that make up that object File names are not metadata: let your metadata describe the digital
object. Use file names to connect metadata to digital images File names will outlast the current project staff
Umass-Amherst
14
Does Your Filename Follow aNumerical Scheme?
Use leading zeroes to facilitate sorting0000001.tiff0000002.tiff0000010.tiff
If the filename scheme involves a data,YYYYMMDD format facilitates sorting
15
Quality Control
Plan for checking images andmetadata as part of projectPresence/absenceAccuracy/quality
16
Storing and Accessing YourImages & Metadata
Online (or inhouse) digital database / imageserver Some opensource options:
Omeka Murkutu Dspace Islandora (Drupal + Fedora) Archivematica
Simple spreadsheet and folder(s) Can be imported into other systems later
MS Excel LibreOffice (opensource, free)
17
Digital Preservation CDs, DVDs, external drive not recommended
for preservation purposes RAID array of hard drives, with additional
backup stored someplace else (non-colocated)
Professionally maintained servers Your internal IT system
check in about digital preservation plan Hosted service (such as through Dspace,
Omeka, etc) Preservation and rights-friendly server services
18
Services -- Things to watch for
Does the service make any claims on yourcontent? “unlimited, royalty-free sublicense…”
Evaluation criteria: FADGI refers us to Trusted Digital Repositories: Attributes and
Responsibilitieshttp://www.oclc.org/research/activities/past/rlg/trustedrep/repositories.pdf
Trustworthy Repositories Audit Certification(TRAC): Criteria and Checklisthttp://www.crl.edu/sites/default/files/attachments/pages/trac_0.pdf
19
Digital Preservation
All things decay -- How do you monitorfor bitrot? Include a checksum as partof your metadataMD5 hash or SHA hashAutomatically generated for you with
many hosted servicesEasy for you to make:
Karen’s Directory Printer (free)
20
Digital Imaging EquipmentOptions
Scanner Constant internal light source Slower
Camera, copy stand, and lights Need to arrange constant light source (“studio
lighting”) Optimal: two lights set a 45 degree angle Unless fluorescent or LED, heat will be generated
More rapid than scanning Additional supplies:
White card, gray card, shutter release
21
Setting Imaging Standards
22
Setting Imaging Standards
Bit Depth
Resolution
File Format
NMSRCA
23
The number of bits by which eachpixel is defined. Sets the range oftonal values by which an image
can be represented.
Bit Depth
NMSRCA
24
1-Bit Depth (2 values)
NMSRCA
25
8-Bit Depth (256 values)
NMSRCA
26
8-Bit DepthHighlights & Shadows
NMSRCA
27
24-Bit Color(16,777,216 values)
NMSRCA
28
24-Bit Color
Color information is brokendown into three channels—Red,Green, and Blue—eachrecorded at 8-bit depth. Incomposite, these create16,777,216 color values.
NMSRCA
29
The number of pixels by which animage is represented, usually
expressed by a sampling rate ofpixels per inch (ppi), or by overall
pixel dimensions.
Resolution
NMSRCA
30
Before establishing project standards, knowyour scanner’s specifications!
•Optical Resolution: Maximum number of samples per inchthat the scanner can take from your source material, oftenmuch lower than the scanner’s advertised “maximumresolution”•Interpolated Resolution: An inflated resolution created byadding pixels where no direct samples have been takenfrom the source material•High Resolution Platen Area: A special feature of somescanners in which a higher optical resolution can beproduced from within a designated area of the scannerplaten
Resolution
NMSRCA
31
•Goals and needs of your institution
•How will the digital image be used?
•What are the resolution limitations of your outputdevices?
•How will you store your digital files?
Other Considerations inSetting Resolution
Standards
NMSRCA
32
72ppi 600ppi
1200ppi
4”x5” black and white negative scanned at72ppi, 600ppi, and 1200ppi
after NMSRCA
dpi = dots per inch:printing
ppi = pixels per inch:screen or file
33
72ppi 600ppi
1200ppi
Detail of 4”x5” black and white negative scannedat 72ppi, 600ppi, and 1200ppi
NMSRCA
34
Example:4 inches x 5 inches x 24 bit RGB color x 300 ppix 300 ppi / 8 bits per byte / 1024 bytes per KB /1024 KB per MB = 5.15 MB
File Size = (Height x Width x Bit depth x Resolution (ppi)2)/ 8 bits per byte / 1024 bytes per KB/ 1024 KB per MB
Calculating File Size
Measure Height & Width in inches, to match pixels per inch
https://www.library.cornell.edu/preservation/tutorial/intro/intro-06.html
Calculating File Size
35
Example: a 4”x5” color printscanned at the scanner’smaximum resolution of 2000 ppiand 24-bit color
(4” x 5” x 24 x 2000 x 2000) / 8 bits per byte / 1024bytes per KB / 1024 KB per MB = nearly 229 MB
after NMSRCA
Calculating File Size:Max Resolution Example
36
The format you select determineshow the image is stored, what
programs you can use to open it,and to what degree the image canbe manipulated once it is opened.
File Formats
NMSRCA
37
Lossless (as opposed to Lossy file formats likeJPGThe most widely used, widely supportedbitmapped file formatCan support any dimensions, any resolution,and any bit depthCan encode bi-tonal, grayscale, RGB, andCMYK color modesCan be saved in compressed anduncompressed formats
Tagged Image File Format(TIFF)
after NMSRCA
38
Tonal Depth: 8-bit grayscale/24-bit RGBcolorFile Format: TIFFScale: 100%Compression: UncompressedSpatial Resolution: 4000 pixels acrossthe long dimension
NMSRCA Imaging ProjectStandards for Masters
NMSRCA
39
FADGI GuidelinesFor Reflection Scanning
Format: 8 x 10” or smaller4000 pixels along long edge
Format: Larger than 8 x 10” up to 11 x 14”
6000 pixels along long edgeFormat: Larger than 11 x 14”
8000 pixels along long edge
40
Planning for Resolution(What ppi to Use)
What ppi should you use to achieve thespatial resolution of 4000 pixels across thelong dimension?
Length in inches x Resolution in ppi =number of pixels across length
Resolution in ppi = number of pixels acrosslength / length
Example: 8 x 10” photoResolution in ppi = 4000 pixels / 10 inchesResolution = 400 ppi
41
Access Images
NMSRCA
42
Tonal Depth: 8-bit grayscale/24-bit RGB colorFile Format: JPEG (lossy, but display inbrowsers is supported)Compression: MediumSpatial Resolution: Maximum width of 640ppi bya maximum height of 480ppiTonal Range: Adjust high and low input levels toencompass the information in the scan; adjustmidpoint input level for best monitor display
NMSRCA Imaging ProjectStandards for Access Images
NMSRCA
43
Deriving Access Images
•Open the master image in an image editingprogram such as Adobe Photoshop•Resize the image for monitor display atactual pixel size•Adjust tonal range for monitor display•Keep a record of all modifications made tothe access image•Save the file in a manageable file formatsuited for high-speed delivery
•Can use Batch processingafter NMSRCA
44
Image Capture Equipment:Scanner
after NMSRCA
45
Copy Table
46
Photographic Lights, Daylightcolor temp 5000K - 5500K
http://archivehistory.jeksite.org/chapters/appendixd.htm
47
Camera Settings
Using your “studio lighting” setup, setWhite Balance using photo of whitecard (or white sheet of paper)
ASA 100 (for less “grainy” appearance)Output: RAW (requires developing into
TIFF) or TIFFExposure: set using gray card
48
Other formats: ScanningSlides with DSLR
http://www.scantips.com/es-1.html
49
Other Formats: Oversize
Low-Cost Tilt TopVacuum Table forDigital Capture ofNewspapers andOther LargePaper Objects
http://www.wilhelm-research.com/VacuumTable/WIR-CFI_Tilt-Top_%20Vacuum_Table_Guide_2013_05_22_v3.pdf
50
Scanning:Creating the Digital Image
•Allow the scanner to warm up•Start each session with a cleanplaten—check the platen for dust orstreaks between scans•Remove dust from the sourcematerial•Place the image square andsecurely on the glass•Set the scanning softwareaccording to your image size andtype, and your correspondingstandards -- turn off allautocorrection•Run a preview of the scan; if itappears satisfactory, scan theimage as a TIFF file
after NMSRCA
51
Scanning:Adjusting Tonal Input Values
•When using your scanning software’sautomatic tonal controls, be sure thatno tonal information is lost in the scan
•Loss will most often be seen inshadows when scanning from positiveimages or in highlights when scanningfrom negative images
•If it is necessary to bypass automatictonal controls and manually set tonalinput values, aim for low-contrast inyour master image
•Keep a record of all automatic andmanual settings
NMSRCA
52
Setting Black and WhitePoints from Scanner Preview
Scantips.com
53
Image Scanned with Blackand White Points Set
http://www.scantips.com/simple4.html54
Exceptions in SettingBlack and White Points
•When sourceimage does nothave full tonalrange
•In this instance,resetting WhitePoint would‘wash out’ theimage.
Scantips.com55
Refer to histogram toadjust the White Point
Scantips.com
Levels Tool
56
Refer to histogram toadjust the Black Point
Scantips.com
Levels Tool
57
Refer to Histogram to AdjustBrightness
This provides a much better result thanthe editor tools named Brightness andContrast
58
Curve Tool Provides EvenMore Control than Level
Once you are comfortable adjusting thelevels, start experimenting with using theCurve tool instead. It can do the samethings, but offers more control.
59
Curve Tool
Experiment during today’s hands-onportion
See the walk-through athttp://www.scantips.com/curve.htmlandhttp://www.scantips.com/curve/
60
FADGI Recommendation toAid Tone and Color
ReproductionAlso include reference target when
scanning imageDepending on scanner software,
reference target can be used to pickBlack Point and White Point
61
Targets for Tone and ColorReproduction
FADGIrecommendsincludingreferencetarget inpreservationmaster
This particulartarget wasdeveloped forthem
62
Kodak Q-13 (8” long) orQ-14 (14”) Gray Scale
FADGIrecommendsbecause theyare printed onblack & whitephotographicpaper
63
Aimpoint for PhotographicGray Scales
Reference targetsusually croppedfor access copiesof images
Color bars “assupplement” --color isnot consistent”(FADGI)
Note: Ruler onKodak target “notvery accurate”(FADGI)
64
Color Management --ICC Color Profiles
65
Color Management
ICC Color Profiles International Standard by International Color
Consortium (ICC) ICC profile = a set of data that characterizes a
color input or output device Describes your particular device, at this point in time
(age of its parts), and in current environmentalconditions (if seasonal fluctuations)
Step 1: Calibrate Monitor
Step 2: Profile Scanner (or digital camera)
66
Color Management Step 1: Calibrate Monitor
Need: Calibration device (colorimeter) & manufacturer’ssoftware (~$150-250)
Spyder (Datacolor Spyder4Pro) X-Rite’s ColorMunki or i1Display NEC Color Sensor
Turn on monitor, let warm up at least 30 minutes Use light source you’ll use when working
Curtains drawn? Desk lamp?) Calibrate colorimeter to ambient light, then place over
monitor in location indicated. Software plays known colorvalues and uses colorimeter to measure monitor’sperformance
Resulting data saved as ICC profile in your systemsoftware to tell computer how to use monitor to accuratelydisplay image data Proper location in system preferences chosen by default by
your calibration software Repeat every 2-4 weeks
67
Spyder 4colorimeter
From youtube videoby Kirk Norbury
68
FADGI: NARA MonitorAdjustment Target
FADGI guidelines recommend toassess monitor visually after calibration
https://www.library.cornell.edu/preservation/tutorial/presentation/presentation-07.html
69
Color Management
Step 2: Profile Scanner (or digital camera) Need: IT8 target and IT8-enabled software
IT8 = a set of American National Standards Institute(ANSI) standards for color control specifications
IT8 targets are photographically printed in smallbatches to strict specifications, and then each colorswatch is read with a spectrophotometer
Spectrophotometer data is used to create a data filethat is the exact color profile of that specific batch oftargets
IT8-enabled software will compare known color valuesto values read by your scanner, create ICC colorprofile for your scanner which you apply to yourscanned images
70
IT8 Target
Keep target protected from light, dust, and temperature extremes71
Obtaining an IT8 Target May come with scanner Vendors:
EGM http://www.egm.es/servicios/servicios_interior/18
Wolf Faust ($10 with shipping) http://www.targets.coloraid.de/
SilverFast (integrates with SilverFast software) Kodak
“Reflective” target for calibration for scanningprints
“Transmissive” target for calibration forscanning film or transparencies
72
IT8 Target Batch Number Each manufacturer has a unique
code to indicate which batch oftargets this one belongs to
Code corresponds to known,highly-precise measurements ofactual color of this batch Dataset (aka reference file) comes
with target, or can be downloaded The IT8-enabled software must
be told which data set to use Compare batch name on scanned
image with reference file selectedfor software
Barcode batch number in automatedsystem
73
Color Profiling Your Scanner
Turn on scanner and let warm up for at least30 minutes
Disable any auto-correction features onscanner (e.g. White Balance, exposure, etc.)
Scan IT8 target at 200 dpi and save asuncompressed TIFF
Import TIFF into IT8-enabled software forprocessing against reference file to createICC color profile for your scanner
74
IT8-Enabled Software
~$80-300 (sometimes bundled with scanner) ExactScan Pro (Windows, Mac, Linux) SilverFast (Windows, Mac) VueScan (Windows, Mac, Linux) Profile Prism (Windows) -- also supplier of 35mm
IT8 target Opensource software -- free
CoCa, ICC Color Profiler for Digital Cameras andScanners (Windows, Linux) - beta
Rough Profiler (enables CoCa for Mac)
75
CoCa
76
Digital Camera: Types ofTargets Read by CoCa
Full list of targets athttp://www.dohm.com.au/coca
Includes the IT8 target required forscanner calibration, and many typesused for digital camera calibration
77
Profiling a Digital SLRCamera with an IT8 Target
You’ll shoot the IT8 target in bright sunlight Tape the target to a thick cardboard; will bend as
it heats in sunlight Use white card (or sheet of paper) to take
image for internal white balance feature Take several shots of target in RAW mode,
starting with normal exposure and increasing Fill about 3/4 of screen with target Use IT8-enabled software to create ICC
color profile for your camera
78
Software for profilingcameras
Any of the previously listed IT8-enabled software Additional target/software combinations listed at
http://www.silverfast.com/show/dc-targets/en.html
http://www.cmp-color.fr/eng%20digital%20target.html
http://www.cmp-color.fr/E_CMP_Shop.html Step-by-step guide for profiling with IT8 target
http://www.steves-digicams.com/knowledge-center/profiling-a-camera-with-an-it8-target.html
79
What to do with the ICCScanner Profile
This is different than the monitor profile,which just sits in your system telling thecomputer how to use the monitor
Some scanner let you apply the scannerprofile to the scanner Scanner will then automatically attach profile to
output, and your images will display with truecolor
If you can’t apply the profile to the scanner,you will need to attach it to the image in yourimage processing software
80
Applying the ICC ScannerProfile (in Photoshop)
http://www.booksmartstudio.com/color_tutorial/scanners.html
81
Then Convert to a StandardColor Space
FADGI recommendation: Save in astandard color space for convenienceand digital preservationsRGB for color images
Another standard space is Adobe RGB(1998)
Gray Gamma 2.2 for grayscale images
82
Converting to a StandardColor Space
83
Assigning colorprofile usingGraphicConverter
84
Raster Image (Photo) EditingSoftware Options
ICC-profile enabled options include:Adobe Photoshop Lightroom (Windows,
Mac) ~$115-150GraphicConverter (Mac) $40GIMP (GNU Image Manipulation Program
[formerly General Image ManipulationProgram]) (Windows, Mac, etc)opensource, free
85
Now You are a Master ofColor Management Using
ICC Color Profiles!
86
Digitization Hands-on Demo
87