the sedona conference journal...the sedona conference journal ® v olume 15 v f all 2014in memoriam:...

62
The Sedona Conference Journal ® V o l u m e 1 5 v F a l l 2 0 1 4 In Memoriam: Richard G. Braman Craig W. Weinlein Responding to the Government’s Civil Investigations David C Shonka What’s the Problem with Google? Daniel R Shulman Technology-Assisted Review: e Judicial Pioneers Paul E Burns & Mindy M Morton e Sedona Conference Commentary on Patent Damages & Remedies e Sedona Conference e Sedona Conference Commentary on Information Governance e Sedona Conference e Sedona Conference Database Principles Addressing the Preservation & Production of Databases & Database Information in Civil Litigation e Sedona Conference e Sedona Conference Best Practices Commentary on the Use of Search & Information Retrieval Methods in E-Discovery e Sedona Conference e Sedona Conference Commentary on Achieving Quality in the E-Discovery Process e Sedona Conference e Sedona Conference Glossary: E-Discovery & Digital Information Management (Fourth Edition) e Sedona Conference A R T I C L E S ANTITRUST LAW, COMPLEX LITIGATION AND I NTELLECTUAL PROPERTY RIGHTS

Upload: others

Post on 09-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

The Sedona Conference Journal®

V o l u m e 1 5 v F a l l 2 0 1 4

In Memoriam: Richard G. Braman . . . . . . . . . . . . . . . . . . Craig W. Weinlein

Responding to the Government’s Civil Investigations . . . . . David C . Shonka

What’s the Problem with Google? . . . . . . . . . . . . . . . . . . . Daniel R . Shulman

Technology-Assisted Review: The Judicial Pioneers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paul E . Burns & Mindy M . Morton

The Sedona Conference Commentary on Patent Damages & Remedies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Sedona Conference

The Sedona Conference Commentary on Information Governance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Sedona Conference

The Sedona Conference Database Principles Addressing the Preservation & Production of Databases & Database Information in Civil Litigation . . . . . . . . . . . . . . . . . The Sedona Conference

The Sedona Conference Best Practices Commentary on the Use of Search & Information Retrieval Methods in E-Discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Sedona Conference

The Sedona Conference Commentary on Achieving Quality in the E-Discovery Process . . . The Sedona Conference

The Sedona Conference Glossary: E-Discovery & Digital Information Management (Fourth Edition) . . . . . . . The Sedona Conference

A r t i c l e s

Antitrust lAw, complex litigAtion And intellectuAl propert y rights

Page 2: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

The Sedona Conference Journal® (ISSN 1530-4981) is published on an annual basis, containing selections from the preceding year’s Conferences and Working Group efforts.

The Journal is available on a complementary basis to courthouses and public law libraries and by subscription to others ($95; $45 for Conference participants and Working Group members).

Send us an email ([email protected]) or call (1-602-258-4910) to order or for further information. Check our website for further information about our Conferences, Working Groups, and

publications: www.thesedonaconference.org.

Comments (strongly encouraged) and requests to reproduce all or portions of this issue should be directed to:

The Sedona Conference, 5150 North 16th Street, Suite A-215, Phoenix, AZ 85016 or call 1-602-258-4910; fax 602-258-2499; email [email protected].

The Sedona Conference Journal® designed by MargoBDesign.com – [email protected]

Cite items in this volume to “15 Sedona Conf. J. _____ (2014).”

Copyright 2014, The Sedona Conference. All Rights Reserved.

kva
Typewritten Text
Page 3: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

*Copyright 2014, The Sedona Conference. All Rights Reserved.

2014 THE SEDONA CONFERENCE JOURNAL 305

THE SEDONACONFERENCE GLOSSARY:E-DISCOVERY & DIGITAL INFORMATIONMANAGEMENT* (FOURTH EDITION)

A Project of The Sedona Conference Working Group onElectronic Document Retention & Production (WG1).

Editors:Sherry B. Harris, Crowley Law Office

Paul H. McVoy, Milberg LLP

RFP+ eDiscovery Vendor Panel(see www.thesedonaconference.org for a listing of the

RFP+ eDiscovery Vendor Panel members)

Contributing Editor:Matt K. Mulder, KPMG

Thanks go to all who participated in the dialogue that led to this Commentary.We thank all of our Working Group Series Sustaining and Annual Sponsors,whose support is essential to our ability to develop Working Group Series

publications. For a listing of our sponsors just click on the “Sponsors” Navigationbar on the homepage of our website.

The opinions expressed in this publication, unless otherwise attributed, representconsensus views of the members of The Sedona Conference Working Group 1.

They do not necessarily represent the views of any of the individual participants ortheir employers, clients, or any other organizations to which any of the participantsbelong, nor do they necessarily represent official positions of The Sedona Conference.*

Page 4: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

PREFACE

The Sedona Conference Glossary is published as a tool to assist in theunderstanding and discussion of electronic discovery and electronic informationmanagement issues. It is not intended to be an all-encompassing replacement of existingtechnical glossaries published by ARMA International (www.arma.org), American NationalStandards Institute (www.ansi.org), International Organization for Standardization(www.iso.org), U.S. National Archives & Records Administration (www.archives.gov) andother professional organizations.

The Sedona Conference acknowledges the contributions of all the Working Group1 and RFP+ eDiscovery Vendor Panel members who reviewed and commented on drafts ofthis Fourth Edition, and thanks all of our Working Group Series Sponsors, whose supportis essential to the ability of The Sedona Conference to develop Working Group Seriespublications such as this. For a complete listing of our Sponsors, click on the “Sponsors”navigation bar on the home page of our website at http://www.thesedonaconference.org.

As with all of our publications, your comments are welcome. Please forward themto me at [email protected].

Craig WeinleinExecutive DirectorThe Sedona ConferencePhoenix, AZUSAApril 2014

306 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 5: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

30(b)(6): A short hand reference to Rule 30(b)(6) of the Federal Rules of Civil Procedure,under which a corporation, partnership, association or governmental agency is subject to thedeposition process, and required to provide one or more witnesses to “testify as to mattersknown or reasonably available to the organization” on the topics requested by the depositionnotice. Sometimes the 30(b)(6) topics concern the discovery process itself, includingprocedures for preservation, collection, chain of custody, processing, review, and production.

Ablate: To burn laser-readable “pits” into the recorded layer of optical disks, DVD-ROMsand CD-ROMs.

Ablative: Unalterable data. See Ablate.

Access Control List (ACL): A security group comprised of individual users or user groupsthat is used to grant similar permissions to a program, database or other security controlledenvironment.

Active Data: Information residing on the direct access storage media (disk drives or servers)that is readily visible to the operating system and/or application software with which it wascreated. It is immediately accessible to users without restoration or reconstruction.

Active Records: Records related to current, ongoing or in-process activities referred to on aregular basis to respond to day-to-day operational requirements. See Inactive Record.

Address: A structured format for identifying the specific location or routing detail forinformation on a network or the Internet. These include email addresses Simple MailTransfer Protocol (SMTP), Internet Protocol (IP) addresses and Uniform Resource Locators(URLs), commonly known as Web addresses.

Adware: See Spyware.

Agent: A program running on a computer that performs as instructed by a central controlpoint to track file and operating system events and takes directed actions, such astransferring a file or deleting a local copy of a file, in response to such events.

AIIM: The Association for Information and Image Management, http://www.aiim.org, thatfocuses on Enterprise Content Management (ECM).

Algorithm:With regard to electronic discovery, computer script that is designed to analyzedata patterns using mathematical formulas, and is commonly used to group or find similardocuments based on common mathematical scores.

Alphanumeric: Characters composed of letters, numbers (and sometimes non-controlcharacters, such as @, #, $). Excludes control characters.

Ambient Data: See Data.

American National Standards Institute (ANSI): http://www.ansi.org—a private, non-profit organization that administers and coordinates the U.S. voluntary standardization andconformity assessment system.

2014 THE SEDONA CONFERENCE JOURNAL 307

Page 6: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

American Standard Code for Information Interchange (ASCII, pronounced “ask-ee”):A non-proprietary text format built on a set of 128 (or 255 for extended ASCII)alphanumeric and control characters. Documents in ASCII format consist of only text withno formatting and can be read by most computer systems.

Analog: Data in an analog format is represented by continuously variable, measurable,physical quantities such as voltage, amplitude or frequency. Analog is the oppositeof digital.

Annotation: The changes, additions or editorial comments made or applicable to adocument – usually an electronic image file – using electronic sticky notes, highlighter orother electronic tools. Annotations should be overlaid and not change the originaldocument.

Aperture Card: An IBM punch card with a window that holds a 35mm frame ofmicrofilm. Indexing information is punched in the card.

Applet: A program designed as an add-on to another program, allowing greaterfunctionality for a specific purpose other than for what the original program was designed.

Appliance: A prepackaged piece of hardware and software designed to perform a specificfunction on a computer network, for example, a firewall.

Application: Software that is programmed for one or more specific uses or purposes. Theterm is commonly used in place of “program” or “software.” Applications may be designedfor individual users, for example a word processing program or for multiple users, as in anaccounting application used by many users at the same time.

Application Programming Interface (API): The specifications designed into a programthat allows interaction with other programs. See Mail Application Programming Interface(MAPI).

Application Service Provider (ASP): An Internet-based organization hosting applicationson its own servers within its own facilities. Customers license the application and accessit through a browser over the Internet or via some other network. See Software as aSolution (SaaS).

Architecture: Refers to the hardware, software or combination of hardware and softwarecomprising a computer system or network. “Open architecture” describes computer andnetwork components that are more readily interconnected and interoperable. “Closedarchitecture” describes components that are less readily interconnected and interoperable.

Archival Data: Information an organization maintains for long-term storage and recordkeeping purposes but which is not immediately accessible to the user of a computer system.Archival data may be written to removable media such as a CD, magneto-optical media,tape or other electronic storage device or may be maintained on system hard drives. Somesystems allow users to retrieve archival data directly while other systems require theintervention of an IT professional.

Archive, Electronic: Long-term repositories for the storage of records. Electronic archivespreserve the content, prevent or track alterations, and control access to electronic records.

308 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 7: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

ARMA International: A not-for-profit association and recognized authority on managingrecords and information, both paper and electronic, http://www.arma.org.

Artificial Intelligence (AI): A subfield of computer science focused on the development ofintelligence in machines so that the machines can react and adapt to their environment andthe unknown. AI is the capability of a device to perform functions that are normallyassociated with human intelligence, such as reasoning and optimization through experience.It attempts to approximate the results of human reasoning by organizing and manipulatingfactual and heuristic knowledge. Areas of AI activity include expert systems, naturallanguage understanding, speech recognition, vision, and robotics.

Aspect Ratio: The relationship of the height to the width of any image. The aspect ratio ofan image must be maintained to prevent distortion.

Association for Computing Machinery (ACM): Professional association for computerprofessionals with a number of resources, including a special interest group on search andretrieval. See http://www.acm.org.

Attachment: A record or file associated with another record for the purpose of retention,transfer, processing, review, production and routine records management. There may bemultiple attachments associated with a single “parent” or “master” record. In many recordsand information management programs or in a litigation context, the attachments andassociated record(s) may be managed and processed as a single unit. In common use, thisterm often refers to a file (or files) associated with an email for retention and storage as asingle Message Unit. See Document (or Document Family), Message Unit and Unitization.

Attribute: A specific property of a file such as location, size or type. The term attribute issometimes used synonymously with “data element” or “property.”

Audio-Video Interleave (AVI): A Microsoft standard for Windows animation files thatinterleaves audio and video to provide medium quality multimedia.

Audit Log or Audit Trail: An automated or manual set of chronological records of systemactivities that may enable the reconstruction and examination of a sequence of eventsand/or changes in an event.

Author or Originator: The person, office or designated position responsible for an item’screation or issuance. In the case of a document in the form of a letter, the author ororiginator is usually indicated on the letterhead or by signature. In some cases, a softwareapplication producing a document may capture the author’s identity and associate it withthe document. For records management purposes, the author or originator may bedesignated as a person, official title, office symbol or code.

Auto-Delete: The use of technology to run predefined rules at scheduled intervals to deleteor otherwise manage electronically stored information. May also be referred to as a janitorprogram or system cleanup.

Avatar: A graphical representation of a user in a shared virtual reality, such as Web forumsor chat rooms.

Backbone: The top level of a hierarchical network. It is the main channel along which datais transferred.

2014 THE SEDONA CONFERENCE JOURNAL 309

Page 8: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Backup: The process of creating a copy of active data as a precaution against the loss ordamage of the original data. The process is usually automated on a regular schedule, whichcan include the automatic expiration of older versions. The term is also used to refer to theElectronically Stored Information itself, as in, “a backup of the email server exists.”Backups can be made to any type of storage, including portable media, CDs, DVDs, datatapes or hard drives – also known as a full backup. See Differential Backup andIncremental Backup.

Backup Data: A copy of Electronically Stored Information that serves as a source forrecovery in the event of a system problem or disaster. See Backup.

Backup Tape:Magnetic tape used to store copies of Electronically Stored Information, foruse when restoration or recovery is required. The creation of backup tapes is made usingany of a number of specific software programs and usually involves varying degrees ofcompression.

Backup Tape Rotation or Recycling: The process whereby an organization’s backups areoverwritten with new data, usually on an automated schedule which should be determinedby IT in consultation with records management and legal. For example, the use of nightlybackup tapes for each day of the week – with the daily backup tape for a particular daybeing overwritten on the same day the following week.

Bandwidth: The amount of data a network connection can accommodate in a given periodof time. Bandwidth is usually stated in kilobits per second (kbps), megabits per second(mps) or gigabytes per second (gps).

Bar Code: A small pattern of vertical lines or dots that can be read by a laser or an opticalscanner. In records management and electronic discovery, bar codes may be affixed tospecific records for indexing, tracking and retrieval purposes.

Basic Input Output System (BIOS): The set of user-independent computer instructionsstored in a computer’s ROM, immediately available to the computer when the computer isturned on. BIOS information provides the code necessary to control the keyboard, displayscreen, disk drives and communication ports in addition to handling certain miscellaneousfunctions.

Batch File: A set of commands written for a specific program to complete a discrete seriesof actions, for example, renaming a series of files en masse.

Batch Processing: The processing of multiple sets of Electronically Stored Information atone time. See Processing Data.

Bates Number: Sequential numbering system used to identify individual pages ofdocuments where each page or file is assigned a unique number. Often used in conjunctionwith a suffix or prefix to identify a producing party, the litigation or other relevantinformation. See Beginning Document Number and Production Number.

Bayesian Search: An advanced search that utilizes the statistical approach developed byThomas Bayes, an 18th century mathematician and clergyman. Bayes published a theoremthat describes how to calculate conditional probabilities from the combinations of observedevents and prior probabilities. Many information retrieval systems implicitly or explicitlyuse Bayes’ probability rules to compute the likelihood that a document is relevant to a

310 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 9: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

query. For a more thorough discussion, see The Sedona Conference Best PracticesCommentary on the Use of Search and Information Retrieval Methods in E-Discovery,available for download at https://thesedonaconference.org/download-pub/3669.

Beginning Document Number or BegDoc#: A unique number identifying the first pageof a document or a number assigned to identify a native file.

Big Data: Enormous volumes of data, often distributed and loosely structured, that may bechallenging to process with traditional technology solutions.

Bibliographic Coding:Manually recording objective information from documents such asdate, authors, recipients, carbon copies, blind copies, and associating the information with aspecific document. See Indexing and Coding.

Binary: The Base 2 numbering system used in digital computing that represents allnumbers using combinations of zero and one.

Bit: Binary digit – the smallest unit of computer data. A bit consists of either 0 or 1. Thereare eight bits in a byte. See Byte.

Bitmap: A file format that contains information on the placement and color of individualbits used to convey images composed of individual bits (pixels), for which the system fileextension is .bmp.

Bit Stream Backup: A sector-by-sector/bit-by-bit copy of a hard drive; an exact copy of ahard drive, preserving all latent data in addition to the files and directory structures. SeeForensic Copy.

Cited: Nucor Corp. v. Bell, 2008 WL 4442571, D.S.C., January 11, 2008 (C/ANo. 2:06-CV-02972-DCN.), quoting The Sedona Conference® Glossary: E-Discovery & Digital Information Management, May 2005 First Edition (“A BitStream Back-up is an exact copy of a hard drive, preserving all latent data inaddition to the files and directory structures. Bit Stream Back-up may be createdusing applications such as EnCase.”).

Cited: U.S. v. Saboonchi, 2014 WL 1364765 (D. Mar. Apr. 7, 2014).

Bitonal: A bitonal image uses only black and white.

Bits Per Inch (BPI): A unit of measure of data densities in disk and magnetic tape systems.

Bits Per Second (BPS): A measurement of the rate of data transfer. See Bandwidth.

Blowback: The term for printing Electronically Stored Information to hardcopy.

BMP: See Bitmap.

Bookmark: A link to another location, either within the current file or location or to anexternal location, like a specific address on the internet.

Boolean Search: Boolean searches use keywords and logical operators such as “and,” “or,”and “not” to include or exclude terms from a search, and thus produce broader or narrowersearch results. See Natural Language Search.

2014 THE SEDONA CONFERENCE JOURNAL 311

Page 10: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Boot Sector/Record: See Master Boot Sector/Record and Volume Boot Sector/Record.

Broadband: Commonly used in the context of high bandwidth Internet access madeavailable through a variety of quickly evolving technologies.

Brontobyte: 1,024 yottabytes. See Byte.

Browser: An application used to view and navigate the World Wide Web and otherInternet resources.

Bulletin Board System (BBS): A computer system or service that users access to participatein electronic discussion groups, post messages and/or download files.

Burn: The process of moving or copying data to portable media such as a CD or DVD.

Bus: A parallel circuit that connects the major components of a computer, allowing thetransfer of electric impulses from one connected component to any other.

Byte (Binary Term): A basic measurement of most computer data consisting of 8 bits.Computer storage capacity is generally measured in bytes. Although characters are stored inbytes, a few bytes are of little use for storing a large amount of data. Therefore, storage ismeasured in larger increments of bytes. See Kilobyte, Megabyte, Gigabyte, Terabyte,Petabyte, Exabyte, Zettabyte, Yottabyte, Brontobyte and Geopbyte (listed here in order ofincreasing volume).

Cache: A dedicated, temporary, high speed storage location that can be used to storefrequently-used data for quick user access, allowing applications to run more quickly.

Case De-Duplication: Eliminates duplicates to retain only one copy of each document percase. For example, if an identical document resides with three custodians, only the firstcustodian’s copy will be saved. Also known as Cross Custodial De-Duplication, Global De-Duplication or Horizontal De-Duplication.

Catalog: See Index.

CCITT Group 4: A lossless compression technique/format that reduces the size of a file,generally about 5:1 over RLE and 40:1 over bitmap. CCITT Group 4 compression mayonly be used for bitonal images.

CD-R, CD+R (Compact Disk Recordable): See Compact Disk.

CD-RW (Compact Disk Re-Writable): See Compact Disk.

CD-ROM (Compact Disk Read-Only Memory): See Compact Disk.

Cellular Digital Packet Data (CDPD): A data communication standard utilizing theunused capacity of cellular voice providers to transfer data.

Central Processing Unit (CPU): The primary silicon chip that runs a computer’s operatingsystem and application software. It performs a computer’s essential mathematical functionsand controls essential operations.

312 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 11: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Certificate: An electronic affidavit vouching for the identity of the transmitter. See DigitalCertificate, Digital Signature, Public Key Infrastructure (PKI) Digital Signature.

Chain of Custody: Documentation regarding the possession, movement, handling andlocation of evidence from the time it is identified to the time it is presented in court orotherwise transferred or submitted; necessary both to establish admissibility andauthenticity; important to help mitigate risk of spoliation claims.

Checksum: A value calculated on a set of data as a means of verifying its authenticity to acopy of the same set of data, usually used to ensure data was not corrupted during storageor transmission.

Child: See Document.

Clawback Agreement: An agreement outlining procedures to be followed if documents orElectronically Stored Information are inadvertently produced; typically used to protectagainst the waiver of privilege.

Client: (1) In a network, a computer that can obtain information and access applicationson a server; (2) an application on a hard drive that relies on a server to perform someoperations. See Thin Client.

Client Server: An architecture whereby a computer system consists of one or more servercomputers and numerous client computers (workstations). The system is functionallydistributed across several nodes on a network and is typified by a high degree of parallelprocessing across distributed nodes. With client-server architecture, CPU intensive processes(such as searching and indexing) are completed on the server, while image viewing andOCR occur on the client. This dramatically reduces network data traffic and insulates thedatabase from workstation interruptions.

Clipboard: A holding area in a computer’s memory that temporarily stores informationcopied or cut from a document or file.

Cloud Computing: “[A] model for enabling ubiquitous, convenient, on-demand networkaccess to a shared pool of configurable computing resources (e.g., networks, servers, storage,applications, and services) that can be rapidly provisioned and released with minimalmanagement effort or service provider interaction.”http://csrc.nist.gov/publications/PubsSPs.html#800-145 (last visited January 29, 2013). Forfurther discussion see the cited NIST publication SP800-145.pdf.

Cluster (File): The smallest unit of storage space that can be allocated to store a file onoperating systems. Windows and DOS organize hard disks based on clusters (also known asallocation units), which consist of one or more contiguous sectors. Disks using smallercluster sizes waste less space and store information more efficiently.

Cluster (System): A collection of individual computers that appear as a single logical unit.Also referred to as matrix or grid systems.

Cluster Bitmap: Used in NTFS (New Technology File System) to keep track of the status(free or used) of clusters on the hard drive. See NTFS.

Clustering: See Data Categorization.

2014 THE SEDONA CONFERENCE JOURNAL 313

Page 12: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Coding: Automated or human process by which specific information is captured fromdocuments. Coding may be structured (limited to the selection of one of a finite number ofchoices) or unstructured (a narrative comment about a document). See Indexing, VerbatimCoding, Bibliographic Coding, Level Coding and Subjective Coding.

Color Graphics Adapter (CGA): See Video Graphics Adapter (VGA).

Comma Separated Value (CSV): A text file used for the transmission of fielded data thatseparates data fields with a comma and typically encloses data in quotation marks.

Commercial Off-the-Shelf (COTS): Hardware or software products that are commerciallymanufactured, ready-made and available for use by the general public without the needfor customization.

Compact Disk (CD): A type of optical disk storage media; compact disks come in a varietyof formats. These formats include CD-ROM (CD Read-Only Memory) – read-only; CD-Ror CD+R (CD Recordable) – can be written to once and are then read-only; and CD-RW(CD Re-Writable) – can be written to multiple times.

Compliance Search: The identification of and search for relevant terms and/or parties inresponse to a discovery request.

Compound Document: A file that contains multiple files, often from differentapplications, by embedding objects or linked data; multiple elements may be included, suchas images, text, animation or hypertext. See Container File and OLE.

Compression: The reduction in the size of a source file or files with the use of variety ofalgorithms, depending on the software being used. Algorithms approach the task in avariety of ways, generally eliminates redundant information or by predicting where changesare likely to occur.

Compression Ratio: The ratio of the size of an uncompressed file to a compressed file, e.g.,with a 10:1 compression ratio. Example: a 10 KB file can be compressed to 1 KB.

Computer: Includes but is not limited to network servers, desktops, laptops, notebookcomputers, mainframes and PDAs (personal digital assistants).

Computer Aided Design (CAD): The use of a wide range of computer-based tools thatassist engineers, architects and other design professionals in their design activities.

Computer Aided or Assisted Review: See Technology-Assisted Review.

Computer Client or Client: A computer or program that requests a service of anothercomputer system. A workstation requesting the contents of a file from a file server is a clientof the file server. See Thin Client. Also commonly used as synonymous with an emailapplication, by reference to the Email Client. See Thin Client.

Computer Forensics: The use of specialized techniques for recovery, authentication andanalysis of electronic data when an investigation or litigation involves issues relating toreconstruction of computer usage, examination of residual data, authentication of data bytechnical analysis or explanation of technical features of data and computer usage.Computer forensics requires specialized expertise that goes beyond normal data collection

314 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 13: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

and preservation techniques available to end-users or system support personnel andgenerally requires strict adherence to chain-of-custody protocols. See Forensics andForensic Copy.

Computer Output to Laser Disk (COLD): A computer programming process thatoutputs electronic records and printed reports to laser disk instead of a printer.

Computer Output to Microfilm (COM): A process that outputs electronic records andcomputer generated reports to microfilm.

Concatenate: Generally, to add by linking or joining to form a chain or series; the processof linking two or more databases of similar structure to enable the user to search, use, orreference them as if they were a single database.

Concept Search: The method of search that uses word meanings and ideas, without thepresence of a particular word or phrase, to locate Electronically Stored Information relatedto a desired concept. Word meanings can be derived from any of a number of sources,including dictionaries, thesauri, taxonomies, ontologies or computed mathematically fromthe context in which the words occur.

Conceptual Analytics: Using one or more of a number of mathematical algorithms orlinguistic methodologies to analyze unstructured data by themes and ideas contained withinthe documents enabling the grouping or searching of documents or other unstructured databy their common themes or ideas.

Container File: A compressed file containing multiple files; used to minimize the size ofthe original files for storage and/or transporting. Examples include .zip, .pst and .nsf files.The file must be ripped or decompressed to determine volume, size, record count, etc. andto be processed for litigation review and production. Also see Decompression and Rip.

Cited: Country Vintner of North Carolina, LLC v. E. & J. Gallo Winery, Inc., 718F.3d 249 (4th Cir. Apr. 29, 2013).

Content Comparison: A method of de-duplication that compares file content or output(to image or paper) and ignores metadata. See De-Duplication.

Contextual Search: Using one of a number of mathematical algorithms or linguisticmethodologies to enlarge search results to include not only exact term matches but alsomatches where terms are considered in context of how and where they frequently occur in aspecific document collection or more general taxonomy. For example, a search for the term“diamond” may bring back documents related to baseball but with no reference to the worddiamond because they frequently occur within the same documents and therefore have alogical association.

Cookie: A text file containing tracking information such as dates and times of websitevisits, deposited by a website onto a user’s computer or mobile device. The text file isaccessed each time the website is visited by a specific user and updated with browsing andother information. The main purpose of cookies is to identify users and possibly preparecustomized websites for them, including the personalization of advertising appearing onthe websites.

2014 THE SEDONA CONFERENCE JOURNAL 315

Page 14: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Coordinated Universal Time (UTC): A high precision atomic time standard with uniformseconds defined by International time and leap seconds announced at regular intervals tocompensate for the earth’s slowing rotation and other discrepancies. Leap seconds allowUTC to closely track Universal time, a time standard based not on the uniform passage ofseconds but on the earth’s angular rotation. Time zones around the world are expressed aspositive or negative offsets from UTC. Local time is UTC plus the time zone offset for thatlocation plus an offset (typically +1) for daylight savings, if in effect. For example, 3:00 a.m.Mountain Standard Time = 10:00 UTC – 7. As the zero point reference, UTC is alsoreferred to as Zulu time (Z). See Normalization.

Corrupted File: A file that has become damaged in some way, such as by a virus or bysoftware or hardware failure, so that it is partially or completely unreadable by a computer.

Characters Per Inch (CPI): A description of the number of characters that are contained inan inch of backup tape.

Cross-Custodian De-Duplication: The suppression or removal of exact copies of filesacross multiple custodians for the purposes of minimizing the amount of data for reviewand/or production. See Case De-Duplication and De-Duplication.

Cryptography: Technique to scramble data to preserve confidentiality or authenticity.

Cull (verb): To remove or suppress from viewing, a document from a collection to bereviewed or produced. See Data Filtering and Harvesting.

Custodian: See Record Custodian and Record Owner.

Custodian De-Duplication: The removal or suppression of exact copies of a file foundwithin a single custodian’s data for the purposes of minimizing the amount of data forreview and/or production. Also known as Vertical De-duplication. See De-Duplication.

Customer Relationship Management (CRM) Application: Computer program that helpsmanage communications with client and contains contact information.

Cyclical Redundancy Checking (CRC): Used in data communications to create achecksum character at the end of a data block to ensure integrity of data transmission andreceipt. See Checksum.

Cylinder: The set of tracks on both sides of each platter in a hard drive that is located atthe same head position. See Platter.

Data: Any information stored on a computer, whether created automatically by thecomputer, such as log files or created by a user, such as the information entered on aspreadsheet. See Active Data and Latent Data.

Data Categorization: The categorization and sorting of Electronically StoredInformation – such as foldering by “concept,” content, subject, taxonomy, etc. – throughthe use of technology – such as search and retrieval software or artificial intelligence – tofacilitate review and analysis.

Data Cell: An individual field of an individual record. For example, in a table containinginformation about all of a company’s employees, information about employee Joe Smith is

316 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 15: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

stored in a single record, and information about his social security number is stored in anindividual Data Cell. See Field.

Data Collection: See Harvesting.

Data Controller (as used in the EU Data Protection Act): The natural or legal personwho alone or jointly with others determines the purposes for which and the manner inwhich any Personal Data are to be processed.

Data Element: A combination of characters or bytes referring to one separate piece ofinformation, such as name, address or age.

Data Encryption Standard (DES): A form of private key encryption developed by IBM inthe late 1970’s.

Data Extraction: The process of parsing data, including the text of the file, from anyelectronic documents into separate metadata fields such as Date Created and Date LastAccessed.

Data Field: See Field.

Data Filtering: The process of identifying data based on specified parameters, such as daterange, author(s), and/or keyword search terms, often used to segregate data for inclusion orexclusion in the document culling or review workflow.

Data Formats: The organization of information for display, storage or printing. Data issometimes maintained in certain common formats so that it can be used by variousprograms, which may only work with data in a particular format, e.g. PDF, HTLM. Alsoused by parties to refer to production specifications during the exchange of data duringdiscovery.

Data Harvesting: See Harvesting.

Data Map: A document or visual representation that records the physical or networklocation and format of an organization’s data. Information about the data can includewhere the data is stored, physically and virtually, in what format it is stored, backupprocedures in place, how the Electronically Stored Information moves and is usedthroughout the organization, information about accessibility of the Electronically StoredInformation, retention and lifecycle management practices and policies, and identity ofrecords custodians.

Data Mining: The process of knowledge discovery in databases (structured data); oftentechniques for extracting information, summaries or reports from databases and data sets.In the context of electronic discovery, this term often refers to the processes used to analyzea collection of Electronically Stored Information to extract evidence for production orpresentation in an investigation or in litigation. See Text Mining.

Data Processor (as used in the EU Data Protection Act): A natural or legal person (otherthan an employee of the Data Controller) who processes Personal Data on behalf of theData Controller.

Data Set: A named or defined collection of data. See Production Data Set and PrivilegeData Set.

2014 THE SEDONA CONFERENCE JOURNAL 317

Page 16: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Data Subject (as used in the EU Data Protection Act): An individual who is the subjectof Personal Data.

Data Verification: Assessment of data to ensure it has not been modified from a priorversion. The most common method of verification is hash coding by using industryaccepted algorithms such as MD5, SHA1 or SHA2. See Digital Fingerprint, File LevelBinary Comparison, and Hash Coding.

Database: A set of data elements consisting of at least one file or of a group of integratedfiles, usually stored in one location and made available to several users. The collection ofinformation is organized into a predefined formatted structure and usually organized intofields of data that comprise individual records which are further grouped into data tables.Databases are sometimes classified according to their organizational approach, with themost prevalent approach being the relational database – a tabular database in which data isdefined so that it can be reorganized and accessed in a number of different ways. Anotherpopular organizational structure is the distributed database, which can be dispersed orreplicated among different points in a network. Computer databases typically containaggregations of data records or files, such as sales transactions, product catalogs andinventories, and customer profiles. For further discussion, see The Sedona ConferenceDatabase Principles, available for download athttp://www.thesedonaconference.org/download-pub/426.

Database Management System (DBMS): A software system used to access and retrievedata stored in a database.

Date/Time Normalization: See Normalization.

Date Created: A common metadata field that contains the date a file was created on ormoved to and the media where it currently resides.

Date Last Accessed: A common metadata field that contains the date a file was lastedaccessed, meaning last opened or moved or even copied depending on the technology usedto copy.

Date Last Modified: A common metadata field that contains the date a file was lastchanged either by a modification to the content or format, printed or changed by theautomatic running of any macros that are executed upon the file being opened. The datelast modified field does not normally reflect a change to a file’s storage location or when thefile was opened and read, and is thus often used as an electronic file date control field fordiscovery purposes.

Date Sent: A common metadata field that contains the date on which an email was sent.

Date Received: A common metadata field that contains the date on which an emailwas received.

Daubert or Daubert Challenge: Daubert v. Merrell Dow Pharmaceuticals, 509 U.S. 579, at593-94 (1993), addresses the admission of scientific expert testimony to ensure that thetestimony is reliable before considered for admission pursuant to Rule 702. The courtassesses the testimony by analyzing the methodology and applicability of the expert’sapproach. Faced with a proffer of expert scientific testimony, the trial judge must determinefirst, pursuant to Rule 104(a), whether the expert is proposing to testify to (1) scientific

318 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 17: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

knowledge that (2) will assist the trier of fact to understand or determine a fact at issue.This involves preliminary assessment of whether the reasoning or methodology isscientifically valid and whether it can be applied to the facts at issue. Daubert suggests anopen approach and provides a list of four potential factors: (1) whether the theory can be orhas been tested; (2) whether the theory has been subjected to peer review or publication; (3)known or potential rate of error of that particular technique and the existence andmaintenance of standards controlling the technique’s operation; and (4) consideration ofgeneral acceptance within the scientific community.

Decompression: To expand or restore compressed data back to its original size and format.See Compression.

Decryption: Transformation of encrypted (or scrambled) data back to original form.

De-Duplication (“de-dupe”): The process of comparing electronic files or records basedon their characteristics and removing, suppressing or marking exact duplicate files orrecords within the data set for the purposes of minimizing the amount of data for reviewand production. De-duplication is typically achieved by calculating a file or record’s hashvalue using a mathematical algorithm. De-duplication can be selective, depending on theagreed-upon criteria. See Case De-Duplication, Content Comparison, Cross-Custodian De-Duplication, Custodian De-Duplication, Data Verification, Digital Fingerprint, File LevelBinary Comparison, Hash Coding, Horizontal De-Duplication, Metadata Comparison,Near Duplicates.

Cited: CBT Flint Partners, LLC v. Return Path, Inc., 737 F.3d 1320(Fed. Cir. Dec. 13, 2013).

Defragment (“defrag”): Use of a computer utility to reorganize files so they are morephysically contiguous on a hard drive or other storage medium, when the files or partsthereof have become fragmented and scattered in various locations within the storagemedium in the course of normal computer operations. Used to optimize the operation ofthe computer, it will overwrite information in unallocated space. See Fragmention.

Deleted Data: Information that is no longer readily accessible to a computer user due tothe intentional or automatic deletion of the data. Deleted data may remain on storagemedia in whole or in part until overwritten or wiped. Even after the data itself has beenwiped, directory entries, pointers or other information relating to the deleted data mayremain on the computer. Soft deletions are data marked as deleted (and not generallyavailable to the end-user after such marking) but not yet physically removed or overwritten.Soft-deleted data can be restored with complete integrity.

Deletion: The process whereby data is removed from active files and other data storagestructures on computers and rendered more inaccessible except through the use of specialdata recovery tools designed to recover deleted data. Deletion occurs on several levels inmodern computer systems: (a) File level deletion renders the file inaccessible to theoperating system and normal application programs and marks the storage space occupied bythe file’s directory entry and contents as free and available to reuse for data storage; (b)Record level deletion occurs when a record is rendered inaccessible to a databasemanagement system (DBMS) (usually marking the record storage space as available for re-use by the DBMS, although in some cases the space is never reused until the database iscompacted) and is also characteristic of many email systems; and (c) Byte level deletionoccurs when text or other information is deleted from the file content (such as the deletion

2014 THE SEDONA CONFERENCE JOURNAL 319

Page 18: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

of text from a word processing file); such deletion may render the deleted data inaccessibleto the application intended to be used in processing the file, but may not actually removethe data from the file’s content until a process such as compaction or rewriting of the filecauses the deleted data to be overwritten.

De-NIST: The use of an automated filter program that screens files against the NIST list inorder to remove files that are generally accepted to be system generated and have nosubstantive value in most instances. See NIST List.

De-shading: Removing shaded areas to render images more easily recognizable by OCR.De-shading software typically searches for areas with a regular pattern of tiny dots.

De-skewing: The process of straightening skewed (tilted) images. De-skewing is one of theimage enhancements that can improve OCR accuracy. Documents often become skewedwhen scanned or faxed.

Desktop: Generally refers to the working area of the display on an individual PC.

Desktop Publishing (DTP): PC applications used to prepare direct print output or outputsuitable for printing presses.

De-speckling: Removing isolated speckles from an image file. Speckles often develop whena document is scanned or faxed. See Speckle.

Differential Backup: A method of backing up data that backs up data that is new or hasbeen changed from that last full backup.

Digital: Information stored as a string of ones and zeros (numeric). Opposite of analog.

Digital Audio Disk (DAD): Another term for compact disk.

Digital Audio Tape (DAT): A magnetic tape generally used to record audio but can holdup to 40 gigabytes (or 60 CDs) of data if used for data storage. Has the disadvantage ofbeing a serial access device. Often used for backup.

Digital Certificate: Electronic records that contain unique secure values used to decryptinformation, especially information sent over a public network like the Internet. SeeCertificate, Digital Signature and Public Key Infrastructure (PKI) Digital Signature.

Digital Evidence Bag (DEB): A container file format used for electronic evidence topreserve and transfer evidence in an encrypted or protected form that prevents deliberate oraccidental alteration. The secure wrapper provides metadata concerning the collectionprocess and context for the contained data.

Digital Fingerprint: A fixed-length hash code that uniquely represents the binary contentof a file. See Data Verification, File Level Binary Comparison, Hash Coding.

Digital Linear Tape (DLT): A type of magnetic computer tape used to copy data from anactive system for purposes of archiving or disaster recovery.

Digital Millennium Copyright Act (DMCA): United States copyright law enacted toprotect against copyright infringement of data, address rights and obligations of owners of

320 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 19: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

copyrighted material, and the rights and obligations of Internet service providers on whosesystems the infringing material may reside.

Digital Rights Management (DRM): A program that controls access to, movement orduplication of protected data.

Digital Signature: A way to ensure the identity of the sender, utilizing public keycryptography and working in conjunction with certificates. See Certificate, DigitalCertificate and Public Key Infrastructure (PKI) Digital Signature.

Digital to Analog Converter (DAC): Converts digital data to analog data.

Digital Video Disk or Digital Versatile Disk (DVD): A plastic disk, like a CD, on whichdata can be optically written and read. DVDs can hold more information and can supportmore data formats than CDs. Formats include: DVD-R or DVD+R (DVD Recordable) –written to once and are then read-only; and DVD-RW (DVD Re-Writable) – can bewritten to multiple times.

Digitize: The process of converting an analog value into a digital (numeric) representation.See Analog.

Directory: The organizational structure of a computer’s file storage, usually arranged in ahierarchical series of folders and subfolders. Often simulated as a file folder tree.

Disaster Recovery Tapes: Portable magnetic storage media used to store data for backuppurposes. See Backup Data, Backup Tape.

Discovery: The process of identifying, locating, preserving, securing, collecting, preparing,reviewing and producing facts, information and materials for the purpose ofproducing/obtaining evidence for utilization in the legal process. There are several ways toconduct discovery, the most common of which are interrogatories, requests for productionof documents and depositions. See Electronic Discovery.

Disk: Round, flat storage media with layers of material that enable the recording of data.

Disk Mirroring: The ongoing process of making an exact copy of information from onelocation to another in real time and often used to protect data from a catastrophic hard diskfailure or for long term data storage. See Mirror Image and Mirroring.

Disk Partition: A discrete section of a computer’s hard drive that has been virtuallyseparated from one or more other partitions on the same drive.

Diskwipe: Utility that overwrites existing data. Various utilities exist with varying degreesof efficiency – some wipe only named files or unallocated space of residual data, thusunsophisticated users who try to wipe evidence may leave behind files of which they areunaware.

Disposition: The final business action carried out on a record. This action generally is todestroy or archive the record. Electronic record disposition can include “soft deletions” (seeDeletion), “hard deletions,” “hard deletions with overwrites,” “archive to long-term store,”“forward to organization,” and “copy to another media or format and delete (hard or soft).”

2014 THE SEDONA CONFERENCE JOURNAL 321

Page 20: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Distributed Data: Information belonging to an organization that resides on portable mediaand non-local devices such as remote offices, home computers, laptop computers, personaldigital assistants (PDAs), wireless communication devices (e.g., Blackberry) and Internetrepositories (including email hosted by Internet service providers or portals and websites).Distributed data also includes data held by third parties such as application serviceproviders and business partners. Note: Information Technology organizations may definedistributed data differently (for example, in some organizations distributed data includesany non-server-based data, including workstation disk drives).

Document (or Document Family): A collection of pages or files produced manually or bya software application, constituting a logical single communication of information, butconsisting of more than a single stand-alone record. Examples include a fax cover, the faxedletter, and an attachment to the letter – the fax cover being the “Parent,” and the letter andattachment being a “Child.” See Attachment, Load File, Message Unit, and Unitization –Physical and Logical.

Cited: Abu Dhabi Commercial Bank v. Morgan Stanley & Co. Inc., 2011 WL3738979 (S.D.N.Y., August 18, 2011).

Document Date: Generally, the term used to describe the date the document was lastmodified or put in final form; applies equally to paper and electronic files. See Date LastModified, Date Created, Date Last Accessed, Date Sent, Date Received.

Document Imaging Programs: Software used to scan paper documents and to store,manage, retrieve and distribute documents quickly and easily.

Document Type or Doc Type: A bibliographic coding field that captures the generalclassification of a document, i.e., whether the document is correspondence, memo, report,article and others.

DoD 5015: Department of Defense standard addressing records management.

Domain: A group of servers and computers connected via a network and administeredcentrally with common rules and permissions.

DOS: See Microsoft-Disk Operating System (MS-DOS).

Double Byte Language: See Unicode.

Download: To move data from a remote location to a local computer or network, usuallyover a network or the Internet; also used to indicate that data is being transmitted from alocation to a location. See Upload.

Draft Record: A preliminary version of a record before it has been completed, finalized,accepted, validated or filed. Such records include working files and notes. Records andinformation management policies may provide for the destruction of draft records uponfinalization, acceptance, validation or filing of the final or official version of the record.However, draft records generally must be retained if: (1) they are deemed to be subject to alegal hold; or (2) a specific law or regulation mandates their retention and policies shouldrecognize such exceptions.

Drag-and-Drop: The movement of files by dragging them with the mouse and droppingthem in another place.

322 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 21: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Drive Geometry: A computer hard drive is made up of a number of rapidly rotatingplatters that have a set of read/write heads on both sides of each platter. Each platter isdivided into a series of concentric rings called tracks. Each track is further divided intosections called sectors, and each sector is subdivided into bytes. Drive geometry refers to thenumber and positions of each of these structures.

Driver: A computer program that controls various hardware devices such as the keyboard,mouse or monitor and makes them operable with the computer.

Drop-Down Menu: A menu window that opens on-screen to display context-relatedoptions. Also called pop-up menu or pull-down menu.

Dynamic Data Exchange (DDE): A form of interprocess communications used byMicrosoft Windows to support the exchange of commands and data between twosimultaneously running applications.

Dynamic Random Access Memory (DRAM): A memory technology that is periodicallyrefreshed or updated – as opposed to static RAM chips that do not require refreshing. Theterm is often used to refer to the memory chips themselves.

Early Data Assessment: The process of separating possibly relevant Electronically StoredInformation from non-relevant Electronically Stored Information using both computertechniques, such as date filtering or advanced analytics, and human assisted logicaldeterminations at the beginning of a case. This process may be used to reduce the volumeof data collected for processing and review. Also known as Early Case Assessment (ECA).

Electronic Data Interchange (EDI): Eliminating forms altogether by encoding the data asclose as possible to the point of the transaction; automated business information exchange.

Electronic Discovery (E-Discovery): The process of identifying, locating, preserving,collecting, preparing, reviewing, and producing Electronically Stored Information (ESI) inthe context of the legal process. See Discovery.

Cited: Gordon v. Kaleida Health, 2013 WL 2250579 (W.D.N.Y. May 21, 2013).

Cited: Hinterberger v. Catholic Health System, Inc., 2013 WL 2250603(W.D.N.Y. May 21, 2013).

Electronic Document Management: The process of using a computer program to manageindividual unstructured files, either those created electronically or scanned to digital formfrom paper. See Information Lifecycle Management.

Electronic Document Management System (EDMS): A system to electronically managedocuments during all life cycles. See Electronic Document Management.

Electronic File Processing: See Processing Data.

Electronic Image: An individual page or pages of an electronic document that has beenconverted into a static format, for example PDF or TIFF. See PDF and TIFF.

Electronic Record: Information recorded in a form that requires a computer or othermachine to process it.

2014 THE SEDONA CONFERENCE JOURNAL 323

Page 22: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Electronically Stored Information (ESI): As referenced in the United States Federal Rulesof Civil Procedure, information that is stored electronically, regardless of the media orwhether it is in the original format in which it was created, as opposed to stored in hardcopy (i.e., on paper).

Email (Electronic Mail): An electronic means for sending, receiving and managingcommunications via a multitude of different structured data applications (email clientsoftware), such as Outlook or Lotus Notes or those often known as “webmail,” such asGmail or Yahoo Mail. See Email Message.

Email Address: Unique value given to individual user accounts on a domain used to routeemail messages to the correct email recipient most often formatted as follows: user-ID@domain-name. See Email Message.

Email Archiving: A systematic approach to retaining and indexing email messages toprovide centralized search and retrieval capabilities. See Journaling.

Email Client: See Email (Electronic Mail).

Email Message: A file created or received via an electronic mail system. Any attachmentsthat may be transmitted with the email message are not part of the email message but arepart of the Message Unit and Document Family.

Email String: An electronic conversation between two or more parties via email. Alsoreferred to as an email thread. See Thread.

Email Store: File or database containing individual email messages. See Container File,Message Unit, OST, PST, and NSF.

Embedded Object: A file or piece of a file that is copied into another file, often retainingthe utility of the original file’s application; for example, a part of a spreadsheet embeddedinto a word processing document that still allows for editing and calculations after beingembedded. See Compound Document.

EML: File extension of a generic email message file.

Encapsulated PostScript (EPS): Uncompressed files for images, text and objects. Can onlybe printed on printers with PostScript drivers.

Encoding: To change or translate into code; to convert information into digital format. Forsoftware, encoding is used for video and audio references, like encoding analog format intodigital or raw digital data into compressed format.

Encryption: A procedure that renders the contents of a message or file unreadable toanyone not authorized to read it; used to protect Electronically Stored Information beingstored or transferred from one location to another.

Encryption Key: A data value that is used to encrypt and decrypt data. The number of bitsin the encryption key is a rough measure of the encryption strength; generally, the morebits in the encryption key, the more difficult it is to break.

End Document Number or End Doc#: A common metadata field that contains the batesnumber of the last page of a document.

324 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 23: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

End of File (EOF): A distinctive code that uniquely marks the end of a data file.

Enhanced Parallel Port (EPP): See Port.

Enhanced Small Device Interface (ESDI): A defined, common electronic interface fortransferring data between computers and peripherals, particularly disk drives.

Enhanced Titles: A bibliographic coding field that captures a meaningful/descriptive titlefor a document based on a reading of the document as opposed to a verbatim title lifted asit appears on the face of the document. See Verbatim Coding.

Enterprise Architecture: Framework of information systems and processes integrated acrossan organization. See Information Technology Infrastructure.

Enterprise Content Management (ECM): Management of an organization’s unstructuredElectronically Stored Information, regardless of where it exists, throughout the entirelifecycle of the Electronically Stored Information.

Ephemeral Data: Data that exists for a very brief, temporary period and is transitory innature, such as data stored in RAM.

Erasable Optical Disk: A type of optical disk that can be erased and new ElectronicallyStored Information added; most optical disks are read only.

Ethernet: A common way of networking PCs to create a Local Area Network (LAN).

Evidentiary Image or Copy: See Forensic Copy.

Exabyte: – 1,024 petabytes (approximately one billion gigabytes). See Byte.

Exchange Server: A server running Microsoft Exchange messaging and collaborationsoftware. It is widely used by enterprises using Microsoft infrastructure solutions. Amongother things, Microsoft Exchange manages email, shared calendars, and tasks.

Expanded Data: See Decompression.

Export: The process of saving data or a subset of data in a format that can be used orimported by another system.

Extended Graphics Adapter (EGA): See Video Graphics Adapter (VGA).

Extended Partitions: If a computer hard drive has been divided into more than fourpartitions, extended partitions are created. Under such circumstances each extendedpartition contains a partition table in the first sector that describes how it is furthersubdivided. See Disk Partition.

Extensible Markup Language (XML): A software coding language specification developedby the W3C (World Wide Web Consortium – the Web development standards board).XML is a pared-down version of SGML, designed especially for Web documents.It allows designers to create their own customized tag, enabling the definition, transmission,validation, and interpretation of data between applications and between organizations.

2014 THE SEDONA CONFERENCE JOURNAL 325

Page 24: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Extranet: The portion of an intranet site that is accessible by users outside of a company ororganization hosting the intranet. This type of access is often utilized in cases of jointdefense, joint venture, and vendor client relationships.

False Negative: A result from a search that is not correct because it fails to indicate a matchor hit where one exists.

False Positive: A result from a search that is not correct because it indicates a match or hitwhere there is none.

Fast Mode Parallel Port: See Port.

Federal Information Processing Standards (FIPS): Issued by the National Institute ofStandards and Technology after approval by the Secretary of Commerce pursuant to Section111(d) of the Federal Property and Administrative Services Act of 1949, as amended by theComputer Security Act of 1987, Public Law 100-235.

Fiber Optics: Transmitting information by sending light pulses over cables made from thinstrands of glass.

Field (or Data Field): A defined area of a file or data table used to record an individualpiece of standardized data, such as the author of a document, a recipient or the date ofa document.

Field Mapping: The process of normalizing data to the structure of an existing database forpurposes of loading the data so that data loads to the correct field after validating the datatype is the same, for example mapping the data from a field called Date to an existing fieldin a database named DocDate.

Field Separator or Field Delimiter: A character in a text delimited file that separates thefields in an individual record. For example, the CSV format uses a comma as the fieldseparator. See Text Delimited File.

File: A collection of related data or information stored as a unit under a specified name onstorage medium.

File Allocation Table (FAT): An internal data table on hard drives that keeps track ofwhere the files are stored. If a FAT is corrupt, a drive may be unusable, yet the data may beretrievable with forensics. See Cluster (File).

File Compression: See Compression.

File Extension: Many systems, including DOS and UNIX, allow a filename extension thatconsists of one or more characters following the proper filename. For example, image filesare usually stored as .bmp, .gif, .jpg or .tiff. Audio files are often stored as .aud or .wav.There are a multitude of file extensions identifying file formats. The filename extensionshould indicate what type of file it is; however, users may change filename extensions toevade firewall restrictions or for other reasons. Therefore, file types should be identified at abinary level rather than relying on file extensions. To research file types, seehttp://www.filext.com.Different applications can often recognize only a predeterminedselection of file types. See Format (noun).

326 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 25: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

File Format: The organization or characteristics of a file that determine with whichsoftware programs it can be used. See Format (noun).

File Header: See Header.

File Level Binary Comparison: Method of de-duplication using the digital fingerprint(hash) of a file to compare the individual content and location of bytes in one file againstthose of another file. See Data Verification, De-Duplication, Digital Fingerprint, HashCoding.

File Plan: A document containing the identifying number, title, description, anddisposition authority of files held or used in an office.

File Server: A computer that serves as a storage location for files on a network. File serversmay be employed to store Electronically Stored Information, such as email, financial data orword processing information or to backup the network. See Server.

File Sharing: Providing access to files or programs to multiple users on a network.

File Signature: See Digital Signature.

File Slack: See Slack/Slack Space.

File System: The means by which an operating system or program organizes and keepstrack of Electronically Stored Information in terms of logical structures and softwareroutines to control access to stored Electronically Stored Information, including thestructure in which the files are named, stored, and organized. The file system also tracksdata when a user copies, moves or deletes a file or subdirectory.

File Table: See Master File Table (MFT).

File Transfer: The process of moving or transmitting a copy of a file from one location toanother, as between two programs or from one computer to another.

File Transfer Protocol (FTP): An internet protocol that governs the transfer of filesbetween computers over a network or the Internet. The terms FTP server or FTP site arecommonly used to refer to a location to upload/download and exchange data, particularlyin large volume.

Filename: The name used to identify a specific file in order to differentiate it from otherfiles, typically comprised of a series of characters a dot and a file extension (e.g.,sample.doc). See File Extension, Full Path.

Firewall: A set of related security programs and/or hardware that protect the resources of aprivate network from unauthorized access by users outside of an organization or user group.A firewall filters information to determine whether to forward the information toward itsdestination.

Filter (verb): See Data Filtering.

Flash Drive: A small removable data storage device that uses flash memory and connects via aUSB port. Can be imaged and may contain residual data. Metadata detail may not be theequivalent of Electronically Stored Information maintained in more robust storage media.

2014 THE SEDONA CONFERENCE JOURNAL 327

Page 26: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Flash Memory: A type of computer memory used for storage of data to a physical disk byelectrical impulses.

Flat File: A non-relational, text-based file (i.e., a word processing document).

Floppy Disk: A thin magnetic film disk housed in a protective sleeve used to copy andtransport relatively small amounts of data.

Folder: See Directory.

Forensic Copy: An exact copy of an entire physical storage media (hard drive, CD-ROM,DVD-ROM, tape, etc.), including all active and residual data and unallocated or slackspace on the media. Forensic copies are often called images or imaged copies. See BitStream Backup, Mirror Image.

Cited: CBT Flint Partners, LLC v. Return Path, Inc., 737 F.3d 1320(Fed. Cir. Dec. 13, 2013).

Forensics: The scientific examination and analysis of data held on, or retrieved from, acomputer in such a way that the information can be used as evidence in a court of law. Itmay include the secure collection of computer data; the examination of suspect data todetermine details such as origin and content; the presentation of computer basedinformation to courts of law; and the application of a country’s laws to computer practice.Forensics may involve recreating deleted or missing files from hard drives, validating datesand logged in authors/editors of documents, and certifying key elements of documentsand/or hardware for legal purposes.

Form of Production: The specifications for the exchange of documents and/or databetween parties during a legal dispute. Used to refer both to file format (e.g., native vs.imaged format with agreed-upon metadata and extracted text in a load file) and the mediaon which the documents are produced (paper vs. electronic). See Load File, Native Format.

Format (noun): The internal structure of a file, which defines the way it is stored and used.Specific applications may define unique formats for their data (e.g., “MS Word documentfile format”). Many files may only be viewed or printed using their originating applicationor an application designed to work with compatible formats. There are several commonemail formats, such as Outlook and Lotus Notes. Computer storage systems commonlyidentify files by a naming convention that denotes the format (and therefore the probableoriginating application). For example, DOC for Microsoft Word document files; XLS forMicrosoft Excel spreadsheet files; TXT for text files; HTM for HyperText MarkupLanguage (HTML) files such as web pages; PPT for Microsoft PowerPoint files; TIF for tiffimages; PDF for Adobe images; etc. Users may choose alternate naming conventions, butthis will likely affect how the files are treated by applications.

Format (verb): To make a drive ready to store data within a particular operating system.Erroneously thought to “wipe” drive. Typically, only overwrites the File Allocation Table,but not the actual files on the drive.

Forms Processing: A specialized imaging application designed for handling pre-printedforms. Forms processing systems often use high-end (or multiple) OCR engines andelaborate data validation routines to extract handwritten or poor quality print from formsthat go into a database.

328 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 27: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Fragmentation: The process by which parts of files are separately stored in different areason a hard drive or removable disk in order to utilize available space. See Defragment.

Full Duplex: Data communications devices that allow full speed transmission betweencomputers in both directions at the same time.

Full Path: A file location description that includes the drive, starting or root directory, allattached subdirectories and ending with the file or object name. Often referred to as thePath Name.

Full-Text Indexing: The extraction and compilation of text from a collection of ESI. Text isgathered both from the body of the data and selected metadata fields. See Index.

Full-Text Search: The ability to search an index of all the words in a collection ofElectronically Stored Information for specific characters, words, numbers and/orcombinations or patterns thereof in varying degrees of complexity.

Fuzzy Search: The method of searching an index that allows for one or more characters inthe original search terms to be replaced by wild card characters so that a more broad rangeof data hits will be returned. For example, a fuzzy search for “fell” could return “tell” “fall”or “felt.”

Geopbyte: 1,024 brontobytes. See Byte.

Ghost: See Bit Stream Backup.

GIF (Graphics Interchange Format): A common file format for storing images firstoriginated by CompuServe, an Internet service provider, in 1987. Limited to 256 colors.

Gigabyte (GB): 1,024 megabytes. See Byte.

Global Address List (GAL): A Microsoft Outlook directory of all Microsoft Exchangeusers and distribution lists to which messages can be addressed. The global address list mayalso contain public folder names. Entries from this list can be added to a user’s personaladdress book (PAB).

Global De-Deduplication: See Case De-Duplication.

Global Positioning System (GPS): A technology used to track the location of groundbased objects using three or more orbiting satellites.

GMT Timestamp: Identification of a file using Greenwich Mean Time as the central timeauthentication method. See Normalization.

GPS Generated Timestamp: Timestamp identifying time as a function of its relationshipto Greenwich Mean Time.

Graphical User Interface (GUI, pronounced “gooey”): An interface to a computer ordevice comprised of pictures and icons, rather than words and numbers by which users caninteract with the device.

Grayscale: See Scale-to-Gray.

2014 THE SEDONA CONFERENCE JOURNAL 329

Page 28: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Groupware: Software designed to operate on a network and allow several people to worktogether on the same documents and files.

Hacker: Someone who breaks into a computer system in order to access, steal, change ordestroy information.

Half Duplex: Transmission systems that can send and receive data between computers butnot at the same time.

Handshake: A transmission that occurs at the beginning of a communication sessionbetween computers to establish the technical format of the communication.

Handwriting Recognition Software (HRS): Software that interprets handwriting intomachine readable form.

Hard Drive: A storage device consisting of one or more magnetic media platters on whichdigital data can be written and erased. See Platter.

Harvesting: The process of retrieving or collecting Electronically Stored Information fromany media; an e-discovery vendor or specialist “harvests” Electronically Stored Informationfrom computer hard drives, file servers, CDs, backup tapes, portable devices, and othersources for processing and loading to storage media or a database management system.

Hash Coding: A mathematical algorithm that calculates a unique value for a given set ofdata, similar to a digital fingerprint, representing the binary content of the data to assist insubsequently ensuring that data has not been modified. Common hash algorithms includeMD5 and SHA. See Data Verification, Digital Fingerprint, File Level Binary Comparison.

Head: Devices which ride very closely to the surface of the platter on a hard drive andallow information to be read from and written to the platter.

Header: Data placed at the beginning of a file or section of data that in part identifies thefile and some of its attributes. A header can consist of multiple fields, each containing itsown value. See Message Header

Hexadecimal: A number system with a base of 16. The digits are 0-9 and A-F, where Fequals the decimal value of 15.

Hidden Files or Data: Files or data not readily visible to the user of a computer. Someoperating system files are hidden to prevent inexperienced users from inadvertently deletingor changing these essential files. See Steganography.

Hierarchical Storage Management (HSM): Software that automatically migrates files fromon-line to less expensive near-line storage, usually on the basis of the age or frequency of useof the files.

High Technology Crime Investigation Association (HTCIA): Computer forensicsnonprofit association; resources include educational programs and Listservs.

Hold: See Legal Hold.

330 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 29: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Horizontal De-Duplication: A way to identify Electronically Stored Informationduplicated across multiple custodians or other production data sets normally by comparinghash algorithms to identify duplicates and then removing or suppressing those duplicates.See Case De-Duplication, De-Duplication.

Host: In a network, the central computer that controls the remote computers and holds thecentral databases.

Hub: A network device that connects multiple computers and peripherals together allowingthem to share network connectivity. A central unit that repeats and/or amplifies data signalsbeing sent across a network.

Hyperlink: A pointer in a hypertext document – usually appearing as an underlined orhighlighted word or picture – upon selection sends a user to another location either withinthe current document or to another location accessible on the network or Internet.

HyperText: Text that includes hyperlinks or shortcuts to other documents or views,allowing the reader to easily jump from one view to a related view in a non-linear fashion.

HyperText Markup Language (HTML): Developed by CERN of Geneva, Switzerland; themost common programming language format used on the Internet. HTML+ adds supportfor multi-media. The tag-based ASCII language used to create pages on the World WideWeb uses tags to tell a Web browser to display text and images. HTML is a markup or“presentation” language, not a programming language. Programming code can be imbeddedin an HTML page to make it interactive. See Java.

HyperText Transfer Protocol (HTTP): The underlying protocol used by the World WideWeb. HTTP defines how messages are formatted and transmitted, and what actions serversand browsers should take in response to various commands. For example, when you enter aURL in your browser, this actually sends an HTTP command to the Web server directing itto fetch and transmit the requested site. HTTPS adds a layer of encryption to the protocolto protect the information that is being transmitted and is often used by application serviceproviders to protect the data being viewed over the Web.

Icon: In a GUI, a picture or drawing that is activated by clicking a mouse to command thecomputer program to perform a predefined series of actions.

Image: (1) To make an identical copy of a storage device, including empty sectors. Alsoknown as creating a mirror image or mirroring the drive. See Bit Stream Backup, ForensicCopy, Mirror Image. (2) An electronic or digital picture of a document (e.g., TIFF, PDF,etc.). See Image Processing, Processing Data, Render Images.

Cited: CBT Flint Partners, LLC v. Return Path, Inc., 737 F.3d 1320(Fed. Cir. Dec. 13, 2013).

Image Copy or Imaged Copy: See Forensic Copy.

Image Enabling: A software function that creates links between existing applications andstored images.

Image File Format: See File Format, Format (noun).

2014 THE SEDONA CONFERENCE JOURNAL 331

Page 30: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Image Key: The name of an image and cross reference to the image’s file in a documentload file, often the Bates number of the page.

Image Processing: To convert data from its current/native format to a fixed image for thepurposes of preserving the format of a document and facilitating the transfer betweenparties, typically with the addition of a Bates number to the face of each image. See Formof Production, Native Format, Processing Data, Render Images.

Image Processing Card (IPC): A board mounted in a computer, scanner or printer thatfacilitates the acquisition and display of images. The primary function of most IPCs is therapid compression and decompression of image files.

Import: The process of bringing data into an environment or application that has beenexported from another environment or application.

Inactive Record: Records related to closed, completed or concluded activities. Inactiverecords are no longer routinely referenced but must be retained in order to fulfill reportingrequirements or for purposes of audit or analysis. Inactive records generally reside in a long-term storage format remaining accessible for purposes of business processing only withrestrictions on alteration. In some business circumstances, inactive records may bereactivated.

Incremental Backup: A method of backing up data that backs up data that is new or hasbeen changed from that last backup of any kind, be it a full backup or the last incrementalbackup.

Index/Coding Fields: Database fields used to categorize and organize records. Often user-defined, these fields can be used for searching for and retrieving records. See Coding.

Index: A searchable catalog of information created to maximize storage efficiency and allowfor improved search. Also called catalog. See Full-Text Indexing.

Indexing: (1) The process of organizing data in a database to maximize storage efficiencyand optimize searching; (2) Objective coding of documents to create a list similar to a tableof contents. See Coding.

Information: For the purposes of this document, information is used to mean hard copydocuments and ESI.

Information Governance: Information governance is the comprehensive, inter-disciplinaryframework of policies, procedures, and controls used by mature organizations to maximizethe value of an organization’s information while minimizing associated risks byincorporating the requirements of: (1) e-discovery, (2) records & information managementand (3) privacy/security into the process of making decisions about information.

Information Lifecycle Management (ILM): A phrase used to discuss the policies andprocedures governing the management of data within an organization, from creationthrough destruction. See Electronic Document Management.

Information Systems (IS) or Information Technology (IT): Usually refers to thedepartment of an entity which designs, maintains, and assists users with regard to thecomputer infrastructure.

332 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 31: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Information Technology (IT) Infrastructure: The overall makeup of business-widetechnology operations, including mainframe operations, standalone systems, email,networks (WAN and LAN), Internet access, customer databases, enterprise systems,application support, regardless of whether managed, utilized or provided locally, regionally,globally, etc., or whether performed or located internally or by outside providers(outsourced to vendors). The IT infrastructure also includes applicable standard practicesand procedures, such as backup procedures, versioning, resource sharing, retentionpractices, system cleanup, and the like. See Enterprise Architecture.

Infrastructure as a Service (IaaS): A form of Cloud Computing whereby a third partyservice provider offers, on-demand, a part of its computer infrastructure remotely. Specificservices may include servers, software or network equipment resources that can be providedon an as-needed basis without the purchase of the devices or the resources needed tosupport them. See Cloud Computing.

Input device: Any peripheral that allows a user to communicate with a computer byentering information or issuing commands (e.g., keyboard).

Instant Messaging (IM): A form of electronic communication involving immediatecorrespondence between two or more online users. Instant messages differ from email intheir limited metadata and in that messages are often transitory and not stored past themessaging session.

Institute of Electrical and Electronic Engineers (IEEE): An international association thatadvocates the advancement of technology as it relates to electricity. IEEE sponsors meetings,publishes a number of journals, and establishes standards.

Integrated Drive Electronics (IDE): An engineering standard for interfacing computersand hard disks.

Intelligent Character Recognition (ICR): The conversion of scanned images (bar codes orpatterns of bits) to computer recognizable codes (ASCII characters and files) by means ofsoftware/programs that define the rules of and algorithms for conversion, helpful forinterpreting handwritten text. See Handwriting Recognition Software (HRS), OCR.

Inter-Partition Space: Unused sectors on a track located between the start of the partitionand the partition boot record of a hard drive. This space is important because it is possiblefor a user to hide information here. See Partition, Track.

Interlaced: To refresh a display every other line once per refresh cycle. Since only half theinformation displayed is updated each cycle, interlaced displays are less expensive than non-interlaced. However, interlaced displays are subject to jitters. The human eye/brain canusually detect displayed images that are completely refreshed less than 30 times per second.

Interleave: To arrange data in a noncontiguous way to increase performance. When used todescribe disk drives, it refers to the way sectors on a disk are organized. In one-to-oneinterleaving, the sectors are placed sequentially around each track. In two-to-oneinterleaving, sectors are staggered so that consecutively numbered sectors are separated by anintervening sector. The purpose of interleaving is to make the disk drive more efficient. Thedisk drive can access only one sector at a time, and the disk is constantly spinning beneath.

International Organization for Standardization (ISO): A worldwide federation ofnational standards bodies; www.iso.org.

2014 THE SEDONA CONFERENCE JOURNAL 333

Page 32: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

International Telecommunication Union (ITU): An international organization under theUN, headquartered in Geneva, concerned with telecommunications that developsinternational data communications standards; known as CCITT prior to March 1, 1993.See http://www.itu.int.

Internet: A worldwide interconnected system of networks that all use the TCP/IPcommunications protocol and share a common address space. The Internet supportsservices such as email, the World Wide Web, file transfer (FTP), and Internet Relay Chat(IRC). Also known as “the net,” “the information superhighway,” and “cyberspace.”

Internet Protocol (IP): The principal communications protocol for data communicationsacross the Internet.

Internet Protocol (IP) Address: A unique name that identifies the physical location of aserver on a network, expressed by a numerical value (e.g., 128.24.62.1). See TransmissionControl Protocol/Internet Protocol (TCP/IP).

Internet Publishing Software: Specialized software that allows materials to be published tothe Internet. The term Internet Publishing is sometimes used to refer to the industry ofonline digital publication as a whole.

Internet Relay Chat (IRC): System allowing Internet users to chat in real time.

Internet Service Provider (ISP): A business that provides access to the Internet, usually fora fee.

Integrated Services Digital Network (ISDN): An all-digital network that can carry data,video, and voice.

Intranet: A secure private network that uses Internet-related technologies to provideservices within an organization or defined infrastructure.

ISO 8859-1: Also called Latin-1. A standard character encoding of the Latin alphabet usedfor most Western European languages. ISO 8859-1 is considered a legacy encoding inrelation to Unicode, yet it is nonetheless still in common use today. The ISO 8859-1standard consists of 191 printable characters from the Latin script. It is essentially a supersetof the ASCII character encoding and a subset of the Windows-1252 character encoding.See ASCII, Windows-1252.

ISO 9660 CD Format: The ISO format for creating CD-ROMs that can be readworldwide.

ISO 15489-1: The ISO standard addressing standardization of international best practicesin records management.

Jailbreak: A process of bypassing security restrictions of an operating system to take fullcontrol of a device.

Java: A platform-independent, programming language for adding animation and otheractions to websites.

Joint Photographic Experts Group (JPEG): A compression algorithm for still images thatis commonly used on the Web.

334 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 33: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Journal: A chronological record of data processing operations that may be used toreconstruct a previous or an updated version of a file. In database management systems, it isthe record of all stored data items that have values changed as a result of processing andmanipulation of the data.

Journaling: A function of electronic communication systems (such as MicrosoftExchange and Lotus Notes) that copies items that are sent and received into a secondinformation store for retention or preservation. Because Journaling takes place at theinformation store (server) level when the items are sent or received, rather than at themailbox (client) level, some message-related metadata, such as user foldering (what folderthe item is stored in within the recipient’s mailbox) and the status of the “read” flag, isnot retained in the journaled copy. The Journaling function stores items in the system’snative format, unlike email archiving solutions, that use proprietary storage formatsdesigned to reduce the amount of storage space required. Journaling systems may alsolack the sophisticated search and retrieval capabilities available with many email archivingsolutions. See Email Archiving.

Jukebox: A mass storage device that holds optical disks and automatically loads them intoa drive.

Jump Drive: See Flash Drive.

Kerning: Adjusting the spacing between two letters.

Key Drive: See Flash Drive.

Key Field: See Primary Key

Keyword: Any specified word, or combination of words, used in a search, with the intentof locating certain results.

Kilobyte (KB): A unit of 1,024 bytes. See Byte.

Kofax Board: The generic term for a series of image processing boards manufactured byKofax Imaging Processing. These are used between the scanner and the computer andperform real-time image compression and decompression for faster image viewing, imageenhancement, and corrections to the input to account for conditions such as documentmisalignment.

Landscape Mode: A page orientation or display such that the width exceeds the height(Horizontal).

Laser Disk: Same as an optical CD, except 12” in diameter.

Laser Printing: A printing process by which a beam of light hits an electrically chargeddrum and causes a discharge at that point. Toner is then applied, which sticks to the non-charged areas. Paper is pressed against the drum to form the image and is then heated todry the toner.

Latency: The time it takes to read a disk (or jukebox), including the time to physicallyposition the media under the read/write head, seek the correct address and transfer it.

2014 THE SEDONA CONFERENCE JOURNAL 335

Page 34: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Latent Data: Deleted files and other Electronically Stored Information that are inaccessiblewithout specialized forensic tools and techniques. Until overwritten, these data reside onmedia such as a hard drive in unused space and other areas available for data storage. Alsoknown as ambient data. See Residual Data.

Latent Semantic Indexing and Analysis: A method of processing data that identifiesrelationships between data sets by analyzing terms and term frequency. Commonapplications include grouping documents together based on the documents’ concepts andmeanings instead of by simple searching.

Latin-1: See ISO 8859-1.

Leading: The amount of space between lines of printed text.

Legacy Data, Legacy System: ESI which can only be accessed via software and/or hardwarethat has become obsolete or replaced. Legacy data may be costly to restore or reconstructwhen required for investigation or litigation analysis or discovery.

Legal Hold: A legal hold is a communication issued as a result of current or reasonablyanticipated litigation, audit, government investigation or other such matter that suspendsthe normal disposition or processing of records. Legal holds may encompass proceduresaffecting data that is accessible as well as data that is not reasonably accessible. The specificcommunication to business or IT organizations may also be called a hold, preservationorder, suspension order, freeze notice, hold order, litigation hold, or hold notice. See, TheSedona Conference Commentary on Legal Holds, September 2010, available for downloadat http://www.thesedonaconference.org/download-pub/470.

Lempel-Ziv & Welch (LZW): A common, lossless compression standard for computergraphics, used for most TIFF files. Typical compression ratios are 4/1.

Level Coding: Used in bibliographical coding to facilitate different treatment, such asprioritization or more thorough extraction of data, for different categories of documents,such as by type or source. See Coding.

LFP: IPRO Tech Inc.’s image cross reference file; an ASCII delimited text file that crossreferences an image’s Bates number to its location and file name.

Lifecycle: A record’s lifecycle is the life span of a record from its creation or receipt to itsfinal disposition. Usually described in three stages: (1) creation, (2) maintenance and use,and (3) archive to final disposition. See Information Lifecycle Management.

Linear and Non-Linear Review: Performed by humans. Linear review workflow begins atthe beginning of a collection and addresses information in order until a full review of allinformation is complete. Non-linear review workflow is to prepare only certain portions forreview, based either on the results of criteria, such as search terms, computer assisted reviewresults or some other method, to isolate only information likely responsive. See Review.

Linear Tape-Open (LTO): A type of magnetic backup tape that can hold as much as 800GB of data, or 1200 CDs depending on the data file format.

Link: See Hyperlink.

336 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 35: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Liquid Crystal Display (LCD): Two polarizing transparent panels with a liquid crystalsurface between; application of voltage to certain areas causes the crystal to turn dark, and alight source behind the panel transmits though crystals not darkened.

Litigation Hold: See Legal Hold.

Load File: A file that relates to a set of scanned images or electronically processed files thatindicates where individual pages or files belong together as documents, to includeattachments, and where each document begins and ends. A load file may also contain datarelevant to the individual documents, such as selected metadata, coded data, and extractedtext. Load files should be obtained and provided in prearranged or standardized formats toensure transfer of accurate and usable images and data.

Cited: Aguilar v. Immigration and Customs Enforcement Division of the United StatesDepartment of Homeland Security, 255 F.R.D. 350, 353 (S.D.N.Y. 2008).

Cited: CBT Flint Partners, LLC v. Return Path, Inc., 737 F.3d 1320(Fed. Cir. Dec. 13, 2013).

Cited: E.E.O.C. v. SVT, LLC, 2014 U.S. Dist. LEXIS 50114(N.D. Ind. Apr. 10, 2014).

Cited: Nat’l Day Laborer Org. Network v. United States Immigration & CustomsEnforcement Agency, 2011 U.S. Dist. LEXIS 11655 (S.D.N.Y. February 7, 2011).

Local Area Network (LAN): A group of computers at a single location (usually an office orhome) that are connected by phone lines, coaxial cable or wireless transmission. See Network.

Log File: A text file created by an electronic device or application to record activity of aserver, website, computer or software program.

Logical Entities: An abstraction of a real-world object or concept that is both independentand unique. Conceptually, a logical entity is a noun, and its relationships to other entitiesare verbs. In a relational database, a logical entity is represented as a table. Attributes of theentity are in columns of the table and instances of the entity are in rows of the table.Examples of logical entities are employees of a company, products in a store’s catalog, andpatients’ medical histories.

Logical File Space: The actual amount of space occupied by a file on a hard drive. Theamount of logical file space differs from the physical file space because when a file is createdon a computer, a sufficient number of clusters (physical file space) are assigned to contain thefile. If the file (logical file space) is not large enough to completely fill the assigned clusters(physical file space), then some unused space will exist within the physical file space.

Logical Unitization: See Unitization – Physical and Logical.

Logical Volume: An area on the hard drive that has been formatted for file storage. A harddrive may contain a single or multiple volumes.

Lossless Compression: A method of compressing an image file, bit by bit, that results inno loss of information either during compression or extraction.

2014 THE SEDONA CONFERENCE JOURNAL 337

Page 36: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Lossy Compression: A method of image compression whereby storage size of image isreduced by decreasing the resolution and color fidelity while maintaining minimumacceptable standard for general use. A lossy image is one where the image after compressionis different from the original image due to lost information. The differences may or maynot be noticeable, but a lossy conversion process does not retain all the originalinformation. JPEG is an example of a lossy compression method.

Lotus Domino: An IBM server product providing enterprise-level email, collaborationcapabilities and custom application platform; began life as Lotus Notes Server, the servercomponent of Lotus Development Corporation’s client-server messaging technology. Canbe used as an application server for Lotus Notes applications and/or as a Web server. Has abuilt-in database system in the format of .nsf.

Magnetic/Optical Storage Media: The physical piece of material that receives data that hasbeen recorded using a number of different magnetic recording processes. Examples includehard drives, backup tapes, CD-ROMs, DVD-ROMs, Jaz and Zip Drives.

Magnetic Disk Emulation (MDE): Software that makes a jukebox look and operate like ahard drive such that it will respond to all the input/output (I/O) commands ordinarily sentto a hard drive.

Magneto-Optical Drive: A drive that combines laser and magnetic technology to createhigh-capacity erasable storage.

Mail Application Programming Interface (MAPI): A Windows-based software standardthat enables a program to send and receive email by connecting the program to selectedemail servers. See API.

Mailbox: A term used to describe all email associated with an individual email account,whether located physically together on one server, across a server array or stored in cloudbased storage.

Make-Available Production: Process by which a generally large universe of potentiallyresponsive documents is made available to a requestor; the requestor selects or tags desireddocuments, and the producing party produces only the selected documents. See Quick Peek.

Malware: Any type of malicious software program, typically installed illicitly, includingviruses, Trojans, worms, key loggers, spyware, adware and others.

Management Information Systems (MIS): A phrase used to describe the resources, peopleand technology, used to manage the information of an organization.

MAPI Mail Near-Line: Documents stored on optical disks or compact disks that arehoused in a jukebox or CD changer and can be retrieved without human intervention.

Marginalia: Handwritten notes on documents.

Master Boot Sector/Record: The sector on a hard drive that contains the computer code(boot strap loader) necessary for the computer to start up and the partition table describingthe organization of the hard drive.

Master File Table (MFT): The primary record of file storage locations on a MicrosoftWindows based computer employing NTFS filing systems.

338 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 37: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Mastering: Making many copies of a disk from a single master disk.

MBOX: The format in which email is stored on traditional UNIX email systems.

Media: An object or device, such as a disk, tape or other device, on which data is stored.

Megabyte (MB): 1,024 kilobytes. See Byte.

Memory: Data storage in the form of chips, or the actual chips used to hold data; storage isused to describe memory that exists on tapes, disks, CDs, DVDs, flash drives and harddrives. See Random Access Memory (RAM), Read Only Memory (ROM).

Menu: A list of options, each of which performs a desired action such as choosing acommand or applying a particular format to a part of a document.

Message-Digest Algorithm 5 (MD5): A hash algorithm used to give a numeric value to adigital file or piece of data. Commonly used in electronic discovery to find duplicates in adata collection. See Hash Coding.

Message Header: The text portion of an email that contains routing information of theemail and may include author, recipient and server information, which tracks the path ofthe email from its origin server to its destination mailbox.

Message Unit: An email and any attachments associated with it.

Metadata: The generic term used to describe the structural information of a file thatcontains data about the file, as opposed to describing the content of a file. See SystemGenerated Metadata and User Created Metadata. For a more thorough discussion, see TheSedona Guidelines: Best Practice Guidelines & Commentary for Managing Information &Records in the Electronic Age (Second Edition), https://thesedonaconference.org/download-pub/74, and The Sedona Conference Commentary on Ethics & Metadata,https://thesedonaconference.org/download-pub/3111.

Cited: CBT Flint Partners, LLC v. Return Path, Inc., 737 F.3d 1320(Fed. Cir. Dec. 13, 2013).

Metadata Comparison: A comparison of specified metadata as the basis for de-duplicationwithout regard to content. See De-Duplication.

Microfiche: Sheet microfilm (4” by 6”) containing reduced images of 270 pages or more ina grid pattern.

Microprocessor: See CPU.

Microsoft-Disk Operating System (MS-DOS): Used in Windows-based personalcomputers as the control system prior to the introduction of 32-bit operating systems.

Microsoft Outlook: A personal information manager from Microsoft, part of the MicrosoftOffice suite. Although often used mainly as an email application, it also provides calendar,task and contact management, note taking, a journal and Web browsing. Can be used as astand-alone application, or operate in conjunction with Microsoft Exchange Server toprovide enhanced functions for multiple users in an organization, such as shared mailboxesand calendars, public folders and meeting time allocation.

2014 THE SEDONA CONFERENCE JOURNAL 339

Page 38: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Microsoft Outlook Express: A scaled down version of Microsoft Outlook.

MiFi: A portable wireless hub that allows users with the correct credentials to access theInternet.

Migrated Data: Electronically Stored Information that has been moved from one databaseor format to another.

Migration: Moving Electronically Stored Information from one computer application orplatform to another; may require conversion to a different format.

Mirror Image: A bit by bit copy of any storage media. Often used to copy theconfiguration of one computer to anther computer or when creating a preservation copy.See Forensic Copy and Image.

Cited: White v. Graceland College Center for Professional Development & LifelongLearning, Inc., 2009 WL 722056 at *6 (D. Kan. March 18, 2009).

Mirroring: The duplication of Electronically Stored Information for purposes of backup orto distribute Internet or network traffic among several servers with identical ESI. See BitStream Backup, Disk Mirroring, Image.

Modem (Modulator-Demodulator): A device that can encode digital information into ananalog signal (modulates) or decode the received analog signal to extract the digitalinformation (demodulate).

Mount or Mounting: The process of making off-line Electronically Stored Informationavailable for online processing. For example, placing a magnetic tape in a drive and settingup the software to recognize or read that tape. The terms load and loading are often used inconjunction with, or synonymously with, mount and mounting (as in “mount and load atape”). Load may also refer to the process of transferring Electronically Stored Informationfrom mounted media to another media or to an online system.

MPEG-1, -2, -3 and -4: Different standards for full motion video to digitalcompression/decompression techniques advanced by the Moving Pictures Experts Group.

MSG: A common file format, in which emails can be saved, often associated with MicrosoftOutlook email program, which preserves both the format and any associated attachmentinformation.

Multimedia: The combined use of different media; integrated video, audio, text and datagraphics in digital form.

Multimedia Messaging Service (MMS): A protocol of messaging that allows for thetransmission of multimedia content such as pictures, video or sound over mobile networks.See Text Message.

Multisync: Analog video monitors that can receive a wide range of display resolutions,usually including TV (NTSC). Color analog monitors accept separate red, green & blue(RGB) signals.

340 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 39: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

National Institute of Standards and Technology (NIST): A federal technology agencythat works with industry to develop and apply technology measurements and standards. SeeNIST List.

Native Format: Electronic documents have an associated file structure defined by theoriginal creating application. This file structure is referred to as the native format of thedocument. Because viewing or searching documents in the native format may require theoriginal application (for example, viewing a Microsoft Word document may require theMicrosoft Word application), documents may be converted to a neutral format as part ofthe record acquisition or archive process. Static format (often called imaged format), such asTIFF or PDF, are designed to retain an image of the document as it would look viewed inthe original creating application but do not allow metadata to be viewed or the documentinformation to be manipulated unless agreed-upon metadata and extracted text arepreserved. In the conversion to static format, some metadata can be processed, preservedand electronically associated with the static format file. However, with technologyadvancements, tools and applications are increasingly available to allow viewing andsearching of documents in their native format while still preserving pertinent metadata. Itshould be noted that not all ESI may be conducive to production in either the NativeFormat or imaged format, and some other form of production may be necessary. Databases,for example, often present such issues. See Form of Production, Load File.

Cited: Akanthos Capital Management, LLC v. CompuCredit Holdings Corp.,2014 WL 896743 (N.D. Ga. Mar. 7, 2014).

Cited: Covad Communications Co. v. Revonet, Inc., 254 F.R.D. 147(D.D.C. 2008).

Cited: Palar v. Blackhawk Bancorporation Inc., 2013 WL 1704302(C.D. Ill. Mar. 19, 2013)

Cited: Race Tires America, Inc. v. Hoosier Racing Tire Corp., 2012 WL 887593(3d Cir. March 16, 2012).

Native Format Review: Review of Electronically Stored Information in its native formatusing either a third party viewer application capable of rendering native files in closeapproximation to their original application or the actual original application in which theElectronically Stored Information was created. See Review.

Natural Language Search: A manner of searching that permits the use of plain languagewithout special connectors or precise terminology, such as “Where can I find informationon William Shakespeare?” as opposed to formulating a search statement (such as“information” and “William Shakespeare”). See Boolean Search.

Near Duplicates: (1) Two or more files that are similar to a certain percentage, for example,files that are 90% similar may be identified as near duplicates; used for review to locatesimilar documents and review all near duplicates at one time; (2) The longest email in anemail conversation where the subparts are identified and suppressed in an email collectionto reduce review volume.

Near-Line Data Storage: A term used to refer to a data storage system where data is notactively available to users, but is available through an automated system that enables therobotic retrieval of removable storage media or tapes. Data in near-line storage is often

2014 THE SEDONA CONFERENCE JOURNAL 341

Page 40: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

stored on servers that do not have as high performance as active servers. Making near-linedata available will not require human intervention (as opposed to off-line data which canonly be made available through human actions).

Network: A group of two or more computers and other devices connected together(“networked”) for the exchange and sharing of resources. See Local-Area Network (LAN)and Wide-Area Network (WAN).

Network Operating System (NOS): See Operating System.

Network Operations Center (NOC): The location where a network or computer array ismonitored and maintained.

Neural Network: Neural networks are made up of interconnected processing elementscalled units, which respond in parallel to a set of input signals given to each.

New Technology File System (NTFS): A high-performance and self-healing file systemproprietary to Microsoft, used in Windows NT, Windows 2000, Windows XP andWindows Vista Operating Systems, that supports file-level security, compression andauditing. It also supports large volumes and powerful storage solutions such as RedundantArray of Inexpensive Disks (RAID). An important feature of NTFS is the ability to encryptfiles and folders to protect sensitive data.

NIST List: A hash database of computer files developed by NIST to identify files that aresystem generated and generally accepted to have no substantive value in most instances.

Node: Any device connected to a network. PCs, servers and printers are all nodes onthe network.

Non-Interlace:When each line of a video image is scanned separately. Older CRTcomputer monitors use non-interlaced video.

Normalization: The process of reformatting data so that it is stored in a standardized form,such as setting the date and time stamp of a specific volume of Electronically StoredInformation to a specific zone, often GMT, to permit advanced processing of the ESI, suchas de-duplication. See Coordinated Universal Time.

Notes Server: See Lotus Domino.

NSF: Lotus Notes container file (i.e., database.nsf ); can be either an email database or thetraditional type of fielded database. See Lotus Domino.

Object: In personal computing, an object is a representation of something that a user canwork with to perform a task and can appear as text or an icon. In a high-level method ofprogramming called object-oriented programming (OOP), an object is a freestanding blockof code that defines the properties of something.

Object Linking and Embedding (OLE): A feature in Microsoft’s Windows that allows thelinking of different files, or parts of files, together into one file without forfeiting any of theoriginal file’s attributes or functionality. See Compound Document.

Objective Coding: See Coding.

342 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 41: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

OCR (Optical Character Recognition): A technology process that captures text from animage for the purpose of creating a parallel text file that can be associated with the imageand searched in a database. OCR software evaluates scanned data for shapes it recognizes asletters or numerals. See Handwriting Recognition Software (HRS), Intelligent CharacterRecognition (ICR).

Cited (Glossary Third Ed.): Country Vintner of North Carolina, LLC v. E. & J.Gallo Winery, Inc., 718 F.3d 249 (4th Cir. Apr. 29, 2013).

Official Record Owner: See Record Owner.

Offline Data: The storage of Electronically Stored Information outside the network in dailyuse (e.g., on backup tapes) that is only accessible through the offline storage system, not thenetwork.

Offline Storage: Electronically Stored Information stored on removable disk (optical,compact, etc.) or magnetic tape and not accessible by the active software or server. Oftenused for making disaster-recovery copies of records for which retrieval is unlikely.Accessibility to offline media usually requires restoring the data back to the active server.

Online: Connected to a network or the Internet.

Online Review: The review of data on a computer, either locally on a network or via theInternet. See Review.

Online Storage: The storage of Electronically Stored Information as fully accessibleinformation in daily use on the network or elsewhere.

Ontology: A collection of categories and their relationships to other categories and towords. An ontology is one of the methods used to find related documents when given aspecific query.

Operating System (OS): Provides the software platform that directs the overall activity of acomputer, network or system and on which all other software programs and applicationsrun. In many ways, choice of an operating system will effect which applications can be run.Operating systems perform basic tasks, such as recognizing input from the keyboard,sending output to the display screen, keeping track of files and directories on the disk andcontrolling peripheral devices such as disk drives and printers. For large systems, theoperating system has even greater responsibilities and powers – becoming a traffic cop tomake sure different programs and users running at the same time do not interfere with eachother. The operating system is also responsible for security, ensuring that unauthorized usersdo not access the system. Examples of computer operating systems are UNIX, DOS,Microsoft Windows, LINUX, Mac OS and IBM z/OS. Examples of portable deviceoperating systems are iOS, Android, Microsoft Windows and BlackBerry. Operatingsystems can be classified in a number of ways, including: multi-user (allows two or moreusers to run programs at the same time – some operating systems permit hundreds or eventhousands of concurrent users); multiprocessing (supports running a program on more thanone CPU); multitasking (allows more than one program to run concurrently);multithreading (allows different parts of a single program to run concurrently); and realtime (instantly responds to input – general-purpose operating systems, such as DOS andUNIX, are not real-time).

2014 THE SEDONA CONFERENCE JOURNAL 343

Page 42: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Optical Disks: Computer media similar to a compact disk that cannot be rewritten. Anoptical drive uses a laser to read the ESI.

Optical Jukebox: See Jukebox.

OST: A Microsoft Outlook information store that is used to save folder information thatcan be accessed offline.

Cited: White v. Graceland College Center for Professional Development & LifelongLearning, Inc., 2009 WL 722056 at *5 (D. Kan. March 18, 2009).

Outlook: See Microsoft Outlook.

Over-inclusive:When referring to data sets returned by some method of query, search,filter or cull, results that are returned overly broad.

Overwrite: To record or copy new data over existing data, as in when a file or directory isupdated.

Packet: A unit of data sent across a network that may contain identity and routinginformation. When a large block of data is to be sent over a network, it is broken up intoseveral packets, sent and then reassembled at the other end. The exact layout of anindividual packet is determined by the protocol being used.

Page File/Paging File: A method to temporarily store data outside of the main memorybut quickly retrievable. This information is left in the swap file after the programs areterminated and may be retrieved using forensic techniques. See Swap File.

Parallel Port: See Port.

Parent: See Document.

Parsing: In electronic discovery, the process by which a file is broken apart into itsindividual components for indexing, processing or to prepare for loading into a reviewdatabase.

Partition: An individual section of computer storage media such as a hard drive. Forexample, a single hard drive may be divided into several partitions in order that eachpartition can be managed separately for security or maintenance purposes. When a harddrive is divided into partitions, each partition is designated by a separate drive letter, i.e.,C, D, etc.

Partition Table: Indicates each logical volume contained on a disk and its location.

Partition Waste Space: After the boot sector of each volume or partition is written to atrack, it is customary for the system to skip the rest of that track and begin the actualuseable area of the volume on the next track. This results in unused or wasted space on thattrack where information can be hidden. This wasted space can only be viewed with a lowlevel disk viewer. However, forensic techniques can be used to search these wasted spaceareas for hidden information.

Password: A text or alphanumeric string that is used to authenticate a specific user’s accessto a secure program, network or part of a network.

344 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 43: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Path: (1) The hierarchical description of where a directory, folder or file is located on acomputer or network; (2) A transmission channel, the path between two nodes of anetwork that a data communication follows, and the physical cabling that connects thenodes on a network.

Pattern Matching: A generic term that describes any process that compares one file’scontent with another file’s content.

Pattern Recognition: Technology that searches Electronically Stored Information for likepatterns and flags, and extracts the pertinent data, usually utilizing an algorithm. Forinstance, in looking for addresses, alpha characters followed by a comma and a space,followed by two capital alpha characters, followed by a space, followed by five or moredigits, are usually the city, state and zip code. By programming the application to look fora pattern, the information can be electronically identified, extracted, or otherwise utilizedor manipulated.

Peer-to-Peer or P2P: A form of network organization that uses portions of each user’sresources, like storage space or processing power, for use by others on the network.Notorious examples include the storage sharing of Napster or BitTorrent.

Peripheral: Any accessory device attached to a computer, such as a disk drive, printer,modem or joystick.

Peripheral Component Interconnect or Interface (PCI): A high-speed interconnect localbus used to support multimedia devices.

Personal Address Book (PAB): A file type to describe a Microsoft Outlook list of contactscreated and maintained by an individual user for personal use.

Personal Computer (PC): Computer based on a microprocessor and designed to be usedby one person at a time.

Personal Computer Memory Card International Association (PCMCIA): Plug-in cardsfor computers (usually portables) that extend the storage and/or functionality.

Personal Data (as used in the EU Data Protection Act): Data which relate to a naturalperson who can be identified from the data, directly or indirectly, in particular by referenceto an identification number or to one or more factors specific to his or her physical,physiological, mental, economic, cultural or social identity. Also referred to as PII(Personally Identifiable Information).

Personal Digital Assistant (PDA): A portable device used to perform communication andorganizational tasks.

Personal Filing Cabinet (PFC): The AOL proprietary email storage container file used forthe local storage of emails, contacts, calendar events and other personal information.

Petabyte (PB): 1,024 terabytes (approximately one million gigabytes). See Byte.

Physical Disk: An actual piece of computer media, such as the hard disk or drive, floppydisks, CD-ROM disks, zip drive, etc.

2014 THE SEDONA CONFERENCE JOURNAL 345

Page 44: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Physical File Storage:When a file is created on a computer, a sufficient number of clustersare assigned to contain the file. If the file is not large enough to completely fill the assignedclusters then some unused space will exist within the physical file space. This unused spaceis referred to as file slack and can contain unused space, previously deleted/overwritten filesor fragments thereof.

Physical Unitization: See Unitization – Physical and Logical.

Picture Element: The smallest addressable unit on a display screen. The higher theresolution (the more rows and columns), the more information can be displayed.

Ping: Executable command, used as a test for checking network connectivity.

Pitch: Characters (or dots) per inch, measured horizontally.

Pixel: A single unit of a raster image that allows a picture to be displayed on an electronicscreen or computer monitor.

Plaintext or Plain Text: The least formatted and therefore most portable form of text forcomputerized documents.

Plasma Display: A type of flat panel display commonly use for large televisions in whichmany tiny cells are located between two panels of glass holding an inert mixture of gasesthat are then electronically charged to produce light.

Platform as a Service (PaaS): A form of Cloud computing which describes the outsourcingof the computer platform upon which development and other workflows can be performedwithout the costs of hardware, software and personnel. See Cloud Computing.

Platter: One of several components that make up a computer hard drive. Platters are thin,rapidly rotating disks that have a set of read/write heads on both sides of each platter. Eachplatter is divided into a series of concentric rings called tracks. Each track is further dividedinto sections called sectors, and each sector is sub-divided into bytes.

Plug and Play (PNP): A method by which new hardware may be detected, configured, andused by existing systems upon connection with little or no user intervention.

Pointer: An index entry in the directory of a disk (or other storage medium) that identifiesthe space on the disk in which an electronic document or piece of electronic data resides,thereby preventing that space from being overwritten by other data. In most cases, when anelectronic document is deleted, the pointer is deleted, allowing the document to beoverwritten, but the document is not actually erased until overwritten.

Port: An interface between a computer and other computers or devices. Can be dividedinto two primary groups based on signal transfer: Serial ports send and receive one bit at atime via a single wire pair, while parallel ports send multiple bits at the same time overseveral sets of wires. See Universal Serial Bus (USB) Port. Software ports are virtual dataconnections used by programs to exchange data directly instead of going through a file orother temporary storage locations; the most common types are Transmission ControlProtocol/Internet Protocol (TCP/IP) and User Datagram Protocol (UDP).

346 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 45: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Portable Document Format (PDF): A file format technology developed by Adobe Systemsto facilitate the exchange of documents between platforms regardless of originatingapplication by preserving the format and content.

Cited: Country Vintner of North Carolina, LLC v. E. & J. Gallo Winery, Inc.,718 F.3d 249 (4th Cir. Apr. 29, 2013).

Cited: Saliga v. Chemtura Corp., 2013 WL 6182227 (D. Conn. Nov. 25, 2013).

Portable Volumes: A feature that facilitates the moving of large volumes of documentswithout requiring copying multiple files. Portable volumes enable individual CDs to beeasily regrouped, detached and reattached to different databases for a broader informationexchange.

Portrait Mode: A page orientation or display such that the height exceeds the width (vertical).

Precision:When describing search results, precision is the number of documents retrievedfrom a search divided by the total number of documents returned. For example, in a searchfor documents relevant to a document request, it is the percentage of documents returnedfrom a search that are actually relevant to the request. See The Sedona Conference BestPractices Commentary on the Use of Search and Information Retrieval Methods in E-Discovery, available for download at https://thesedonaconference.org/download-pub/3669.

Predictive Coding/Ranking: See Technology-Assisted Review.

Preservation: The process of retaining documents and ESI, including document metadata,for legal purposes and includes suspension of normal document destruction policies andprocedures. See Spoliation.

Preservation Notice, Preservation Order: See Legal Hold.

Primary Key: A unique value stored in a field or fields of a database record that is used toidentify the record and, in a relational database, to link multiple tables together.

Print On Demand (POD): A term referring to document images stored in electronicformat and available to be quickly printed.

Printout: Printed data, also known as hard copy.

Private Network: A network that is connected to the Internet but is isolated from theInternet with security measures allowing use of the network only by persons within theprivate network.

Privilege Data Set: The universe of documents identified as responsive and/or relevant butwithheld from production on the grounds of legal privilege, a log of which is usuallyrequired to notify of withheld documents and the grounds on which they were withheld(e.g., work product, attorney-client privilege).

Process/processing (as used in the EU Data Protection Act): Any operation or set ofoperations which is performed upon Personal Data, whether or not by automatic means,such as collection, recording, organization, storage, adaptation or alteration, retrieval,consultation, use, disclosure by transmission, dissemination or otherwise making available,alignment or combination, blocking, erasure or destruction.

2014 THE SEDONA CONFERENCE JOURNAL 347

Page 46: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Processing Data: The automated ingestion of Electronically Stored Information into aprogram for the purpose of extracting metadata and text; and in some cases, the creation ofa static image of the source Electronically Stored Information files according to apredetermined set of specifications, in anticipation of loading to a database. Specificationscan include the de-duplication of ESI, or filtering based on metadata contents such as dateor email domain and specific metadata fields to be included in the final product.

Production: The process of delivering to another party, or making available for that party’sreview, documents and/or Electronically Stored Information deemed responsive to adiscovery request.

Production Data Set: The universe of documents and/or Electronically Stored Informationidentified as responsive to document requests and not withheld on the grounds of privilege.

Production Number: Often referred to as the Bates number. A sequential number assignedto every page of a production for fixed image production formats, or to every file in a nativefile production, used for tracking and reference purposes. Often used in conjunction with asuffix or prefix to identify the producing party, the litigation, or other relevant information.See Bates Number, Beginning Document Number.

Program: See Application and Software.

Properties: File level metadata describing attributes of the physical file, i.e., size, creationdata and author. See Metadata.

Protocol: Defines a common series of rules, signals and conventions that allow differentkinds of computers and applications to communicate over a network. One of the mostcommon protocols for networks is called Transmission Control Protocol/Internet Protocol(TCP/IP).

Protodigital: Primitive or first-generation digital. Applied as an adjective to systems,software, documents, or ways of thinking. The term was first used in music to refer to earlycomputer synthesizers that attempted to mimic the sound of traditional musicalinstruments and to early jazz compositions written on computers with that instrumentationin mind. In electronic discovery, this term is most often applied to systems or ways ofthinking that – on the surface – appear to embrace digital technology, but attempt toequate Electronically Stored Information to paper records, ignoring the unique attributes ofESI. When someone says, “What’s the big deal with e-discovery? Sure we have a lot ofemail. You just print it all out and produce it like you used to”; that is an example ofprotodigital thinking. When someone says, “We embrace electronic discovery. We scaneverything to .PDF before we produce it”; that person is engaged in protodigital thinking –attempting to fit Electronically Stored Information into the paper discovery paradigm.

Proximity Search: A search syntax written to find two or more words within a specifieddistance from each other.

PST: A Microsoft Outlook email storage file containing archived email messages in acompressed format.

Cited:White v. Graceland College Center for Professional Development & LifelongLearning, Inc., 2009 WL 722056 at *5 (D. Kan. March 18, 2009).

348 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 47: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Public Key Infrastructure (PKI) Digital Signature: A system, including hardware,software and policies, designed to manage digital certificates and match those certificates tospecific users so that data can be validated as authentic. See Certificate, Digital Certificateand Digital Signature.

Public Network: A network that is part of the public Internet.

Quality Control (QC): Steps taken to ensure that results of a given task, product or serviceare of sufficiently high quality; the operational techniques and activities that are used tofulfill requirements for quality. In document handling and management processes, thisincludes image quality (resolution, skew, speckle, legibility, etc.), and data quality (correctinformation in appropriate fields, validated data for dates, addresses, names/issues lists, etc.).

Quarter Inch Cartridge (QIC): Digital recording tape, 2000 feet long, with anuncompressed capacity of 5 GB.

Query: An electronic search request for specific information from a database or other ESI.

Query By Image Content (QBIC): An IBM search system for stored images that allowsthe user to sketch an image and then search the image files to find those which most closelymatch. The user can specify color and texture – such as sandy beaches or clouds.

Queue: A sequence of items such as packets or print jobs waiting to be processed. Forexample, a print queue holds files that are waiting to be printed.

Quick Peek: An initial production whereby documents and/or Electronically StoredInformation are made available for review or inspection before being reviewed forresponsiveness, relevance, privilege, confidentiality or privacy. See Make-AvailableProduction.

Quick Response (QR) Code: A small, square matrix pattern that can be read by an opticalscanner or mobile phone camera; can store thousands of alphanumeric characters and maybe affixed to business cards, advertising, product parts or other objects in order to conveyinformation, commonly an internet URL.

Random Access Memory (RAM): Hardware inside a computer that retains memory on ashort-term basis and stores information while the computer is in use. It is the workingmemory of the computer into which the operating system, startup applications and driversare loaded when a computer is turned on, or where a program subsequently started up isloaded, and where thereafter, these applications are executed. RAM can be read or writtenin any section with one instruction sequence. It helps to have more of this working spaceinstalled when running advanced operating systems and applications to increase operatingefficiency. RAM content is erased each time a computer is turned off. See DynamicRandom Access Memory (DRAM).

Raster/Rasterized (Raster or Bitmap Drawing): A method of representing an image witha grid (or map) of dots. Typical raster file formats are GIF, JPEG, TIFF, PCX, BMP, etc.and they typically have jagged edges.

Read Only Memory (ROM): Random memory that can be read but not written orchanged. Also, hardware, usually a chip, within a computer containing programmingnecessary for starting the computer and essential system programs that neither the user nor

2014 THE SEDONA CONFERENCE JOURNAL 349

Page 48: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

the computer can alter or erase. Information in the computer’s ROM is permanentlymaintained even when the computer is turned off.

Recall:When describing search results, recall is number of documents retrieved from asearch divided by all of the responsive documents in a collection. For example, in a searchfor documents relevant to a document request, it is the percentage of documents returnedfrom a search compared against all documents that should have been returned and exist inthe data set. See The Sedona Conference Best Practices Commentary on the Use of Searchand Information Retrieval Methods in E-Discovery, available for download athttps://thesedonaconference.org/download-pub/3669.

Record: (1) Information, regardless of medium or format that has value to an organization.(2) A single row of information or subset of data elements in a database.

Record Custodian: An individual responsible for the physical storage of records throughouttheir retention period. In the context of electronic records, custodianship may not be adirect part of the records management function in all organizations. For example, someorganizations may place this responsibility within their Information TechnologyDepartment, or they may assign responsibility for retaining and preserving records withindividual employees. See Record Owner.

Record Lifecycle: The time period from which a record is created until it is disposed. SeeInformation Lifecycle Management.

Record Owner: The physical custodian or subject matter expert on the contents of therecord and responsible for the lifecycle management of the record. This may be, but is notnecessarily, the author of the record. See Record Custodian.

Record Series: A description of a particular set of records within a file plan. Each categoryhas retention and disposition data associated with it, applied to all record folders andrecords within the category. See DoD 5015.

Record Submitter: The person who enters a record in an application or system. This maybe, but is not necessarily, the author or the record owner.

Records Archive: See Repository for Electronic Records.

Records Hold: See Legal Hold.

Records Management: The planning, controlling, directing, organizing, training,promoting and other managerial activities involving the life-cycle of information, includingcreation, maintenance (use, storage, retrieval) and disposition, regardless of media. SeeInformation Lifecycle Management.

Records Manager: The person responsible for the implementation of a recordsmanagement program in keeping with the policies and procedures that govern thatprogram, including the identification, classification, handling and disposition of theorganization’s records throughout their retention life-cycle. The physical storage andprotection of records may be a component of this individual’s functions, but it may also bedelegated to someone else. See Record Custodian.

350 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 49: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Records Retention Period: The length of time a given record series must be kept,expressed as either a time period (e.g., four years), an event or action (e.g., audit), or acombination (e.g., six months after audit).

Records Retention Schedule: A plan for the management of records listing types of recordsand how long they should be kept; the purpose is to provide continuing authority todispose of or transfer records to historical archives. See Information Lifecycle Management.

Records Store: See Repository for Electronic Records.

Recover, Recovery: See Restore.

Redaction: A portion of an image or document is intentionally obscured or removed toprevent disclosure of a specific portion. Done to protect privileged or irrelevant portions,including highly confidential, sensitive or proprietary information.

Redundant Array of Independent Disks (RAID): A method of storing data on serversthat usually combines multiple hard drives into one logical unit, thereby increasing capacity,reliability and backup capability. RAID systems may vary in levels of redundancy, with noredundancy being a single, non-mirrored disk as level 0, two disks that mirror each other aslevel 1, on up, with level 5 being one of the most common. RAID systems are morecomplicated to restore and copy.

Refresh Rate: The number of times per second a computer display (such as on a CRT orTV) is updated.

Region (of an image): An area of an image file that is selected for specialized processing.Also called a zone.

Registration: (1) In document coding, it is the process of lining up an image of a form todetermine the location of specific fields. See Coding; (2) entering pages into a scanner suchthat they are correctly read.

Relative Path: The electronic path on a network or computer to an individual file from acommon point on the network.

Remote Access: The ability to access and use digital information from a location off-sitefrom where the information is physically located; e.g., to use a computer, modem and someremote access software to connect to a network from a distant location.

Render Images: To take a native format electronic file and convert it to an image thatappears as if the original format file were printed to paper. See Image Processing.

Replication: See Disk Mirroring.

Report: Formatted output of a system providing specific information.

Repository for Electronic Records: A direct access device on which the electronic recordsand associated metadata are stored. Sometimes called a records store or records archive.

Residual Data: Sometimes also referred to as Ambient Data; data that is not active on acomputer system as the result of being deleted or moved to another location and isunintentionally left behind. Residual data includes: (1) data found on media free space; (2)

2014 THE SEDONA CONFERENCE JOURNAL 351

Page 50: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

data found in file slack space; and (3) data within files that has functionally been deleted inthat it is not visible using the application with which the file was created, without use ofundelete or special data recovery techniques. May contain copies of deleted files, Internetfiles and file fragments. See Latent Data.

Resolution: Refers to the sharpness and clarity of an image. The term is most often used todescribe monitors, printers and graphic images.

Restore: To transfer data from a backup medium (such as tapes) to an active system, oftenfor the purpose of recovery from a problem, failure or disaster. Restoration of archivalmedia is the transfer of data from an archival store to an active system for the purposes ofprocessing (such as query, analysis, extraction or disposition of that data). Archivalrestoration of systems may require not only data restoration but also replication of theoriginal hardware and software operating environment. Restoration of systems is oftencalled recovery.

Retention Schedule: See Records Retention Schedule.

Reverse Engineering: The process of analyzing a system or piece of software to identifyhow it was created in order to recreate it in a new or different form. Reverse engineering isusually undertaken in order to redesign the system for better maintainability or to producea copy of a system without utilizing the design from which it was originally produced. Forexample, one might take the executable code of a computer program, run it to study how itbehaved with different input, and then attempt to write a program that behaved the sameor better.

Review: The process of reading or otherwise analyzing documents to make a determinationas to the document’s applicability to some objective or subjective standard. Often used todescribe the examination of documents in a legal context for their responsiveness orrelevance to specific issues in a matter. See Native Format Review, Online Review.

Rewriteable Technology: Storage devices where the data may be written more than once –typically hard drives, floppies and optical disks.

RFC822: Standard that specifies a syntax for text messages that are sent between one ormore computer users, within the framework of email.

Rich Text Format (RTF): A standard text file format that preserves minimal stylisticformatting of document files for ease in exchange between various parties with differentsoftware.

Rip: To extract Electronically Stored Information from container files, for example tounbundle email collections into individual emails, during the e-discovery process whilepreserving metadata, authenticity and ownership. Also used to describe the extraction orcopying of data to or from an external storage device.

RIM: Records and information management. [Also the acronym of the company thatdeveloped and sells BlackBerry devices, Research In Motion.]

RLE (Run Length Encoded): Compressed image format; supports only 256 colors; mosteffective on images with large areas of black or white.

352 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 51: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Root Directory: The top level in a hierarchical file system. For example on a PC, the rootdirectory of the hard drive, usually C:, contains all the second-level subdirectories on thatdrive.

Router: A device that forwards data packets along networks. A router is connected to atleast two networks, commonly two LANs or WANs or a LAN and its ISPs network.Routers are located at gateways, the places where two or more networks connect. SeeWireless Router.

SaaS (Software as a Service): Software application delivery model where a software vendordevelops a Web-native software application and hosts and operates (either independently orthrough a third-party) the application for use by its customers over the Internet. Customerspay, not for owning the software itself, but for using it. See Application Service Provider(ASP), Cloud Computing.

Sampling: Sampling usually refers to the process of testing a database or a large volumeof Electronically Stored Information for the existence or frequency of relevantinformation. It can be a useful technique in addressing a number of issues relating tolitigation, including decisions about what repositories of data are appropriate to search ina particular litigation, and determinations of the validity and effectiveness of searches orother data extraction procedures.

SAN (Storage Area Network): A high-speed sub-network of shared storage devices. Astorage device is a machine that contains nothing but a disk or disks for storing data. ASAN’s architecture works in a way that makes all storage devices available to all servers on aLAN or WAN. As more storage devices are added to a SAN, they too will be accessiblefrom any server in the larger network. The server merely acts as a pathway between the enduser and the stored data. Because stored data does not reside directly on any of a network’sservers, server power is utilized for business applications and network capacity is released tothe end user. See Network.

SAS-70: (Statement on Auditing Standards No. 70, Service Organizations): An auditingstandard developed by the American Institute of Certified Public Accountants (AICPA), whichincludes an examination of an entity’s controls over information technology, security andrelated processes. There are two types of examinations: Type I examines the policies andprocedures in place for their effectiveness to the stated objective; Type II reports on how thesystems were actually used during the period of review. The SAS-70 Type II assessment is oftenused by hosting vendors and storage co-locations as a testament to their internal controls.

Scalability: The capacity of a system to expand without requiring major reconfiguration orre-entry of data. For example, multiple servers or additional storage can be easily added.

Scale-to-Gray: An option to display a black and white image file in an enhanced mode,making it easier to view. A scale-to-gray display uses gray shading to fill in gaps or jumps(known as aliasing) that occur when displaying an image file on a computer screen. Alsoknown as grayscale.

Schema: A set of rules or conceptual model for data structure and content, such as adescription of the data content and relationships in a database.

Scroll Bar: The bar on the side or bottom of a window that allows the user to scroll up anddown through the window’s contents. Scroll bars have scroll arrows at both ends and ascroll box, all of which can be used to scroll around the window.

2014 THE SEDONA CONFERENCE JOURNAL 353

Page 52: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Search: See Bayesian Search, Boolean Search, Compliance Search, Concept Search,Contextual Search, Full-Text Search, Fuzzy Search, Index, Keyword, Pattern Recognition,Proximity Search, Query By Image Content (QBIC), Sampling, Search Engine and SearchSyntax.

Search Engine: A program that enables a search for keywords or phrases, such as on Webpages throughout the World Wide Web, e.g., Google, Bing, etc.

Search Syntax: The grammatical formatting of a search string, which is particular to thesearch program. Includes formatting for proximity searches, phrase searches or any otheroptions that are supported by the search program.

Sector: A sector is normally the smallest individually addressable unit of information storedon a hard drive platter and usually holds 512 bytes of information. Sectors are numberedsequentially starting with 1 on each individual track. Thus, Track 0, Sector 1 and Track 5,Sector 1 refer to different sectors on the same hard drive. The first PC hard disks typicallyheld 17 sectors per track.

Secure Hash Algorithm (SHA-1 and SHA-2): used for computing a condensedrepresentation of a message or a data file specified by FIPS PUB 180. See Hash Coding.

Serial Line Internet Protocol (SLIP): A connection to the Internet in which the interfacesoftware runs in the local computer, rather than the Internet’s.

Serial Port: See Port.

Server: Any central computer on a network that contains Electronically Stored Informationor applications shared by multiple users of the network on their client computers; serversprovide information to client machines. For example, there are Web servers that send outWeb pages, mail servers that deliver email, list servers that administer mailing lists, FTPservers that hold FTP sites and deliver Electronically Stored Information to requestingusers, and name servers that provide information about Internet host names. See File Server.

Server Farm: A cluster of servers.

Service-Level Agreement: A contract that defines the technical support or businessparameters that a service provider or outsourcing firm will provide its clients. Theagreement typically spells out measures for performance and consequences for failure.

Session: A lasting connection, usually involving the exchange of many packets between auser or host and a server, typically implemented as a layer in a network protocol, such asTelnet or File Transfer Protocol (FTP).

SGML/HyTime: A multimedia extension to SGML, sponsored by DoD.

Short Message Service (SMS): The most common data application for text messagingcommunication, allows users to send text messages to phones and other mobilecommunication devices. See Text Message.

Signature: See Certificate.

Simplex: One-sided page(s).

354 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 53: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Single, In-Line Memory Module (SIMM): A mechanical package (with “legs”) used toattach memory chips to printed circuit boards.

Single Instance Storage: The method of de-duplication that is undertaken on a storagedevice to maximize space by eliminating multiple copies of a single file by retaining onlyone copy. This system of storage can occur either on a file level, or on a field level, whereindividual components of files are disassembled so that only unique parts are retainedacross an entire population and the reassembly of the original files is managed upondemand.

Siri: A personal assistant application for the Apple iPhone mobile device that allows usersto search for the phone applications and Web services through voice recognition andnatural language processing software.

Skewed: Tilted images. See De-skewing.

Slack/Slack Space: The unused space that exists on a hard drive when the logical file spaceis less than the physical file space. Also known as file slack. A form of residual data, theamount of on-disk file space from the end of the logical record information to the end ofthe physical disk record. Slack space can contain information soft-deleted from the record,information from prior records stored at the same physical location as current records,metadata fragments and other information useful for forensic analysis of computer systems.See Cluster Bitmap and Cluster (File).

Small Computer System Interface (SCSI, pronounced “skuzzy”): A common, industrystandard connection type between computers and peripherals, such as hard disks, CD-ROM drives and scanners. SCSI allows for up to 7 devices to be attached in a chain viacables. SDLT (Super DLT): A type of backup tape that can hold up to 300 GB or 450CDs, depending on the data file format. See Digital Linear Tape (DLT).

Smart Card: A credit card size device that contains a microprocessor, memory and a battery.

SMTP (Simple Mail Transfer Protocol): The protocol widely implemented on theInternet for exchanging email messages.

Snapshot: See Bit Stream Backup.

Social Network: A group of people that use the Internet to share and communicate, eitherprofessionally or personally, in a public setting typically based on a specific theme orinterest. For example, Facebook is a popular social network that allows people to connect tofriends and acquaintances anywhere in the world in order to share personal updates,pictures and experiences, and is used by entities as a public-facing presence.

Social Media: Internet applications which permit individuals or organizations tointeractively share and communicate.

Software: Any set of coded instructions (programs) stored on computer-readable media thatcontrol what a computer does or can do. Includes operating systems and softwareapplications.

Software application: See Application, Software.

2014 THE SEDONA CONFERENCE JOURNAL 355

Page 54: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Speckle: Imperfections in an image as a result of scanning paper documents that do notappear on the original. See De-speckling.

Spoliation: Spoliation is the destruction of records or properties, such as metadata, that maybe relevant to ongoing or anticipated litigation, government investigation or audit. Courtsdiffer in their interpretation of the level of intent required before sanctions may be warranted.

Cited: Victor Stanley, Inc. v. Creative Pipe, Inc., 269 F.R.D. 497(D. Md., 2010).

Cited: Rimkus Consulting Group, Inc. v. Cammarata, 688 F. Supp. 2d 598(S.D. Tex., 2010).

Cited: Quantlab Technologies Ltd. (BGI) v. Godlevsky, 2014 WL 651944(S.D. Tex. Feb. 19, 2014).

Spyware: A data collection program that secretly gathers information about the user andrelays it to advertisers or other interested parties. Adware usually displays banners orunwanted pop-up windows, but often includes spyware as well. See Malware.

Stand-Alone Computer: A personal computer that is not connected to any other computeror network.

Standard Generalized Markup Language (SGML): An informal industry standard foropen systems document management that specifies the data encoding of a document’sformat and content. Has been virtually replaced by Extensible Markup Language (XML).

Standard Parallel Port (SPP): See Port.

Steganography: The hiding of information within a more obvious kind of communication.Although not widely used, digital steganography involves the hiding of data inside a soundor image file. Steganalysis is the process of detecting steganography by looking at variancesbetween bit patterns and unusually large file sizes.

Storage Device: A device capable of storing ESI.

Storage Media: See Magnetic/Optical Storage Media.

Streaming Indexing: Real-time or near real-time indexing of data as it being moved fromone storage medium to another.

Structured Data: Data stored in a structured format, such as databases or data setsaccording to specific form and content rules as defined by each field of the database.Contrast to Unstructured Data.

Structured Query Language (SQL): A database computer language used to manage thedata in relational databases. A standard fourth generation programming language (4GL – aprogramming language that is closer to natural language and easier to work with than ahigh-level language).

Subjective Coding: Recording the judgments of a reviewer as to a document’s relevancy,privilege or importance with regard to factual or legal issues in a legal matter. See Coding.

Suspension Notice or Suspension Order: See Legal Hold.

356 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 55: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Swap File: A file used to temporarily store code and data for programs that are currentlyrunning. This information is left in the swap file after the programs are terminated, andmay be retrieved using forensic techniques. Also referred to as a page file or paging file.

System: (1) A collection of people, machines and methods organized to perform specificfunctions; (2) An integrated whole composed of diverse, interacting, specialized structuresand sub-functions; and/or (3) A group of sub-systems united by some interaction orinterdependence, performing many duties, but functioning as a single unit.

System Administrator (sysadmin, or sysop): The person responsible for and/or in chargeof keeping a network or enterprise resource, such as a large database, operational.

System Files: Files allowing computer systems to run; non-user-created files.

System-Generated Metadata: Information about a file that is created and applied to a fileby a computer process or application. Information could include the data a file was saved,printed or edited, and can include where a file was stored and how many times it has beenedited. See Metadata.

Cited: CBT Flint Partners, LLC v. Return Path, Inc., 737 F.3d 1320(Fed. Cir. Dec. 13, 2013).

T1: A high speed, high bandwidth leased line connection to the Internet. T1 connectionsdeliver information at 1.544 megabits per second.

T3: A high speed, high bandwidth leased line connection to the Internet. T3 connectionsdeliver information at 44.746 megabits per second.

Tape Drive: A hardware device used to store or backup Electronically Stored Informationon a magnetic tape. Tape drives are sometimes used to backup large quantities ofElectronically Stored Information due to their large capacity and cheap cost relative to otherstorage options.

Taxonomy: The science of categorization, or classification, of things based on apredetermined system. In reference to Websites and portals, a site’s taxonomy is the way itorganizes its Electronically Stored Information into categories and subcategories, sometimesdisplayed in a site map. Used in information retrieval to find documents related to a queryby identifying other documents in the same category.

Technology-Assisted Review (TAR)*: A process for prioritizing or coding a collection ofElectronically Stored Information using a computerized system that harnesses humanjudgments of subject matter expert(s) on a smaller set of documents and then extrapolatesthose judgments to the remaining documents in the collection. Some TAR methods usealgorithms that determine how similar (or dissimilar) each of the remaining documents is tothose coded as relevant (or non-relevant, respectively) by the subject matter experts(s), whileother TAR methods derive systematic rules that emulate the expert(s) decision-makingprocess. TAR systems generally incorporate statistical models and/or sampling techniques toguide the process and to measure overall system effectiveness.

2014 THE SEDONA CONFERENCE JOURNAL 357

* Maura R. Grossman & Gordon V. Cormack, The Grossman-Cormack Glossary of Technology-Assisted Review with Forward by John M. Facciola, U.S. magistrate Judge, 2013 Fed. Cts. L. Rev. 7 (January 2013), http://www.fclr.org/fclr/articles/html/2010/grossman.pdf.

Page 56: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Telnet (Telecommunications Network): A protocol for logging onto remote computersfrom anywhere on the Internet.

Template: Sets of index fields for documents, providing framework for preparation.

Temporary (Temp) File: Contemporaneous files created by applications and stored on acomputer for temporary use only, created to enable the processor of the computer toquickly pull back and assemble data for currently active files.

Terabyte: 1,024 gigabytes (approximately one trillion bytes). See Byte.

Text Delimited File: A common format for structured data exchange whereby a text filecontains fielded data where the fields are separated by a specific ASCII character and alsousually contain a header line that defines the fields contained in the file. See Field Separatoror Field Delimiter.

Text Message: A written message typically restricted to 160 characters in length that is sentamong users with mobile devices. The messages can be sent via the Short Messaging Service(SMS), as well as images, video and other multimedia using the Multimedia MessagingService (MMS).

Text Mining: The application of data mining (knowledge discovery in databases) tounstructured textual data. Text mining usually involves structuring the input text (oftenparsing, along with application of some derived linguistic features and removal of others,and ultimate insertion into a database), deriving patterns within the data, and evaluatingand interpreting the output, providing such ranking results as relevance, novelty, andinterestingness. Also referred to as “Text Data Mining.” See Data Mining.

Text Retrieval Conference (TREC): An ongoing series of workshops co-sponsored byNIST and the U. S. Department of Defense.

TGA: Targa format. A scanned format – widely used for color-scanned materials (24-bit) aswell as by various paint and desktop publishing packages.

Thin Client: A computer or software program which relies on a central server forprocessing and application resources, and Electronically Stored Information storage in acentral area instead of locally; used mainly for output and input of user information orcommands. See Client.

Thread: A series of technologically related communications, usually on a particular topic.Threads can be a series of bulletin board messages (for example, when someone posts aquestion and others reply with answers or additional queries on the same topic). A threadcan also apply to emails or chats, where multiple conversation threads may existsimultaneously. See Email String.

Thumb Drive: See Flash Drive.

Thumbnail: A miniature representation of a page or item for quick overviews to provide ageneral idea of the structure, content and appearance of a document. A thumbnail programmay be a standalone or part of a desktop publishing or graphics program. Thumbnailsprovide a convenient way to browse through multiple images before retrieving the oneneeded. Programs often allow clicking on the thumbnail to retrieve it.

358 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 57: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

TIFF (Tagged Image File Format): A widely used and supported graphic file formats forstoring bit-mapped images, with many different compression formats and resolutions. Filename has .TIF extension. Can be black and white, gray-scaled or color. Images are stored intagged fields, and programs use the tags to accept or ignore fields, depending on theapplication. The format originated in the early 1980s.

Cited: Akanthos Capital Management, LLC v. CompuCredit Holdings Corp.,2014 WL 896743 (N.D. Ga. Mar. 7, 2014).

Cited: Country Vintner of North Carolina, LLC v. E. & J. Gallo Winery, Inc.,718 F.3d 249 (4th Cir. Apr. 29, 2013).

Cited: E.E.O.C. v. SVT, LLC, 2014 U.S. Dist. LEXIS 50114(N.D. Ind. Apr. 10, 2014).

Cited: Race Tires America, Inc. v. Hoosier Racing Tire Corp., 2012 WL 887593(3d Cir. Mar. 16, 2012).

Cited: Saliga v. Chemtura Corp., 2013 WL 6182227 (D. Conn. Nov. 25, 2013).

Cited: In re Seroquel Products Liability Litigation, 244 F.R.D. 650(M.D. Fla., Aug. 21, 2007).

Cited: Williams v. Sprint/United Management Company, 230 F.R.D. 640(D. Kan. 2005).

TIFF Group III: A one-dimensional compression format for storing black and whiteimages that is utilized by many fax machines. See TIFF.

TIFF Group IV: A two-dimensional compression format for storing black and whiteimages. Typically compresses at a 20-to-1 ratio for standard business documents. See TIFF.

Time Zone Normalization: See Normalization.

Toggle: A switch (which may be physical or virtualized on a screen) that is either on or off,and reverses to the opposite when selected.

Tone Arm: A device in a computer that reads to/from a hard drive.

Tool Kit Without An Interesting Name (TWAIN): A universal toolkit with standardhardware/software drivers for multi-media peripheral devices. Often used as a protocolbetween a computer and scanners or image capture equipment.

Toolbar: The row of graphical or text buttons that perform special functions quicklyand easily.

Topology: The geometric arrangement of a computer system. Common topologies includea bus (network topology in which nodes are connected to a single cable with terminators ateach end), star (LAN) designed in the shape of a star, where all end points are connected toone central switching device, or hub), and ring (network topology in which nodes areconnected in a closed loop; no terminators are required because there are no unconnectedends). Star networks are easier to manage than ring topology.

2014 THE SEDONA CONFERENCE JOURNAL 359

Page 58: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

Track: Each of the series of concentric rings contained on a hard drive platter.

Transmission Control Protocol/Internet Protocol (TCP/IP): The first two networkingprotocols defined; enable the transfer of data upon which the basic workings of the featuresof the Internet operate. See Internet Protocol, Port.

Trojan: A malware program that contains another hidden program embedded inside it forthe purpose of discretely delivering the second program to a computer or network withoutthe knowledge of the user or administrator. See Malware.

True Resolution: The true optical resolution of a scanner is the number of pixels per inch(without any software enhancements).

Twiki: A WikiWiki – Enables simple form-based Web applications without programming,and granular access control (though it can also operate in the classic ‘no authentication’mode). Other enhancements include configuration variables, embedded searches, server-sideincludes, file attachments, and a plug-in API that has spawned over 150 plug-ins to linkinto databases, create charts, sort tables, write spreadsheets, make drawings, and so on.

Typeface: A specific size and style of type within a family. There are many thousands oftypefaces available for computers, ranging from modern to decorative.

User Datagram Protocol (UDP): A protocol allowing computers to send short messages toone another. See Port.

Ultrafiche:Microfiche that can hold 1,000 documents/sheet as opposed to the normal 270.

Unallocated Space: The area of computer media, such as a hard drive, that does notcontain readily accessible data. Unallocated space is usually the result of a file being deleted.When a file is deleted, it is not actually erased but is simply no longer accessible throughnormal means. The space that it occupied becomes unallocated space, i.e., space on thedrive that can be reused to store new information. Until portions of the unallocated spaceare used for new data storage, in most instances, the old data remains and can be retrievedusing forensic techniques.

Under-Inclusive:When referring to data sets returned by some method of query, search,filter or cull, results that are returned incomplete or too narrow. See False Negative.

Unicode: A 16-bit ISO 10646 character set accommodating many more characters than theASCII character set. Created as a standard for the uniform representation of character setsfrom all languages. Unicode supports characters 2 bytes wide. Sometimes referred to as“double byte language.” See www.unicode.org for more information.

Uniform Resource Locators (URL): The addressing system used in the World Wide Weband other Internet resources. The URL contains information about the method of access,the server to be accessed and the path of any file to be accessed. Although there are manydifferent formats, a URL might look like this:http://thesedonaconference.org/publications_html. See Address.

Unitization – Physical and Logical: The assembly of individually scanned pages intodocuments. Physical Unitization utilizes actual objects such as staples, paper clips andfolders to determine pages that belong together as documents for archival and retrieval

360 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 59: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

purposes. Logical unitization is the process of human review of each individual page in animage collection using logical cues to determine pages that belong together as documents.Such cues can be consecutive page numbering, report titles, similar headers and footers andother logical indicators. This process should also capture document relationships, such asparent and child attachments. See Attachment, Document or Document Family, Load File,Message Unit.

Cited: Race Tires America, Inc. v. Hoosier Racing Tire Corp., 2012 WL 887593(3d Cir. Mar. 16, 2012).

Universal Serial Bus (USB) Port: A port on a computer or peripheral device into which aUSB cable or device can be inserted – quickly replacing the use or need for serial andparallel ports as it provides a single, standardized way to easily connect many differentdevices. See Flash Drive and Port.

UNIX: A software operating system designed to be used by many people at the same time(multi-user) capable of performing multiple tasks or operations at the same time (multi-tasking); common operating system for Internet servers.

Unstructured Data: Refers to free-form data which either does not have a data structure orhas a data structure not easily readable by a computer without the use of a specific programdesigned to interpret the data; created without limitations on formatting or content by theprogram with which it is being created. Examples include word processing documents orslide presentations.

Upgrade: A newer version of hardware, software or application

Upload: To move data from one location to another in any manner, such as via modem,network, serial cable, internet connection or wireless signals; indicates that data is beingtransmitted to a location from a location. See Download.

User Created Metadata: Information about a file that is created and applied to a file by auser. Information includes the addressees of an email, annotations to a document andobjective coding information. See Metadata.

Cited: CBT Flint Partners, LLC v. Return Path, Inc., 737 F.3d 1320(Fed. Cir. Dec. 13, 2013).

UTC: See Coordinated Universal Time.

UTF-8: A character encoding form of Unicode that represents Unicode code points withsequences of one, two, three or four bytes. UTF-8 can encode any Unicode character. It isthe most common Unicode encoding on the Web and the default encoding of XML. Animportant advantage of UTF-8 is that it is backward compatible with the ASCII encoding,which includes the basic Latin characters. Consequently, all electronic text in the ASCIIencoding is conveniently also Unicode. This backward compatibility was a primary reasonfor the invention of UTF-8. See ASCII, Unicode, UTF-16.

UTF-16: A character encoding form of Unicode that represents Unicode code points withsequences of one or two 16-bit code units. UTF-16 can encode any Unicode character.It is much less often used for data interchange than the UTF-8 encoding form. UTF-16 iscommonly used in computer programming languages and programming APIs and is the

2014 THE SEDONA CONFERENCE JOURNAL 361

Page 60: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

encoding used internally for file names by Microsoft Windows and NTFS. SeeUnicode, UTF-8.

Validate: In the context of this document, to confirm or ensure well-grounded logic, andtrue and accurate determinations.

Vector: Representation of graphic images by mathematical formulas. For instance, a circle isdefined by a specific position and radius. Vector images are typically smoother than rasterimages.

Verbatim Coding: Manually extracting information from documents in a way that matchesexactly as the information appears in the documents. See Coding.

Version, Record Version: A particular form or variation of an earlier or original record. Forelectronic records the variations may include changes to file format, metadata or content.

Vertical De-Duplication: A process through which duplicate Electronically StoredInformation, as determined by matching hash values, are eliminated within a singlecustodian’s data set. See Content Comparison, File level Binary Comparison Horizontal De-Duplication, Metadata Comparison, Near Duplicates.

Video Display Terminal (VDT): Generic name for all display terminals.

Video Electronics Standards Association (VESA): Concentrates on computer videostandards.

Video Graphics Adapter (VGA): A computer industry standard, first introduced by IBMin 1987, for color video displays. The minimum dot (pixel) display is 640 by 480 by 16colors. Then “Super VGA” was introduced at 800 x 600 x 16, then 256 colors. VGA canextend to 1024 by 768 by 256 colors. Replaces EGA, an earlier standard and the even olderCGA. Newer standard displays can range up to 1600 by 1280.

Video Scanner Interface: A type of device used to connect scanners with computers.Scanners with this interface require a scanner control board designed by Kofax, Xionicsor Dunord.

Virtual Private Network (VPN): A secure network that is constructed by using publicwires to secure connect nodes. For example, there are a number of systems that enablecreation of networks using the Internet as the medium for transporting data. These systemsuse encryption and other security mechanisms to ensure that only authorized users canaccess the network and that the data cannot be intercepted.

Virtualization: Partitioning a server into multiple virtual servers, each capable of runningan independent operating system and associated software applications as though it were aseparate computer. Virtualization is particularly useful for centralized IT infrastructures tomanage multiple computing environments with the same set of hardware, and for cloudcomputing providers to provide customized interfaces to clients without investing inseparate machines, each with its own operating system.

Virus: A self-replicating program that spreads on a computer or network by inserting copiesof itself into other executable code or documents. A program into which a virus hasinserted itself is said to be infected, and the infected file (or executable code that is not part

362 THE SEDONA CONFERENCE GLOSSARY VOL. XV

Page 61: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

of a file) is a host. Viruses are a kind of malware that range from harmless to destructiveand damage computers by either destroying data or overwhelming the computer’s resources.See Malware.

Vital Record: A record that is essential to the organization’s operation or to the re-establishment of the organization after a disaster.

Visualization: The process of graphically representing data.

Voice over Internet Protocol (VoIP): Telephonic capability across an Internet connection.

Volume: A specific amount of storage space on computer storage media such as hard drives,floppy disks, CD-ROM disks, etc. In some instances, computer media may contain morethan one volume, while in others a single volume may be contained on more than one disk.

Volume Boot Sector/Record:When a partition is formatted to create a volume of data, avolume boot sector is created to store information about the volume. One volume containsthe operating system and its volume boot sector contains code used to load the operatingsystem when the computer is booted up. See Partition.

WAV: File extension name for Windows sound files.

Webmail: Email service that is provided through a website. See Email.

Website: A collection of Uniform Resource Indicators (URIs), including Uniform ResourceLocators (URLs), in the control of one administrative entity. May include different types ofURIs (e.g., FTP, telnet or Internet sites). See URI and URL.

What You See Is What You Get (WYSIWYG): Display and software technology thatshows on the computer screen exactly what will print.

Wide Area Network (WAN): Refers generally to a network of PCs or other devices, remoteto each other, connected by electronic means, such as transmission lines.See Network.

WiFi (Wireless Fidelity):Wireless networking technology that allows electronic devices toconnect to one another and the Internet from a shared network access point.

Wiki: A collaborative website that allows visitors to add, remove, and edit content.

Wildcard Operator: A character used in text-based searching that assumes the value of anyalphanumeric character, characters, or in some cases, words. Used to expand search termsand enable the retrieval of a wider range of hits.

Windows-1252: Also called ANSI, Western European and CP1252 (Microsoft code page1252). A character encoding of the Latin alphabet used for most Western Europeanlanguages. Windows-1252 is a superset of the ASCII and ISO 8859-1 standard characterencodings. The characters that are included in Windows-1252, but that are not included inISO 8859-1, are often the source of character interpretation and display problems in texton the Web and in electronic mail. Similar problems sometimes occur when text in theWindows-1252 encoding is converted to the UTF-8 encoding form of Unicode becauseUTF-8 is not wholly backward compatible with Windows-1252. The name ANSI is a

2014 THE SEDONA CONFERENCE JOURNAL 363

Page 62: The Sedona Conference Journal...The Sedona Conference Journal ® V olume 15 v F all 2014In Memoriam: Richard G. Braman. . . . . . . . . . . . . . . . . .Craig W. Weinlein Responding

misnomer resulting from historical happenstance, but it is not incorrect to use it in contextswhere its meaning is readily understood. See ASCII, ISO 8859-1.

Wireless Router: A hardware device that opens access to an Internet connection or networkto a secured or unsecured connection via a receiver on a computer or other piece ofhardware such as a printer permitting wireless transmission. See WiFi.

Workflow: The automation of a business process, in whole or part, during whichElectronically Stored Information or tasks are passed from one participant to another foraction according to a set of procedural rules.

Workflow, Ad Hoc: A simple manual process by which documents can be moved around amulti-user review system on an as-needed basis.

Workflow, Rule-Based: A programmed series of automated steps that route documents tovarious users on a multi-user review system.

Workgroup: A group of computer users connected to share individual talents and resourcesas well as computer hardware and software – often to accomplish a team goal.

Worm: A self-replicating computer program, sending copies of itself, possibly without anyuser intervention. See Malware.

World Wide Web (WWW): A massive collection of hypertext documents accessed via theInternet using a browser. The documents, also known as Web Pages, can contain formattedtext, audio and video files, and multimedia programs.

Write Once Read Many Disks (WORM Disks): A popular archival storage media duringthe 1980s. Acknowledged as the first optical disks, they are primarily used to store archivesof data that cannot be altered. WORM disks are created by standalone PCs and cannot beused on the network, unlike CD-ROM disks.

X.25: A standard protocol for data communications, which has largely been replaced by lesscomplex protocols including the Internet Protocol (IP).

XML, XRML: See Extensible Markup Language.

Yottabyte: 1,024 zettabytes. See Byte.

Zettabyte: 1,024 exabytes. See Byte.

ZIP: A common file compression format that allows quick and easy storage for transmissionor archiving one or several files.

Zip Drive: A removable disk storage device developed by Iomega with disk capacities of100, 250 and 750 megabytes.

Zone OCR: An add-on feature of imaging software that populates data fields by readingcertain regions or zones of a document and then placing the recognized text into thespecified field.

364 THE SEDONA CONFERENCE GLOSSARY VOL. XV