different file formats in document review
TRANSCRIPT
Different File Formats in
Document Review
January 2015
Ediscovery Document Review
● Ediscovery productions can be requested in and provided in various file formats
● That is why it’s important to know their comparative strengths and weaknesses
Different File Formats● When reviewing documents in an online review platform, the
experience can differ significantly based on the format of the file you’re reviewing
● The 3 most common file types are:○ Image files○ Text files○ Native files
Image Files● Most document review platforms default to an image viewer● An image viewer simply loads an image for each page of a
document● Generally, those images will be in TIFF, PDF, or PNG image
format
Benefits of Image Files1. They load quickly! Reviewing millions of pages takes long
enough: you don’t want to be waiting for documents to load2. Bates stamping on image files simplifies referencing
documents and preparing for trial3. This is the standard format, so it is most-consistently
produced or requested
Text Files● A text viewer loads the text
contents of a file● Generally, converted into UTF-8
text● Not a true visualization of a
document, as it is missing images, etc.
1. They load quickly, like image files2. Allow for keyword highlighting, which
makes finding key phrases and documents much easier
○ Can automatically highlight across an entire case, or personalize highlights during a review session
○ Can search for keywords in a collection en masse
3. In the Everlaw tool, allows for use of machine translation to automatically translate text in foreign languages
Benefits of Text Files
Native Files● The original file in the file type of the program it was
originally associated with● Sample files include .doc, .xls, .msg, etc.● This is the most technically complex, as there is a
theoretically-limitless number of different file types to support. The Everlaw tool supports >300 different file types
● Slower than image or text files
Benefits of Native Files1. You can view the actual original file, with no alterations made to
conform to another file format2. Usually allow for highlighting and translation of the file’s text3. Makes it possible to retrieve missing metadata4. Many file formats (like Excel) don’t make much sense broken out
into images or text. ○ The Everlaw review platform includes our own custom
spreadsheet viewer in the native view
File Combinations● File collections are never in just one file format: you always
receive or provide a combination● Though the Everlaw review platform can support almost any
production protocol, format still matters: getting the right protocol can save you review time and money
● A few common combinations you might receive or produce:1) Image, Text and Native2) Image and Text3) Native and Text
Common File Combinations: Image, Text, and Native● One of the easiest solutions is to produce
everything
● Benefit: Gives the reviewer the choice of using whatever tool is most appropriate for each document (e.g. can view PDFs in the image viewer, emails in the text viewer spreadsheets in the native viewer)
● Drawback: Hosting all the formats means an increase in hosting costs
Common File Combinations:Image and Text● Most of the time, native files won’t be necessary. However, if you
are leaning toward images and text, make exceptions for certain files — like spreadsheets and videos
● Benefit: Allows the reviewer to avoid the slower load speed of native files
● Drawback: Note that this approach doesn’t often save much on cost, as images tend to make up the bulk of the cost
Common File Combinations:Native and Text● Benefit: Most useful as a cost-savings measure.
You can use native files for initial review before producing to their final protocol. This cuts down on cost of hosting non-responsive documents; you can just image the documents when you need to produce them.
● Drawback: May take time or effort to locate files in different formats in initial review vs. production.
Did We Miss Anything?
…If you want to see how Everlaw deals with these different file types, don’t hesitate to ask: [email protected]!