demystifying advanced technologies to find solutions that work
DESCRIPTION
A Corporate Counsel headline from late last year asked, “Can Predictive Coding Save The World?” A better, albeit more modest question is, can it save you money? This panel addresses that loaded question and the related issues of: • Deploying advanced technologies across enterprise data, • Measuring the effectiveness of advanced technologies, • Vetting and selecting appropriate service providers, and Validating the results of predictive coding. In this panel, IT and legal experts survey the technology horizon, giving you insights and best practices for finding the solutions that work best for you.TRANSCRIPT
![Page 1: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/1.jpg)
Demystifying Advanced Technologies to Find Solutions that Work
Friday, Oct. 11 | 9:45 – 10:45
Presented by
![Page 2: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/2.jpg)
Peter Oesterling
Assistant General Counsel | Nationwide
![Page 3: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/3.jpg)
Alex Ponce de Leon
Discovery Counsel | Intel
![Page 4: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/4.jpg)
J. William Speros
Evidence Consulting Attorney | Speros & Associates
![Page 5: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/5.jpg)
“Technology-Assisted Review,” called by its nickname “Predictive Coding,” describes a process whereby computers are programmed to search a large amount of data to find quickly and efficiently the data that meet a particular requirement. Computer science and the sciences of statistics and psychology inform its use. While it bruises the human ego, scientists…determined that …[i]t is now indubitable that technology-assisted review is an appreciably better and more accurate means of searching a set of data.”
THE GROSSMAN-CORMACK GLOSSARY OF TECHNOLOGY-ASSISTED REVIEW
FEDERAL COURTS LAW REVIEW Volume 7, Issue 1 (2013) Foreword by John M. Facciola, U.S. Magistrate Judge
![Page 6: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/6.jpg)
“Technology-Assisted Review,” called by its nickname “Predictive Coding,” describes a process whereby computers are programmed to search a large amount of data to find quickly and efficiently the data that meet a particular requirement. Computer science and the sciences of statistics and psychology inform its use. While it bruises the human ego, scientists…determined that …[i]t is now indubitable that technology-assisted review is an appreciably better and more accurate means of searching a set of data.”
THE GROSSMAN-CORMACK GLOSSARY OF TECHNOLOGY-ASSISTED REVIEW
FEDERAL COURTS LAW REVIEW Volume 7, Issue 1 (2013) Foreword by John M. Facciola, U.S. Magistrate Judge
Process: a series of actions that produce
something or that lead to a particular result
![Page 7: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/7.jpg)
“Now, the methodology of the use of technology-assisted review may itself be in dispute, with the parties controverted to each other’s use of a particular method or tool. Those controversies have already lead to judicial decisions that have to grapple with a wholly new way of searching and with scientific principles derived from the science of statistics or other disciplines.”
THE GROSSMAN-CORMACK GLOSSARY OF TECHNOLOGY-ASSISTED REVIEW
FEDERAL COURTS LAW REVIEW Volume 7, Issue 1 (2013) Foreword by John M. Facciola, U.S. Magistrate Judge
![Page 8: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/8.jpg)
“Now, the methodology of the use of technology-assisted review may itself be in dispute, with the parties controverted to each other’s use of a particular method or tool. Those controversies have already lead to judicial decisions that have to grapple with a wholly new way of searching and with scientific principles derived from the science of statistics or other disciplines.”
THE GROSSMAN-CORMACK GLOSSARY OF TECHNOLOGY-ASSISTED REVIEW
FEDERAL COURTS LAW REVIEW Volume 7, Issue 1 (2013) Foreword by John M. Facciola, U.S. Magistrate Judge
Methodology: a set of methods, rules, or ideas that are important in a science or art : a particular procedure
or set of procedures
![Page 9: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/9.jpg)
THE GROSSMAN-CORMACK GLOSSARY OF TECHNOLOGY-ASSISTED REVIEW
FEDERAL COURTS LAW REVIEW Volume 7, Issue 1 (2013)
Predictive Coding: An industry-specific term generally used to describe a
Technology-Assisted Review process involving the use of a Machine Learning Algorithm to distinguish Relevant from Non-Relevant Documents, based on Subject Matter Expert(s)’ Coding of a Training Set of Documents.
![Page 10: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/10.jpg)
THE GROSSMAN-CORMACK GLOSSARY OF TECHNOLOGY-ASSISTED REVIEW
FEDERAL COURTS LAW REVIEW Volume 7, Issue 1 (2013)
Predictive Coding: An industry-specific term generally used to describe a
Technology-Assisted Review process involving the use of a Machine Learning Algorithm to distinguish Relevant from Non-Relevant Documents, based on Subject Matter Expert(s)’ Coding of a Training Set of Documents.
![Page 11: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/11.jpg)
“A word is not a crystal, transparent and unchanged, it is the skin of a living thought and may vary greatly in color and content according to the circumstances and the time in which it is used.”
Justice Oliver Wendell Holmes Jr., Towne v. Eisner, 245 U.S. 418, 425 (1918)
THE GROSSMAN-CORMACK GLOSSARY OF TECHNOLOGY-ASSISTED REVIEW
FEDERAL COURTS LAW REVIEW Volume 7, Issue 1 (2013) Foreword by John M. Facciola, U.S. Magistrate Judge
![Page 13: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/13.jpg)
![Page 14: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/14.jpg)
![Page 15: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/15.jpg)
“I think you should be more explicit here in step two.”
![Page 16: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/16.jpg)
Published as guest contributor to Ralph Losey’s E-Discovery Team Blog Site:
http://e-discoveryteam.com/2013/04/28/predictive-codings-erroneous-zones-are-emerging-junk-science/?shareadraft=517d80048f827
“Predictive Coding’s Erroneous Zones Are Emerging Junk Science”
![Page 17: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/17.jpg)
“Predictive Coding’s Erroneous Zones Are Emerging Junk Science”
• “PBS’ Frontline’s Forensic Tools: What’s Reliable and What’s Not-So-Scientific dispelled the infallibility, and in some instances, the validity, of analytical techniques long relied upon by our legal profession.”
• “Even if those techniques were not botched or biased, their validity ranges from bought-and-paid-for infomercials to, at best, an approximation.”
• “Back then attorneys and judges (and experts and vendors) did with those junk sciences just what we are doing now with respect to predictive coding: allowing claims, however unjustified and erroneous, to form the basis of our practices, to influence our precedent and to accrue authority.”
![Page 18: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/18.jpg)
“[T]hose of us who trust the scientific and adversarial process recognize that erroneous claims don’t naturally defeat truth. They suppress truth, distract from truth and sometimes persist so long that we forget to inquire into the truth. Oftentimes, weak interests seek to dispel erroneous claims which are promoted by strong commercial interests. With respect to predictive coding my sense is that we are neither deluded nor deceptive — well, not too much anyway — but we just have not yet thought it through.”
“Predictive Coding’s Erroneous Zones Are Emerging Junk Science”
![Page 19: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/19.jpg)
“[T]hose of us who trust the scientific and adversarial process recognize that erroneous claims don’t naturally defeat truth. They suppress truth, distract from truth and sometimes persist so long that we forget to inquire into the truth. Oftentimes, weak interests seek to dispel erroneous claims which are promoted by strong commercial interests. With respect to predictive coding my sense is that we are neither deluded nor deceptive — well, not too much anyway — but we just have not yet thought it through.”
“Predictive Coding’s Erroneous Zones Are Emerging Junk Science”
![Page 20: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/20.jpg)
Erroneous Practice #1
Using a full-text search to identify prospectively responsive documents and then employing predictive coding to eliminate those that are not responsive.
Erroneous Practice #2
Pulling a random sample of documents to train the initial seed set.
Erroneous Practice #3
Identifying “magic numbers” of minimum:• “Iterations”• Responsive documents within a
randomly accumulated setErroneous Practice #4
Asserting that Predictive Coding software is the “gold standard” for document retrieval in complex matters.
“Predictive Coding’s Erroneous Zones Are Emerging Junk Science”
![Page 21: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/21.jpg)
Erroneous Practice #4
Asserting that Predictive Coding software is the “gold standard” for document retrieval in complex matters.
Is Erroneous Because
It asserts that predictive coding is a standard:• Share some commonly understood
characteristics but no precise attributes• Involves some general methodologies but no
clear rules• Are associated with general aspirations but
no comprehensively defined operations.Example All advertisements or orders for “predictive
coding”
“Predictive Coding’s Erroneous Zones Are Emerging Junk Science”
![Page 22: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/22.jpg)
Erroneous Practice #4
Asserting that Predictive Coding software is the “gold standard” for document retrieval in complex matters.
Is Erroneous Because
It asserts that predictive coding is a standard:• Share some commonly understood
characteristics but no precise attributes• Involves some general methodologies but no
clear rules• Are associated with general aspirations but
no comprehensively defined operations.Example All advertisements or orders for “predictive
coding”
“Predictive Coding’s Erroneous Zones Are Emerging Junk Science”
![Page 23: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/23.jpg)
Gold Standard vs “Standard”
![Page 24: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/24.jpg)
Erroneous Practice #2
Pulling a random sample of documents to train the initial seed set.
Is Erroneous Because
A. Looks for relevance in all the wrong places: Thoughtful researchers don’t try learn about relevant docs by examining irrelevant ones.
B. It turns a blind eye to what is staring you in the eye: denies that attorneys know what they are paid to know: where to look and what to find.
C. Measures the wrong stuff: • Constrained and circular “like” definition• Prevalence vs Relevance vs Probativeness
Example Global Aerospace v. Landow Aviation (settled without court ruling re strategy)
“Predictive Coding’s Erroneous Zones Are Emerging Junk Science”
![Page 25: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/25.jpg)
Erroneous Practice #2
Pulling a random sample of documents to train the initial seed set.
Is Erroneous Because
A. Looks for relevance in all the wrong places: Thoughtful researchers don’t try learn about relevant docs by examining irrelevant ones.
B. It turns a blind eye to what is staring you in the eye: denies that attorneys know what they are paid to know: where to look and what to find.
C. Measures the wrong stuff: • Constrained and circular “like” definition• Prevalence vs Relevance vs Probativeness
Example Global Aerospace v. Landow Aviation (settled without court ruling re strategy)
“Predictive Coding’s Erroneous Zones Are Emerging Junk Science”
![Page 27: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/27.jpg)
Erroneous Practice #1
Using a full-text search to identify prospectively responsive documents and then employing predictive coding to eliminate those that are not responsive.
Is Erroneous Because
A.Over-relies and under-delivers: presumed arrogance or clairvoyance
B.It arbitrarily places documents out-of-sight and, therefore, out-of-mind: likelihood that responsive documents will ever be produced but dumbing-down the predictive coding intelligence
Example In re: Biomet M2a Magnum Hip Implant Prods. Liab. Litig. (endorsed by court)
“Predictive Coding’s Erroneous Zones Are Emerging Junk Science”
![Page 28: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/28.jpg)
Erroneous Practice #1
Using a full-text search to identify prospectively responsive documents and then employing predictive coding to eliminate those that are not responsive.
Is Erroneous Because
A.Over-relies and under-delivers: presumed arrogance or clairvoyance
B.It arbitrarily places documents out-of-sight and, therefore, out-of-mind: likelihood that responsive documents will ever be produced but dumbing-down the predictive coding intelligence
Example In re: Biomet M2a Magnum Hip Implant Prods. Liab. Litig. (endorsed by court)
“Predictive Coding’s Erroneous Zones Are Emerging Junk Science”
![Page 30: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/30.jpg)
Erroneous Practice #3 Identifying “magic numbers” of minimum:• “Iterations”• Responsive documents within a randomly
accumulated setIs Erroneous Because A.You may not be able to get there from here:
Don’t know starting point or ending pointB.You don’t know what isn’t yet known: Cannot
predict alternative pathsC. Consider low frequency, high probativenessD.Who’s the witness?
Example • “This [iteration] process shall be repeated for a total of seven iterations… [Requesting party pays] costs and fees… [for] more 40,000 documents.” (DaSilva Moore)• Vendors’ affidavits in various matters
“Predictive Coding’s Erroneous Zones Are Emerging Junk Science”
![Page 31: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/31.jpg)
Erroneous Practice #3 Identifying “magic numbers” of minimum:• “Iterations”• Responsive documents within a randomly
accumulated setIs Erroneous Because A.You may not be able to get there from here:
Don’t know starting point or ending pointB.You don’t know what isn’t yet known: Cannot
predict alternative pathsC. Consider low frequency, high probativenessD.Who’s the witness?
Example • “This [iteration] process shall be repeated for a total of seven iterations… [Requesting party pays] costs and fees… [for] more 40,000 documents.” (DaSilva Moore)• Vendors’ affidavits in various matters
“Predictive Coding’s Erroneous Zones Are Emerging Junk Science”
![Page 32: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/32.jpg)
May not be able to get there even with a “Magic” number of steps…
![Page 33: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/33.jpg)
Erroneous Practice #1
Using a full-text search to identify prospectively responsive documents and then employing predictive coding to eliminate those that are not responsive.
Erroneous Practice #2
Pulling a random sample of documents to train the initial seed set.
Erroneous Practice #3
Identifying “magic numbers” of minimum:• “Iterations”• Responsive documents within a
randomly accumulated setErroneous Practice #4
Asserting that Predictive Coding software is the “gold standard” for document retrieval in complex matters.
“Predictive Coding’s Erroneous Zones Are Emerging Junk Science”
![Page 34: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/34.jpg)
Search Mechanisms’ InferencesIn
fere
nces
(ris
k) re
reca
ll
Search Mechanism
Databases
Files, Folders(in place)
End-usertags
Files, Folders(per user)
Duplicates
“Technology Assisted Review”
via Machine Learning
E-mail threading and “Near” Duplicates
Key words
Random Sampling
Similarity/Clusters Sorting
Similarity
Clustering
![Page 35: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/35.jpg)
Your Notes
_________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
![Page 36: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/36.jpg)
Your Notes
_________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
![Page 37: Demystifying Advanced Technologies to Find Solutions that Work](https://reader036.vdocument.in/reader036/viewer/2022062513/55632f46d8b42a57348b53bc/html5/thumbnails/37.jpg)
Your Notes
_________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________