why i love predictive coding - 2017 · i hope you get the opportunity to experience this for...
Post on 21-Aug-2020
1 Views
Preview:
TRANSCRIPT
WHYILOVEPREDICTIVECODING
MakingdocumentreviewfunagainwithMr.EDRandPredictiveCoding4.0
RalphLosey*e-DiscoveryTeam
Manylawyersandtechnologistslikepredictivecodingandrecommendittotheircolleagues.Theyhavegoodreasons.Ithasworkedforthem.Ithasallowedthemtodoe-discoveryreviewsinaneffective,costefficientmanner,especiallythebigprojects.Thatistrueformetoo,butthatisnotwhyIlovepredictivecoding.Myfeelingscomefromtheexcitement,fun,andamazementthatoftenarisefromseeingitinaction,firsthand.IlovewatchingthepredictivecodingfeaturesinmysoftwarefinddocumentsthatIcouldneverhavefoundonmyown.IlovethewaytheAIinthesoftwarehelpsmetodotheimpossible.IreallylovehowitmakesmefarsmarterandskilledthanIreallyam.
IhavebeengettingthosekindsofpositivefeelingsconsistentlybyusingthelatestPredictiveCoding4.0methodology(shownright)andKrolLDiscovery’slatesteDiscovery.comReviewsoftware(“EDR”).Sotoohavemye-DiscoveryTeammemberswhohelpedmetoparticipateinTREC2015and2016(thegreatscienceexperimentforthelatesttextsearchtechniquessponsoredbytheNationalInstituteofStandardsandTechnology).Duringourgruelingforty-fivedaysofexperimentsin2015,andagainforsixtydaysin2016,wecametoadmiretheintelligenceofthenewEDRsoftwaresomuchthatwedecidedtopersonalizetheAIasarobot.WenamedhimMr.EDRoutofrespect.Heevenhashisownwebsitenow,MrEDR.com,whereheexplainshowhehelpedmye-DiscoveryTeaminthe2015and2015TRECTotalRecallTrackexperiments.
*Thisisaneditedreprintoftheauthor’spersonalblog,e-discoveryteam.com,andcontainshispersonalopinionsandnotthoseofhislawfirmoritsclients.CopyrightRalphLosey2015,2017.Referencetoanyproductsshouldnotbeconstruedasacommercialendorsement.
2
Bottomlineforusfromthisresearchwastoproveandimproveourmethods.Ourlatestversion4.0ofPredictiveCoding,HybridMultimodalISTMethodistheresult.Wehaveevenopen-sourcedthismethod,wellmostofit,andteachitinafreeseventeen-classonlineprogram:TARcourse.com.Asidefromtestingandimprovingourmethods,another,perhapsevenmoreimportantresultofTRECforuswasourrediscoverythatwithgoodteamwork,andgoodsoftwarelikeMr.EDRatyourside,documentreviewneedneverbeboringagain.Thedocumentsthemselvesmaywellbeboringashell,that'sanothermatter,butthesearchforthemneednotbe.
HowandWhyPredictiveCodingisFun
StepsFour,FiveandSixofthestandardeight-stepworkflowforPredictiveCoding4.0iswhereweworkwiththeactivemachine-learningfeaturesofMr.EDR.Theseareitspredictivecodingfeatures,atypeofartificialintelligence.Wetrainthecomputeronourconceptionofrelevancebyshowingitrelevantandirrelevantdocumentsthatwehavefound.Thesoftwareisdesignedtothengooutandfindallotherrelevantdocumentsinthetotaldataset.Oneoftheskillswelearniswhenwehavetaughtenoughandcanstopthetrainingandcompletethedocumentreview.AtTRECwecallthattheStopdecision.Itisimportanttokeepdownthecostsofdocumentreview.
Weuseamultimodalapproachtofindtrainingdocuments,meaningweusealloftheothersearchfeaturesofMr.EDRtofindrelevantESI,suchaskeywordsearches,similarityandconcept.Weiteratethetrainingbysampledocuments,bothrelevantandirrelevant,untilthecomputerstartstounderstandthescopeofrelevancewehaveinmind.ItisatrainingexercisetomakeourAIsmart,togetittounderstandthebasicideasofrelevanceforthatcase.ItusuallytakesmultipleroundsoftrainingforMr.EDRtounderstandwhatwehaveinmind.Butheisafastlearner,andbyusingthelatesthybridmultimodalIST("intelligentlyspacedlearning")techniques,wecanusuallycompletehistraininginafewdays.AtTREC,whereweweremovingfastafterhourswiththeÃ-Team,wecompletedsomeofthetrainingexperimentsinjustafewhours.
AfterawhileMr.EDRstartsto“getit,”hestartstoreallyunderstandwhatweareafter,whatwethinkisrelevantinthecase.Thatiswhenahappyshockandawetypemomentcanhappen.ThatiswhenMr.EDR’sintelligenceandsearchabilitiesstarttoexceedourown.Yes.Ithappens.Thepupilthenstartstoevolvebeyondhisteachers.Thesmartalgorithmsstarttoseepatternsandfindevidenceinvisibletous.Atthatpointwesometimesevenlethimtrainhimselfbyautomaticallyacceptinghistop-rankedpredictedrelevantdocumentswithoutevenlookingatthem.Ourmainrolethenistodetermineagoodrangefortheautomaticacceptanceanddosomespot-checking.Weare,ineffect,allowingMr.EDRtotakeoverthereview.Ohwhatafeelingtothenwatchwhathappens,toseehimkeepfindingnewrelevantdocumentsandkeep
3
gettingsmarterandsmarterbyhisownself-programming.ThatisthespecialAI-highthatmakesitsomuchfuntoworkwithPredictiveCoding4.0andMr.EDR.
Itdoesnothappenineveryproject,butwiththenewPredictiveCoding4.0methodsandthelatestMr.EDR,weareseeingthiskindoftransformationhappenmoreandmoreoften.ItisatippingpointinthereviewwhenweseeMr.EDRgobeyondus.Hestartstounearthrelevantdocumentsthatmyteamwouldneverevenhavethoughttolookfor.Therelevantdocumentshefindsaresometimescompletelydissimilartoanyotherswefoundbefore.Theydonothavethesamekeywords,oreventhesameknownconcepts.Still,Mr.EDRseespatternsinthesedocumentsthatwedonot.Hecanfindthehiddengemsofrelevance,evenoutliersandblackswans,iftheyexist.Whenhestartstotrainhimself,thatisthepointinthereviewwhenwethinkofMr.EDRasgoingintosuperheromode.Atleast,thatisthewaymyyounge-DiscoveryTeammemberslikestotalkabouthim.
BytheendofmanyprojectsthealgorithmicfunctionsofMr.EDRhaveattainedahigherintelligenceandskilllevelthanourown(atleastonthetaskoffindingtherelevantevidenceinthedocumentcollection).Heisalwayslightningfastandinexhaustible,evenuntrained,butbytheendofhistraining,hebecomesasearchgenius.WatchingMr.EDRinthatkindofsuperheromodeiswhatmakesPredictiveCoding4.0apleasure.
TheEmpowermentofAIAugmentedSearch
ItishardtodescribethecombinationofprideandexcitementyoufeelwhenMr.EDR,yourstudent,takesyourtrainingandthengoesbeyondyou.Morethanthat,thesuper-AIyoucreatedthenempowersyoutodothingsthatwouldhavebeenimpossiblebefore,absurdeven.Thatfeelsprettygoodtoo.YoumaynotbeIronMan,orlooklikeRobertDowney,butyouwillbecapableofremarkablefeatsoflegalsearchstrength.
4
Forinstance,usingMr.EDRasourIronMan-likesuits,mye-discoveryÃ-Teamofthreeattorneyswasabletodothirtydifferentreviewprojectsandclassify17,014,085documentsin45days.See2015TRECexperimentsummaryatMr.EDR.Wedidtheseprojectsmostlyatnights,andonweekends,whileholdingdownourregularjobs.Whatmakesthiscrazyimpossible,isthatwewereabletoaccomplishthisbyonlypersonallyreviewing32,916documents.Thatislessthan0.2%ofthetotalcollection.Thatmeanswereliedonpredictivecodingtodo99.8%ofourreviewwork.Incredible,buttrue.
Usingtraditionallinearreviewmethodsitwouldhavetakenus45yearstoreviewthatmanydocuments!Instead,wediditin45days.Plusourrecallandprecisionrateswereinsanelygood.Weevenscored100%precisionand100%recallinoneTRECprojectin2015andtwomorein2016.Youreadthatright.Perfection.Manyofourotherprojectsattainedscoresinthehighandmidnineties.Wearenotsayingyouwillgetresultslikethat.Everyprojectisdifferent,andsomearemuchmoredifficultthanothers.ButwearesayingthatthiskindofAI-enhancedreviewisnotonlyfastandefficient,itiseffective.
Yes,it’sprettycoolwhenyourlittleAIcreationdoesalltheworkforyouandmakesyoulookgood.Still,norobotcoulddothiswithoutyourtrainingandsupervision.Weareateam,whichiswhywecallithybridmultimodal,manandmachine.
HavingFunwithScientificResearchatTREC2015and2016
Duringthe2015TRECTotalRecallTrackexperimentsmyteamwouldsometimesgettotallylostonafewofthereallyhardTopics.Wewerenotgivenlegalissuestosearch,asusual.Theywerearcanetechnicalhackerissues,politicalissues,orlocalnewsstories.Notonlywereweinnewfields,thescopeofrelevanceofthethirtyTopicswasneverreallyexplained.(Weweregivenonetothreewordexplanationsin2015,in2016wegotawholesentence!)WehadtofigureoutintendedrelevanceduringtheprojectbasedonfeedbackfromtheautomatedTRECdocumentadjudicationsystem.Wewouldhavesomelimitedunderstandingofrelevancebasedonoursuppositionsoftheinitialkeywordhints,andsowecouldbegintotrainMr.EDRwiththat.But,inseveralTopics,weneverhadanyrealunderstandingofexactlywhatTRECthoughtwasrelevant.
Thiswasaveryfrustratingsituationatfirst,but,andhereisthecoolthing,eventhoughwedidnotknow,Mr.EDRknew.That’sright.HesawtheTRECpatternsofrelevancehiddentousmeremortals.InmanyofthethirtyTopicswewouldjustsitbackandlethimdoallofthedriving,likeaGooglecar.Wewouldoftenjustcheerhimon(andeachother)astheTRECsystemskeptsayingMr.EDRwasright,thedocumentsheselectedwererelevant.Thetruthis,duringmuchofthe45daysofTRECwewerelikekidsinacandystorehavingagreat
5
time.ThatiswhenwedecidedtogiveMr.EDRacapeandsuperherostatus.Heneverletusdown.ItisagreatfeelingtocreateanAIwithgreaterintelligencethanyourownandthenseeitaugmentandimproveyourlegalwork.Itistrulyahybridhuman-machinepartnershipatitsbest.
Ihopeyougettheopportunitytoexperiencethisforyourselfsomeday.TheTRECexperimentsin2015and2016onrecallinpredictivecodingareover,butthesearchfortruthandjusticegoesoninlawsuitsacrossthecountry.Tryitonyournextdocumentreviewproject.
DoWhatYouLoveandLoveWhatYouDo
Mr.EDR,andothergoodpredictivecodingsoftwarelikeit,canaugmentourownabilitiesandmakeusincrediblyproductive.ThisiswhyIlovepredictivecodingandwouldnottradeitforanyotherlegalactivityIhaveeverdone(althoughIhavehadsimilarhighsfromoralargumentsthatwentgreat,ortherushthatcomesfromwinningabigcase).
TheexcitementofpredictivecodingcomesthroughclearlywhenMr.EDRisfullytrainedandabletocarryonwithoutyou.ItisakindofKurzweilianmini-singularityevent.Itusuallyhappensneartheendoftheproject,butcanhappenearlierwhenyourcomputercatchesontowhatyouwantandstartstofindthehiddengemsyoumissed.IsuggestyougivePredictiveCoding4.0andMr.EDRatry.TomakeiteasierIopen-sourcedourlatestmethodandcreatedanonlinecourse.TARcourse.com.Itwillteachanyoneourmethod,iftheyhavetherightsoftware.Learnthemethod,getthesoftwareandthenyoutoocanhavefunwithevidencesearch.Youtoocanlovewhatyoudo.Documentreviewneedneverbeboringagain.
Caution
Onenoteofcaution:moste-discoveryvendors,includingthelargest,donothaveactivemachinelearningfeaturesbuiltintotheirdocumentreviewsoftware.EventhefewthathaveactivemachinelearningdonotnecessarilyfollowtheHybridMultimodalISTPredictiveCoding4.0approachthatweusedtoattaintheseresults.Theyinsteadrelyentirelyonmachine-selecteddocumentsfortraining,orevenworse,relyentirelyonrandomselecteddocumentstotrainthesoftware,orhaveelaborateunnecessarysecretcontrolsets.
Thealgorithmsusedbysomevendorswhosaytheyhave"predictivecoding"or"artificialintelligence"arenotverygood.Scientiststellmethatsomeareonlydressed-upconceptsearch
6
orunsuperviseddocumentclustering.OnlybonafideactivemachinelearningalgorithmscreatethekindofAIexperiencethatIamtalkingabout.Softwarefordocumentreviewthatdoesnothaveanyactivemachinelearningfeaturesmaybecheap,andmaybepopular,buttheylackthepowerthatIlove.Withoutactivemachinelearning,whichisfundamentallydifferentfromjust"analytics,"itisnotpossibletoboostyourintelligencewithAI.Sobewareofsoftwarethatjustsaysithasadvancedanalytics.Askifithas"activemachinelearning"?
Itisimpossibletodothethingsdescribedinthisessayunlessthesoftwareyouareusinghasactivemachinelearningfeatures.Thisisclearlythewayofthefuture.ItiswhatmakesdocumentreviewenjoyableandwhyIlovetodobigprojects.Itturnsscarytofun.
So,ifyoutried"predictivecoding"or"advancedanalytics"before,anditdidnotworkforyou,itcouldwellbethesoftware’sfault,notyours.Oritcouldbethepoormethodyouwerefollowing.ThemethodthatwedevelopedinDaSilvaMoore,wheremyfirmrepresentedthedefense,wasaversion1.0method.DaSilvaMoorev.PublicisGroupe,287F.R.D.182,183(S.D.N.Y.2012).Wehavecomealongwaysincethen.Wehaveeliminatedunnecessaryrandomcontrolsetsandgonetocontinuoustraining,insteadoftrainthenreview.ThisisspelledoutintheTARcourse.comthatteachesourlatestversion4.0techniques.
Thenew4.0methodsarenothardtofollow.TheTARcourse.computsourmethodsonlineandeventeachesthetheoryandpractice.Andthe4.0methodscertainlywillwork.WehaveproventhatatTREC,butonlyifyouhavegoodsoftware.Withjustalittletraining,andsomehelpatfirstfromconsultants(mostvendorswithbonafideactivemachinelearningfeatureswillhavegoodonestohelp),youcanhavethekindofsuccessandexcitementthatIamtalkingabout.
Donotgiveupifitdoesnotworkforyouthefirsttime,especiallyinacomplexproject.Tryanothervendorinstead,onethatmayhavebettersoftwareandbetterconsultants.Also,besurethatyourconsultantsarePredictiveCoding4.0experts,andthatyoufollowtheiradvice.Finally,rememberthatthecheapestsoftwareisalmostneverthebest,and,inthelongrunwillcostyouasmallfortuneinwastedtimeandfrustration.
7
Conclusion
Lovewhatyoudo.Itisagreatfeelingandsurefirewaytojobsatisfactionandsuccess.Withthesenewpredictivecodingtechnologiesitiseasierthanevertolovee-discovery.Trythemout.TreatyourselftotheAIhighthatcomesfromusingsmartmachinelearningsoftwareandfastcomputers.Thereisnothingelselikeit.Ifyouswitchtothe4.0methodsandsoftware,youtoocanknowthatthrill.Youcanwatchanadvancedintelligence,whichyouhelpedcreate,exceedyourownabilities,exceedanyone’sabilities.YoucansitbackandwatchMr.EDRcompleteyoursearchforyou.Youcanwatchhimdosoinrecordtimeandwithrecordresults.Itisamazingtoseegoodsoftwarefinddocumentsthatyouknowyouwouldneverhavefoundonyourown.
PredictivecodingAIinsuperheromodecanbeexcitingtowatch.Whydepriveyourselfofthat?Whosaysdocumentreviewhastobeslowandboring?Startmakingthepracticeoflawfunagain.
__________
TheauthorcanbereachedatRalph.Losey@gmail.comoratworkatRalph.Losey@JacksonLewis.com.Consultationsbytheauthorrelatedtopredictivecoding,e-discoveryoranyotherfor-payservicesareprovidedexclusivelytocurrentclientsoftheauthor’slawfirm,JacksonLewisP.C.
top related