Predictive Coding Gets A Chance

by Dechert LLP

Bexis attended the annual spring meeting last week.  PLAC meetings are almost always good for at least one blog post.  This is it.

In the high-tech morass that is ediscovery, parties have tried various ways to do something about the disparity between cost and benefit.  An approach is to attempt to use new technology to fix – or at least ameliorate – the problems caused by the explosion in electronic information caused by existing technology.

One such proposed technological fix is called “predictive coding.”  Googling that phrase yields far more technical information than we could possibly provide (or maybe even understand), so in the nutshell of a very small nut, predictive coding takes advantage of artificial intelligence software that enables a computer to learn from its mistakes and adjust its processes accordingly.  The need for attorneys to review produced edocuments is a major aspect of excessive ediscovery cost.

Predictive coding can reduce that cost by using computers to extrapolate actual attorney review of a small subset (a “seed set”) of edocuments over the entire proposed production of documents.  The attorneys review the seed set – then the computer does a similar set of documents based upon the attorney coding.  The attorneys review that set and correct errors.  The computer does another set, having incorporated the attorney’s revisions.  That review process is repeated however many times, until everyone is satisfied the error rate (both false positive and false negative) is acceptable.  The vendors claim predictive coding ultimately makes fewer mistakes than review by actual human attorneys.  Take those financially interested claims with however many grains of salt you believe they deserve.

But until recently, no court anywhere had authorized the use of predictive coding in actual ediscovery.  Now that’s changed.  A presentation we heard at the PLAC spring meeting last week (by David Cohen of Reed Smith), mentioned four decisions in three cases where predictive coding had been judicially authorized as an ediscovery tool.

The oldest of them was decided less than three months ago.  In Moore v. Publicis Groupe, ___ F. Supp.2d ___, 2012 WL 607412 (Mag. S.D.N.Y. Feb. 24, 2012), a magistrate judge declared “that computer-assisted review is an acceptable way to search for relevant ESI in appropriate cases.”  Id. at *1.  In Moore the parties had initially “agreed” to use predictive coding, but disputes then (predictably?) arose, requiring judicial resolution.  Perhaps not so coincidentally, the magistrate before whom that agreement was reached had personally written an article on the benefits of predictive coding, id. at *2, and lawyers sure pay attention when their judge starts quoting his/her own articles.

Be that as it may, the first Moore decision (yes, there’s Moore to follow), established some guidelines for the use of predictive coding:

  • “[P]roportionality requires consideration of results as well as costs. And if stopping at 40,000 [out of a universe of 3 million documents] is going to leave a tremendous number of likely highly responsive documents unproduced, [the defendant’s] proposed cutoff doesn't work.”  Id. at *3.
  • A seed set of 2400 documents would be culled and reviewed until the predictive coding process reached a level of 95% confidence – that the documents generated by the program were responsive.  Id. at *5.
  • “[A]ll of the documents [in] the seed set, whether . . . ultimately coded relevant or irrelevant, aside from privilege, will be turned over to” plaintiffs.  Id.  Both sides could code documents in the seed set. Id.
  • The seed set coding itself involved two processes:  a keyword code and “judgmental sampling” – the latter to be performed by “senior attorneys” who would not otherwise be conducting a manual document review.  Id.
  • The number of training iterations for the computer was initially set at seven, with the possibility of more if the results had not stabilized.  Id. at *6.
  • Predictive coding was accurate enough to support a certification that a disclosure was “complete and correct” under Fed. R. Civ. P. 26(g)(1)(A).  Id. at *7.
  • Daubert requirements do not apply to a determination of the validity of an ediscovery method.  Id.
  • Accuracy concerns would be addressed “down the road” by reviewing documents from that seed set that the predictive coding system had judged irrelevant.  Id. at *8.  If the system was deeming “hot documents” to be “irrelevant,” then the software would have to be “retrained” or “some other search method employed.”  Id.

The first Moore opinion stated that it was dealing with the “easy” case, where both sides agreed to the use of predictive coding.  In dictum the magistrate addressed the criteria that a “harder” case – where one side did not want predictive coding at all – would entail:

"The question to ask in that situation is what methodology would the requesting party suggest instead?  Linear manual review is simply too expensive where, as here, there are over three million emails to review.  Moreover, while some lawyers still consider manual review to be the “gold standard,” that is a myth, as statistics clearly show that computerized searches are at least as accurate, if not more so, than manual review. . . .  [O]n every measure, the performance of [predictive coding] was at least as accurate (measured against the original review) as that of human re-review."

2012 WL 607412, at *9 (citation and quotation marks omitted).

The first Moore opinion closed with four “lessons for the future”:  (1) judicial approval of predictive coding is necessarily tentative in any given case because how well such processes work can only be determined by their results; (2) for predictive coding to work requires staged discovery; (3) counsel need to get relevant information ahead of time from their clients’ knowledge of the producing party’s records; and (4) ediscovery vendors should participate in hearings concerning predictive coding.  Id. at *12.

As the first Moore opinion was by a magistrate, it carried with it a right of appeal to the relevant federal district court.  Fed. R. Civ. P. 72(a).  Rather than go through with predictive coding, the plaintiffs in Moore took such an appeal – indeed, once away from the magistrate (and his published article) they appeared to revoke their consent to predictive coding altogether.  In Moore v. Publicis Groupe SA, 2012 WL 1446534 (S.D.N.Y. April 26, 2012), the district court affirmed – just in time for the PLAC spring meeting.

The court affirmed the magistrate in all respects – except that it didn’t matter whether the plaintiffs agreed to predictive coding or not, since the procedures adequately protected their rights:

"[T]he confusion [over plaintiff’s consent] is immaterial because the ESI protocol contains standards for measuring the reliability of the process and the protocol builds in levels of participation by Plaintiffs.  It provides that the search methods will be carefully crafted and tested for quality assurance, with Plaintiffs participating in their implementation. . . .  If there is a concern with the relevance of the culled documents, the parties may raise the issue before [the magistrate] before the final production.  Further, upon the receipt of the production, if Plaintiffs determine that they are missing relevant documents, they may revisit the issue of whether the software is the best method."

Moore II, 2012 WL 1446534, at *2.  The reliability of predictive coding can only be determined by looking at its results.  Thus, it is “premature” and “speculative” to raise reliablity concerns before the system is tested in practice.  If problems arise, then “the parties are allowed to reconsider their methods.”  Id.

In the end, the court in Moore II cautioned that perfection cannot be allowed to become the enemy of the good.  Proportionality has a role to play in determining how ediscovery is to be conducted:

"There simply is no review tool that guarantees perfection. . . .  [T]here are risks inherent in any method of reviewing electronic documents.  Manual review with keyword searches is costly, though appropriate in certain situations.  However, even if all parties here were willing to entertain the notion of manually reviewing the documents, such review is prone to human error and marred with inconsistencies from the various attorneys’ determination of whether a document is responsive."

2012 WL 1446534, at *3.

The PLAC presentation also indicated that predictive coding had been approved in the case of Kleen Products LLC et al v. Packaging Corporation of America, 1:10-cv-05711 (N.D. Ill.), earlier that very week (that is to say, last week, now).  We have a PACER account, and we’re not afraid to use it, so we looked up the docket for that case.  Unfortunately, we can’t confirm or deny approval of predictive coding in Kleen.  That’s because no order appears on PACER.  PACER does, however, indicate that a discovery hearing was held on April 20, 2012, so it’s likely that an oral decision occurred, and the parties are still working out the terms of the order.  There’s also a “transcript” entry in the docket, but it was not accessible through PACER, so all we can say at this point is that it exists.

Finally, a state court, in Virginia, has recently authorized predictive coding.  See Global Aerospace Inc. v. Landow Aviation, 2012 WL 1431215 (Va. Cir. Loudoun Co. April 23, 2012) (“it is hereby ordered Defendants shall be allowed to proceed with the use of predictive coding for purposes of the processing and production of electronically stored information, with processing to be completed with 60 days and production to follow as soon as practicable and in no more than 60 days”).  That’s it, however – no rationale is provided.

As far as we know (we ran a search just to be sure) these cases represent the sum total of all judicial opinions that have ever discussed predictive coding.  But then, several months ago, there was exactly zero.  It’s a fast moving field. Stay tuned.

Finally, why should we care?

Two reasons – from a defense perspective.  Number one, it promises to be a hell of a lot cheaper than manual review.  High costs = high nuisance value = higher settlements = more incentive for the other side to bring more meritless lawsuits.  Not to mention that we should, as a general principle, try to save our clients money whenever we can.  Number two, ediscovery’s incredibly complicated, and things can go awry.  If things go awry, defendants (and their lawyers) tend to get blamed, because we’re almost always the producing party.  But if a court orders descriptive coding and things go awry….  At that point it’s harder to blame (and/or sanction) us because we’re doing exactly what the court ordered us to do.  Thus, plaintiffs are less able to litigate ediscovery and instead must contend with (see number one) the merits of their lawsuits. The bottom line (we hope) is the same, that being fewer meritless lawsuits.

DISCLAIMER: Because of the generality of this update, the information provided herein may not be applicable in all situations and should not be acted upon without specific legal advice based on particular situations.

© Dechert LLP | Attorney Advertising

Written by:

Dechert LLP

Dechert LLP on:

Readers' Choice 2017
Reporters on Deadline

"My best business intelligence, in one easy email…"

Your first step to building a free, personalized, morning email brief covering pertinent authors and topics on JD Supra:
Sign up using*

Already signed up? Log in here

*By using the service, you signify your acceptance of JD Supra's Privacy Policy.
Custom Email Digest
Privacy Policy (Updated: October 8, 2015):

JD Supra provides users with access to its legal industry publishing services (the "Service") through its website (the "Website") as well as through other sources. Our policies with regard to data collection and use of personal information of users of the Service, regardless of the manner in which users access the Service, and visitors to the Website are set forth in this statement ("Policy"). By using the Service, you signify your acceptance of this Policy.

Information Collection and Use by JD Supra

JD Supra collects users' names, companies, titles, e-mail address and industry. JD Supra also tracks the pages that users visit, logs IP addresses and aggregates non-personally identifiable user data and browser type. This data is gathered using cookies and other technologies.

The information and data collected is used to authenticate users and to send notifications relating to the Service, including email alerts to which users have subscribed; to manage the Service and Website, to improve the Service and to customize the user's experience. This information is also provided to the authors of the content to give them insight into their readership and help them to improve their content, so that it is most useful for our users.

JD Supra does not sell, rent or otherwise provide your details to third parties, other than to the authors of the content on JD Supra.

If you prefer not to enable cookies, you may change your browser settings to disable cookies; however, please note that rejecting cookies while visiting the Website may result in certain parts of the Website not operating correctly or as efficiently as if cookies were allowed.

Email Choice/Opt-out

Users who opt in to receive emails may choose to no longer receive e-mail updates and newsletters by selecting the "opt-out of future email" option in the email they receive from JD Supra or in their JD Supra account management screen.


JD Supra takes reasonable precautions to insure that user information is kept private. We restrict access to user information to those individuals who reasonably need access to perform their job functions, such as our third party email service, customer service personnel and technical staff. However, please note that no method of transmitting or storing data is completely secure and we cannot guarantee the security of user information. Unauthorized entry or use, hardware or software failure, and other factors may compromise the security of user information at any time.

If you have reason to believe that your interaction with us is no longer secure, you must immediately notify us of the problem by contacting us at In the unlikely event that we believe that the security of your user information in our possession or control may have been compromised, we may seek to notify you of that development and, if so, will endeavor to do so as promptly as practicable under the circumstances.

Sharing and Disclosure of Information JD Supra Collects

Except as otherwise described in this privacy statement, JD Supra will not disclose personal information to any third party unless we believe that disclosure is necessary to: (1) comply with applicable laws; (2) respond to governmental inquiries or requests; (3) comply with valid legal process; (4) protect the rights, privacy, safety or property of JD Supra, users of the Service, Website visitors or the public; (5) permit us to pursue available remedies or limit the damages that we may sustain; and (6) enforce our Terms & Conditions of Use.

In the event there is a change in the corporate structure of JD Supra such as, but not limited to, merger, consolidation, sale, liquidation or transfer of substantial assets, JD Supra may, in its sole discretion, transfer, sell or assign information collected on and through the Service to one or more affiliated or unaffiliated third parties.

Links to Other Websites

This Website and the Service may contain links to other websites. The operator of such other websites may collect information about you, including through cookies or other technologies. If you are using the Service through the Website and link to another site, you will leave the Website and this Policy will not apply to your use of and activity on those other sites. We encourage you to read the legal notices posted on those sites, including their privacy policies. We shall have no responsibility or liability for your visitation to, and the data collection and use practices of, such other sites. This Policy applies solely to the information collected in connection with your use of this Website and does not apply to any practices conducted offline or in connection with any other websites.

Changes in Our Privacy Policy

We reserve the right to change this Policy at any time. Please refer to the date at the top of this page to determine when this Policy was last revised. Any changes to our privacy policy will become effective upon posting of the revised policy on the Website. By continuing to use the Service or Website following such changes, you will be deemed to have agreed to such changes. If you do not agree with the terms of this Policy, as it may be amended from time to time, in whole or part, please do not continue using the Service or the Website.

Contacting JD Supra

If you have any questions about this privacy statement, the practices of this site, your dealings with this Web site, or if you would like to change any of the information you have provided to us, please contact us at:

- hide
*With LinkedIn, you don't need to create a separate login to manage your free JD Supra account, and we can make suggestions based on your needs and interests. We will not post anything on LinkedIn in your name. Or, sign up using your email address.