Predictive Coding Slowly Becoming a Game Changer

by Maynard Nexsen

Two Years After Da Silva Moore

In 2012, Magistrate Judge Andrew Peck of the Southern District of New York approved the use of predictive coding (also called “technology assisted review” or “TAR”) in Da Silva Moore v. Publicis Groupe to search for relevant information. Predictive coding enables attorneys to categorize large volumes of information more efficiently than traditional linear document review.

Since Da Silva Moore, predictive coding is slowing gaining traction as a discovery tool that can help “secure the just, speedy, and inexpensive determination of every action and proceeding.”

What is Predictive Coding?

Predictive coding involves using computer algorithms to determine which documents are relevant based on a review of test documents by attorneys. The attorneys review and code an initial subset of documents (commonly referred to as the “seed set”) to “train” the computer. The computer “learns” what is relevant from the attorney coding and applies it to the remaining documents. The attorneys then manually review sample responsive and non-responsive results to determine whether the computer review reached a predetermined confidence level.

Attorneys are using predictive coding (1) to replace manual review typically performed by contract attorneys; (2) to cull down the volume of documents for manual review; and/or (3) to prioritize finding key documents in third party document productions.

Tangible Cost Savings and Efficiencies

Using predictive coding to review a large volume of documents can potentially result in substantial cost savings.

It is estimated that seventy cents of every dollar spent during discovery is for the expense of manually reviewing potentially relevant information.But, predictive coding may decrease those costs by as much seventy-five percent.

Technology assisted review shrinks the amount of data requiring human manual review. In 2012, the Department of Justice (“DOJ”) investigated the proposed Anheuser-Busch InBev NV – Grupo Modelo SAB merger. According to a Wall Street Journal article, the DOJ approved the parties’ utilization of predictive coding to conduct the review of over a million documents. The parties estimated that they saved fifty percent by using predictive coding over traditional manual review.

Technology assisted review (“TAR”) allows parties to review and produce relevant information more efficiently. It usually takes months for attorneys to review hundreds of thousands of documents. Using TAR, one major law firm reviewed 570,000 documents and identified 3,070 relevant documents in just three days to comply with a court order. Shortening the discovery period can ultimately help trim discovery costs, and in addition, the parties can focus on the merits much earlier in the litigation.

TAR can enable smaller firms to compete in litigation with massive discovery. Smaller firms often lack the financial and human resources to manually review large volumes of documents—but TAR is creating a more even playing field. For example, a California firm with twenty-five attorneys installed predictive coding software in-house to review eleven million pages of records in an environmental case with a large national law firm representing the opposing side. The firm now uses predictive coding in a quarter of its cases.

Some contend that predictive coding—when used appropriately—is also more accurate than manual document review. Earlier this year, Judge Denise Cote of the Southern District of New York recognized the reliability of predictive coding:

I think there’s every reason to believe that, if [predictive coding is] done correctly, it may be more reliable — not just as reliable but more reliable than manual review, and certainly more cost effective — cost effective for the plaintiff and the defendants.

Studies show that there is great variability in human subjective relevance judgments. In addition, human fatigue or distraction leads to errors during manual review. Predictive coding, in comparison, offers more consistency.

In light of the perceived benefits associated with TAR, the Sedona Best Practices stated last year:

A consensus is forming in the legal community that human review of documents in discovery is expensive, time consuming, and error-prone. There is also a growing awareness that, used correctly, linguistic and mathematically-based content analysis, embodied in new forms of search and retrieval technologies, tools, techniques, and processes in support of the review function, can effectively reduce litigation cost, time, and error rates.

Acceptance by Courts and Government Agencies

Federal or state courts in New York, Tennessee, Virginia, Georgia, and Delaware have approved predictive coding to search for relevant electronic discovery since Da Silva Moore. In fact, some courts have encouraged or even compelled the application of predictive coding during discovery. As a result, parties are more frequently employing predictive coding to retrieve relevant information.

Government agencies have also accepted predictive coding as an appropriate means for parties to search for relevant information requested during a government investigation. The DOJ confirmed earlier this year that it had negotiated or was in the process of negotiating approximately a dozen TAR protocols. The DOJ is also expected to release a “model TAR protocol” in the near future. The Securities & Exchange Commission (“SEC”) is now deploying Recommind’s Axcelerate® Review & Analysis which has predictive coding capabilities. TAR may be allowed by the SEC subject to the approval of its staff. The Federal Trade Commission (“FTC”) also approves predictive coding on “case-by-case basis.”

The Cowen Group released a report last year on the use of predictive coding by AmLaw 200 law firms. Sixty-two percent of the firms surveyed used predictive coding during the previous year while eighty-two percent expected to employ predictive coding within the next six months. In addition, eighty-one percent of firms experienced increased client requests for predictive coding.

Attorneys still waver on TAR because of their unfamiliarity with the technology or a perceived lack of court guidance. However, many attorneys might also be surprised just how many courts and government agencies have endorsed or accepted predictive coding in the last couple of years.

Practical Considerations for Firms Considering TAR Tools

Predictive coding “is not a magic, Staples-Easy-Button, solution appropriate for all cases.” The technology is expensive and the coding process requires attorneys to commit to training the computer on relevancy. Because of the upfront expense and time requirements, predictive coding is most likely to provide real cost savings and efficiencies when used for large volume document reviews.
TAR will inevitably result in the inadvertent production of privileged documents—not unlike manual review performed by humans. Although studies claim TAR is as (or more) reliable than manual review, attorneys are still reluctant to produce documents without at least one round of attorney review. Federal Rule of Evidence 502 affords protections for inadvertently produced documents. Rule 502(d) in particular allows for a court to enter an order that provides “for return of the documents without waiver irrespective of the care taken by the disclosing party” according to the Advisory Committee Explanatory Note to Rule 502. Attorneys should consider Rule 502 protections to alleviate their privilege concerns.
TAR is more than just a cost savings tool. Forty-seven percent of the firms surveyed by the Cowen Group used predictive coding to accelerate attorney learning of cases. Specifically, TAR can identify and rank the most the relevant documents so that attorneys can focus on those documents. Attorneys also apply TAR to assist with prioritizing important third party documents. Firms might rely on TAR in their early case assessment to help estimate the risks and costs of pursuing different legal strategies.
There is some debate as to the extent parties must communicate with the other side about TAR strategies and whether that transparency infringes on attorney work product. For example, are parties obligated to inform the opposing side that they are using predictive coding? Is approval from the opposing side necessary? Is the opposing side entitled to review the seed set? The Northern District of Illinois observed that under the Sedona Principles, “[r]esponding parties are best situated to evaluate the procedures, methodologies, and techniques appropriate for preserving and producing their own electronically stored information.” But other courts have emphasized transparency and cooperation when using TAR. For producing parties, obtaining early agreement on the parameters for the predictive coding process can provide comfort that their methodology will not be challenged.