Court Permits Combination of Predictive Coding and Keyword Search

by Morgan Lewis

Focusing on precision rather than recall, district court finds that process complies with discovery obligations.

On April 18, the U.S. District Court for the Northern District of Indiana issued a discovery order in In re Biomet M2a Magnum Hip Implant Products Liability Litigation,[1] finding that defendant Biomet's discovery process, which included the combined use of keyword search and predictive coding, fulfilled its discovery obligations. However, the court accepted Biomet's reliance on precision measurements, rather than recall measurements, leading to a potentially substantial underestimation of what proportion of relevant documents Biomet produced.


In response to the plaintiffs' discovery demands, Biomet collected 6 terabytes of data and filtered the resulting 19.5 million documents with keyword searches to identify approximately 3 million documents for review.[2] They performed a predictive coding review on these 3 million records to identify documents for production, but the plaintiffs objected to Biomet's approach, arguing that Biomet should have applied predictive coding to all 19.5 million documents and should be required to do so to find any remaining relevant documents. The plaintiffs alleged that the use of keywords before applying predictive coding polluted the results of the process. The plaintiffs also argued that Biomet should have allowed the plaintiffs to participate in a joint review of the documents used to train the predictive coding software. Biomet did offer the plaintiffs the opportunity to propose additional keyword searches and invited the plaintiffs to review samples of the output of the predictive coding system.

Court's Opinion and Biomet's Statistical Claim

The court rejected the plaintiffs' arguments, focusing its analysis on whether Biomet had satisfied its obligations under Federal Rules of Civil Procedure 26(b) and 34(b)(2) and the Seventh Circuit Principles Relating to the Discovery of Electronically Stored Information. The court found nothing in the duty of cooperation that requires the parties to jointly review data. It also deflected the plaintiffs' argument that limiting the document population with keywords prior to applying predictive coding necessarily diluted the value of the latter process. The court also focused on the cost of the review of all 19.5 million documents proposed by the plaintiffs, finding that the costs were not proportional to the "comparatively modest" increase in the relevant documents that would be found, as based on the statistical testing performed by Biomet.[3]

Biomet's brief in support of its process was the source of the statistical claim that only 0.94% of documents not hit by its keyword searches were relevant. Its expert characterized this as a "very low number of potentially responsive documents" missed compared with the 16% relevance of the keyword search results, which the court echoed in its order. While, the 0.94% figure is comparatively small when measured against the 16% relevance of the keyword search results, it represents a much larger number of actual documents that the percentages seem to indicate. Biomet's measurement showing 0.94% relevance equates to approximately 86,000–210,000 missed responsive documents. Compared with the approximately 180,000–230,000 relevant documents the keywords did retrieve, the keyword searches potentially excluded more responsive documents than they retrieved.


Courts continue to issue orders and opinions allowing (and occasionally requiring) the use of predictive coding as a means of reducing the cost of discovery. The court in Biomet accepted the notion that predictive coding is a reasonable method by which a party may meet its discovery obligations and that cost shifting can be an appropriate means of addressing proportionality concerns. It made clear that cooperation does not require complying with the requesting party's demand for a specific process, and it was also not convinced that keyword search and predictive coding cannot be used together, as the plaintiffs argued.

It is clear, however, that the court did not base its reasonableness assessment on a measure of the level of recall[4] of Biomet's process. Instead, it focused on comparative costs and Biomet's assertions that the keyword search results had a greater proportion of relevant documents than the documents that were not hit by the keyword searches. This focus on precision rather than recall led the court to approve Biomet's process, which may well have left behind more relevant documents than it found.

It is critical to remember that the standards for discovery are reasonableness and proportionality, not perfection. 100% recall of relevant documents is not required by courts' rules, but producing parties should not rely solely on the type of comparative precision measurements that the court agreed with in Biomet. They should instead focus on achieving reasonable recall rates while defensibly managing costs and risks given the specifics of each case. Strategies to achieve this may include limiting the scope of collection, applying keyword searches, using predictive coding, and employing other methods depending on the matter.


If you have any questions or would like more information on the issues discussed in this LawFlash, please contact any of the following Morgan Lewis eData attorneys and technologists:


Stephanie A. "Tess" Blair
Scott A. Milner
Jacquelyn A. Caridad
Tara S. Lawler

New York
Denise E. Backhouse

San Francisco
Lorraine M. Casto

Washington, D.C.
Graham B. Rollins

Jennifer Mott Williams


New York
L. Keven Hayworth

James B. Vinson

San Francisco
Wayne R. Feagley

George E. Phillips

Washington, D.C.
Jessica A. Robinson

[1]. In re Biomet M2a Magnum Hip Implant Prods. Liab. Litig., No. 3:12-MD-2391 (N.D. Ind. Apr. 18, 2013) (order regarding discovery of ESI), available here.

[2]. Biomet also used de-duplication to reduce the number of documents for review.

[3]. Biomet order, supra note 1, at 5.

[4]. Recall is the actual proportion of relevant documents retrieved out of a population of documents being searched. A related measure, precision, is the proportion of ultimately relevant documents within a set of documents retrieved by a given search.

DISCLAIMER: Because of the generality of this update, the information provided herein may not be applicable in all situations and should not be acted upon without specific legal advice based on particular situations.

© Morgan Lewis | Attorney Advertising

Written by:

Morgan Lewis

Morgan Lewis on:

Readers' Choice 2017
Reporters on Deadline

"My best business intelligence, in one easy email…"

Your first step to building a free, personalized, morning email brief covering pertinent authors and topics on JD Supra:
Sign up using*

Already signed up? Log in here

*By using the service, you signify your acceptance of JD Supra's Privacy Policy.
Privacy Policy (Updated: October 8, 2015):

JD Supra provides users with access to its legal industry publishing services (the "Service") through its website (the "Website") as well as through other sources. Our policies with regard to data collection and use of personal information of users of the Service, regardless of the manner in which users access the Service, and visitors to the Website are set forth in this statement ("Policy"). By using the Service, you signify your acceptance of this Policy.

Information Collection and Use by JD Supra

JD Supra collects users' names, companies, titles, e-mail address and industry. JD Supra also tracks the pages that users visit, logs IP addresses and aggregates non-personally identifiable user data and browser type. This data is gathered using cookies and other technologies.

The information and data collected is used to authenticate users and to send notifications relating to the Service, including email alerts to which users have subscribed; to manage the Service and Website, to improve the Service and to customize the user's experience. This information is also provided to the authors of the content to give them insight into their readership and help them to improve their content, so that it is most useful for our users.

JD Supra does not sell, rent or otherwise provide your details to third parties, other than to the authors of the content on JD Supra.

If you prefer not to enable cookies, you may change your browser settings to disable cookies; however, please note that rejecting cookies while visiting the Website may result in certain parts of the Website not operating correctly or as efficiently as if cookies were allowed.

Email Choice/Opt-out

Users who opt in to receive emails may choose to no longer receive e-mail updates and newsletters by selecting the "opt-out of future email" option in the email they receive from JD Supra or in their JD Supra account management screen.


JD Supra takes reasonable precautions to insure that user information is kept private. We restrict access to user information to those individuals who reasonably need access to perform their job functions, such as our third party email service, customer service personnel and technical staff. However, please note that no method of transmitting or storing data is completely secure and we cannot guarantee the security of user information. Unauthorized entry or use, hardware or software failure, and other factors may compromise the security of user information at any time.

If you have reason to believe that your interaction with us is no longer secure, you must immediately notify us of the problem by contacting us at In the unlikely event that we believe that the security of your user information in our possession or control may have been compromised, we may seek to notify you of that development and, if so, will endeavor to do so as promptly as practicable under the circumstances.

Sharing and Disclosure of Information JD Supra Collects

Except as otherwise described in this privacy statement, JD Supra will not disclose personal information to any third party unless we believe that disclosure is necessary to: (1) comply with applicable laws; (2) respond to governmental inquiries or requests; (3) comply with valid legal process; (4) protect the rights, privacy, safety or property of JD Supra, users of the Service, Website visitors or the public; (5) permit us to pursue available remedies or limit the damages that we may sustain; and (6) enforce our Terms & Conditions of Use.

In the event there is a change in the corporate structure of JD Supra such as, but not limited to, merger, consolidation, sale, liquidation or transfer of substantial assets, JD Supra may, in its sole discretion, transfer, sell or assign information collected on and through the Service to one or more affiliated or unaffiliated third parties.

Links to Other Websites

This Website and the Service may contain links to other websites. The operator of such other websites may collect information about you, including through cookies or other technologies. If you are using the Service through the Website and link to another site, you will leave the Website and this Policy will not apply to your use of and activity on those other sites. We encourage you to read the legal notices posted on those sites, including their privacy policies. We shall have no responsibility or liability for your visitation to, and the data collection and use practices of, such other sites. This Policy applies solely to the information collected in connection with your use of this Website and does not apply to any practices conducted offline or in connection with any other websites.

Changes in Our Privacy Policy

We reserve the right to change this Policy at any time. Please refer to the date at the top of this page to determine when this Policy was last revised. Any changes to our privacy policy will become effective upon posting of the revised policy on the Website. By continuing to use the Service or Website following such changes, you will be deemed to have agreed to such changes. If you do not agree with the terms of this Policy, as it may be amended from time to time, in whole or part, please do not continue using the Service or the Website.

Contacting JD Supra

If you have any questions about this privacy statement, the practices of this site, your dealings with this Web site, or if you would like to change any of the information you have provided to us, please contact us at:

- hide
*With LinkedIn, you don't need to create a separate login to manage your free JD Supra account, and we can make suggestions based on your needs and interests. We will not post anything on LinkedIn in your name. Or, sign up using your email address.