Recent experience has convinced me that familiarity breeds complacency when it comes to responding to Antitrust Second Requests. In the name of the “unique nature” of a Second Request review, experienced practitioners often maintain that only traditional protocols will work, and unhesitatingly accept a production set that may be upwards of 50% nonresponsive.
Certainly, there are constraints on review techniques when it comes to responding to a Second Request. Both the DOJ and the FTC require advance written notice when “using software or technology… to identify or eliminate documents, data, or information potentially responsive to [the] Request.”[i] Both likewise demand specific information when the review process relies on either search terms or technology-assisted review. The specific reference to “seed set[s] and training rounds” in the Model Second Requests used by each agency exhibits a selective, nearly exclusionary, preference for TAR 1.0. And the concomitant prohibition in the DOJ’s Predictive Coding Model Agreement against any responsiveness review following the “predictive coding process” virtually guarantees the substantial, unnecessary production of nonresponsive documents.[ii]
But those constraints should never inhibit a constant quest for better techniques – techniques that make the review more efficient or result in the production of less nonresponsive documents, or both. Much like the technology titans chronicled in Always Day One, we cannot slip into Day Two and focus on fiercely defending tradition rather than inventing the future.[iii] As Jeff Bezos (Amazon) observed in a 2016 letter to shareholders:[iv]
Day 2 is stasis. Followed by irrelevance. Followed by excruciating, painful decline. Followed by death. And that is why it is always Day 1.
Instead, we need to strive to continuously live Day One, and prioritize focused innovation over tradition – particularly when we cling to tradition only for tradition’s sake.
And, as discussed below, there is indeed a better technique for responding to Antitrust Second Requests, a more efficient technique that effectively focuses on the exclusive production of responsive documents. There is a better way and, frankly, no rational reason to stagnate in tradition.
Tradition is Stasis
Practitioners typically use one of two techniques when responding to Second Requests: either search terms followed by linear review, or TAR 1.0. Neither is particularly efficient or effective. And both provide, at best, limited insight into the documents produced to the agency. Even worse, a TAR 1.0 approach can be exceedingly harmful as a consequence of the unavoidable production of nonresponsive documents.
One tangential observation before moving to the reality of both techniques: the fact that those practitioners who focus on Second Requests use antiquated eDiscovery techniques for review debunks the “unique nature” of an Antitrust Second Request review. It is, after all (assuming proper training by counsel), just a document review – responsiveness and privilege are far from foreign concepts to an experienced document review team. And the magnitude and compressed production deadlines associated with Second Requests are becoming almost commonplace touchstones among experienced eDiscovery vendors. One marketing piece for Second Request capabilities highlights the ability to process 45 million documents in 106 days. Okay… but I have seen eDiscovery vendors capable of processing upwards of 30 million documents (~33TB), in 30 days. There is simply nothing truly “unique” about responding to a Second Request.
And if the eDiscovery realm has taught us anything, we know that search terms followed by linear review is an ineffective, inefficient document review technique. The Blair Maron study tells us that search terms often retrieve only on the order of twenty percent (20%) of the responsive documents from a document collection.[v] And while naysayers often seek to discredit the study – largely on the basis of advances in computerized search technology – I have personally seen knowledgeable merits counsel struggle to find thirty percent (30%) of the responsive documents using search terms with even the most modern search technology. Since 20% or 30% recall will undoubtedly be inadequate for a Second Request, practitioners will need to spend countless hours refining the search terms to improve recall. And with increased recall comes decreased precision – a direct consequence of the precision-recall tradeoff.[vi] Practically, that means that every point-increase in the recall percentage will decrease the precision in the search term review set by some amount. And, since the Blair Maron study put observed precision at roughly 80%, much more than 20% of the review set will be nonresponsive at higher recall levels. That directly increases review effort, and decreases efficiency.
Qualitatively, the search term approach is even more concerning. Typical Second Request document volumes and deadlines necessitate a substantial number of reviewers. Marketing materials for one large case tout the engagement of more than 300 reviewers for a single Second Request response. That presents two practical problems. First, more reviewers simply mean more inconsistency. This can be particularly disconcerting when privilege calls are missed. Second, with documents being spread indiscriminately among so many reviewers, there is no opportunity for gaining any real insight into the nature of the documents that are being produced to the agency. In the context of a fast-paced Second Request, this can mean the difference between preparation and naked reaction during negotiations.
A TAR 1.0 approach, on the other hand, may be more efficient (in terms of the number of documents reviewed to achieve production), but will undoubtedly be less effective and less protective. At a reasonable recall level, TAR 1.0 is not particularly precise. It is not at all uncommon to see precision levels at less than 50%. And marketing materials for one large case suggest that precision for some collections may well be less than 30% at recall levels of only 75%. That means that upwards of half of a production to the agency will be nonresponsive, particularly since any subsequent responsiveness review is prohibited.
Qualitatively, a TAR 1.0 approach can be even worse than a search term approach. Poor precision in a TAR 1.0 review doesn’t increase review – it increases exposure. Nonresponsive documents are never reviewed; they are produced directly to the agency. Consequently, every issue contained in every nonresponsive document is directly exposed to the agency. And, since only a small fraction of the collection is reviewed to develop the TAR model, there is even less insight into the substance of the production.
Ultimately, traditional review techniques do not well serve the spectrum of objectives attendant to Second Request reviews. Neither technique adequately optimizes efficiency, effectiveness and protection.
Day One, Innovating a Better Way
To just say that there is a better way to respond to a Second Request is an understatement. An innovative rapid analytic investigative review (RAIR) that combines refined workflows with the adept application of analytics, and integrated and targeted sampling, will collectively optimize for all three – efficiency, effectiveness and, perhaps most importantly (particularly given the deficiencies of traditional approaches) protection.
So, how does a RAIR review work?
The backbone of a RAIR review is a small, sophisticated team dedicated to the investigatory, analytic, statistical assessment of substantively similar tranches of documents – typically less than ten team members. More than one team may be necessary, depending on the volume and homogeneity of the collection. But, given that one team can typically assess a few hundred thousand documents in the span of just a week, seldom will any review require more than three teams.
It nearly goes without saying that the concentrated and focused character of the RAIR team will improve consistency, particularly over massive 300 person reviews. And an ingrained practice of constant communication and collaboration within and among the teams only serves to further promote not only consistency, but also decision-making – drawing on the collective wisdom of the team(s), as opposed to the isolated individual determinations of a single reviewer.
Each tranche of documents is derived using all available analytics, essentially by aggregating sets of documents that are substantively similar from the perspective of responsiveness. This approach often results in the creation of document sets that combine thousands, or even tens of thousands, of similar documents for a single decision – responsive or nonresponsive. (I have personally seen one situation where more than 2 million virtually identical documents were aggregated.) This aggregation process continues until the entire collection has been evaluated. And the basis for aggregating each document set is recorded to support the defensibility of the process.
Throughout the aggregation process, this focused assessment automatically instills valuable insights into the context and substance of the documents in the collection, far more than any single individual might garner using traditional review techniques. This continuous and timely knowledge of the contents of, particularly, the documents being produced to the agency can be critical to advance planning, and fully protecting the client’s interests in negotiations with the agency.
As document sets are aggregated, random samples of each set are generated to provide the basis for a bulk responsiveness assessment. At least one sample having a confidence level of 95% and a confidence interval of ±5% should be drawn from each set. More than one sample may be drawn from larger or more diverse (less homogenous) document sets, and small document sets may be reviewed in their entirety.
These representative samples are then reviewed for responsiveness. If the entire sample is consistent and correct, the entire document set is coded accordingly. If the sample is not wholly consistent, the original document set will be reassessed and, if necessary, further separated into responsive and nonresponsive sets. New samples of both will then be drawn and reviewed for a subsequent iteration. Throughout this sample review process, each document may also be evaluated for privilege.
The statistical implications of this approach for the responsiveness review are noteworthy. The responsiveness decision for every aggregated document set is essentially validated with a 95/5 sample. As a result, a RAIR review drives superior levels of recall and precision, often reaching greater than 90% on both. That means that the agency will get virtually everything it might be entitled to under the Second Request (responsive), and nothing that does not otherwise need to be produced (nonresponsive).
Finally, in addition to the preliminary privilege assessment during the responsiveness sample review, privilege is actually subject to much closer scrutiny. Documents are not simply reviewed independently; analytics are used to identify the characteristics (subject matter, timing, people, etc.) underlying contextual privileges, and investigatory techniques are then used to find the privileged documents. This approach ensures the utmost consistency and coverage, particularly given the symbiotic operation of the RAIR team and the intensive, focused analysis underlying the investigation for privileged documents.
Once all of the aggregated document sets have been assessed, and the privilege assessment is complete, a clean concise set of documents almost exclusively responsive to a Second Request can be produced to the agency.
Abandon Tradition for Innovation
The relative benefits of a RAIR review over traditional review techniques for responding to a Second Request are straightforward. Fewer documents are reviewed. More responsive and less nonresponsive documents are produced. Privilege protection is greater. And a RAIR review provides insights into the substance of the production that would otherwise never be available. “That is why it [should always be] Day 1.”
[i] See https://www.justice.gov/atr/file/706636/download (DOJ Model Second Request, Instruction 4) and https://www.ftc.gov/system/files/attachments/hsr-resources/model_second_request_-_final_-_october_2021.pdf (FTC Model Second Request, Instruction 15).
[ii] See https://www.justice.gov/file/1096096/download (DOJ Predictive Coding Model Agreement).
[iii] Alex Kantrowitz, Always Day One: How the Tech Titans Plan to Stay on Top Forever.
[iv] See https://www.vox.com/2017/4/12/15274220/jeff-bezos-amazon-shareholders-letter-day-2-disagree-and-commit.
[vi] See, e.g., https://datascience-george.medium.com/the-precision-recall-trade-off-aa295faba140