In any document search allowing defendants’ custodians to conduct their own searches is much like allowing the fox to guard the henhouse. Even focused and disciplined custodial collections can be fraught with problems.


Email sent during product development:

“ Jim – I am writing to you because of the problems I have found during my evaluation of project 249, preliminarily named Mustang. I really think you should talk to Carla Basher about the detailed engineering aspects to the flux capacitor (at least that is what we are calling it) that governs the voltage running through what we already know are stressed leads (I know you are talking about replacing them.

Best, Brad”

The header shows the email going to and the sender as, with a blind copy to


Although the above is a limited example, this email is a good reason to consider how best to obtain as much information as possible at the custodial level when crafting EDiscovery requests. I suggest this email is an illustration for why predictive coding can be a flawed culling solution unless the parties agree to or the court order collaboration in developing the predictive coding test batches. I suggest this also shows reason for considering keyword searches before implementation of predictive coding.

What do we know by looking at this email, presumably early in the development of this fictitious project:

  1. Jim is involved in the evaluation.
  2. Brad is coordinating reviews.
  3. There are problems with the project (at least Jim thinks so)
  4. The project is number 249.
  5. The project already has a nickname, “Mustang”.
  6. Carla Bashe is an additional custodian.
  7. There may be problems with the flux capacitor.
  8. Electrical leads are showing stress during testing and replacing themis possible.
  9. Someone with the email prefix of cmb289c6 is involved with the project.

This simple (5) line email contains what may be significant issues in the development of this project. Assuming this is early in the project and certain things change going forward, this email may never be found. For example:

- Jim leaves.

- Brad transfers to the Alaskan office.

- The project number changes to Project 385-c.

- The project nickname is changed to snow leopard.

- Carla Bash looks at the flux capacitor, but reports her findings to the new project manager, Sally Approval.

In this case, a self-collection by custodians will not work; even if we ignore the problems inherent to custodian self-collection, because Jim is gone and is never again involved in the project. The project number and the name change early in the product development process.

How would predictive coding do in this collection? Keyword search? Hash values? Concept search?

None of the typically lauded technology solutions would help with a custodial collection in this (highly over simplified) situation without some information about the custodians.

Judge Shira Scheindlin faced a similar situation in National Day Laborer Organizing Network et al. v. United States Immigration and Customs Enforcement Agency, et al. 2012 U.S. Dist. Lexis 97863 (SDNY, July 13, 2012). This dispute focused on the plaintiffs’ desires to obtain information from several government agencies and their doubts about how record custodian searches were conducted. The plaintiffs’ argued that the custodial searches were patently inadequate and the defendants argued they were legally sufficient or beyond.

Preliminarily, in granting some of the motions and denying others, Judge Scheindlin observed:

“Generalizations about the quality of defendants’ searches are difficult because some of the searches appear to have been extremely rigorous, some woefully inadequate, and many simply documented with detail insufficient to permit proper evaluation.”

In ruling on the motions, the court made the following findings and conclusions:

ICE Deputy Director. ICE failed to search this custodian, but corrected the problem; finding 56 responsive documents.

Homeland Security. This agency conducted no search for records, claiming they had no involvement with the subject matter. Documents produced by an ICE employee demonstrated HIS played a role.

Office of State, Local, and Tribal Coordination (OSLTC). Two custodians searched their records and ICE argued they were the only custodians most likely to have responsive documents. The court found that ICE should not have limited its search to “…only the files of the two custodians who are ‘most likely’ to have responsive records; it must also search other locations that are reasonably likely to contain records. Because the OSLTC Chief of Staff was circulating memoranda regarding the office’s outreach efforts relating to opt-out, his or her files likely contain responsive records and should have been searched.”

FBI/The Director’s Office. The court ruled that the Director’s office should have been searched, but only because the plaintiffs produced senate testimony by the Director related to the matter and the plaintiffs provided email evidence obtained through discovery of other custodians that suggested the involvement of the Director’s Office as a likely custodian.

Office of General Counsel (OGC). In response to requests, OGC submitted an affidavit setting forth they had no relevant documents. The court said it found it absurd that “they may satisfy their obligation by submitting a sworn declaration… asserting that he has requested that an office perform a search, has received no response from the office, and therefore assumes that a proper search was performed and no documents were found.”

Several other agencies either did or did not conduct a sufficient search, but it is the “Interoperability Initiatives Unit (IIU)” that is most interesting. The vast majority of responsive documents were found in this office. The search was conducted by allowing individual custodians review their shared drive’s and conduct individual searches of their emails. However, apparently everyone forgot about the seven employees who had left the IIU and who had worked on the matter.

The search terms employed by the agencies were diverse and, sometimes, the memories of employees were used for reliance. The court found fault with both custodian led searches and the failure of the agencies to carefully define uniform search criteria. Sadly for the plaintiffs, the court also had to weigh the inadequacies against the enormous cost of ordering new searches over old territory. Judge Scheindlin ordered that the parties work cooperatively together to craft more targeted searches and to pare down custodians who conducted no searches (relying on their memory) or who conducted inadequate searches. Finally, the court set forth:

“The parties will need to agree on search terms and protocols – and, if necessary, testing to evaluate and refine those terms. If they wish to and can, then they may agree on predictive coding techniques and other more innovative ways to search.”

“Defendant agencies, in turn, will need to cooperate fully with plaintiffs. As in the past, the Court will supervise this process and provide a variety of mechanisms for resolving any disputes. Disagreements will be resolved early, before they lead to inadequate (or wasteful) searches.”

What is the answer?

Express mutual cooperation. Be willing to be candid and open about cooperating with the opposing party as much as is possible and in the best interests of your client. Remember the best interests of your client are not served by running up fees and costs to fight over discovery and data production that is reasonable.

Talk. Discuss with opposing counsel what information can be made available about what custodians were involved in a given project. Ask whether there is a project leader who participated in the product development that might provide an overview of involved custodians. Discuss the possibility of interviewing certain custodians to obtain a better idea how to better focus discovery requests. Talk about how to handle former employees who would be likely data custodians.

Cooperate. Communicate and Collaborate. Leave the trial warrior for the courtroom.