A Brief History of Ediscovery—and a Glimpse of What’s to Come With Collaborative Data



In the beginning, there was discovery.

Just plain old discovery, not the “e” variety. Lawyers shuffled reams of pages and hefted boxes of paper and went nearly blind scanning those individual sheets during document review. It was slow, it was labor-intensive, and it was not very environmentally sound, but it was straightforward and easy—if not to manage, then at least to conceptualize.

Of course, all things must come to an end. Eventually, people started corresponding more over this newfangled thing called “email” and less over traditional mail. Email was dramatically faster and simpler, which led us to disparagingly call the original mail “snail mail.”

But while email was faster, it wasn’t immediately obvious how to technically wrangle it or fit it into established discovery processes. In those early days, lawyers bypassed the issue by agreeing not to include email or other electronic data in discovery. For a while, email was essentially a black box in the world of discovery.

Things have obviously changed since then—and there’s an even bigger change lurking on the horizon. Let’s take a look at where we’ve been, where we’re going, and what we’re going to need to do to get there.


When legal and IT professionals first started incorporating email into discovery—putting the “e” in ediscovery, as it were—there was a tendency to treat email like paper. This was easy enough to do, since email is an electronic version of a classic letter. While it’s somewhat more complex and sophisticated than paper and pen, email is still, in many regards, paper-like.

With email, it’s easy to see who is involved in a conversation. If an individual sends or receives an email, that person is a custodian of that record.

It’s easy—again, at least conceptually, if not technically—to put the email accounts of those identified custodians on a legal hold.

It’s straightforward to discern the context of a conversation within an email chain. Emails are self-contained conversations; scrolling through the history of the message reveals its full context. There may be internal shorthand and gaps in information, but the gist, generally speaking, is all there.

Because context is all-inclusive in emails, it’s easy to scope discovery. Messages that include keywords are relevant; those that don’t likely aren’t.

None of this is to say that email was easy for ediscovery practitioners to figure out. We had to think about discovery and discoverable information in new ways. We had to learn to work with electronic data and master aspects of technology that we’d never imagined. We had to figure out how to use metadata and load files. We had to deal with attachments and parse the family relationships between emails and learn what “threading” was all about (hint: it has nothing to do with sewing).

But, as they say, hindsight is 20-20. Now that we’re staring down the barrel of collaborative data, the challenges we faced with email appear almost quaint. We’re no longer talking about a form of communication that is a close cousin to paper documents; collaborative data is an entirely new, and rather exotic, beast.


We’re talking here about the chats, messages, comments, reactions, attachments, and other data associated with communication tools like Slack, project management tools like Asana, ticket and issue management tools like Jira, and even document management platforms like Google Docs. If multiple people can use a tool, app, or platform to instantly communicate with team members regardless of each other’s physical location, that technology probably falls within what we consider the category of collaboration apps—and the data that these tools generate has barely any relationship to email, much less to the familiar world of paper.

Wait, you say: the mere fact that this data is being generated doesn’t make it important. After all, people could be using those collaboration apps to chat about what they’re having for lunch or what they’re doing after work. That’s probably not going to be relevant to any litigation matter, so it will never fall within the scope of discovery. Who cares what’s happening on Slack or any other highfalutin collaboration tool?

It’s true that trivial, non-work-related conversations do happen on Slack and other collaboration tools. But it’s even more true that work conversations—and real work—happen through collaborative technology. Lots and lots of work conversations, as it turns out. People are tracking their projects, commenting on each other’s work product, sharing customer concerns, answering questions, debugging code, reporting expenses, sharing research, and more.

You’d better believe those conversations are relevant to all kinds of litigation matters. Allegations of workplace discrimination? Proof of knowledge about a faulty or dangerous product design? Awareness of a security flaw? You name it, it’s probably been discussed on a collaboration app.

Which means that data is discoverable. Now we just have to figure out how to work with it.


Conversation streams within collaboration apps are not structured or self-contained in the way that emails are. Instead, they’re more chaotic and fluid. People may join a channel or leave a channel; edit or delete comments; use words to communicate or pin on an emoji reaction instead. The whole flow of conversation is different, with short, one-word responses that may not make any sense in isolation. A single line or snippet of conversation, standing alone, is likely to be devoid of meaning or context.

Who, then, are the custodians of a particular message? When a message is publicly posted on what amounts to a digital bulletin board, how can you tell who’s seen it and who hasn’t? When potential custodians extend well beyond those individuals who wrote or reacted to a particular message, how can you identify them?

In the absence of clear-cut custodians, how can you implement a legal hold on collaborative data? Do you have to keep everything forever? Can you? Where does that data even reside, and how long will you have access to it?

Then there are issues of context and scope. How many messages—or screens of information—do you need to capture to understand the context of a one-word response? Is a one-line message that just says “yeah” relevant to the scope of discovery in a matter? How can you tell which conversation that response is related to?

Nor is the information within collaboration apps limited to words. What does it mean when a user adds the “eyes” emoji reaction to a post? Or when someone responds to a Slack message with a check mark? How about a raccoon? Odds are that each organization has its own Slack shorthand—but emojis and reactions are forms of communication that haven’t existed before and don’t have paper corollaries. That means we’re going to have to reinvent the way we think about data, communication, and even language.


It’s time for ediscovery professionals to take a step back and think about data in a new way, as we did when email was new. We have to revisit our tools, processes, and approaches to ensure that they will work for new data—and if they won’t, we have to devise new means of handling those data types.

We have to continually reevaluate how we define custodians, context, and the scope of discovery.

We have to contemplate what it means for data to be “on hold.” Where is that data stored? How accessible is it? When we manage to get to it, how readily searchable is it?

Last but not least, we must be prepared for the continuing evolution of our communication style and the changes in data types that will inevitably follow. We can’t risk thinking of these new data types as “basically like email”; they are far less like email than email was like paper.

Written by:


Hanzo on:

Reporters on Deadline

"My best business intelligence, in one easy email…"

Your first step to building a free, personalized, morning email brief covering pertinent authors and topics on JD Supra:
*By using the service, you signify your acceptance of JD Supra's Privacy Policy.
Custom Email Digest
- hide
- hide