Expectation Versus Reality: Collecting Data From Google Workspace



Our expectations don’t always match reality. You probably have your own examples: you’ve seen a gorgeous photo of a meal or a DIY project and thought, hey, I could do that! Sometimes it works out, and other times—well, you end up making a meme about expectation versus reality.

The same thing can happen with collecting data from Google Workspace. Businesses that use Google Workspace to manage and collaborate on their documents appreciate it as a convenient and helpful solution, especially for remote teams who need to work on documents together despite being in dispersed physical locations. But suppose those organizations haven’t yet incorporated the data from Google Workspace into their broader information governance pipeline. In that case, they may have unrealistic expectations about how challenging it will be to search, preserve, and manage that data. 

So, what happens when the legal team gets notice of a potential litigation matter—and learns that all of the responsive data is likely to be in Google Workspace? If you're not prepared, you may be met with a very costly and time-consuming ediscovery experience.

What You Expect: Collecting Data From Google Workspace

It's no secret that data volumes across all cloud providers have exploded in recent years. According to The Seagate Rethink Data Survey by IDC -2020, corporate data will grow by more than 42% by 2022, with a lot of that data stored on cloud services such as Google Workspace. Given that Google provides enterprise-grade clients with terabytes of data storage, it’s not uncommon for companies to let huge data volumes accumulate.

For example, let’s say you need to collect information from Jason in marketing about the campaign pitch for Product X. So, while you know you’re going to have a lot of data to sort through in Google Workspace, you expect that you can use Google Vault to help sift through that data. Never mind that you’re not familiar with the overall data structure Jason uses on his Drive—surely you can navigate around his folders and pick and choose what you want to collect, right? You’re just going to find the relevant marketing files for Product X and download those files.

Then reality strikes. 

What Actually Happens: Over Collecting a Mess of Disorganized Data

What you weren’t aware of is that Jason has a lot of movies, sound clips, and other files that he uses as inspiration and source material for designing new marketing campaigns, which means his Drive folder is massive.

Additionally, as you’re trying to navigate to the files you need, you realize that you don’t know where anything is in Jason’s Drive, and Google Vault doesn’t provide granular insight into the data structure. Try as you might, you realize you simply cannot get to only the content you need. Your options seem to be limited to collecting Jason’s Drive or not collecting it. 

Try as you might, you realize you simply cannot get to only the content you need. Your options seem to be limited to collecting Jason’s Drive or not collecting it. 

This scenario is common and creates a burdensome overcollection workflow where you just download everything from every custodian. However, given that the cost of ediscovery is directly related to the volume of data involved, and the fact that privacy policy may dictate the limits of what can be copied and stored,  it’s important to avoid collecting unrelated files. 

Perhaps you can use an external tool to do a more detailed search or eliminate that irrelevant source material from your download? Even then, though, you realize that this isn’t a great solution—you’re still going to have to collect Jason’s Drive data, transfer it into a processing engine for indexing, and spend the time and money to cull the data down to what you needed in the first place. In addition, if this matter involves multiple custodians, you may quickly realize that you don’t have the resources to collect every single file and then sort through them.

[View source.]

Written by:


Hanzo on:

Reporters on Deadline

"My best business intelligence, in one easy email…"

Your first step to building a free, personalized, morning email brief covering pertinent authors and topics on JD Supra:
*By using the service, you signify your acceptance of JD Supra's Privacy Policy.
Custom Email Digest
- hide
- hide

This website uses cookies to improve user experience, track anonymous site usage, store authorization tokens and permit sharing on social media networks. By continuing to browse this website you accept the use of cookies. Click here to read more about how we use cookies.