# Determining Appropriate Sample Size to Test Your Search: eDiscovery Throwback Thursdays

CloudNine
Contact

If you missed it last week, we started a new series – Throwback Thursdays – here on the blog, where we are revisiting some of the eDiscovery best practice posts we have covered over the years and discuss whether any of those recommended best practices have changed since we originally covered them.

This post was originally published on April 1, 2011 – no fooling!  It was part of a three-post series that we will revisit over the next three weeks – we have continued to touch on this topic over the years, including our webcast just last month.  One of our best!

One part of searching best practices is to test your search results (both the result set and the files not retrieved) to determine whether the search you performed is effective at maximizing both precision and recall to the extent possible, so that you retrieve as many responsive files as possible without having to review too many non-responsive files.  One question I often get is: how many files do you need to review to test the search?

If you remember from statistics class in high school or college, statistical sampling is choosing a percentage of the results population at random for inspection to gather information about the population as a whole.  This saves considerable time, effort and cost over reviewing every item in the results population and enables you to obtain a “confidence level” that the characteristics of the population reflect your sample.  Statistical sampling is a method used for everything from exit polls to predict elections to marketing surveys to poll customers on brand popularity and is a generally accepted method of drawing conclusions for an overall results population.  You can sample a small portion of a large set to obtain a 95% or 99% confidence level in your findings (with a margin of error, of course).

So, does that mean you have to find your old statistics book and dust off your calculator or (gasp!) slide rule?  Thankfully, no.

There are several sites that provide sample size calculators to help you determine an appropriate sample size, including this one.  Many eDiscovery platforms do so as well.  You’ll simply need to identify a desired confidence level (typically 95% to 99%), an acceptable margin of error (typically 5% or less) and the population size.

So, if you perform a search that retrieves 100,000 files and you want a sample size that provides a 99% confidence level with a margin of error of 5%, you’ll need to review 660 of the retrieved files to achieve that level of confidence in your sample (only 383 files if a 95% confidence level will do).  Here’s an illustration of that using the site referenced above.

If 1,000,000 files were not retrieved, you would only need to review 664 of the not retrieved files to achieve that same level of confidence (99%, with a 5% margin of error) in your sample – only four more files to review than the previous sample, even though the collection is 900,000 files larger!  Don’t believe me?  See for yourself here.

As you can see, the sample size doesn’t need to increase much when the population gets really large and you can review a relatively small subset to understand your collection and defend your search methodology to the court.

Next week, we will talk about how to randomly select the files to review for your sample.

So, what do you think?  Do you use sampling to test your search results?

[View source.]

Contact
more
less

# "My best business intelligence, in one easy email…"

Your first step to building a free, personalized, morning email brief covering pertinent authors and topics on JD Supra:
- hide

Updated: May 25, 2018:

JD Supra is a legal publishing service that connects experts and their content with broader audiences of professionals, journalists and associations.

This Privacy Policy describes how JD Supra, LLC ("JD Supra" or "we," "us," or "our") collects, uses and shares personal data collected from visitors to our website (located at www.jdsupra.com) (our "Website") who view only publicly-available content as well as subscribers to our services (such as our email digests or author tools)(our "Services"). By using our Website and registering for one of our Services, you are agreeing to the terms of this Privacy Policy.

Please note that if you subscribe to one of our Services, you can make choices about how we collect, use and share your information through our Privacy Center under the "My Account" dashboard (available if you are logged into your JD Supra account).

## Collection of Information

Registration Information. When you register with JD Supra for our Website and Services, either as an author or as a subscriber, you will be asked to provide identifying information to create your JD Supra account ("Registration Data"), such as your:

• Email
• First Name
• Last Name
• Company Name
• Company Industry
• Title
• Country

Other Information: We also collect other information you may voluntarily provide. This may include content you provide for publication. We may also receive your communications with others through our Website and Services (such as contacting an author through our Website) or communications directly with us (such as through email, feedback or other forms or social media). If you are a subscribed user, we will also collect your user preferences, such as the types of articles you would like to read.

Information from third parties (such as, from your employer or LinkedIn): We may also receive information about you from third party sources. For example, your employer may provide your information to us, such as in connection with an article submitted by your employer for publication. If you choose to use LinkedIn to subscribe to our Website and Services, we also collect information related to your LinkedIn account and profile.

## How do we use this information?

We use the information and data we collect principally in order to provide our Website and Services. More specifically, we may use your personal information to:

• Operate our Website and Services and publish content;
• Distribute content to you in accordance with your preferences as well as to provide other notifications to you (for example, updates about our policies and terms);
• Measure readership and usage of the Website and Services;
• Communicate with you regarding your questions and requests;
• Authenticate users and to provide for the safety and security of our Website and Services;
• Conduct research and similar activities to improve our Website and Services; and
• Comply with our legal and regulatory responsibilities and to enforce our rights.

## How is your information shared?

• Content and other public information (such as an author profile) is shared on our Website and Services, including via email digests and social media feeds, and is accessible to the general public.
• If you choose to use our Website and Services to communicate directly with a company or individual, such communication may be shared accordingly.
• Readership information is provided to publishing law firms and authors of content to give them insight into their readership and to help them to improve their content.
• Our Website may offer you the opportunity to share information through our Website, such as through Facebook's "Like" or Twitter's "Tweet" button. We offer this functionality to help generate interest in our Website and content and to permit you to recommend content to your contacts. You should be aware that sharing through such functionality may result in information being collected by the applicable social media network and possibly being made publicly available (for example, through a search engine). Any such information collection would be subject to such third party social media network's privacy policy.
• Your information may also be shared to parties who support our business, such as professional advisors as well as web-hosting providers, analytics providers and other information technology providers.
• Any court, governmental authority, law enforcement agency or other third party where we believe disclosure is necessary to comply with a legal or regulatory obligation, or otherwise to protect our rights, the rights of any third party or individuals' personal safety, or to detect, prevent, or otherwise address fraud, security or safety issues.
• To our affiliated entities and in connection with the sale, assignment or other transfer of our company or our business.

## Children's Information

Our Website and Services are not directed at children under the age of 16 and we do not knowingly collect personal information from children under the age of 16 through our Website and/or Services. If you have reason to believe that a child under the age of 16 has provided personal information to us, please contact us, and we will endeavor to delete that information from our databases.

Our Website and Services may contain links to other websites. The operators of such other websites may collect information about you, including through cookies or other technologies. If you are using our Website or Services and click a link to another site, you will leave our Website and this Policy will not apply to your use of and activity on those other sites. We encourage you to read the legal notices posted on those sites, including their privacy policies. We are not responsible for the data collection and use practices of such other sites. This Policy applies solely to the information collected in connection with your use of our Website and Services and does not apply to any practices conducted offline or in connection with any other websites.

## Information for EU and Swiss Residents

JD Supra's principal place of business is in the United States. By subscribing to our website, you expressly consent to your information being processed in the United States.

• Right of Access/Portability: You can ask to review details about the information we hold about you and how that information has been used and disclosed. Note that we may request to verify your identification before fulfilling your request. You can also request that your personal information is provided to you in a commonly used electronic format so that you can share it with other organizations.
• Right to Correct Information: You may ask that we make corrections to any information we hold, if you believe such correction to be necessary.
• Right to Restrict Our Processing or Erasure of Information: You also have the right in certain circumstances to ask us to restrict processing of your personal information or to erase your personal information. Where you have consented to our use of your personal information, you can withdraw your consent at any time.

You can make a request to exercise any of these rights by emailing us at privacy@jdsupra.com or by writing to us at:

Privacy Officer
JD Supra, LLC
10 Liberty Ship Way, Suite 300
Sausalito, California 94965

You can also manage your profile and subscriptions through our Privacy Center under the "My Account" dashboard.

We will make all practical efforts to respect your wishes. There may be times, however, where we are not able to fulfill your request, for example, if applicable law prohibits our compliance. Please note that JD Supra does not use "automatic decision making" or "profiling" as those terms are defined in the GDPR.

• Onward Transfer to Third Parties: As noted in the "How We Share Your Data" Section above, JD Supra may share your information with third parties. When JD Supra discloses your personal information to third parties, we have ensured that such third parties have either certified under the EU-U.S. or Swiss Privacy Shield Framework and will process all personal data received from EU member states/Switzerland in reliance on the applicable Privacy Shield Framework or that they have been subjected to strict contractual provisions in their contract with us to guarantee an adequate level of data protection for your data.

## California Privacy Rights

Pursuant to Section 1798.83 of the California Civil Code, our customers who are California residents have the right to request certain information regarding our disclosure of personal information to third parties for their direct marketing purposes.

You can make a request for this information by emailing us at privacy@jdsupra.com or by writing to us at:

Privacy Officer
JD Supra, LLC
10 Liberty Ship Way, Suite 300
Sausalito, California 94965

Some browsers have incorporated a Do Not Track (DNT) feature. These features, when turned on, send a signal that you prefer that the website you are visiting not collect and use data regarding your online searching and browsing activities. As there is not yet a common understanding on how to interpret the DNT signal, we currently do not respond to DNT signals on our site.

## Access/Correct/Update/Delete Personal Information

We reserve the right to change this Privacy Policy at any time. Please refer to the date at the top of this page to determine when this Policy was last revised. Any changes to our Privacy Policy will become effective upon posting of the revised policy on the Website. By continuing to use our Website and Services following such changes, you will be deemed to have agreed to such changes.

## Contacting JD Supra

As with many websites, JD Supra's website (located at www.jdsupra.com) (our "Website") and our services (such as our email article digests)(our "Services") use a standard technology called a "cookie" and other similar technologies (such as, pixels and web beacons), which are small data files that are transferred to your computer when you use our Website and Services. These technologies automatically identify your browser whenever you interact with our Website and Services.

## How We Use Cookies and Other Tracking Technologies

We use cookies and other tracking technologies to:

1. Improve the user experience on our Website and Services;
2. Store the authorization token that users receive when they login to the private areas of our Website. This token is specific to a user's login session and requires a valid username and password to obtain. It is required to access the user's profile information, subscriptions, and analytics;
3. Track anonymous site usage; and
4. Permit connectivity with social media networks to permit content sharing.

There are different types of cookies and other technologies used our Website, notably:

• "Persistent cookies" - These cookies stay on your computer or device after your browser has been closed and last for a time specified in the cookie. We use persistent cookies when we need to know who you are for more than one browsing session. For example, we use them to remember your preferences for the next time you visit.
• "Web Beacons/Pixels" - Some of our web pages and emails may also contain small electronic images known as web beacons, clear GIFs or single-pixel GIFs. These images are placed on a web page or email and typically work in conjunction with cookies to collect data. We use these images to identify our users and user behavior, such as counting the number of users who have visited a web page or acted upon one of our email digests.

Analytics/Performance Cookies. JD Supra also uses the following analytic tools to help us analyze the performance of our Website and Services as well as how visitors use our Website and Services:

Facebook, Twitter and other Social Network Cookies. Our content pages allow you to share content appearing on our Website and Services to your social media accounts through the "Like," "Tweet," or similar buttons displayed on such pages. To accomplish this Service, we embed code that such third party social networks provide and that we do not control. These buttons know that you are logged in to your social network account and therefore such social networks could also know that you are viewing the JD Supra Website.

The processes for controlling and deleting cookies vary depending on which browser you use. To find out how to do so with a particular browser, you can use your browser's "Help" function or alternatively, you can visit http://www.aboutcookies.org which explains, step-by-step, how to control and delete cookies in most browsers.