Tips for Paralegals and Litigation Support Professionals – January 2023

Association of Certified E-Discovery Specialists (ACEDS)
Contact

Association of Certified E-Discovery Specialists (ACEDS)

January 1, 2023—Searching Inside an Excel Cell for One of Multiple Strings

You can use an Excel formula to check to see when any one of multiple values appears in the contents of a cell.

=SUMPRODUCT(–ISNUMBER(SEARCH($D$2:$D$7,A2)))>0

This formula will check inside cell A2 for the values listed in cells D2 to D7.

When you have a range of cells that you want to search through, enter the list of strings you’re looking for hits for with an absolute reference using dollar signs. So in this example, we search through the addresses listed in column A for the state capitals listed in column F. We can pull down the formula entered in cell B2 to the cells below using CTRL + D. The formula will return ‘TRUE’ when one of the values from cells F2 to F51 are listed in column A.

Searching inside an Excel cell for one of multiple strings

January 7, 2023—Regex to Find Where Consecutive Lines End With the Same Text

Tonight’s tip features a regular expression script created by The fourth bird of the Netherlands on stack overflow. I posted looking for a regular expression that would find the text on a line repeated from the prior line after a time code at the beginning of the both lines that might be different. See this example:

(11:12:21) [Tom]: Hello this is Tom. Who is it?

(11:14:08) [Tom]: Hello this is Tom. Who is it?

The goal was to find when consecutive lines were the same after the first 10 characters. The fourth bird came up with a solution that would find when parts of two lines matched. In a text editor like NotePad++ run this find and replace search:

FIND: ^(\([^][]*\))(.*)(?:\r?\n\([^][]*\)\2)+

REPLACE: $1$2

^(\([^][]*\)) will find the first part of the string – the time code in parentheses. So the caret ^ matches the beginning of the line, and the rest then finds the rest of the text between the parentheses.

(.*) matches to the end of the line after the parenthetical information at the beginning.

(?:\r?\n this then matches a new group on a new line

\([^][]*\) this matches from the first part of the previous line.

\2)+ this then matches with the second part of the previous line.

As you can see in this demonstration a find and replace in the text editor can easily remove the duplicate lines.

Regex to find where consecutive lines end with the same text

January 14, 2023—Removing Hyperlinks From PDFs for Court Filings

Some courts will not allow PDFs to be electronically filed which contain hyperlinks. This can really become a problem if you have PDFs with many pages and a url has been added to the footer for each page, as is common with PDFs created from SEC filings found on EDGAR.

removing hyperlinks from PDFs for court filings_1

Printing out the PDF as a hard copy and photo scanning is too much of a hassle, and there may not be time to go back and redo the PDFs without the footers containing the urls – or urls could be scattered in the body of the document. Removing links in EDIT mode may also be too time consuming.

Flattening a PDF by writing it to a new PDF will not necessarily remove hyperlinks in the PDF, and hyperlinks can still be present when a conversion to PDF/A format is performed.

removing hyperlinks from PDFs for court filings_2

The Acrobat tool which will remove web links . . .

removing hyperlinks from PDFs for court filings_3

. . . will not take out all hyperlinked web addresses:

removing hyperlinks from PDFs for court filings_4

Going under Preferences in Acrobat and unselecting ‘Create links from URLs’ in the Documents category will only deactivate hyperlinks for the current user of the PDF – not recipients you forward it to.

removing hyperlinks from PDFs for court filings_5

Instead try converting the PDF to a multipage TIFF image using an editor like FoxIT. (Acrobat will not convert a PDF to the multipage TIFF format.) This software has an option to export the PDF as a multipage TIFF image. When saving the file click on Settings and set the conversion type to ‘Single File’.

removing hyperlinks from PDFs for court filings_6

You can then open the PDF back up in Acrobat which will reformat it as a PDF. The formatting will be retained but the links will no longer be active.

removing hyperlinks from PDFs for court filings_7

January 21, 2023—MS Exchange 10 Email Deletion Limit

Microsoft Exchange limits the number of emails that can be simultaneously deleted from a single mailbox with one command to 10. While it’s possible to delete one email from thousands of mailboxes, the same command used to search for and delete messages across user accounts has a built in limitation on how many emails it will delete at once.

New-ComplianceSearchAction is a cmdlet (a PowerShell script or operation) which an admin can implement on Exchange that can search through the subject line, body, and metadata fields for certain key terms.

The cmdlet is designed to respond to cyber security breaches (such as purging a phishing email) not help businesses comply with the need to remediate data pursuant to a protective order.

See confirmation on this limitation for the purge switch here: https://learn.microsoft.com/en-us/powershell/module/exchange/new-compliancesearchaction?view=exchange-ps

The PowerShell script to delete emails will specify a named search:

New-ComplianceSearchAction -SearchName “Remove Phishing Message” -Purge -PurgeType SoftDelete

Not any Exchange admin can run this cmdlet. The admin needs to have Discovery Management rights.

The cmdlet transfers emails to the Deletions folder of a user account’s Recoverable Items. (When an Outlook user holds down the SHIFT key and presses DEL the message is not gone forever – it goes to Recoverable Items.). On Exchange, users’ Outlook data is divided between Recoverable Items, Interpersonal Messaging (IPM – all of the data visible to the user), and non-IPM data (operational data). Recoverable Items contains additional folders:

  • Purges – items hard deleted when a litigation hold is in place. Emails sent here won’t be deleted until the period set for retention ends. But even then it will only be deleted until the mailbox is processed by the Managed Folder assistant, which can be set to run once every 1 to 7 days.
  • Audits – logs activity in the account.
  • DiscoveryHolds – items subject to an Office 365 litigation hold that are hard deleted.
  • Versions – this folder will retain multiple versions of Outlook items that have been modified.

Note that when rerun the cmdlet will include messages in the Recoverable Items folder in the results, so the total count in the results will not change as successive searches are run.

Microsoft has posted a PowerShell script which can be used to delete multiple batches of 10 emails automatically:

https://answers.microsoft.com/en-us/msoffice/forum/all/delete-more-than-10-items-from-a-mailbox-using/f28efa60-3766-4f50-af2d-e1f9be588931

. . . but it notes that this is not a supported script, and it cannot be run on multiple mailboxes.

Exchange will retain deleted items by default for up to 14 days but this period can be increased to 30 days.

[View source.]

Written by:

Association of Certified E-Discovery Specialists (ACEDS)
Contact
more
less

PUBLISH YOUR CONTENT ON JD SUPRA NOW

  • Increased visibility
  • Actionable analytics
  • Ongoing guidance

Association of Certified E-Discovery Specialists (ACEDS) on:

Reporters on Deadline

"My best business intelligence, in one easy email…"

Your first step to building a free, personalized, morning email brief covering pertinent authors and topics on JD Supra:
*By using the service, you signify your acceptance of JD Supra's Privacy Policy.
Custom Email Digest
- hide
- hide