“And That’s Where Things Got Weird:” Pondering Using AI to Interpret Legal Documents

Harris Beach Murtha PLLC
Contact

In an otherwise unassuming case relating to an insurance company’s obligation to cover an insured, Judge Kevin C. Newsom of the Eleventh Circuit engaged in a fascinating discussion of the merits of using of large language model (“LLM”)-based generative AI programs like ChatGPT to interpret and define terms. After weighing the positive and negatives of such usage, Judge Newsom provides general guidance for the potential use of generative AI in this manner.

The case in question is Snell v. United Specialty Insurance Company, 102 F.4th 1208 (11th Cir. 2024), and one of the issues on appeal was how the term “landscaping” in an insurance contract should be interpreted. More specifically, an insured landscaper brought an action against his insurance company after it declined to defend and indemnify him. The landscaper installed, of all things, an inground trampoline. Someone was hurt using the trampoline and sued the landscaper. The insurance company denied his claim on the grounds that installing an inground trampoline did not fall within the definition of “landscaping” as set forth in his insurance policy. The lower court decision focused on whether the term “landscaping” included installing an inground trampoline and ultimately concluded that it did not. On appeal, the Eleventh Circuit affirmed but did so on grounds unrelated to the interpretation of “landscaping.”

Judge Newsom fully joined in the majority opinion, but took the opportunity to draft a concurring opinion investigating a unique question: could generative AI assist in the judicial process of interpreting the ordinary meaning of terms? In a well-written opinion with dashes of humor and humility in equal measure, Judge Newsom examined the pros and cons.

But before doing so, he put OpenAI’s ChatGPT and Google’s Bard (now Gemini) through their paces. He asked each the definitional question of “What is the ordinary meaning of ‘landscaping?’” and the ultimate question of “Is installing an in-ground trampoline ‘landscaping?’” Each program returned answers, but those aren’t what is interesting about this opinion. What is interesting is Judge Newsom’s examination of the pros and cons of using LLMs to define or interpret terms and suggestions for how to improve that use.

Pros of Using LLMs for Finding Ordinary Meanings:

Judge Newsom referenced the following pros of using LLMs:

  1. LLMs are typically trained on massive amounts of text culled from the internet and other sources, including things like Wikipedia articles, books, articles, and comments. Judge Newsom’s premise was that “[b]ecause they cast their nets so widely, LLMs can provide useful statistical predictions about how, in the main, ordinary people ordinarily use words and phrases in ordinary life.”
  2. LLMs are accessible to ordinary citizens unlike, for example, Westlaw and Lexis.
  3. LLM training is “relatively transparent,” as compared to, for example, dictionaries. While we often don’t know what datasets were used to train particular LLMs, Judge Newsom notes LLMs are generally trained using “tons and tons of internet data.”
  4. LLMs are superior to surveys and corpus linguistics both from a practical standpoint and for purposes of avoiding input bias.

Cons of Using LLMs for Finding Ordinary Meanings

Judge Newsom also catalogued several disadvantages of using LLMs for legal interpretation:

  1. LLMs are known to hallucinate, i.e., to generate incorrect answers – even with confidence. Though Judge Newsom notes that this problem isn’t limited to generative AI: “Flesh-and-blood lawyers hallucinate too.”
  2. LLMs are not trained on offline speech. Consequently, LLMs cannot truly capture a comprehensive spectrum of everyday usage of words. “People living in poorer communities (perhaps disproportionately minorities and those in rural areas) are less likely to have ready internet access and thus may be less likely to contribute to the sources from which LLMs draw in crafting their responses to inquiries.”
  3. LLMs could theoretically be influenced by people seeking to manipulate results.
  4. LLMs could lead us into dystopia. Judge Newsom ponders whether over reliance on LLMs could put us on a “dystopian path toward ‘robo judges’ algorithmically resolv[e] human disputes.” He is categorically opposed to the mechanic application of LLM results.

Judge Newsom’s Recommendations for the Use of LLMs

As part of his final analysis, Judge Newsom developed some recommendations for the potential use of LLMs in interpretative work:

  1. The LLM’s “highest and best use” in the context of judicial opinions is determining the ordinary meaning of words, as opposed to asking the ultimate question in the case.
  2. Don’t query a LLM just once. Use a range of prompts and use different LLM models to get more “robust” results.
  3. Ask the LLM for a confidence level in addition to the query – so, for example, please provide the ordinary meaning of “landscaping” and indicate the level of confidence you have in the answer.
  4. Judge Newsom is an originalist and under that framework one must use the ordinary meaning of terms at the time they were written. Consequently, he pondered whether LLMs could be instructed to only consider training materials that were written as of a certain date.

Ultimately, Judge Newsom stopped short of a wholehearted endorsement of LLMs for legal interpretation, but concluded that LLMs “have promise,” and that “it no longer strikes [him] as ridiculous to think that an LLM like ChatGPT might have something useful to say about the common, everyday meaning of the words and phrases used in legal texts.”

Additional Areas for Exploration

One additional area where LLMs could be of potential use in the future that was not discussed by Judge Newsom is in patent litigation – namely Markman hearings. For the uninitiated, before an infringement determination can be made, the court first has a Markman hearing for the purposes of interpreting and defining certain terms used in the patent claims. Markman hearings are important because the question of infringement can often be determined based on how the claim terms are interpreted.

Claim terms in patents are interpreted from the perspective of a person of ordinary skill in the art (“POSITA”); for example, a person with a degree in electrical engineering with the years of experience working in the field of medical devices. Now imagine a hypothetical future in which a party or a judge could ask an LLM customized to have the education and experience of the POSITA at issue how a particular claim term would be interpreted. As Judge Newsom points out, this would be just one tool available for the purposes of interpretation; not the end all, be all; but it's an exciting notion.

We’ve only begun to scratch the surface of how LLMs and generative AI could be used in the legal field. Judge Newsom’s thoughtful analysis in Snell provides the beginnings of a framework.

DISCLAIMER: Because of the generality of this update, the information provided herein may not be applicable in all situations and should not be acted upon without specific legal advice based on particular situations. Attorney Advertising.

© Harris Beach Murtha PLLC

Written by:

Harris Beach Murtha PLLC
Contact
more
less

PUBLISH YOUR CONTENT ON JD SUPRA NOW

  • Increased visibility
  • Actionable analytics
  • Ongoing guidance

Harris Beach Murtha PLLC on:

Reporters on Deadline

"My best business intelligence, in one easy email…"

Your first step to building a free, personalized, morning email brief covering pertinent authors and topics on JD Supra:
*By using the service, you signify your acceptance of JD Supra's Privacy Policy.
Custom Email Digest
- hide
- hide