March 11, 2025

Machine Unlearning – A Potential Option for Remedy

+ Follow Contact

Send

Embed

Farella Braun + Martel LLP

Machine unlearning (MU) is a concept that is likely more interesting to lawyers than to data scientists, as the latter typically focus on collecting, maintaining, and mining as much data as possible—whether in the context of traditional data analytics or Generative AI (GenAI) model training. Unless the data is outrageously inaccurate or biased—so much so that it presents unfixable statistical distortions and pollutes the generative model to the extent that the entire model needs to be retrained—data scientists are generally not thinking about erasing, deleting, or forgetting data. For lawyers, however, erasing, forgetting, and unlearning hold significance for at least two reasons.

First, privacy laws and regulations include provisions for the right to erasure or deletion (e.g., GDPR, Art. 17, and CCPA §1798.105), which grant individuals the right to request that businesses retract and remove their data from public access or third parties. In the context of GenAI, this could require generative models to “forget” the data they have learned.

Second, one of the fundamental principles of legal remedies is restoring something to its original state, known as “restitution.” For example, in cases of real property trespass, restitution involves vacating the land; in larceny, it requires returning stolen property to its rightful owner. Similarly, in a privacy intrusion case involving generative models trained on private data, restitution might require the removal of that data from the model’s knowledge base—a significantly more challenging task than one might assume.

As explained in this DarkReading.com article, unlearning is far more complex than merely deleting or removing training data: “Under GDPR, deleting personal data is like picking carrots out of a salad. But deleting data from a trained LLM is more like trying to retrieve a whole strawberry from a smoothie.” According to recent research, asking a machine to forget training data is even harder than extracting a strawberry from a smoothie—at least a blender with a strawberry smoothie cannot create new strawberries in its next task, regardless of whether the old strawberry has been removed. A generative model, however, could still produce outputs that resemble the original training data even after the training data’s removal, as knowledge of the deleted data could be revived through user prompts or related tasks. For more on this, see a research paper by leading academics: https://arxiv.org/pdf/2412.06966.

The cleanest method of unlearning for generative models appears to be retraining, but this process is prohibitively expensive and nearly impossible for large language models (LLMs). Scientists are actively exploring more feasible methods to achieve unlearning without requiring full retraining. For recent developments, see an article published in Nature: https://www.nature.com/articles/s42256-025-00985-0.

As computer scientists continue to advance machine unlearning technology, the legal community should closely monitor the development of this technology. MU as a remedy has not yet appeared in legal opinions, but the right to erasure under privacy law and the right to restitution in property law could lead to adoption—particularly if unlearning becomes affordable and practical.

Under GDPR, deleting personal data is like picking carrots out of a salad. But deleting data from a trained LLM is more like trying to retrieve a whole strawberry from a smoothie.

www.darkreading.com/...

Send Print Report

DISCLAIMER: Because of the generality of this update, the information provided herein may not be applicable in all situations and should not be acted upon without specific legal advice based on particular situations. Attorney Advertising.