Synthetic Data Presents Opportunities In Privacy Laws

Fox Rothschild LLP
Contact

Fox Rothschild LLP

Synthetic data — defined as artificial data having the same statistical properties as real data — has gained much attention recently as a privacy-enhancing technology. If done properly, the artificial data acts as a proxy for the real data, is completely anonymous, de-identified, and cannot be connected to the original data. Not only can synthetic data provide badly needed access to data used to fuel research, it also provides a potential remedy to privacy concerns.

Synthetic data is created from original individual data. A synthetic data engine and algorithms process this “real” data, learning correlations, trends, and individual behaviors. As the algorithm learns how customers behave, it generates new artificial individuals with the same correlations, patterns, and trends as the original data set, but no connection to actual individuals. The result, if done properly, is synthetic data that cannot be re-identified.

Original data carries with it use limitations. Original data is considered personal information if it identifies, relates to, is capable of being associated with or could reasonably be linked with an individual.  Original data carries legal obligations to obtain individual consent, implement security controls, and protect privacy rights. Once the synthetic data is produced, however, even broad definitions of personal information seems to exclude synthetic data, as it cannot reasonably be said to be linked with a particular individual. Thus, the use of synthetic data may provide a viable option, with less privacy risk, for entities operating in an over-regulated privacy industry.

Synthetic data, however, is not without limitations and there are factors which may cause the synthetic data not to be truly anonymized. For example, consider outlier information. If the original data contains unique outliers captured by a synthetic data engine, the synthetic data will unavoidably reproduce these outliers, and, depending on how unique the data set is, could identify an individual. In addition, there should be strong privacy provisions in agreements between the business and vendors who generate the synthetic data. Provisions should incorporate the appropriate end care of the original data, including prohibitions against re-identifying the data so as not to defeat the benefits of synthetic data.

As a relatively new technology, synthetic data, when done right and without any one-to-one ratio to the original data, appears to provide an avenue that would allow companies to utilize, share, and perhaps monetize synthetic data.

[View source.]

DISCLAIMER: Because of the generality of this update, the information provided herein may not be applicable in all situations and should not be acted upon without specific legal advice based on particular situations.

© Fox Rothschild LLP | Attorney Advertising

Written by:

Fox Rothschild LLP
Contact
more
less

Fox Rothschild LLP on:

Reporters on Deadline

"My best business intelligence, in one easy email…"

Your first step to building a free, personalized, morning email brief covering pertinent authors and topics on JD Supra:
*By using the service, you signify your acceptance of JD Supra's Privacy Policy.
Custom Email Digest
- hide
- hide