July 28, 2023

Don’t Be Lazy: Lessons in Licensing Large Language Models

Ballard Spahr LLP

+ Follow Contact

Send

Embed

Ballard Spahr LLP

Llama? Vicuña? Alpaca? You might be asking yourself, “what do these camelids have to do with licensing LLM artificial intelligence?” The answer is, “a lot.”

LLaMa, Vicuña, and Alpaca are the names of three recently developed large language models (LLMs). LLMs are a type of artificial intelligence (AI) that uses deep learning techniques and large data sets to understand, summarize, generate, and predict content (e.g., text). These and other LLMs are the brains behind the generative chatbots showing up in our daily lives, grabbing headlines, and sparking debate about generative artificial intelligence. The LLaMa model was developed by Meta (the parent company of Facebook). Vicuña is the result of a collaboration between UC Berkeley, Stanford University, UC San Diego, and Carnegie Mellon University. And Alpaca was developed by a team at Stanford. LLaMa was released in February, 2023; Alpaca was released on March 13, 2023; and Vicuña was released two weeks later on March 30, 2023.

LLMs like these are powerful tools and present attractive opportunities for businesses and researchers alike. Potential applications of LLMs are virtually limitless, but typical examples are customer service interfaces, content generation (both literary and visual), content editing, and text summarization.

While powerful, these tools present risks. Different models have diverse technical strengths and weaknesses. For example, the team that developed Vicuña recognizes “it is not good at tasks involving reasoning or mathematics, and it may have limitations in accurately identifying itself or ensuring the factual accuracy of its outputs.” Thus, Vicuña might not be the best choice for a virtual math tutor. Moreover, in a general sense, the most popular type of LLM – the recurrent neural network (RNN) – is well-suited for modeling sequential data, but suffers from something called the “vanishing gradient problem” (i.e., as more layers using certain activation functions are added to neural networks, the gradients of the loss function approach zero, making the network hard to train). Meanwhile, transformers (the “T” in GPT), are great with long-range dependencies which help with translation style tasks, but are limited in their ability to perform complex compositional reasoning.

Beyond understanding such technical differences, businesses must understand that using these tools may create legal liabilities. Decision makers must understand the differences in the terms of use (including licensing terms) under which various LLMs (and/or associated chatbots) are made available. For example, the terms of use of GPT-3 (by OpenAI), LaMDA (by Google), and LLaMa are all different. Some terms may overlap or are similar, but the organizations developing the models may have different objectives or motives and therefore may place different restrictions on the use of the models.

For example, Meta believes that “[b]y sharing the code for LLaMA, other researchers can more easily test new approaches to limiting or eliminating [] problems in large language models,” and thus Meta released LLaMa “under a noncommercial license focused on research use cases,” where “[a]ccess to the model will be granted on a case-by-case basis to academic researchers; those affiliated with organizations in government, civil society, and academia; and industry research laboratories around the world.” Thus, generally speaking, LLaMa is available for non-commercial purposes (e.g., research). Similarly, Vicuña, which is a fine-tuned LLaMa model that was trained on approximately 70,000 user shared conversations from ChatGPT, is also available for non-commercial uses. On the other hand, OpenAI’s GPT terms of service tell users “you can use Content (e.g., the inputs of users and outputs generated by the system) for any purpose, including commercial purposes such as sale or publication…” Meanwhile, the terms of use of Google’s Bard (which relies on the LaMDA model developed by Google), as laid out in the “Generative AI Additional Terms of Service,” make it plain that users “may not use the Services to develop machine learning models or related technology.” As is standard in industry, any misuse of the service gives rise to the LLM’s owner and operator to terminate the user’s use and likely creates exposure to civil liabilities under contract law and other related liabilities.

The waters are muddied further when these large corporations start lending and sharing availability of LLMs with each other. There are further indications that Meta is opening up access to its LLaMa model beyond the world of academia as reports surface about partnerships with Amazon and Microsoft. For example, Meta’s LLaMa large language model is now available to Microsoft Azure users.

Thus, in selecting LLMs for various purposes, users must weigh the technical advantages and drawbacks of the different models (e.g., network architecture, weights and biases of algorithms, performance parameters, computing budget and the actual data on which the model was trained) with the legal liabilities that may arise from using these LLMs. Critically, before investing too much time or resources into a product or service that makes use of an LLM, business leaders must review the terms associated with the model in order to fully understand the scope of legally permissible use and take actions to ensure legal compliance with those terms so as to avoid liabilities.

[View source.]

Send Print Report

DISCLAIMER: Because of the generality of this update, the information provided herein may not be applicable in all situations and should not be acted upon without specific legal advice based on particular situations.

Written by:

Ballard Spahr LLP

Contact + Follow

Jonathan Hummel

+ Follow

Jonathon Talcott

+ Follow

less

Published In:

Algorithms

+ Follow

Artificial Intelligence

+ Follow

Google

+ Follow

Innovative Technology

+ Follow

Licenses

+ Follow

Licensing Rules

+ Follow

Machine Learning

+ Follow

Microsoft

+ Follow

Consumer Protection

+ Follow

Privacy

+ Follow

Science, Computers & Technology

+ Follow

less

Ballard Spahr LLP on:

Don’t Be Lazy: Lessons in Licensing Large Language Models

Latest Posts

Written by:

Published In:

Ballard Spahr LLP on:

"My best business intelligence, in one easy email…"