While artificial intelligence promises to be useful in responding to the coronavirus (COVID-19) pandemic, companies should be aware of potential copyright considerations.
The sudden and severe impact of COVID-19 has created many immediate needs for the medical industry, federal and state governments, and the public. Artificial intelligence (AI)—broadly speaking, computing technologies that perform tasks that involve human-like perception and learning based on analysis of large amounts of data (i.e., training data)—could be a powerful tool in addressing these needs and fighting this global pandemic. For example, there have been a number of news stories about whether AI may prove to be helpful in developing vaccines and finding treatments for COVID-19. In the rush to use AI, however, institutions should be aware that using training data or other inputs that constitute third-party intellectual property raises potential copyright concerns.
Copyright protects original works of authorship fixed in a tangible medium of expression. “Original,” under US copyright law, currently means sufficiently creative, independent, human authorship. Accordingly, copyright does not protect facts or discoveries—but it does protect nonfiction works describing facts and discoveries, such as medical research articles and, in certain instances, related research datasets.
The assumption (which may prove to be true) that the use of AI will lead to significant public health benefits does not create a per se exception allowing the use of copyrightable material in a machine-learning database without permission. The strength of an infringement claim in this context may turn, in part, on the availability or lack thereof of a licensing scheme to use the relevant materials for that purpose.
EXCEPTIONS TO COPYRIGHT PROTECTION ARE EVALUATED CASE BY CASE
By statute, copyright law does excuse certain uses that would otherwise be infringing because they are deemed to be “fair use.” Fair use is a defense to infringement, meaning that whether a particular unlicensed training use constitutes copyright infringement or is excused pursuant to fair use has to be determined on a case-by-case basis in the courts—unless Congress acts to create a specific exemption.
Fair use has often been described as “one of the most unsettled areas of the law . . . [a] doctrine . . . ‘so flexible as virtually to defy definition.’” Or, in the words of the US Supreme Court, “the extent of permissible copying varies with the purpose and character of the use.” There is no bright-line rule to apply to determine when AI training data needs to be licensed.
One relevant line of cases from the non-AI context examined the wholesale, unlicensed digitization of books to provide full-text search functionality. In those cases, it was held that where “it [is] reasonably necessary for the [user] to make use of the entirety of the works in order to enable the full-text search function, . . . the copying [is not] excessive.” These courts reasoned that the search function is “a transformative use, which augments public knowledge by making available information about [copyrighted works] without providing the public with a substantial substitute for  the original works or derivatives of them.” Parties may argue that such cases support finding that unlicensed use of various types of copyrightable materials to train AI is similarly transformative.
At the same time, courts have held that “a particular unauthorized use should be considered ‘more fair’ when there is no ready market or means to pay for the use, while such an unauthorized use should be considered ‘less fair’ when there is a ready market or means to pay for the use.” Some content owners have taken steps to standardize their content into more useable formats to license for AI training. The development of this licensing market may impact a fair use analysis—particularly given that courts place significant weight on “(1) the extent of the market harm caused by the particular actions of the alleged infringer, and (2) whether unrestricted and widespread conduct of the sort engaged in by the [alleged infringer] would result in a substantially adverse impact on the potential market [for the original work].”
In some jurisdictions, copyright law has been amended to expressly allow research uses of lawfully accessed copyrighted materials to train AI systems. For example, Japan recently amended its copyright law to expressly permit use of copyrightable content for AI training data, provided the intended use is nonconsumptive, e.g., is used for “comparison, classification, or other statistical analysis of language, sound, or image data.” In 2019, the European Union also adopted the Directive on Copyright and Related Rights in the Digital Single Market, which includes a mandate for member countries to implement limitations or exemptions authorizing (1) “scientific research, text and data mining of works or other subject matter to which [research organizations] have lawful access” and (2) “reproductions and extractions of lawfully accessible works and other subject matter for the purposes of text and data mining.” No such statutory right exists in the United States.
CURRENT US GOVERNMENT REVIEWS OF AI AND COPYRIGHT
Given these and many other important open questions, in late 2019, the US Patent and Trademark Office (USPTO) sought comments on 13 questions related to the impact of AI on copyright, trademarks, trade secret, and other IP rights. Several of these questions focused on the core issues relevant to AI and copyright:
- Does current law adequately address the legality (e.g., the fair use doctrine) of an AI using large volumes of copyrighted material in its operation?
- How does AI impact the need to protect databases, and are current laws adequate?
- What can we learn from the legal systems of other countries?
The USPTO received dozens of written comments from individuals and organizations. While there were many competing views on these topics, some of the most disparate opinions related to whether AI’s use of large volumes of copyrighted materials as training data should always, sometimes, or never be considered fair use. Whether and when the output of AI tools may infringe any input is another outstanding question that almost certainly will be addressed on a case-by-case basis. The challenges in predicting or determining when copyrightable works like, for example, music and software, may infringe preexisting works may become even more complex.
The US Copyright Office (USCO) and the World Intellectual Property Organization are also reviewing many issues related to AI and copyright, including the historical underpinnings of human authorship and copyright, considerations for using copyright-protected works as part of machine learning, whether a sui generis approach to AI and copyright should be considered, and the future of AI and copyright policy in general.
POTENTIAL OVERLAP WITH PATENT CONSIDERATIONS
Notably, the USPTO recently addressed whether AI—described as a “creativity machine”—could be the inventor of a patent. The USPTO determined that interpreting the term “inventor” to include AI would contradict the plain reading of the patent statutes, which refer to inventors as natural persons. It also reasoned that “conception”—the touchstone of inventorship—represents “the completion of the mental part of invention” and must be performed by natural persons.
For copyright, the Supreme Court has repeatedly confirmed that “the Constitution's use of the word “authors” means “he to whom anything owes its origin”—and that “only such [works] as are original, and are founded in the creative powers of the mind” are eligible for copyright. Accordingly, such guidance has been construed by the USCO to require human authorship.
So both the ability to protect AI output as copyrightable subject matter, and the ability to use copyrightable material as input, without consent, are both highly fact-specific issues.
The proliferation of COVID-19-related AI uses may hasten Congress or the courts to make pronouncements on some of these questions concerning what may require a license and what may be fair use. At this time, there are also many open questions in the United States as to when works created with AI will be protected by copyright. The amount of human guidance or collaboration with AI—and how such human authorship is documented and marketed—may make a significant difference in the ability to protect works created with AI.
 Princeton Univ. Press v. Michigan Document Servs., Inc., 99 F.3d 1381, 1392 (6th Cir. 1996).
 Campbell v. Acuff-Rose Music, Inc., 510 U.S. 569, 586-87 (1994).
 Authors Guild, Inc. v. HathiTrust, 755 F.3d 87 (2d Cir. 2014); see also A.V. ex rel. Vanderhye v. iParadigms, LLC, 562 F.3d 630, 639 (4th Cir. 2009).
 Authors Guild v. Google, Inc., 804 F.3d 202, 207 (2d Cir. 2015) (finding fair use for “[c]omplete unchanged copying . . . when the copying was reasonably appropriate” in light of the purpose “to provide a search function”).
 Am. Geophysical Union v. Texaco Inc., 60 F.3d 913, 931 (2d Cir. 1994).
 See Copyright Clearance Center, RightFind® XML for Mining Solution (noting XML-formatted content optimized for data mining)
 Peter Letterese & Assocs., Inc. v. World Inst. of Scientology Enters., 533 F.3d 1287, 1315 (11th Cir. 2008); see also Campbell v. Acuff-Rose Music, Inc., 510 U.S. 569, 590 (1994) (Courts must “consider not only the extent of market harm caused by the particular actions of the alleged infringer, but also whether unrestricted and widespread conduct of the sort engaged in by the defendant ... would result in a substantially adverse impact on the potential market for the original . . . [and] also of harm to the market for derivative works.” (internal quotes and citations omitted)).
 Feist Pubs., Inc. v. Rural Tel. Serv. Co., 499 U.S. 340, 346–47 (1991) (internal citations omitted).
 US Copyright Office, Compendium of US Copyright Office Practices § 306 (3d ed. 2017) (“Because copyright law is limited to “original intellectual conceptions of the author,” the Office will refuse to register a claim if it determines that a human being did not create the work.”).