Yesterday, popular authors including John Grisham, Jonathan Franzen, George R.R. Martin, Jodi Picoult, and George Saunders joined the Authors Guild in suing OpenAI, alleging that training the companyâs large language models (LLMs) used to power AI tools like ChatGPT on pirated versions of their books violates copyright laws and is âsystematic theft on a mass scale.â
âGenerative AI is a vast new field for Silicon Valleyâs longstanding exploitation of content providers," Franzen said in a statement provided to Ars. "Authors should have the right to decide when their works are used to âtrainâ AI. If they choose to opt in, they should be appropriately compensated.â
OpenAI has previously argued against two lawsuits filed earlier this year by authors making similar claims that authors suing âmisconceive the scope of copyright, failing to take into account the limitations and exceptions (including fair use) that properly leave room for innovations like the large language models now at the forefront of artificial intelligence.â
This latest complaint argued that OpenAIâs âLLMs endanger fiction writersâ ability to make a living, in that the LLMs allow anyone to generateâautomatically and freely (or very cheaply)âtexts that they would otherwise pay writers to create.â
Authors are also concerned that the LLMs fuel AI tools that âcan spit out derivative works: material that is based on, mimics, summarizes, or paraphrasesâ their works, allegedly turning their works into âengines ofâ authorsâ âown destructionâ by harming the book market for them. Even worse, the complaint alleged, businesses are being built around opportunities to create allegedly derivative works:
Businesses are sprouting up to sell prompts that allow users to enter the world of an authorâs books and create derivative stories within that world. For example, a business called Socialdraft offers long prompts that lead ChatGPT to engage in âconversationsâ with popular fiction authors like Plaintiff Grisham, Plaintiff Martin, Margaret Atwood, Dan Brown, and others about their works, as well as prompts that promise to help customers âCraft Bestselling Books with AI.â
They claimed that OpenAI could have trained their LLMs exclusively on works in the public domain or paid authors âa reasonable licensing feeâ but chose not to. Authors feel that without their copyrighted works, OpenAI âwould have no commercial product with which to damageâif not usurpâthe market for these professional authorsâ works.â
âThere is nothing fair about this,â the authorsâ complaint said.
Their complaint noted that OpenAI chief executive Sam Altman claims that he shares their concerns, telling Congress that "creators deserve control over how their creations are usedâ and deserve to âbenefit from this technology.â But, the claim adds, so far, Altman and OpenAIâwhich, claimants allege, âintend to earn billions of dollarsâ from their LLMsâhave âproved unwilling to turn these words into actions.â
Saunders said that the lawsuitâwhich is a proposed class action estimated to include tens of thousands of authors, some of multiple works, where OpenAI could owe $150,000 per infringed workâwas an âeffort to nudge the tech world to make good on its frequent declarations that it is on the side of creativity.â He also said that stakes went beyond protecting authorsâ works.
âWriters should be fairly compensated for their work,â Saunders said. "Fair compensation means that a personâs work is valued, plain and simple. This, in turn, tells the culture what to think of that work and the people who do it. And the work of the writerâthe human imagination, struggling with reality, trying to discern virtue and responsibility within itâis essential to a functioning democracy.â
The authorsâ complaint said that as more writers have reported being replaced by AI content-writing tools, more authors feel entitled to compensation from OpenAI. The Authors Guild told the court that 90 percent of authors responding to an internal survey from March 2023 âbelieve that writers should be compensated for the use of their work in âtrainingâ AI.â On top of this, there are other threats, their complaint said, including that âChatGPT is being used to generate low-quality ebooks, impersonating authors, and displacing human-authored books.â
Authors claimed that despite Altmanâs public support for creators, OpenAI is intentionally harming creators, noting that OpenAI has admitted to training LLMs on copyrighted works and claiming that thereâs evidence that OpenAIâs LLMs âingestedâ their books âin their entireties.â
âUntil very recently, ChatGPT could be prompted to return quotations of text from copyrighted books with a good degree of accuracy,â the complaint said. âNow, however, ChatGPT generally responds to such prompts with the statement, âI canât provide verbatim excerpts from copyrighted texts.ââ
To authors, this suggests that OpenAI is exercising more caution in the face of authorsâ growing complaints, perhaps since authors have alleged that the LLMs were trained on pirated copies of their books. Theyâve accused OpenAI of being âopaqueâ and refusing to discuss the sources of their LLMsâ data sets.
Authors have demanded a jury trial and asked a US district court in New York for a permanent injunction to prevent OpenAIâs alleged copyright infringement, claiming that if OpenAIâs LLMs continue to illegally leverage their works, they will lose licensing opportunities and risk being usurped in the book market.
Ars could not immediately reach OpenAI for comment. [Update: OpenAIâs spokesperson told Ars that âcreative professionals around the world use ChatGPT as a part of their creative process. We respect the rights of writers and authors, and believe they should benefit from AI technology. Weâre having productive conversations with many creators around the world, including the Authors Guild, and have been working cooperatively to understand and discuss their concerns about AI. Weâre optimistic we will continue to find mutually beneficial ways to work together to help people utilize new technology in a rich content ecosystem.â]
Rachel Geman, a partner with Lieff Cabraser and co-counsel for the authors, said that OpenAIâs "decision to copy authorsâ works, done without offering any choices or providing any compensation, threatens the role and livelihood of writers as a whole.â She told Ars that "this is in no way a case against technology. This is a case against a corporation to vindicate the important rights of writers.â
If enough countries join in then there will be a barrier to actually making money off of it. Even if you become the leader in AI if your method is just banned in other countries the money wonât be there.
That being said, it doesnât seem like itâs going to get very far anyway.
Given how we train models (content and math), AIs is not practical to ban/legislate away. While the public applications of AI are for content generation and NLP, as @Rinox alluded to, the military applications are where we are going to see the most focus from the government. As an example, the Lantirn targeting pod uses SVMs to profile aircraft from afar, and it took enormous engineering to get it accurate. Comparable object detection functionality can be obtained with NNs and off-the-shelf GPUs. Countries like China already have âdiffering philosophiesâ when it comes to intellectual property rights, so we can remove the largest manufacturing market from the potential list of those who would blanket ban AI. Ditto on any possibility of their military forgoing AI either.
The real problem here is copyright law, which has extended protections far and above the length of time that is reasonable. Had we terms of say 35 years, we could simply train on older material.
Itâs not about making money, not only at least. The other reason why the US, China and other countries are obsessed with AI, LLM, NN, ML is because it may prove decisive for their militaries.
What is thought by most militaries today is that we are at a turning point of sorts and the next generation of weapons will be powered by AI in some capacity, and the more we go on, the more AI will be involved (target acquisition, reaction, drone and missile guidance, APS, AA, etc)