Launched in 2022, Anna’s Archive is a search engine for both legal and illegal libraries [File]
| Photo Credit: REUTERS
A group of authors that sued Nvidia over allegations of copyright violation, have now claimed that the chipmaker contacted the shadow library search engine Anna’s Archive to illegally use its pirated data.
While the case in question is almost two years old, a first consolidated amended complaint was filed on January 16 that provides details of alleged interactions between Nvidia and Anna’s Archive.
The lawsuit was filed in March 2024, with authors Abdi Nazemian, Brian Keene, Stewart O’Nan, Andre Dubus III, and Susan Orlean alleging in the January 16 filing that NVIDIA copied their copyrighted works multiple times to train its language models, and also sourced data from “known pirated libraries”. They further alleged that Nvidia willingly moved to use Anna’s Archive.
Launched in 2022, Anna’s Archive is a search engine for both legal and illegal libraries. It compiles text and e-books from across the internet to make them searchable in one location. Donors can download these books for free at faster speeds than non-paying users, with Anna’s Archive recording tens of thousands of downloads per day, according to the website.
While Anna’s Archive is illegal to use due to its piracy-based model, the platform considers itself a non-profit dedicated to preserving the wealth of human knowledge and making it accessible to all.
The plaintiffs in the Nvidia copyright case claimed that sometime following 2023, Nvidia contacted Anna’s Archive to learn about acquiring its pirated collections and its data set for LLMs. After some back and forth communication with Anna’s Archive, Nvidia was ready to move ahead, alleged the authors in the lawsuit.
“Within a week of contacting Anna’s Archive, and days after being warned by Anna’s Archive of the illegal nature of their collections, NVIDIA management gave “the green light” to proceed with the piracy. Anna’s Archive offered NVIDIA millions of pirated copyrighted books,” stated the filing.
Multiple ongoing lawsuits filed by authors against Big Tech firms accuse them of scraping copyrighted works for AI training data or getting them through illegal “shadow libraries”.
“Virtually every one of the major LLM developers—including OpenAI, Meta, and Anthropic—pirated books from Library Genesis, Z-Library, Sci-Hub, and/or Pirate Library Mirror. NVIDIA followed this industry-wide practice and pirated troves of books from shadow libraries,” claimed the authors in their lawsuit, later adding, “Internal documents show competitive pressures drove NVIDIA to piracy”.
In late December, Anna’s Archive took responsibility for scraping Spotify’s data, claiming that it had 300TB of music and metadata to back up. Early in 2026, however, the platform reported that its .org web address appeared to have been suspended.
Published – January 20, 2026 12:18 pm IST