Definition
An unauthorised digital repository that aggregates and distributes copyrighted books, academic papers, and other written works without the permission of rights holders. Prominent examples include Library Genesis (LibGen), Z-Library, Sci-Hub, and Bibliotik. Shadow libraries have become central to AI copyright litigation because AI developers used collections sourced from these repositories, notably the Books3 dataset derived from Bibliotik, to train large language models. Claimants in cases such as Bartz v. Anthropic, Tremblay v. OpenAI, and Carreyrou v. Anthropic allege that downloading and using copyrighted works from shadow libraries to build commercial AI systems constitutes infringement regardless of how the works are subsequently processed.