the Supreme Court determined in 2016 that Google Books’ practice of summarizing texts — and showing excerpts to users — didn’t violate copyright law. This means the law could be interpreted to permit the use of books to train software, like A.I. applications.
They'd have to prove the texts were even used, and then prove it's a copyright violation.
Neural Nets are almost never trained on the whole of something at once. Instead the thing is divided up into segments before the NN sees it. In the case of Latent Diffusion models, like Stable diffusion, the model is trained on 512x512 pixel segments of images. For Transformers based LLMs like ChatGPT, it would be trained on segments of text.
They'd have to prove the texts were even used, and then prove it's a copyright violation.
I'm just curious about the leap in logic it takes to go from previews-to-encourage-buying to give-AI-the-whole-book.
Neural Nets are almost never trained on the whole of something at once. Instead the thing is divided up into segments before the NN sees it. In the case of Latent Diffusion models, like Stable diffusion, the model is trained on 512x512 pixel segments of images. For Transformers based LLMs like ChatGPT, it would be trained on segments of text.