Neural Nets are almost never trained on the whole of something at once. Instead the thing is divided up into segments before the NN sees it. In the case of Latent Diffusion models, like Stable diffusion, the model is trained on 512x512 pixel segments of images. For Transformers based LLMs like ChatGPT, it would be trained on segments of text.
Neural Nets are almost never trained on the whole of something at once. Instead the thing is divided up into segments before the NN sees it. In the case of Latent Diffusion models, like Stable diffusion, the model is trained on 512x512 pixel segments of images. For Transformers based LLMs like ChatGPT, it would be trained on segments of text.