But for these images with multiple distinct subjects, I do it in parts. I generated a bunch of geese with presents on a flat background, picked the ones I liked, cut and pasted them into a shed with Zelda, then used that image as the input for some image-to-image passes, adjusting the cfg value and rolling the dice until I got one that was okay. That's how I did the pokemon birds too. To the best of my knowledge, having it simultaneously generate four completely different pokemon correctly without the features of one leaking into others would have fantastically low odds.
This one
But for these images with multiple distinct subjects, I do it in parts. I generated a bunch of geese with presents on a flat background, picked the ones I liked, cut and pasted them into a shed with Zelda, then used that image as the input for some image-to-image passes, adjusting the cfg value and rolling the dice until I got one that was okay. That's how I did the pokemon birds too. To the best of my knowledge, having it simultaneously generate four completely different pokemon correctly without the features of one leaking into others would have fantastically low odds.