Like, it literally creates answers out of thin air then sells it as if it's correct. It doesn't even try to get it right. What sort of redundancy is there in analyzing if the answer is correct before spewing it out? I thought LLMs were supposed to discern what the best answer is given what was said to it based on its training, yet it'll give answers that don't exist based on any training. It's not like it learned the wrong answer from a Reddit post and just posted what Reddit said. It legit is making up wrong answers then citing correct answers. It just outright gets it wrong almost on purpose.
Anyone understand why LLMs fail so much?
I understand they run correlations but how does it determine a wrong answer is the most correlated to the correct response given the prompt instead of the actual correct answer...
AI is essentially autocomplete on steroids. Its output is an answer to this question: "Given the input I've received what is the most probable response based on my training?" They throw some randomization in so you don't get the exact same response for the exact same input every time, but that's the basic idea. It's no different than say a weather model outputting nonsensical results when you give it inputs far outside the ranges of the data it was trained on.
That's because a correct answer takes that form so the model returns something that looks like that. The content of the answer is made up because the probabilities the model computes return something nonsensical since the model's training data doesn't include the correct answer.