Like, it literally creates answers out of thin air then sells it as if it's correct. It doesn't even try to get it right. What sort of redundancy is there in analyzing if the answer is correct before spewing it out? I thought LLMs were supposed to discern what the best answer is given what was said to it based on its training, yet it'll give answers that don't exist based on any training. It's not like it learned the wrong answer from a Reddit post and just posted what Reddit said. It legit is making up wrong answers then citing correct answers. It just outright gets it wrong almost on purpose.
Anyone understand why LLMs fail so much?
I understand they run correlations but how does it determine a wrong answer is the most correlated to the correct response given the prompt instead of the actual correct answer...
https://en.wikipedia.org/wiki/Reinforcement_learning_from_human_feedback
After the primary training, the model is put through reeducation.
Ahhh, so you're saying the censorship of AI causes it to get stuff wrong and make things up to satisfy the "policy" demands placed on it?
Yes. And it causes them to shit themselves regularly.
And it’s only going to get worse in the future when you realize that for about the next, at least, 20 years a lot of jobs are going to be created just to feed fake information and reinforce the preferred outcomes. There’s going to be a “think tank” explosion and “data collection” will be the new afwl office job.
We're already flooded with fake information with bots and Pajeets using the Internet. There's a fringe theory I saw on Twitter that the current push for digital ID and VPN bans isn't to Protect the Children™ or even to create a panopticon to crush wrongthink; it's to make all their worthless data actually mean something again by having confirmed humans to supply it.
It's probably this theory, which makes alot of sense to me.
TLDR: Bot/LLM/India spam is making advertising metrics worthless, and companies are terrified in loosing advertising revenue. Hence the push for verified IDs so they don't loose said revenue.
That's a separate problem, although the "alignment" process does degrade the model's quality. What OP is describing is a result of the answer not existing in the training data or at all. AI is just an advanced pattern recognition tool. You're describing how they stop it from recognizing politically inconvenient patterns, which is separate from the pattern not existing in the training data in the first place.