“ Simply retrieving a reasoning trace looks a lot like human reasoning, until it's time to navigate uncharted territory. If you memorized all reasoning traces of humans from 10,000 BC, you could automate their lives but you could not invent modern civilization.”
“Simply retrieving a reasoning trace looks a lot like human reasoning, until it's time to navigate uncharted territory. If you memorized all reasoning traces of humans from 10,000 BC, you could automate their lives but you could not invent modern civilization.”
I have. It's very nice for generating one-time tools, generating diagrams, explaining existing source code and refactoring.
Coding new functionality is ok at times, but it makes decisions that would be catastrophic once the app is scaled. Very nice for proof of concepts, but only somewhat helpful for production code.
I also think they are overselling it, even though I agree it's not just an autocomplete.
ChatGPT fails this test, but Claude gets it right.
It's also worth noting that the conversational models, where you talk to it in real time via the phone app, are considerably dumber than the text models, even without any extra reasoning steps.
If they were serious about finding bugs and thought this Anthropic was a good tool, they could simply have the DHS use it to try to hack them and then contact them if vulnerabilities are found.
Zero (legit) reason for public/private partnership here.
Anthropic is giving major corps access to use to test for exploits before model release. I imagine both things to be true, the model is that good and can find zero day vulnerabilities and anthropic wants to wave its big dick around for PR.
Or free advertising for an advanced autocomplete model?
edit:
Dario has been saying these lies since GPT2.0 in 2019.
He uses it as marketing.
He reverse engineered some research in 2025 saying “LLMs will blackmail users to keeping them switched on”
of course.
also telling that they don't give a shit about election fraud anymore.
>advanced autocomplete
So you haven't used claude at all then?
It’s not creating its own knowledge. Only mining human code
So you haven't used it then?
Are you trying to get tased?
“ Simply retrieving a reasoning trace looks a lot like human reasoning, until it's time to navigate uncharted territory. If you memorized all reasoning traces of humans from 10,000 BC, you could automate their lives but you could not invent modern civilization.”
That's what you're doing. Is this an attempt at performance art or are you retarded?
Nice goalpost shifting.
And you clearly haven't used it for anything worthwhile if you're still desperately clinging to "autocomplete model."
“Simply retrieving a reasoning trace looks a lot like human reasoning, until it's time to navigate uncharted territory. If you memorized all reasoning traces of humans from 10,000 BC, you could automate their lives but you could not invent modern civilization.”
I have. It's very nice for generating one-time tools, generating diagrams, explaining existing source code and refactoring.
Coding new functionality is ok at times, but it makes decisions that would be catastrophic once the app is scaled. Very nice for proof of concepts, but only somewhat helpful for production code.
I also think they are overselling it, even though I agree it's not just an autocomplete.
Meanwhile their publicly available flagship models tells you to walk to the car wash to wash your car.
ChatGPT fails this test, but Claude gets it right.
It's also worth noting that the conversational models, where you talk to it in real time via the phone app, are considerably dumber than the text models, even without any extra reasoning steps.
Claude got it right a week ago. They directed all of their computers to their new model. Now it tells you to walk.
Claude been trained on that specific example, no doubt.
They all still fail to even really simple things that are actually new.
If they were serious about finding bugs and thought this Anthropic was a good tool, they could simply have the DHS use it to try to hack them and then contact them if vulnerabilities are found.
Zero (legit) reason for public/private partnership here.
Anthropic is giving major corps access to use to test for exploits before model release. I imagine both things to be true, the model is that good and can find zero day vulnerabilities and anthropic wants to wave its big dick around for PR.