Anthropic new AI model is 10x better at hunting bugs so the White House just forced the fed chair and all big bank executives to a meeting - Kotaku In Action 2

Anthropic new AI model is 10x better at hunting bugs so the White House just forced the fed chair and all big bank executives to a meeting (twitter.com)

posted 95 days ago by SophiesBoyfriend 95 days ago by SophiesBoyfriend +24 / -0

17 comments

17 comments share save hide report block hide replies

Comments (17)

sorted by:

▲ 12 ▼

– SophiesBoyfriend [S] 12 points 95 days ago +12 / -0

Or free advertising for an advanced autocomplete model?

edit:

Dario has been saying these lies since GPT2.0 in 2019.

He uses it as marketing.

He reverse engineered some research in 2025 saying “LLMs will blackmail users to keeping them switched on”

permalink save report block reply

▲ 8 ▼

– TheMafia 8 points 95 days ago +8 / -0

of course.

also telling that they don't give a shit about election fraud anymore.

permalink parent save report block reply

▲ 4 ▼

– GamingTheSystem-01 4 points 95 days ago +4 / -0

>advanced autocomplete

So you haven't used claude at all then?

permalink parent save report block reply

▲ 8 ▼

– SophiesBoyfriend [S] 8 points 95 days ago +8 / -0

It’s not creating its own knowledge. Only mining human code

permalink parent save report block reply

▲ 5 ▼

– GamingTheSystem-01 5 points 95 days ago +5 / -0

So you haven't used it then?

permalink parent save report block reply

▲ 7 ▼

– WeedleTLiar 7 points 95 days ago +7 / -0

Are you trying to get tased?

permalink parent save report block reply

▲ 1 ▼

– SophiesBoyfriend [S] 1 point 90 days ago +1 / -0

“ Simply retrieving a reasoning trace looks a lot like human reasoning, until it's time to navigate uncharted territory. If you memorized all reasoning traces of humans from 10,000 BC, you could automate their lives but you could not invent modern civilization.”

permalink parent save report block reply

▲ 1 ▼

– GamingTheSystem-01 1 point 90 days ago +1 / -0

That's what you're doing. Is this an attempt at performance art or are you retarded?

permalink parent save report block reply

▲ 4 ▼

– SR388-SAX 4 points 95 days ago +4 / -0

Nice goalpost shifting.

And you clearly haven't used it for anything worthwhile if you're still desperately clinging to "autocomplete model."

permalink parent save report block reply

▲ 1 ▼

– SophiesBoyfriend [S] 1 point 90 days ago +1 / -0

“Simply retrieving a reasoning trace looks a lot like human reasoning, until it's time to navigate uncharted territory. If you memorized all reasoning traces of humans from 10,000 BC, you could automate their lives but you could not invent modern civilization.”

permalink parent save report block reply

▲ 3 ▼

– foggydoggy 3 points 94 days ago +3 / -0

I have. It's very nice for generating one-time tools, generating diagrams, explaining existing source code and refactoring.

Coding new functionality is ok at times, but it makes decisions that would be catastrophic once the app is scaled. Very nice for proof of concepts, but only somewhat helpful for production code.

I also think they are overselling it, even though I agree it's not just an autocomplete.

permalink parent save report block reply

▲ 11 ▼

– kalerg_plan 11 points 95 days ago +11 / -0

Meanwhile their publicly available flagship models tells you to walk to the car wash to wash your car.

permalink save report block reply

▲ 4 ▼

– GamingTheSystem-01 4 points 95 days ago +4 / -0

ChatGPT fails this test, but Claude gets it right.

It's also worth noting that the conversational models, where you talk to it in real time via the phone app, are considerably dumber than the text models, even without any extra reasoning steps.

permalink parent save report block reply

▲ 4 ▼

– kalerg_plan 4 points 94 days ago +4 / -0

Claude got it right a week ago. They directed all of their computers to their new model. Now it tells you to walk.

permalink parent save report block reply

▲ 2 ▼

– fauxgnaws 2 points 94 days ago +2 / -0

Claude been trained on that specific example, no doubt.

They all still fail to even really simple things that are actually new.

permalink parent save report block reply

▲ 4 ▼

– WeedleTLiar 4 points 95 days ago +4 / -0

If they were serious about finding bugs and thought this Anthropic was a good tool, they could simply have the DHS use it to try to hack them and then contact them if vulnerabilities are found.

Zero (legit) reason for public/private partnership here.

permalink save report block reply

▲ 1 ▼

– Jack 1 point 95 days ago +1 / -0

Anthropic is giving major corps access to use to test for exploits before model release. I imagine both things to be true, the model is that good and can find zero day vulnerabilities and anthropic wants to wave its big dick around for PR.

permalink parent save report block reply