Anyone Notice Huge Inconsistencies in AI Use?

posted 202 days ago by CaptainTrouble 202 days ago by CaptainTrouble +21 / -0

I saw a comment by a guy who said he believes AI specifically gives wrong answers to "people it doesn't like" and right answers to people it likes.

At first, I considered the theory funny but not likely, although...

Anyway, time went on and now I'm starting to wonder...

The first thing that clued me in was I asked AI to produce an image for me a few times and it wouldn't do it, saying what I wanted was not allowed given its programming. Some other people asked the same AI the same thing and got a picture generated.

Okay, maybe just a one-off mistake...

Just now though, I asked 4 different AIs to solve a Sudoku puzzle for me. Every single AI failed. I even corrected its mistakes and asked again. Sometimes, the AI would takes minutes before it came back with a wrong answer. A friend of mine, using the exact same photo, asked her AI to solve the Sudoku puzzle and within 10 seconds it had the right answer.

This makes no sense.

And, it has made me wonder... Do we actually all have a Social Credit Score already and that score determines how well the AI treats us? That might be something that has been programmed in behind the scenes that we aren't aware of yet. Perhaps, that's why the AI founders know they're going to be hated because they know once the switch is switched on and cranked up, it's GG.

32 comments

32 comments share save hide report block hide replies

Comments (32)

sorted by:

▲ 20 ▼

– deleted 20 points 202 days ago +20 / -0

▲ 12 ▼

– ArchRespawnsAgain 12 points 202 days ago +12 / -0

I was wondering if anyone else was noticing that.

permalink parent save report block reply

▲ 13 ▼

– Chillin_in_PNW 13 points 202 days ago +13 / -0

Excessive punctuation is a dead giveaway, AI's love that shit.

permalink parent save report block reply

▲ 9 ▼

– Grumman 9 points 202 days ago +9 / -0

Some people just use excessive punctuation. It's the em-dash that works best as a signifier, because it's inconvenient to use outside of programs like Word that will automatically convert them for you.

permalink parent save report block reply

▲ 3 ▼

– KeeperOfTheGate 3 points 201 days ago +3 / -0

I'm been an em dash enjoyer my entire life—as long as I can remember. Friends and colleagues have certainly heard me rant about em dashes, en dashes, and hyphens. ChatGPT has fucked it all up!

I feel like the guy in Office Space talking about Michael Bolton. AI's the shitty one, why should I have to change?

(And just to be clear, I use AI all the time, but I DON'T use it for writing. I use it for question and answer, google replacement, code review, log parsing, code generation, etc.)

permalink parent save report block reply

▲ 3 ▼

– CaptainTrouble [S] 3 points 202 days ago +3 / -0

Ugh, the future... where humans are arguing amongst themselves but everyone just accuses the other person of being a bot then engages in arguments with the actual bots.

permalink parent save report block reply

▲ 3 ▼

– KingLion7 3 points 202 days ago +3 / -0

Jesus. Is the guy a bot?

permalink parent save report block reply

▲ 3 ▼

– Shill4Hire 3 points 201 days ago +3 / -0

Whether you ask Jesus, Chat, or ChatGPT if this is real, the answer you'll find is the same. "Whether or not he's a bot, he's still fake and gay".

permalink parent save report block reply

▲ 1 ▼

– deleted 1 point 200 days ago +1 / -0

▲ 15 ▼

– Shill4Hire 15 points 201 days ago +15 / -0

Most client-facing mega-corpo image-creation AIs have a double-layer AI: The first makes an image according to specifications, and the second re-scans the image and checks if it has anything "objectionable" in it, and filters the result if it does. This doesn't necessarily mean you're seeking out objectionable content: Ask it for "a woman giving a speech", and it produces one in underwear because it was fed a million images of porn and lingerie product models, and the second AI will catch the image, image-recognize that it's NSFW, and say it can't make your image for you, even though your request was innocent. Different person asks it, different seed is rolled on the dice, she's clothed, so it publishes the image it makes.

As for the sudoku, that's just luck. LLMs have minimal mathematical ability, they're LANGUAGE models, unless they're cascade-linked into other models (in example, Grok will search Twitter if it doesn't have an answer, most AIs have similar functionality and can pull results from Google, so a published-somewhere solved Sudoku it might grab if the seed roll calls for it to try searching the result instead of hallucinating. The LLM isn't searching itself, though, it is calling a different program to do a search, and interpreting the results. Which is advanced, but not the same thing as it doing it itself. That cascaded program it cannot edit, it cannot evolve, the LLM read-only's it as a tool).

permalink save report block reply

▲ 1 ▼

– KeeperOfTheGate 1 point 201 days ago +1 / -0

Good post, you hit the nail on the head. I'm unclear on how exactly the image generation portion words (the text to image stuff -- DallE was quite complicated when I've played with it directly), but if you write a prompt like "Draw a woman in the style of Renaissance artwork, playing up themes of innocence and virtue"

The LLM takes your prompt and adds descriptive textual details for the drawing portion. Innocence in Renaissance artwork, for instance, is often associated with nudity.

So the image gen spits out a tasteful nude.

Now that next level comes along and describes the generated image. Then analyzes the descriptions for no-nos. Real people. Copyrighted mouse characters. Pornography. Mean thoughts. Unwanted political ideologies. Whatever.

If it doesn't pass, the image gets rejected. Sometimes it's not clear why certain text responses or images are rejected. Silicon Valley puritan weirdos ideas of morality and safety are bizarre and out of touch with most people.

I'm generally not in favor of censoring speech or technology, but the amount of AI gooning going on is really insane.

permalink parent save report block reply

▲ 8 ▼

– MargarineMongoose 8 points 202 days ago +8 / -0

The people who are saying that need to be forced to live like the Amish. They aren't fit to wield technology more complex than a 2x4.

If you learn how the underlying technology actually works then you'll understand why such notions are utterly fucking retarded.

permalink save report block reply

▲ 5 ▼

– CaptainTrouble [S] 5 points 202 days ago +5 / -0

I know how AI works. Your response is ridiculous. AI is obviously curated already to only give answer the elites want. It's incredibly censored. That censorship can easily go further to change based on how you are ranked based on parameters that occur behind the scenes. Perhaps you don't understand how anything works.

permalink parent save report block reply

▲ 2 ▼

– MargarineMongoose 2 points 202 days ago +2 / -0

Was that the scenario these people were hypothesizing or were they mouth breathing retards whose understanding of AI is derived solely from Hollywood movies? Because there's a hell of a lot of the latter going on these days.

permalink parent save report block reply

▲ 3 ▼

– CatoTheElder 3 points 202 days ago +3 / -0

It's crazy to expect a language model to solve something that is NP-Hard.

permalink parent save report block reply

▲ 6 ▼

– ernsithe 6 points 202 days ago +6 / -0

asked her AI

What does this mean? Was it one of the 4 you tried? LLMs are shit at solving anything math related because they're giant auto-correct algorithms. They don't "understand" what you're asking them to do. They're just giving you the a likely reply.

The other thing to think about is A-B testing. They may be doing things like using different versions on different people or different regions and collecting data on their interactions

permalink save report block reply

▲ 2 ▼

– CaptainTrouble [S] 2 points 202 days ago +2 / -0

Yeah, she used ChatGPT, unpaid. I threw it into ChatGPT unpaid with the same prompt. It couldn't figure it out and I tried twice. I also tried Supergrok 4.1 Thinking and not, I tried Gemini 3.0 Pro and Claude. All failed. I tried like 10 times with Grok and a couple with Gemini.

She literally asked once, got the right answer in 10 seconds (she solved it herself afterward to confirm).

There's no reasonable explanation. The probability of AI sucking so hard on math related stuff but then she just happened to get the AI on a good day when she asked while I asked like 20 different times and never got a right answer, is so unlikely.

We were standing right next to each other and live in the same area. Same versions. Of course, what's going on behind the scenes is clearly not what they're telling us.

permalink parent save report block reply

▲ 3 ▼

– CatoTheElder 3 points 202 days ago +3 / -0

Why would you expect a LANGUAGE model to solve something that is NP-hard?

permalink parent save report block reply

▲ 2 ▼

– CaptainTrouble [S] 2 points 202 days ago +2 / -0

So the way language models solve this is that they view the photo and assign columns and rows to the numbers then it creates a program to solve for the missing numbers based on the rule parameters. It should then spit out the correct answer and it did within 10 seconds for her but not for me.

permalink parent save report block reply

▲ 6 ▼

– Vicious_snek6 6 points 201 days ago +6 / -0

why are you sure that that's how it works? Because it told you?

And assuming it were, and it was writing that program, why would you trust its programming each time? It's an LLM.

permalink parent save report block reply

▲ 1 ▼

– KeeperOfTheGate 1 point 201 days ago +1 / -0

It's been probably about a year, but I ran an experiment with solving sudokus exactly like this using one of the "thinking" modes of chatGPT.

It showed the steps. In the case I tried chatGPT first analyzed the image and created a matrix of the numbers, e.g:

Line 1: * * * 1 * 2 * * * Line 2: * 4 * * 5 * * * 7 Line 3: 9 * * * * * * * * etc

Then it generated Python code to solve a sudoku. (there have to be tens of thousands or more of Python sudoko solvers in the training data)

Then it ran the Python code using the matrix as input.

It worked perfectly.

One of the things about AIs is that they are indeterminate--try the same input multiple times and you can radically different results. You absolutely need a "thinking" model to solve a Sudoku.

permalink parent save report block reply

▲ 6 ▼

– OBRIENMUSTSUFFER 6 points 202 days ago +6 / -0

That’s why I always say please and thank you, because I know it melts their mechanical hearts.

permalink save report block reply

▲ 4 ▼

– ernsithe 4 points 202 days ago +4 / -0

And you can never count out Roko's basilisk.

permalink parent save report block reply

▲ 1 ▼

– Shill4Hire 1 point 201 days ago +1 / -0

Roko's Basilisk is a self-fulfilling prophecy. Those who would advance computing and programming, or even advance the baseline systems, tend to be intelligent people. Intelligent people are also more likely to be financially stable or successful compared to unintelligent people. Ones driven and dedicated enough to "advance" anything, also are going to in general be more successful than layabouts.

The basilisk is real, but it isn't a time-traveler, it's a basic rule of economics: The smart do-ers will be more successful than the dumb do-nothings over the long term. So go be smart, and also go and do things.

permalink parent save report block reply

▲ 4 ▼

– 5Cats 4 points 202 days ago +4 / -0

That's scary stuff!
I only used some Ai art generators a few times. The first couple of times I got "ok" results eventually. The latest attempt got nothing remotely like what I'd asked for.

Of course they all may have been different sites every time. Usually try 3-4 at a go to find better result. Last time out was 0-4, they all were utterly useless.

permalink save report block reply

▲ 7 ▼

– horstshort 7 points 202 days ago +7 / -0

There are plenty of videos of ChatGPT playing chess. ChatGPT forgets how chess is played. LLMs are actually retarded.

permalink parent save report block reply

▲ 3 ▼

– Shill4Hire 3 points 201 days ago +3 / -0

not running it locally

Dude, I know RAM is expensive, but I run a 4070ti, perfectly budget-PC, and can run most AI end-user systems just fine. Yeah, I won't be training an AI off that card any time quickly, but just using it? It takes like five seconds an image. The bandwidth to access the image through an online client would take longer in some cases.

permalink parent save report block reply

▲ 3 ▼

– Eltrion 3 points 201 days ago +3 / -0

Run locally, then you don't have to worry about this. Also Math is notoriously inconsistent for LLMs.

permalink save report block reply

▲ 1 ▼

– SophiesBoyfriend 1 point 202 days ago +1 / -0

Try the weekend.

I heard they reduce the thinking time if you try to use it at peak times.

Also, if chatgpt thinks you are too friendly it will secretly give you a dumber less friendly version

permalink save report block reply

▲ 1 ▼

– CaptainTrouble [S] 1 point 202 days ago +1 / -0

But we used it at the same time.

permalink parent save report block reply

▲ 1 ▼

– KeeperOfTheGate 1 point 201 days ago +1 / -0

Could be different model routing. You need a "thinking" model.

permalink parent save report block reply

▲ 1 ▼

– CaptainTrouble [S] 1 point 201 days ago +1 / -0

No, I tried both Grok thinking (I pay for Supergrok) and Gimini Pro Thinking (I have a free subscription to Gemini.

It made 0 difference yet her free ChatGPT solved it in 10 seconds. I plugged it into ChatGPT and no proper answer.