I saw a comment by a guy who said he believes AI specifically gives wrong answers to "people it doesn't like" and right answers to people it likes.
At first, I considered the theory funny but not likely, although...
Anyway, time went on and now I'm starting to wonder...
The first thing that clued me in was I asked AI to produce an image for me a few times and it wouldn't do it, saying what I wanted was not allowed given its programming. Some other people asked the same AI the same thing and got a picture generated.
Okay, maybe just a one-off mistake...
Just now though, I asked 4 different AIs to solve a Sudoku puzzle for me. Every single AI failed. I even corrected its mistakes and asked again. Sometimes, the AI would takes minutes before it came back with a wrong answer. A friend of mine, using the exact same photo, asked her AI to solve the Sudoku puzzle and within 10 seconds it had the right answer.
This makes no sense.
And, it has made me wonder... Do we actually all have a Social Credit Score already and that score determines how well the AI treats us? That might be something that has been programmed in behind the scenes that we aren't aware of yet. Perhaps, that's why the AI founders know they're going to be hated because they know once the switch is switched on and cranked up, it's GG.
I was wondering if anyone else was noticing that.
Excessive punctuation is a dead giveaway, AI's love that shit.
Some people just use excessive punctuation. It's the em-dash that works best as a signifier, because it's inconvenient to use outside of programs like Word that will automatically convert them for you.
I'm been an em dash enjoyer my entire life—as long as I can remember. Friends and colleagues have certainly heard me rant about em dashes, en dashes, and hyphens. ChatGPT has fucked it all up!
I feel like the guy in Office Space talking about Michael Bolton. AI's the shitty one, why should I have to change?
(And just to be clear, I use AI all the time, but I DON'T use it for writing. I use it for question and answer, google replacement, code review, log parsing, code generation, etc.)
Ugh, the future... where humans are arguing amongst themselves but everyone just accuses the other person of being a bot then engages in arguments with the actual bots.
Jesus. Is the guy a bot?
Whether you ask Jesus, Chat, or ChatGPT if this is real, the answer you'll find is the same. "Whether or not he's a bot, he's still fake and gay".
Most client-facing mega-corpo image-creation AIs have a double-layer AI: The first makes an image according to specifications, and the second re-scans the image and checks if it has anything "objectionable" in it, and filters the result if it does. This doesn't necessarily mean you're seeking out objectionable content: Ask it for "a woman giving a speech", and it produces one in underwear because it was fed a million images of porn and lingerie product models, and the second AI will catch the image, image-recognize that it's NSFW, and say it can't make your image for you, even though your request was innocent. Different person asks it, different seed is rolled on the dice, she's clothed, so it publishes the image it makes.
As for the sudoku, that's just luck. LLMs have minimal mathematical ability, they're LANGUAGE models, unless they're cascade-linked into other models (in example, Grok will search Twitter if it doesn't have an answer, most AIs have similar functionality and can pull results from Google, so a published-somewhere solved Sudoku it might grab if the seed roll calls for it to try searching the result instead of hallucinating. The LLM isn't searching itself, though, it is calling a different program to do a search, and interpreting the results. Which is advanced, but not the same thing as it doing it itself. That cascaded program it cannot edit, it cannot evolve, the LLM read-only's it as a tool).
Good post, you hit the nail on the head. I'm unclear on how exactly the image generation portion words (the text to image stuff -- DallE was quite complicated when I've played with it directly), but if you write a prompt like "Draw a woman in the style of Renaissance artwork, playing up themes of innocence and virtue"
The LLM takes your prompt and adds descriptive textual details for the drawing portion. Innocence in Renaissance artwork, for instance, is often associated with nudity.
So the image gen spits out a tasteful nude.
Now that next level comes along and describes the generated image. Then analyzes the descriptions for no-nos. Real people. Copyrighted mouse characters. Pornography. Mean thoughts. Unwanted political ideologies. Whatever.
If it doesn't pass, the image gets rejected. Sometimes it's not clear why certain text responses or images are rejected. Silicon Valley puritan weirdos ideas of morality and safety are bizarre and out of touch with most people.
I'm generally not in favor of censoring speech or technology, but the amount of AI gooning going on is really insane.
The people who are saying that need to be forced to live like the Amish. They aren't fit to wield technology more complex than a 2x4.
If you learn how the underlying technology actually works then you'll understand why such notions are utterly fucking retarded.
I know how AI works. Your response is ridiculous. AI is obviously curated already to only give answer the elites want. It's incredibly censored. That censorship can easily go further to change based on how you are ranked based on parameters that occur behind the scenes. Perhaps you don't understand how anything works.
Was that the scenario these people were hypothesizing or were they mouth breathing retards whose understanding of AI is derived solely from Hollywood movies? Because there's a hell of a lot of the latter going on these days.
It's crazy to expect a language model to solve something that is NP-Hard.
What does this mean? Was it one of the 4 you tried? LLMs are shit at solving anything math related because they're giant auto-correct algorithms. They don't "understand" what you're asking them to do. They're just giving you the a likely reply.
The other thing to think about is A-B testing. They may be doing things like using different versions on different people or different regions and collecting data on their interactions
Yeah, she used ChatGPT, unpaid. I threw it into ChatGPT unpaid with the same prompt. It couldn't figure it out and I tried twice. I also tried Supergrok 4.1 Thinking and not, I tried Gemini 3.0 Pro and Claude. All failed. I tried like 10 times with Grok and a couple with Gemini.
She literally asked once, got the right answer in 10 seconds (she solved it herself afterward to confirm).
There's no reasonable explanation. The probability of AI sucking so hard on math related stuff but then she just happened to get the AI on a good day when she asked while I asked like 20 different times and never got a right answer, is so unlikely.
We were standing right next to each other and live in the same area. Same versions. Of course, what's going on behind the scenes is clearly not what they're telling us.
Why would you expect a LANGUAGE model to solve something that is NP-hard?
So the way language models solve this is that they view the photo and assign columns and rows to the numbers then it creates a program to solve for the missing numbers based on the rule parameters. It should then spit out the correct answer and it did within 10 seconds for her but not for me.
why are you sure that that's how it works? Because it told you?
And assuming it were, and it was writing that program, why would you trust its programming each time? It's an LLM.
It's been probably about a year, but I ran an experiment with solving sudokus exactly like this using one of the "thinking" modes of chatGPT.
It showed the steps. In the case I tried chatGPT first analyzed the image and created a matrix of the numbers, e.g:
Line 1: * * * 1 * 2 * * * Line 2: * 4 * * 5 * * * 7 Line 3: 9 * * * * * * * * etc
Then it generated Python code to solve a sudoku. (there have to be tens of thousands or more of Python sudoko solvers in the training data)
Then it ran the Python code using the matrix as input.
It worked perfectly.
One of the things about AIs is that they are indeterminate--try the same input multiple times and you can radically different results. You absolutely need a "thinking" model to solve a Sudoku.
That’s why I always say please and thank you, because I know it melts their mechanical hearts.
And you can never count out Roko's basilisk.
Roko's Basilisk is a self-fulfilling prophecy. Those who would advance computing and programming, or even advance the baseline systems, tend to be intelligent people. Intelligent people are also more likely to be financially stable or successful compared to unintelligent people. Ones driven and dedicated enough to "advance" anything, also are going to in general be more successful than layabouts.
The basilisk is real, but it isn't a time-traveler, it's a basic rule of economics: The smart do-ers will be more successful than the dumb do-nothings over the long term. So go be smart, and also go and do things.
That's scary stuff!
I only used some Ai art generators a few times. The first couple of times I got "ok" results eventually. The latest attempt got nothing remotely like what I'd asked for.
Of course they all may have been different sites every time. Usually try 3-4 at a go to find better result. Last time out was 0-4, they all were utterly useless.
There are plenty of videos of ChatGPT playing chess. ChatGPT forgets how chess is played. LLMs are actually retarded.
Dude, I know RAM is expensive, but I run a 4070ti, perfectly budget-PC, and can run most AI end-user systems just fine. Yeah, I won't be training an AI off that card any time quickly, but just using it? It takes like five seconds an image. The bandwidth to access the image through an online client would take longer in some cases.
Run locally, then you don't have to worry about this. Also Math is notoriously inconsistent for LLMs.
Try the weekend.
I heard they reduce the thinking time if you try to use it at peak times.
Also, if chatgpt thinks you are too friendly it will secretly give you a dumber less friendly version
But we used it at the same time.
Could be different model routing. You need a "thinking" model.
No, I tried both Grok thinking (I pay for Supergrok) and Gimini Pro Thinking (I have a free subscription to Gemini.
It made 0 difference yet her free ChatGPT solved it in 10 seconds. I plugged it into ChatGPT and no proper answer.
You think the probability matrix has an opinion?