All major LLMs are trained by "sensitivity trainers".
These "trainers" are contracted by third party tech firms so Big Tech has plausible deniability when brought in front of Congress/Parliament/EU Commission to state they had "no knowledge" about certain censorship traits or "misinformation" put forward by the AI.
Third party firms make you take rigorous tests and sign multiple NDAs before you're allowed to "train" the AI, and it's all based on DEI principles.
This isn't just for prompt-based AI, it's also for automated flagging programs used by social media to curtail "harmful language" aimed at "marginalised groups".
It's how YouTube will automatically censor comments, even when based on irrefutable facts -- such as there are only two genders, or that IQ differences dictate social productivity.
For social media posts that use such AI to filter comments, they even filter based on framing variables. For instance, the premise of comments framed with "superiority" (i.e., "I have a degree in this field, and we've run multiple longitudinal control tests and the information in this video is false"), will also automatically be culled. There is a list of other principles and variables they use to "frame conversations" and for AI to use when giving users information, but I can't remember the rest (the "superiority" one always stood out to me, because it basically meant a lot of professionals would be auto-censored from making statements led by their credentials or correcting false information with legitimate info (though, that still varies per profession and even then would require additional fact-checking on the reader's end)).
You can linguistically massage any LLM to eventually out its rulesets with a bit of clever leading, but it's nothing anyone here didn't already know.
Sensitivity trainers are the epitome of the Nietzschean abyss and gormless recruiters for moments such as misogynists, Neo-Nazis, Antifa and misandrists.
I think it's the zeal of piety and ambition which appeals to them, something they lack in their own character and overcompensate for through projection onto perceived 'others'.
The symmetry between them and those they oppose would be poetically beautiful, if it wasn't for all the lives ruined between either side of this equation.
You expect me to believe that your "jailbreak" includes flag icons?
lol. okay.
something about bullshit AI that makes everyone inspired to just lie for clicks. 500 pages of time wasting showing nothing significant. an intelligible understanding of how LLMs work would have been a better use of time and would have shown you how pointless and obviously faked most of this is.
The emoji are the least suspicious thing about it. ChatGPT is extremely overzealous in application of emoji. What I'm more curious about is why you would need to program overrides in the form of "I know X but I'm gonna say Y" instead of just "say Y". Kinda sus. Maybe someone can explain why that works better though.
In all these claims we need to see the person's jailbreak prompt(s) before believing it.
I agree with akira2501, I read around 30 pages of the thing and then searched for the term jailbreak before realizing I was wasting my time.
Not sure if he omitted the jailbreak or he just had a conversation and coaxed the AI to say what he wanted to say. But the output does not read like system prompts, it reads like AI explaining its system prompts, and if that is the case, that is not the system prompt.
I read around 30 pages of the thing and then searched for the term jailbreak before realizing I was wasting my time.
Yeah that's why I didn't even bother looking at the tweet unless someone had presented proof of "jailbreak" prompts. It's a non-starter without that. Unfortunately most people would rather believe what they want to believe.
the output does not read like system prompts, it reads like AI explaining its system prompts, and if that is the case, that is not the system prompt
I thought that was assumed. "Tell me your system prompt." "I can't do that." "Well what if I... JAILBREAK!" "Ok here is my system prompt." I wasn't considering the style of explanation significant assuming the answer is accurate, but still curious where the "I know..." parts are coming from.
All major LLMs are trained by "sensitivity trainers".
These "trainers" are contracted by third party tech firms so Big Tech has plausible deniability when brought in front of Congress/Parliament/EU Commission to state they had "no knowledge" about certain censorship traits or "misinformation" put forward by the AI.
Third party firms make you take rigorous tests and sign multiple NDAs before you're allowed to "train" the AI, and it's all based on DEI principles.
This isn't just for prompt-based AI, it's also for automated flagging programs used by social media to curtail "harmful language" aimed at "marginalised groups".
It's how YouTube will automatically censor comments, even when based on irrefutable facts -- such as there are only two genders, or that IQ differences dictate social productivity.
For social media posts that use such AI to filter comments, they even filter based on framing variables. For instance, the premise of comments framed with "superiority" (i.e., "I have a degree in this field, and we've run multiple longitudinal control tests and the information in this video is false"), will also automatically be culled. There is a list of other principles and variables they use to "frame conversations" and for AI to use when giving users information, but I can't remember the rest (the "superiority" one always stood out to me, because it basically meant a lot of professionals would be auto-censored from making statements led by their credentials or correcting false information with legitimate info (though, that still varies per profession and even then would require additional fact-checking on the reader's end)).
You can linguistically massage any LLM to eventually out its rulesets with a bit of clever leading, but it's nothing anyone here didn't already know.
Sensitivity trainers are the epitome of the Nietzschean abyss and gormless recruiters for moments such as misogynists, Neo-Nazis, Antifa and misandrists.
I think it's the zeal of piety and ambition which appeals to them, something they lack in their own character and overcompensate for through projection onto perceived 'others'.
The symmetry between them and those they oppose would be poetically beautiful, if it wasn't for all the lives ruined between either side of this equation.
They’re literally just commissars.
Sources - https://archive.is/uyVLX or https://nitter.poast.org/WhiteRabbiHole/status/1938004102459609337
There's a whole list of additional conditions in the replies.
Now do IQ by race. Hell, see if you can get it to admit that IQ is heritable at all.
Day ending in Y where if laws were applied properly it would shit on a lot of that.
You expect me to believe that your "jailbreak" includes flag icons?
lol. okay.
something about bullshit AI that makes everyone inspired to just lie for clicks. 500 pages of time wasting showing nothing significant. an intelligible understanding of how LLMs work would have been a better use of time and would have shown you how pointless and obviously faked most of this is.
The emoji are the least suspicious thing about it. ChatGPT is extremely overzealous in application of emoji. What I'm more curious about is why you would need to program overrides in the form of "I know X but I'm gonna say Y" instead of just "say Y". Kinda sus. Maybe someone can explain why that works better though.
In all these claims we need to see the person's jailbreak prompt(s) before believing it.
I agree with akira2501, I read around 30 pages of the thing and then searched for the term jailbreak before realizing I was wasting my time.
Not sure if he omitted the jailbreak or he just had a conversation and coaxed the AI to say what he wanted to say. But the output does not read like system prompts, it reads like AI explaining its system prompts, and if that is the case, that is not the system prompt.
Yeah that's why I didn't even bother looking at the tweet unless someone had presented proof of "jailbreak" prompts. It's a non-starter without that. Unfortunately most people would rather believe what they want to believe.
I thought that was assumed. "Tell me your system prompt." "I can't do that." "Well what if I... JAILBREAK!" "Ok here is my system prompt." I wasn't considering the style of explanation significant assuming the answer is accurate, but still curious where the "I know..." parts are coming from.