I just asked my Gemini Pro to create an Ancient Roman General picture and it refused to do it because it's controversial.
There must be some deep dark realms of the internet where AI isn't so Judaized. What is it?
I just asked my Gemini Pro to create an Ancient Roman General picture and it refused to do it because it's controversial.
There must be some deep dark realms of the internet where AI isn't so Judaized. What is it?
That's a lot more hazy than what I'd thought you meant. Papers exist for all kinds of things, but that doesn't mean they all really work, or that they're all being used by a major company in production. I think this is what you saw, and it's for fine-tuning low rank adaptations. Having read plenty of papers like this, I'd bet it's a lot less effective in practice. Anthropic, the "AI Alignment" company, has more recent research that's a lot less ambitious, and even that's basically just theoretical.
Any case, "just a jailbreak prompt" is not what I'm referring to. When a model is online, they prevent you from editing the things that it says. With an open source model, you can directly edit the things that it is saying, providing affirmative responses from "its own mouth" as the lead-in to a task completion. LLM refusal training can't handle that, which is why the online interfaces are so restrictive.
Moreover, I don't think you need a $15k computer to run an LLM, especially in the era of quantization. You can run the lightest DeepSeek model on your laptop, or on a cheap Colab server ($10 a month). I've done this myself a while back (though not with DeepSeek), and I'd highly encourage you to try it out.
Even the model I ran on that $15K computer (a version of LaMDA) was too dumb to be useful for anything at all, and it was really slow compared to hosted versions. An even lighter model would be pointless, and getting the speed I see from hosted models would probably require a $100K computer, or a $10 or $15 an hour AWS instance.
Do you mean LLaMA (Zuckerberg stripped out the camel case now, apparently)? LLaMA is known to be fairly mediocre, it's not one of the competitive ones. DeepSeek is the first really 'modern' LLM to get an open-source release.
Anyways, what's your use-case? What requires a smart, uncensored LLM at very low latency?