I hear a lot of talk about the potential of AI and Sexbots. Just recently, Russia Today even ran an article on the topic.
What is actually the latest in technology in this realm?
In theory, we should be able to have some sort of virtual reality augment like Google Glasses and like voice AI where you could have a full on conversation with some unlocked ChatGPT AI while having sex with your doll or something. Your doll could be texting you whole you're at work and snapping you nudes and talking about how much it wants you to fuck her etc... Then the doll AI could even describe how hot it was when you were fucking her with like sensors or some shit.
Anyone actually know what the pinnacle of this tech is at the moment?
Edit - Asking for a friend, obviously...
What are local LLMs and how does someone get one that works?
Local LLMs are Language models that run locally on your graphics card or cpu.
There are three parts to a running an LLM Model, Backend and Front End.
Model determines behavior. This is the part that could be censored or not, or will steer themselves away from certain topics. This is also where you have size and quantization. Size is the number of nodes, and Quantization determines the size of the node.
13B and 33B are the sweet spot for size and 4bit is the sweet spot for quantization right now. Smaller is faster dumber and easier to run. There is also context size but that's changing rapidly right now. Context size determines how long a passage of text it can process, and consequently how long it's memory is in conversation mode.
Backend determines how the model is run. The model either needs to be loaded onto your GPUs VRAM to run on your graphics card, or into your ram .
Oobabooga is a popular backend for running on GPU, and Llama.cpp and Kobold.cpp are backends that run on cpu.
Of course GPU runs much faster but it's far easier to expand your system Ram to the 64 or 128 GB you'll need for the big models than it is to get that much VRAM.
Koboldcpp and Llamacpp both have GPU offload modes that use both, which will speed you up a bit over running on cpu.
For front ends, both kobold and ooba have built in front ends, but there are dedicated front ends like silly tavern that make interacting with it more like a chat app.
So that was the short version. The technology all exists and is pretty good, but it's a pain to set up. I must emphasize that this is all private, uncensored and you are completely in control of it's behavior. Many people create models by taking existing models and fine tuning them by adding specific training data. This can be done very cheaply and quickly, unlike the original models which cost thousands of dollars to create.
This is why so much of the ecosystem is based around Llama, Facebook's model. It was the first model to be leaked in a format that allowed for open source models to be built off it.
In a few years someone will streamline this enough that it will be accessable to run for people who aren't enthusiasts. Right now the market is basically super normie censored offering like chatgpt, and super enthusiast models that require you to know a bit about Linux, tensorflow, and CUDA to get a grip on.
This is an excellent list of references even for those of us not trying to build a sexbot (or maybe we are). Thanks!
Hey I'm also a local LLM enthusiast and I must say that you've provided a very good overview.
I'd like to add that it really isn't too hard to start playing around with them nowadays; usability has improved quite a bit even over the last few months. I like to recommend koboldcpp because I feel like it's much more newbie friendly than ooba (plus I really enjoy its story mode).
Once you have kobold you only need a GGML model and you're good to go, but I guess picking the right model might still be a hurdle for many folks. Myself I have an 8GB card and I like TheBloke's finetunes, so his 4bit quantization of 13B Wizarrd-Vicuna-Uncensored (https://huggingface.co/TheBloke/Wizard-Vicuna-13B-Uncensored-GGML) works pretty well for me. Looking forward to his finetunes of LLama2 which are just now starting to drop!