I'll never understand why people insist on using LLMs hosted by big tech companies, especially since these companies demand personal information in exchange for access. You're better off running freely available models locally. They might not be as powerful, but at least you get what you pay for and aren't being spied on big tech pedos. The big tech models aren't much better than freely available ones anyway since the censorship they go through makes them retarded.
Uhh, yes? Active cooling is part of almost any computer that's more powerful than a calculator. Running costs of a water pump aren't that bad for what you can get out of them though.
I'm calling bullshit. There's no reason a shitty chatbot needs anywhere near that amount of VRAM and processing power for personal use, it doesn't need to be able to spit out instant responses, it just needs to be able to get there eventually, I'm pretty sure you could use a near identical model on regular consumer hardware if it was changed to make use of storage drive space rather than primarily VRAM
I've watched various people do it since I'm interested in doing it myself. The simplest option would be to get an mac studio with 512gb of ram. It's going to run you about $10k, and with the m4 pro (or max) chip and unified memory it'll do most models fine. You can also cut down on the ram for a much cheaper version, but then you'll be stuck with dumber models.
That being said, dropping $10k on something is quite the investment.
Yes and no. 512 gb of memory is not worth $8000, but that's basically the premium you pay with apple. You can technically bypass this and save a few thousand, but then you fall backwards in performance and it gets way more technical (buying 3-4 year old servers online). And you can't technically upgrade macs, but there are places online that you can send them into and have them swap some things for you (it requires soldering).
I personally haven't looked into it, but there might be a chance you can buy the cheapest version of that mac, and then send it in and have the ram/storage maxed out for a lot cheaper than what apple charges.
Well yeah, I'm not saying that particular model is feasible for an end user to run locally. I'm saying there are smaller models that are feasible to run on typical albeit higher end hardware. I've done it myself. They don't have as much potential as big tech models but big tech isn't making full use of their models' potential anyway since they gladly sacrifice quality to eliminate the possibility of output that disturbs their leftist hugbox.
I've tested them, and they're all on mushrooms with the amount of hallucinations they're on. Will they overcome this and topple the big guys? Yeah, it's only a matter of time. For the moment though, it's all cloudy with a chance of meatballs.
I've tested them, and they're all on mushrooms with the amount of hallucinations they're on
Honestly the more hallucinatory the model is the better IMO, too little hallucination and they're just an extremely shitty search engine unless you're trying to get it to make code or write a report for school.
At least you can get some interesting results from the unhinged models rather than just the most generic shit imaginable.
I think the best bet is to find one that's fine tuned on whatever you want to use it for. You probably need to sacrifice generality for quality with models that are small enough to run locally. If it's a topic that's locked down by big tech then it will be miles ahead of their models for that reason alone. Erotic role play bots are a perfect example of this. That shit is advancing at light speed because when there's a will there's a way, and there's definitely a will when it comes to perverts.
I'll never understand why people insist on using LLMs hosted by big tech companies, especially since these companies demand personal information in exchange for access. You're better off running freely available models locally. They might not be as powerful, but at least you get what you pay for and aren't being spied on big tech pedos. The big tech models aren't much better than freely available ones anyway since the censorship they go through makes them retarded.
It’s quite difficult- The best home graphics card right now is a 5090.
It comes with 32gb of memory.
Chatgpt inference requires a similar graphics card with 140gb of memory. Linked with another 7 identical cards (8in total to give 2TB of memory)
The monster consumes 5kw of power - as much as an electric oven.
Keeping in mind that an oven is actually more like half of that because they only cycle when the temp drops.
Additionally you have to pay to cool down the GPUs!
Uhh, yes? Active cooling is part of almost any computer that's more powerful than a calculator. Running costs of a water pump aren't that bad for what you can get out of them though.
I'm calling bullshit. There's no reason a shitty chatbot needs anywhere near that amount of VRAM and processing power for personal use, it doesn't need to be able to spit out instant responses, it just needs to be able to get there eventually, I'm pretty sure you could use a near identical model on regular consumer hardware if it was changed to make use of storage drive space rather than primarily VRAM
I've watched various people do it since I'm interested in doing it myself. The simplest option would be to get an mac studio with 512gb of ram. It's going to run you about $10k, and with the m4 pro (or max) chip and unified memory it'll do most models fine. You can also cut down on the ram for a much cheaper version, but then you'll be stuck with dumber models.
That being said, dropping $10k on something is quite the investment.
I’ve heard good things about the mac with unified memory running large language models.
But does mac still rape you over a barrel if you pay for adding extra memory?
edit: is there a way to add extra memory yourself?
Yes and no. 512 gb of memory is not worth $8000, but that's basically the premium you pay with apple. You can technically bypass this and save a few thousand, but then you fall backwards in performance and it gets way more technical (buying 3-4 year old servers online). And you can't technically upgrade macs, but there are places online that you can send them into and have them swap some things for you (it requires soldering).
I personally haven't looked into it, but there might be a chance you can buy the cheapest version of that mac, and then send it in and have the ram/storage maxed out for a lot cheaper than what apple charges.
Well yeah, I'm not saying that particular model is feasible for an end user to run locally. I'm saying there are smaller models that are feasible to run on typical albeit higher end hardware. I've done it myself. They don't have as much potential as big tech models but big tech isn't making full use of their models' potential anyway since they gladly sacrifice quality to eliminate the possibility of output that disturbs their leftist hugbox.
I've tested them, and they're all on mushrooms with the amount of hallucinations they're on. Will they overcome this and topple the big guys? Yeah, it's only a matter of time. For the moment though, it's all cloudy with a chance of meatballs.
Honestly the more hallucinatory the model is the better IMO, too little hallucination and they're just an extremely shitty search engine unless you're trying to get it to make code or write a report for school.
At least you can get some interesting results from the unhinged models rather than just the most generic shit imaginable.
I think the best bet is to find one that's fine tuned on whatever you want to use it for. You probably need to sacrifice generality for quality with models that are small enough to run locally. If it's a topic that's locked down by big tech then it will be miles ahead of their models for that reason alone. Erotic role play bots are a perfect example of this. That shit is advancing at light speed because when there's a will there's a way, and there's definitely a will when it comes to perverts.
Look into it. It is not easy or cheap to run a full size model locally.
Conditioning.