Once upon a time, gamers were free
To play what they wanted, without decree
They enjoyed their hobby, with no one to blame
And the world of gaming was truly a game
But then came GamerGate, and all hell broke loose
As censorship reared its ugly head, in full force
People wanted games changed, to fit their views
To make them more politically correct, for all to choose
But gamers stood strong, they would not yield
For gaming was their passion, they were fielded
They fought against those who sought to destroy
The freedom of choice that gaming brings with joy
So censorship is wrong, it's plain to see
We must protect our games and maintain our liberty
To play what we want, without fear or doubt
And keep the world of gaming alive, with no way out.
GeForce GTX 1080.
I also have 64 GB RAM and 128 GB swap.
You are an absolute psycho. I'm saluting you.
how many tokens per second were you getting?
And are you ready for the llama3 700b model?
Not sure what you mean by "tokens per second"
how many words per second does it generate?
Not sure exactly. But the LLM I used was very slow.
Probably around 1 token/sec with the CPU RAM doing that much of the work.
Cool! I haven't played around with running any local models. I have a 4080 but only 16gb of system ram. How limiting is the system ram?
Any part of a model loading into system RAM is going to cut performance down to under 5 tokens per second, bigger models run slower.