Win / KotakuInAction2
KotakuInAction2
Communities Topics Log In Sign Up
Sign In
Hot
All Posts
Settings
All
Profile
Saved
Upvoted
Hidden
Messages

Your Communities

General
AskWin
Funny
Technology
Animals
Sports
Gaming
DIY
Health
Positive
Privacy
News
Changelogs

More Communities

frenworld
OhTwitter
MillionDollarExtreme
NoNewNormal
Ladies
Conspiracies
GreatAwakening
IP2Always
GameDev
ParallelSociety
Privacy Policy
Terms of Service
Content Policy
DEFAULT COMMUNITIES • All General AskWin Funny Technology Animals Sports Gaming DIY Health Positive Privacy
KotakuInAction2 The Official Gamergate Forum
hot new rising top

Sign In or Create an Account

7
Google's new breakthrough algorithm reduces LLM memory requirements by AT LEAST 6 times. Memory price collapse soon? (www.tomshardware.com)
posted 88 days ago by Mathdebate 88 days ago by Mathdebate +7 / -0
6 comments share
6 comments share save hide report block hide replies
Comments (6)
sorted by:
▲ 4 ▼
– BandageBandolier 4 points 87 days ago +4 / -0

Dunno about the memory prices yet. The AI players were already demanding more than there likely was capacity to produce. This may just mean they get to build more of the infrastructure they wanted rather than there being a surplus of memory yet.

permalink save report block reply
▲ 1 ▼
– SophiesBoyfriend 1 point 87 days ago +1 / -0

You’re a mod- why was this post never visible until it was about a day old?

permalink parent save report block reply
▲ 1 ▼
– BandageBandolier 1 point 87 days ago +1 / -0

I approved it as soon as I saw it, at ~12 hours. It's been a busy month. Handshakes need manual approval for top level posts.

permalink parent save report block reply
▲ 1 ▼
– SophiesBoyfriend 1 point 87 days ago +1 / -0

👍

permalink parent save report block reply
▲ 1 ▼
– Mathdebate [S] 1 point 88 days ago +1 / -0

More from the paper itself :

TurboQuant proved it can quantize the key-value cache to just 3 bits without requiring training or fine-tuning and causing any compromise in model accuracy, all while achieving a faster runtime than the original LLMs (Gemma and Mistral). It is exceptionally efficient to implement and incurs negligible runtime overhead. The following plot illustrates the speedup in computing attention logits using TurboQuant: specifically, 4-bit TurboQuant achieves up to 8x performance increase over 32-bit unquantized keys on H100 GPU accelerators.

https://research.google/blog/turboquant-redefining-ai-efficiency-with-extreme-compression/

permalink save report block reply
▲ 2 ▼
– SarcasticRidley 2 points 87 days ago +2 / -0

TurboQuant

This is my quant, my math specialist. Look at his face. Look at his eyes. He's asian!

permalink parent save report block reply

Original 8chan Links to Gamer Gate:

.

The main GG discussion is on the videogames board: https://8chan.moe/v/

.

GamerGate archive is at https://8chan.moe/gamergatehq/

.

GamerGate Wiki:

https://ggwiki.deepfreeze.it/index.php/Main_Page

. . . . . .

. . . . . .

Rules:

.

ONE: Do not advocate for illegal violence or post other illegal activity. (Be aware of your local laws.)

.

TWO: Don't threaten, harass, or impersonate users. Also: don't be a psycho. New users will be held to a higher standard.

.

THREE: Do not post porn.

.

FOUR: NSFW/NSFL content must be flaired NSFW.

.

FIVE: No vote manipulation. Do not break communities.win's features.

.

SIX: No spam or reposts. Do not make more than 5 threads a day.

.

SEVEN: Do not post falsehoods and hoaxes that are obvious to an uncontroversial degree.

. . . . . .

. . . . . .

Moderation Logs:

.

(Two different versions, Scored has more features and is cleaner, but .win let's you see a few more details in certain instances.)

  • Scored
  • .win

Moderators

  • DomitiusOfMassilia
  • C
  • BandageBandolier
  • CarmenOfSandiego
  • The_Shadow_of_Intent
  • SocraticMethod1
  • Kienan
  • Smith1980
Message the Moderators

Terms of Service | Privacy Policy

2026.02.01 - pv4fp (status)

Copyright © 2026.

Terms of Service | Privacy Policy