Mathdebate - Kotaku In Action 2 - The Official Gamergate Forum

Google's new breakthrough algorithm reduces LLM memory requirements by AT LEAST 6 times. Memory price collapse soon? by Mathdebate

▲ 1 ▼

Mathdebate 1 point 110 days ago +1 / -0

More from the paper itself :

TurboQuant proved it can quantize the key-value cache to just 3 bits without requiring training or fine-tuning and causing any compromise in model accuracy, all while achieving a faster runtime than the original LLMs (Gemma and Mistral). It is exceptionally efficient to implement and incurs negligible runtime overhead. The following plot illustrates the speedup in computing attention logits using TurboQuant: specifically, 4-bit TurboQuant achieves up to 8x performance increase over 32-bit unquantized keys on H100 GPU accelerators.

https://research.google/blog/turboquant-redefining-ai-efficiency-with-extreme-compression/

permalink context all comments (6) save report block

We're bombing Iran again by TheOutlaw

▲ 27 ▼

Mathdebate 27 points 137 days ago +27 / -0

With TikTok gone, Americans are only allowed to as 'how high' when Israel says 'Jump'

permalink context all comments (63) save report block

Tech expert who highlighted problems with UK's 'online safety act' was called names by Mathdebate

▲ 1 ▼

deleted 1 point 334 days ago +1 / -0

Obese Latina in the new Star Wars by Webspawner3

▲ 1 ▼

deleted 1 point 334 days ago +1 / -0

Police stabbed in Ireland by a 2nd generation muslim migrant (described by media, police and politicians as “irish born and bred”) by SophiesBoyfriend

▲ 1 ▼

deleted 1 point 348 days ago +1 / -0

TIL - Europe’s Rape Gangs are actually “groups of cousins” 1:40 by SophiesBoyfriend

▲ 1 ▼

deleted 1 point 356 days ago +1 / -0

Controversial 'racist and sickening' bonfire with effigy of migrants in life jackets and a 'stop the boats' sign at the top is set alight in Northern Ireland by LastRights

▲ 3 ▼

deleted 3 points 1 year ago +3 / -0

based lady lmao by Telia

▲ 1 ▼

deleted 1 point 1 year ago +1 / -0