Normal view

Today — 24 April 2026Main stream

Wccftech
DeepSeek V4 Squeezes Million-Token Context Into 10% of V3.2’s Memory, Escalating China’s AI Efficiency War With OpenAI 24 April 2026 at 22:24

DeepSeek V4 Squeezes Million-Token Context Into 10% of V3.2’s Memory, Escalating China’s AI Efficiency War With OpenAI

24 April 2026 at 22:24

Chinese artificial intelligence lab DeepSeek claims to significantly reduce computing resources required for token inference and memory resources with its latest V4 model, according to its release notes. DeepSeek claims that the V4 AI model requires just 27% single-token inference FLOPs and 10% of key-value (KV) cache when compared to its predecessor, the DeepSeek V3.2 model. The reduction in cache requirements addresses memory requirements, with lower requirements conserving memory and increasing the context available to model builders when creating their models. How DeepSeek V4 Slashes Compute and Memory Costs In its release notes for DeepSeek V4, DeepSeek outlines that the […]

Read full article at https://wccftech.com/deepseek-v4-cuts-kv-cache-by-90-at-1m-tokens-but-aggressive-compression-could-risk-needle-in-a-haystack-failures/

Yesterday — 23 April 2026Main stream

Wccftech
Groq’s Inference Chips Are Beating NVIDIA’s Blackwell by 5x on Cost – And Doing It Twice as Fast 23 April 2026 at 23:52

Groq’s Inference Chips Are Beating NVIDIA’s Blackwell by 5x on Cost – And Doing It Twice as Fast

Wccftech

By:Ramish Zafar

23 April 2026 at 23:52

As AI computing capacity continues to grow, an expert from computing infrastructure provider Nebius sat with AlphaSense to describe the state of the industry. While NVIDIA's leading-edge AI GPUs remain the top in the industry when it comes to performance, the expert believes alternatives are growing in popularity, particularly as the industry shifts its cost metrics. The demand for AI computing capacity also remains high, as providers can easily run at 100% utilization rates to drive down costs and earn the most from their investment. Alternatives To NVIDIA Chips Grow In Popularity As Industry Shifts Towards Cost Per Million Tokens […]

Read full article at https://wccftech.com/nvidias-ai-chips-see-alternatives-emerge-amidst-pricing-model-shift-to-cost-per-million-tokens/