Running DeepSeek R1 Locally: Hardware, Costs, and Optimization

- Authors
- Published on
- Published on
In this riveting video by Aladdin Persson, we delve into the world of running DeepSeek R1 locally, a game-changer for LLMs. The sheer power of this 675 billion parameter model with 8-bit quantization is enough to make any tech enthusiast weak at the knees. But hold on to your hats, folks, because the cost of this cutting-edge setup is no mere pocket change - coming in at a cool 6K. Matthew Carrian, a Hugging Face engineer, takes us on a wild ride through the hardware and software setup required to unleash the full potential of DeepSeek R1.
The heart of this operation lies in the motherboard, boasting a whopping 24 DDR5 RAM slots to accommodate the mammoth memory requirements of these models. With the need to load the entire model into RAM, this component is the unsung hero of the setup. And let's not forget the CPUs - not just one, but two slots for those beefy 95 90004 AMD series processors. But here's the kicker: you don't need the latest and greatest CPUs to avoid bottlenecks; older models like the 9,115 or 9,15 will do just fine and save you a pretty penny.
RAM, RAM, and more RAM - 24 sticks of 32GB each are essential for this operation, ringing in at around 3.4K. And here's the plot twist - no GPUs required! That's right, you can achieve state-of-the-art LM performance locally without relying on those flashy graphics cards. But don't get too comfortable, because the real challenge lies in optimizing the setup for maximum throughput. From BIOS settings to SSDs with Linux, every detail counts in the quest for speed. And when it comes to actual performance, a demo reveals a throughput of 6-8 tokens per second - not too shabby for reading tasks, but a far cry from the lightning-fast speeds we crave for real-time reasoning.

Image copyright Youtube

Image copyright Youtube

Image copyright Youtube

Image copyright Youtube
Watch How to run Deepseek-R1 locally for $6000 on Youtube
Viewer Reactions for How to run Deepseek-R1 locally for $6000
Multiple 3090 GPUs with software to spread the load and a motherboard to connect them all
Using a smaller model with 32 Billion parameters for VRAM efficiency
Consider getting a Mac mini with m4 max or m3 ultra
Waiting for the DGX Spark or DGX Station
Speculation about everyone running their own A.I. locally in the future
Related Articles

Unveiling Llama 4: AI Innovation and Performance Comparison
Explore the cutting-edge Llama 4 models in Aladdin Persson's latest video. Behemoth, Maverick, and Scout offer groundbreaking AI innovation with unique features and performance comparisons, setting new standards in the industry.

Netflix's Innovative Foundation Model: Revolutionizing Personalized Recommendations
Discover how Netflix revolutionizes personalized recommendations with their new foundation model. Centralized learning, tokenizing interactions, and efficient training techniques drive scalability and precision in their cutting-edge system.

Exploring AI in Programming: Benefits, Challenges, and Emotional Insights
Aladdin Persson's video explores the impact of AI in programming, discussing its benefits, limitations, and emotional aspects. The Primagen shares insights on using AI tools like GitHub Co-pilot, highlighting productivity boosts and challenges in coding tasks.

Running DeepSeek R1 Locally: Hardware, Costs, and Optimization
Learn how to run DeepSeek R1 locally for state-of-the-art LM performance without GPUs. Discover hardware recommendations and cost breakdowns for this 675 billion parameter model. Optimize your setup for maximum throughput and consider alternatives like Mac mini clusters.