We ran a giant AI model, the Deepseek-R1 671B FP16 model, on an AMD EPYC 9965 server to see if the CPU server could handle what many GPU servers cannot. What is more, we ran the large model alongside traditional virtualization workloads to see how that impacted performance. We even have a how-to so you can get started running AI in your virtualization cluster using Open WebUI and Ollama. STH Main Site Article: https://www.servethehome.com/running-the-deepseek-r1-671b-model-at-fp16-fidelity-alongside-amd-virtualized-workloads/ Substack: https://axautikgroupllc.substack.com/ STH Top 5 Weekly Newsletter: https://eepurl.com/dryM09 ---------------------------------------------------------------------- Become a STH YT Member and Support Us ---------------------------------------------------------------------- Join STH YouTube membership to support the channel: https://www.youtube.com/channel/UCv6J_jJa8GJqFwQNgNrMuww/join Professional Users Substack: https://axautikgroupllc.substack.com/ ---------------------------------------------------------------------- Where to Find STH ---------------------------------------------------------------------- STH Forums: https://forums.servethehome.com Follow on Twitter: https://twitter.com/ServeTheHome Follow on LinkedIn: https://www.linkedin.com/company/servethehome-com/ Follow on Facebook: https://www.facebook.com/ServeTheHome/ Follow on Instagram: https://www.instagram.com/servethehome/ ---------------------------------------------------------------------- Other STH Content Mentioned in this Video ---------------------------------------------------------------------- - Inside a 100K GPU AI Cluster: https://youtu.be/Jf8EPSBZU7Y - AMD EPYC 9005 Turin: https://youtu.be/sM_lWr6iRds - Buying the most popular server on Newegg: https://youtu.be/XUsNMSyVQU8 - 128GB AI mini PC: https://youtu.be/8_pw7mKmaLw - Most unique 8x GPU AI server: https://youtu.be/sUkZ5XBX_pQ - Supermicro NVIDIA H200: https://youtu.be/dmOCYFVLi2M - Aivres NVIDIA H200: https://youtu.be/RjWRXNiz50c ---------------------------------------------------------------------- Timestamps ---------------------------------------------------------------------- 00:00 Introduction 01:43 Why we are running a AI on a virtualization server 04:07 Taking a look at the AMD Volcano 2P AMD EPYC 9965 System 06:10 How-to Setup Open WebUI and Ollama with Deepseek-R1 671b and FP16 10:00 AMD EPYC 9965 Performance on Deepseek-R1 671B FP16 virtualized and bare metal 14:21 Tip for 1P Operation and Not Hurting Performance 16:43 Key Lessons Learned Running LLMs on a Virtualization Host

THIS Changes NUC-sized Systems FOREVER
141.1K views

This NEW NAS Cooks with Everything
325.5K views

Use AI to Strike Back at AI Memory Prices
273.6K views

I built an 8x NVIDIA GB10 cluster for massive Local AI
462.2K views

TURN for FAST on this 128GB AMD AI mini PC
124.4K views

Compute is Everywhere Around You
119.7K views