Session 13: DeepSeek
Reinforcement Learning (RL)Open-source AICost-effective LLM

Session 13: DeepSeek

Presenters

Masih MoloodianYasin FakharMohammad Amin Dadgar

DeepSeek

During the AI Talks meeting, we discussed DeepSeek-R1, a newly introduced large language model (LLM) developed by DeepSeek AI Lab, a subsidiary of High-Flyer, a Chinese hedge fund. DeepSeek-R1 gained rapid popularity, surpassing ChatGPT in rankings and downloads, thanks to its open-source nature and innovative approach to AI development. Unlike other commercial models, DeepSeek was created as a non-commercial AI research initiative, leveraging expertise from leading Chinese universities and operating under China’s AI policies. Built on the Fire-Flyer supercomputer infrastructure, the model embodies a uniquely Chinese approach to LLM development, emphasizing transparency, accessibility, and research-driven innovation.

One of DeepSeek-R1’s standout innovations is its use of pure Reinforcement Learning (RL) rather than the standard Supervised Fine-Tuning (SFT) approach, allowing it to autonomously correct and refine its reasoning without human intervention. This self-improving capability enables the model to detect logical errors and retry solutions, making it highly efficient despite having fewer parameters than its competitors. Performance benchmarks show that DeepSeek-R1 competes with leading models from OpenAI and Meta, achieving 97.3% on MATH-500 and 96.3 percentile in Codeforces programming contests, while outperforming larger models like LLaMA 3 with just 14 billion parameters. Furthermore, DeepSeek’s open-source nature sets it apart from OpenAI’s closed-source models, making it a cost-effective alternative, with just $2.19 per million output tokens compared to OpenAI O1’s $60 per million.

We also covered the model’s development costs and infrastructure, highlighting its affordability in training, requiring only $6 million and 2.9 million GPU hours, a fraction of what OpenAI’s top models demand. The model runs efficiently on high-end GPUs like the NVIDIA A100, with both full-precision and quantized versions available for different VRAM requirements. Additionally, the discussion explored how DeepSeek can be run locally using the Ollama Python framework, enhancing accessibility for developers. Towards the end, we touched on recent news, including DeepSeek’s vulnerabilities to jailbreaking attempts and Nvidia’s involvement in deploying the model, indicating its growing impact in the AI landscape.

Slides link: https://www.canva.com/design/DAGb3wjPLgk/uVzAXYgsx5-MXZk1btWDOA/view?utm_content=DAGb3wjPLgk&utm_campaign=designshare&utm_medium=link2&utm_source=uniquelinks&utlId=h9053f5328e