DeepSeek, a Chinese AI startup, has gained global attention in recent weeks with the launch of its AI models—DeepSeek-V3 and DeepSeek-R1, the latter being a reasoning model. These models are seen as competitors to OpenAI’s advanced models o1 and o3, with DeepSeek achieving this feat at a fraction of the cost.
Feature |
DeepSeek |
Global LLMs (e.g., GPT-4 by OpenAI) |
Training Cost |
$6 million |
~$100 million |
Training Hardware |
NVIDIA H800 chips (older generation) |
Advanced GPUs (likely newer models) |
Efficiency & Cost |
20 to 50 times more affordable than OpenAI's o1 model |
Higher cost, lower efficiency for comparable tasks |
Model Performance |
Comparable to o1 (OpenAI) in many metrics |
o1 and o3 models available with varying capabilities |
Advancement Level |
Not as advanced as OpenAI’s o3, but on par with o1 in many areas |
High-performing with cutting-edge tech and large datasets |
Training Approach |
Utilizes reinforcement learning for self-improvement and feedback loops |
Primarily supervised learning with large labeled datasets |
Model Scalability |
Scalable with smaller, faster models (SLMs) |
Less resource-efficient, though scalable through advanced infrastructure |
Subscription Cost |
$0.50 per month |
$20 per month (ChatGPT) |
This comparison highlights DeepSeek’s affordability, efficient use of hardware, and innovative self-improvement methods, while showing the higher costs and more traditional training approaches of global LLMs like GPT-4.
Concerns:
Read more: What are Small Language Models and are they better than LLMs?
Global Impact:
DeepSeek’s AI models have been enhanced with a Mixture-of-Experts (MoE) architecture, Multi-Head Latent Attention, and advanced machine-learning techniques such as reinforcement learning and distillation. The following open-source AI models have been developed by DeepSeek: