Chinese startup DeepSeek’s launch of its latest AI models, which it says are on a par or better than industry-leading models in the US at a nominal cost, is threatening to upset the technology world order.
The company has lured attention in global AI circles after writing in a paper last month that the training of DeepSeek-V3 required less than $6 million worth of computing power from Nvidia H800 chips.
DeepSeek’s AI Assistant, powered by DeepSeek-V3, has overtaken rival ChatGPT to become the top-rated free application available on Apple’s App Store in the US.
This has raised doubts about the reasoning behind some U.S. tech companies’ decision to pledge billions of dollars in AI investment and shares of several big tech companies, including Nvidia, have been hit.
Below are some facts about the company shaking up the AI sector worldwide.
The release of OpenAI’s ChatGPT in late 2022 caused a scramble among Chinese tech firms, who rushed to create their own chatbots powered by artificial intelligence.
But after the release of the first Chinese ChatGPT equivalent, made by search engine giant Baidu (9888.HK) , opens new tab, there was widespread disappointment in China at the gap in AI capabilities between U.S. and Chinese firms. The quality and cost efficiency of DeepSeek’s models have totally changed this narrative on its head. The two models that have been showered with praise by Silicon Valley executives and U.S. tech company engineers alike, DeepSeek-V3 and DeepSeek-R1, are on par with OpenAI and Meta’s most advanced models, the Chinese startup has said.
They are also cheaper to use. The DeepSeek-R1, released last week, is 20 to 50 times cheaper to use than OpenAI o1 model, depending on the task, claims a post on DeepSeek’s official WeChat account.
But some have publicly expressed scepticism about DeepSeek’s success story.
Scale AI CEO Alexandr Wang said during an interview with CNBC on Thursday, without providing evidence, that DeepSeek has 50,000 Nvidia H100 chips, which he claimed would not be disclosed because that would violate Washington’s export controls that ban such advanced AI chips from being sold to Chinese companies. DeepSeek did not immediately respond to a request for comment on the allegation.
Bernstein analysts on Monday highlighted in a research note that DeepSeek’s total training costs for its V3 model were unknown but were much higher than the $5.58 million the startup said was used for computing power. The analysts also said the training costs of the equally-acclaimed R1 model were not disclosed.
DeepSeek is a Hangzhou-based startup whose controlling shareholder is Liang Wenfeng, co-founder of quantitative hedge fund High-Flyer, based on Chinese corporate records.