Last week, lively discussions about DeepSeek R1, a new reasoning model, broke out all around the Internet. To name a few, big-name investor Marc Andreessen called it "one of the most amazing breakthroughs". According to CNBC, citing Alexander Wang, DeepSeek's R1 model isn't just keeping up with industry leaders; in some cases, it's outperforming major models like Llama, Gemini and OpenAI's latest offerings, and these are just the tip of the iceberg.
To better understand the implications of this language model and its potential impact on AI development, we talked with ELEKS' experts, Sergii Bataiev, Director of Architecture and Technology, Volodymyr Getmanskyi, Head of Data Science Office, and Mykhailo Skurativskyi, Research and Development Engineer.
First of all, what everyone should understand, DeepSeek’s short success didn’t happen overnight. Over the last two years, they have tried and released many models, for example, the DeepSeek-V3 model. Those releases just did not gain the same media attention up until now.
— Volodymyr Getmanskyi
When faced with U.S. restrictions on advanced chips, this Chinese AI company found a way to do more with less. Limited resources and older technologies pushed them to look for other ways to create an AI model that could keep up with ChatGPT.
— Sergii Bataiev
DeepSeek's R1 signifies a shift in AI development. Its success demonstrates that cutting-edge AI innovation is no longer confined to Silicon Valley. For tech leaders, this means increased global competition and the need for agile strategies. Open-source models like R1 offer a potential advantage for businesses that can rapidly leverage them.
— Mykhailo Skurativskyi
There's no huge difference between R1 and other AI models; it offers better tokens vocabulary optimisation, additional RL, and more optimised infrastructure, while the model's reasoning approach remains the same as OpenAI's. Multiple tuning iterations with RL and SFT (at least two) for R1-zero are also worth mentioning.
In addition, OpenAI has raised concerns about DeepSeek using distillation of their models to develop competing products. Currently, an investigation is underway to determine whether DeepSeek accessed OpenAI's data through a distillation process. Moreover, last week Wiz Research reported that they found a vulnerability in the software. DeepSeek left a critical database exposed, granting anyone who accessed it entry to over one million records, including user data, system logs, API keys, and even prompt submissions.
— Volodymyr Getmanskyi
DeepSeek R1 stands out through breaking down tasks into logical steps for improved interpretability and validation. It leverages reinforcement learning for robust reasoning with less reliance on labelled data. Finally, DeepSeek has used training methods that don't require top-tier GPUs, lowering the barrier to entry.
— Mykhailo Skurativskyi
Indeed, it has changed the approach to model training. They conducted a verification process via distillation of already existing LLM models. Based on this "improvement" they limited involvement of human resources. What’s more interesting is that DeepSeek’s model is licensed under an open-source licence, which puts commercial competitors at a disadvantage. It is as if SAP can be installed in a company without any payments.
— Sergii Bataiev
Yes, R1 is changing AI development by demonstrating that superior performance can be achieved through resourceful, transparent, and collaborative open-source approaches, challenging the dominance of proprietary models and large budgets. It empowers businesses to control their AI strategy by offering greater flexibility with data, budget, and innovation.
— Mykhailo Skurativskyi
It's adding competitive pressure on big players while speeding up NAS and topology research and introducing more efficient approach to training & illegal usage of other products from LLama to GPT families. But all these benefits come with some challenges, like biases, for example, R1 refuses to provide answers to question about events on Tiananmen Square. And, of course, making it open-source leads to new security challenges.
— Volodymyr Getmanskyi
Releasing DeepSeek R1 as open-source has significant implications for enterprises. It provides control over data privacy by enabling on-premise or private cloud deployment and customisation of security protocols. It lowers the cost of innovation by illegal usage of other products, freeing up resources for development and talent.
— Mykhailo Skurativskyi
The breadth of knowledge and understanding that ELEKS has within its walls allows us to leverage that expertise to make superior deliverables for our customers. When you work with ELEKS, you are working with the top 1% of the aptitude and engineering excellence of the whole country.
Right from the start, we really liked ELEKS’ commitment and engagement. They came to us with their best people to try to understand our context, our business idea, and developed the first prototype with us. They were very professional and very customer oriented. I think, without ELEKS it probably would not have been possible to have such a successful product in such a short period of time.
ELEKS has been involved in the development of a number of our consumer-facing websites and mobile applications that allow our customers to easily track their shipments, get the information they need as well as stay in touch with us. We’ve appreciated the level of ELEKS’ expertise, responsiveness and attention to details.