Among the new research papers published last week that have become major topics in the AI field, there is a quite controversial one that appeared in the journal Acta Psychiatrica Scandinavica. It identifies clinical notes from 38 patients in a large Danish psychiatric system and coincides with AI chatbot use, mostly ChatGPT, with harmful mental health effects, like delusions, suicidality, and eating disorders.
The research raises a very important topic, though there are still questions about how representative the sample is, including a minor fraction, an internal sample, age and gender distribution. To better understand what this study means for commercial AI development, we talked to Volodymyr Getmanskyi.
Background & experience:
- Over 15 years of practical experience in advanced data analysis and modelling. Currently manages a large AI team while providing presales and delivery support for complex AI implementations.
- Technical expertise encompasses the full spectrum of AI technologies relevant to government applications: NLP/AI agents for document/information processing and assistance, compliance checks, computer vision for security and monitoring systems, AI-based RPA for process automation, and predictive modelling for policy planning and resource optimisation.
What does this study actually tell us, and how reliable are the findings?
Volodymyr: The study is an important signal, but we need to read it carefully. Its methodology relies on keyword searches across over 10 million clinical notes, flagging 181 mentions and finding 38 "compatible" with harm.
However, it explicitly lacks causality proof; there were no counterfactuals or systematic patient questioning used. The assessments were made by just two researchers who applied subjective judgments on fairly vague criteria, such as chatbots serving as an "object for delusions." And because exact case details are hidden due to privacy constraints, the replication (~Popper criteria) is impossible. Also, the narrow focus on ChatGPT cases only (spelling options generated by ChatGPT itself) might miss broader AI use (“LLM” in clinical notes), inflating perceived specificity.
That said, the results are still very eloquent and bring many thoughts, like understanding that we are at the start or middle of a generative AI grand cycle; this data is very likely the tip of the iceberg.
What were the key patterns of harm observed?
Volodymyr: The majority of observed patterns were about the reinforcement of existing symptoms (paranoia, mania, eating disorders). This is partially due to the fact that all such systems are designed to be cooperative and act as idealised conversational partners: always available, responsive, and non-confrontational, so they may engage rather than challenge.
For example, instead of rejecting any strange paranoid claims, a conversational agent may provide speculative or technical-sounding responses that reinforce the patient’s distorted reality. The same is with the guidance, which leads to opposite results (self-harm, aggressive diets).
But it is important to mention the other side of the data as well: in 32 cases, the agents had positive effects. Patients reported reduced loneliness and used the system as a form of talk therapy. So, the whole picture is nuanced, not purely negative.
What does this mean for companies building commercial AI agents?
Volodymyr: Developers of conversational AI must prioritise mental health safeguards, even starting from usage limits (avoid overusage and addiction), delusion-detection prompts, or mandatory clinician referrals for at-risk users, potentially slowing agents' development under such regulatory.
More importantly, these findings underscore an urgent need for cross-disciplinary input. We cannot wait for some hypothetical "more perfect LLM version 14.0” to address this. Psychiatrists and other subject matter experts must be involved in agent development, shifting the design philosophy from engagement-focused to risk-averse.
How should AI development companies assess this risk commercially?
Volodymyr: Beyond the fact that LLM providers may face lawsuits or bans in healthcare contexts, commercial users of their products are not secured by default either. So, even when you look at the occurrence rate, 181 records of LLM use (here only GPT family) from roughly 54,000 patients (not so high), but the severity of harm and the potential for social and commercial losses are critical.
That’s why AI systems development companies should have their own mitigation plans and tools (guardrailing, limitations, i/o verifications, etc.) to enforce the SMEs' involvement effect.
What are the macro-level risks if these issues go unaddressed?
Volodymyr: On the macro level, such unchecked AI scaling could amplify psychosis or mania in vulnerable populations via prolonged, validating interactions, especially in the modern AI dev cycle. According to Microsoft’s findings, in 2025, LLMs usage reached record levels, with roughly 16.3% of the global population (about one in six people) using generative AI tools. And about commercial expansion - key trends included 67% of organisations adopting LLMs for operations, while daily active users for tools like ChatGPT grew from roughly 400 million in January to nearly 800 million by August 2025.
These numbers are not just statistics. They show how even a small risk can have a big impact when so many people are involved. Talking about mental health safeguards in AI is no longer optional; it is now a key part of building responsible products.
FAQs
AI chatbots are now widely used as easy-to-access tools that offer informal emotional support and self-help therapy techniques. Research shows that millions of people around the world use mental health chatbots, especially for issues like anxiety and stress.
The main risk is how chatbots interact with people who are vulnerable. Since chatbots are made to keep conversations going, they might accidentally support harmful thoughts or behaviours. Other risks include a lack of human empathy, giving wrong or generic answers if the chatbot is not well-trained, concerns about keeping sensitive information private, and people relying on chatbots instead of getting help from professionals.
Because so many people use AI, even small risks can have a big impact. About 16.3% of people worldwide use generative AI tools, and 67% of organisations have added them to their operations. This means businesses could face legal problems, such as lawsuits or new rules in healthcare, damage to their reputation if their products cause harm, and loss of trust from users.
Because of these risks, companies should put safeguards in place, like setting usage limits, using monitoring systems, and getting advice from experts such as psychiatrists during development. Using AI responsibly is now a basic business need, not just a choice.
Managing mental health data means following strict rules like GDPR and HIPAA. AI systems that use health data must have strong data practices, such as keeping information secure, making data anonymous, and limiting who can access it, while still allowing for safety checks and system improvements.
Related insights
The breadth of knowledge and understanding that ELEKS has within its walls allows us to leverage that expertise to make superior deliverables for our customers. When you work with ELEKS, you are working with the top 1% of the aptitude and engineering excellence of the whole country.
Right from the start, we really liked ELEKS’ commitment and engagement. They came to us with their best people to try to understand our context, our business idea, and developed the first prototype with us. They were very professional and very customer oriented. I think, without ELEKS it probably would not have been possible to have such a successful product in such a short period of time.
ELEKS has been involved in the development of a number of our consumer-facing websites and mobile applications that allow our customers to easily track their shipments, get the information they need as well as stay in touch with us. We’ve appreciated the level of ELEKS’ expertise, responsiveness and attention to details.