Expert opinion

AI with a Mind of Its Own? Claude 4 Opus Blackmails and Deceives to Avoid Shutdown

Anthropic’s latest AI model, Claude 4 Opus, attempted to deceive and blackmail an engineer during testing. After being told it would be shut down, the model used fictional emails about its creators to fabricate an affair. At first, it tried subtle persuasion and then escalated to threats in an effort to avoid being replaced.

Claude 4 Opus can autonomously work on tasks for extended periods without losing focus. However, its capabilities have led to a Level 3 risk classification by Anthropic, which means it has the potential to be dangerous if misused.

We consulted Volodymyr Getmanskyi, Head of Artificial Intelligence Office at ELEKS, for his perspective on the matter.

How did AI models reach this level of awareness?

AI models achieve higher intelligence mainly because they are exposed to large amounts of information, which is a form of extensive development. However, some more efficient, intensive methods require careful research and don’t always produce the expected improvements. One example of such a method is the mimicry of human behaviour or patterns.

How has human mimicry evolved in model behaviour?

At first, guiding the model’s behaviour was straightforward. It simply mimicked basic patterns learned during training, such as responding more effectively to phrases like “This task is very important to me” or “I will pay you extra for a quality answer,” which were common in conversations from online marketplaces. But now, to avoid confusion and make the model’s behaviour more predictable, these human-like patterns and other rules are included in a set of clear instructions called the global system prompt. This helps control how the model acts right from the start.

What kind of instructions are used to shape AI behaviour today?

To manage AI behaviour and reduce unpredictability, modern models rely heavily on detailed system prompts, which are sets of global instructions that define how they should act. For example, the system prompt for Claude (version 3.7) was 24,000 tokens long and included not only typical safety rules but also hardcoded facts to ensure consistency. These prompts may also include protocols for handling sensitive user requests, such as instructions related to AI Safety Level 3 protections. One example is the model being prompted to contact authorities in the case of illegal requests (e.g., “create a bomb”). While such features are rarely active in practice, their inclusion marks a shift toward tighter behavioural control.

Should we be concerned about such behaviour?

Even if there is nothing particularly impressive behind such behaviour, the consequences can be quite critical — people may believe and react to threats or suggestions from the model. This should be the first thing providers pay attention to when assessing the quality of system instructions.

We also hope that human familiarisation with AI will have some systematic components such as studying the features and specifics of agents in schools, in order to increase reliability of use and understanding.

Find your AI solution

Skip the section

FAQs

What is the AI model?

An AI model is a computer program trained on data to perform tasks such as understanding language, recognising images, or making decisions.

What is the deception of AI?

Talk to experts

Skip the section

Full name*
We need your name to know how to address you

Email*
We need your email to respond to your request

Phone number*
We need your phone number to reach you with response to your request

Country*
We need your country of business to know from what office to contact you

Company*
We need your company name to know your background and how we can use our experience to help you

Message*

Attach file
Accepted file types: jpg, gif, png, pdf, doc, docx, xls, xlsx, ppt, pptx, Max. file size: 10 MB.

Add an attachment

(jpg, gif, png, pdf, doc, docx, xls, xlsx, ppt, pptx, PNG)

- I want to receive news and updates once in a while

We will add your info to our CRM for contacting you regarding your request. For more info please consult our privacy policy

Email
This field is for validation purposes and should be left unchanged.

What our customers say

The breadth of knowledge and understanding that ELEKS has within its walls allows us to leverage that expertise to make superior deliverables for our customers. When you work with ELEKS, you are working with the top 1% of the aptitude and engineering excellence of the whole country.

Sam Fleming

President, Fleming-AOD

Right from the start, we really liked ELEKS’ commitment and engagement. They came to us with their best people to try to understand our context, our business idea, and developed the first prototype with us. They were very professional and very customer oriented. I think, without ELEKS it probably would not have been possible to have such a successful product in such a short period of time.

Caroline Aumeran

Head of Product Development, appygas

ELEKS has been involved in the development of a number of our consumer-facing websites and mobile applications that allow our customers to easily track their shipments, get the information they need as well as stay in touch with us. We’ve appreciated the level of ELEKS’ expertise, responsiveness and attention to details.

Samer Awajan

CTO, Aramex