Coding agents have become one of the most discussed themes in modern software development. They promise automatic code generation, autonomous refactoring, and even full migration of legacy systems — but practical use often reveals a gap between promise and reality.
To learn more about AI coding agents, we spoke with Nazar Kalashnikov, a software engineer who worked on a major Japanese legacy-system modernisation project and has hands-on experience with AI-assisted code generation.
Could you share a bit about the project where coding agents were used?
The client maintained a huge collection of old systems — Delphi, Cobol, and Fortran — and needed everything migrated to a unified Java + React stack. They had already tried building an agent that directly translated one language into another, but because of big differences and no 1:1 mapping, that approach failed.
Documentation made it even harder: everything was in Japanese and formatted as scanned slides of UI screens with code identifiers instead of meaningful names. It made sense for old electronics workflows, but not in modern software projects.
How did your team approach the legacy system migration?
The goal was to validate a new method and see whether artificial intelligence could actually help, so two independent approaches were tested.
- Our team (with AI): We developed scripts to generate intermediate documentation from legacy code, then used that documentation to generate Java/React code on the client side.
- Team without AI: Another team manually implemented the same tasks without AI.
Both teams worked independently on the same project. The point was to compare whether AI truly speeds up the process and improves quality on a real-world system.
The AI-based approach was structured into two distinct stages:
- In the first stage, we transformed the legacy code into documentation. My team processed each file by breaking it down into individual functions and feeding them to AI. The AI then produced detailed outputs for each function, including descriptions, parameter types, and a human-readable explanation of what the function actually does. The result was lengthy documents that essentially served as complete specifications for every function.
- In the second stage, another team attempted to use this documentation to generate Java and React code through a full code generator. While the intermediate documentation was produced in English, the final code had to be in Japanese, and our Japanese colleagues reviewed it for quality and accuracy.
Where did AI succeed, and where did it fail?
- Documentation generation: AI performed surprisingly well in documentation generation. Models did a good job describing functions, and about 90% of the documentation quality was acceptable. The main issues appeared in complex cases or where code had many dependencies.
- Code generation: This is where the project stalled. The team trying to generate Java and React code could not produce a usable generator. Some code was generated, but it was not production-ready. The manual team also couldn’t fully complete their version. In the end, neither approach produced a final usable product, so we couldn’t compare efficiency.
Neither the AI-based nor the manual approach produced a final usable product. So comparing efficiency was impossible.
Did you find that some parts of the legacy code weren’t worth porting?
Yes. Much of the code, especially from the early 90s, wasn’t worth saving. There were giant monolithic files, no structure, outdated approaches, and manually rendered UI elements.
In many cases, rebuilding from scratch is better, reusing only key architecture ideas or business logic pieces. But our client required the new interface to look exactly the same — same controls, same positions, same labels. This made the process much harder.
Do you find AI useful in your everyday work?
I am a bit sceptical about the current state of coding agents as autonomous tools, but I actively use them for specific tasks.
- Documentation and comments: AI excels at documenting code, writing explanations, and generating readable descriptions. It produces clean text and saves time on writing.
- Python and scripts: About 90% of generated Python code is good, even for moderately complex algorithms. This works well for utilities and data processing. Colleagues also report good results for frontend code.
For massive backends — over 100,000 lines of code across 15 interconnected services — AI doesn't help and often slows things down. Preparing the context for the model can take more time than understanding and writing the code manually.
Do you see AI making a meaningful impact on development tasks today?
Yes. One area with real potential is using AI as an early reviewer rather than a replacement. This includes automatic AI-driven code analysis, AI-powered pull request scanning, and static analysis enhanced by models.
The idea is that AI serves as the first line of defence. It flags suspicious patterns, and only after that does the pull request go to a human reviewer. You cannot fully trust AI — results differ between runs, and reliability is inconsistent. But as a filter, it can reduce reviewer workload and speed up review cycles.
Have you experimented with running AI models locally? What was your experience?
I have. Local models are still inconsistent. Running strong models locally requires expensive hardware — laptops aren’t sufficient. Using the cloud is more probable, but there are still questions about cost and ensuring deterministic, repeatable results. That pushes teams toward cloud migration, though having a clear cloud strategy is a necessity.
Outside coding, where do you see AI being really effective?
- Language, texts, and translation: AI excels at writing and editing, providing cultural explanations, and producing dialect-aware translations. For example, when reading Japanese literature, AI can help interpret cultural nuances that are not obvious to non-Japanese readers.
- Medicine and diagnostics: Medicine is another area with huge potential. Medical conditions don’t drastically change over decades, vast datasets exist, and wearable devices collect biometric data in real time.
For example, AI in medical imaging can analyse patterns, detect anomalies, and provide early warnings. Another example I like is wearables that can detect early signs of arrhythmia before the user even feels symptoms — a clear example of AI truly saving lives.
Given the current state of AI, do you think developers need to start learning it now?
I take a practical view on this. Coding agents aren’t fully reliable yet, and there is often a gap between what companies expect and what these tools actually deliver. Full intelligent automation of complex workflows is still a long way off.
Even so, learning AI is essential. Hands-on practice is necessary, and each new generation of models brings improvements. Today, while AI is not fully dependable for large, complex backend systems, it is already very useful for Python scripting, frontend development, documentation, analysis, and language-related tasks.
In my opinion, AI agents cannot be used straightforwardly to replace developers in complex legacy systems. They struggle with architecture, dependencies, context, and reliability. But they are excellent for documentation, Python scripts, frontend generation, text and language tasks, medical diagnostics, and AI-assisted code review.
FAQs
AI coding agents are artificial intelligence systems that automate programming tasks. They can:
- Understand your instructions in natural language (you tell them what to build or fix).
- Generate, edit, debug, and refactor code across multiple files or entire projects.
- Act autonomously to break down large tasks into smaller steps and complete them with minimal supervision.
- Integrate with development tools and workflows (e.g., IDEs, pull requests, version control).
The best AI coding agent depends on your needs:
- GitHub Copilot – Best for GitHub projects and IDE integration.
- Claude (Opus/Code) – Best for complex reasoning and explanations.
- Cursor AI – Best for integrated coding editor support.
- AutoGPT/autonomous agents – Best for high-level, self-directed tasks.
For most developers, GitHub Copilot is the top choice for everyday coding.
Related insights
The breadth of knowledge and understanding that ELEKS has within its walls allows us to leverage that expertise to make superior deliverables for our customers. When you work with ELEKS, you are working with the top 1% of the aptitude and engineering excellence of the whole country.
Right from the start, we really liked ELEKS’ commitment and engagement. They came to us with their best people to try to understand our context, our business idea, and developed the first prototype with us. They were very professional and very customer oriented. I think, without ELEKS it probably would not have been possible to have such a successful product in such a short period of time.
ELEKS has been involved in the development of a number of our consumer-facing websites and mobile applications that allow our customers to easily track their shipments, get the information they need as well as stay in touch with us. We’ve appreciated the level of ELEKS’ expertise, responsiveness and attention to details.