With these rapid changes in delivery practices, it is important to examine where estimation methods are falling short. This article examines why traditional estimation fails in AI-native delivery and proposes a model that matches actual work patterns.
- AI-native delivery has changed how we build software, but estimation methods have not kept up. Most estimates still rely on spreadsheet-based thinking.
- Don’t estimate features in a straight line. AI work is front-loaded: it takes more effort at the beginning and becomes much faster over time. A simple per-feature model hides this reality.
- Early setup matters most. Investing in the AI working environment (context, rules, templates, and instructions) has the biggest impact. Without it, every feature takes significantly more effort.
- Effort and delivery time are not the same. You need to estimate both engineering effort and real calendar time, which is often limited by client feedback cycles rather than development speed.
- Real-world example: Traditional methods estimated 5,900 hours. AI-native delivery delivered it in under 1,400, as each feature sped up the next.
Why traditional estimation breaks in AI-native delivery
The software development industry has reimagined roles and redefined what "done" means. But when it's time to answer, "How long will this take?" — the same spreadsheets appear. Decompose into components. Get specialist opinions. Add a risk buffer. Sum it all up.
This estimation approach assumes each feature takes a predictable amount of effort, components are mostly independent, and that risk is evenly spread. AI-native delivery breaks all three hypotheses. The results will continue to differ from real project outcomes until estimation methods change.
How AI rewrites the rules of software delivery
1. Effort is no longer tied to feature size
Story points work when similar features require similar effort. That's the entire premise of calibration. In AI-native delivery, the same feature can cost 8 hours or 30 minutes — depending on when you do it, who does it, and what tools they're working with.
A senior engineer who has mastered AI partnership will deliver in a fraction of the time a traditionally skilled engineer needs. A more capable model with better context handling changes the equation again. And even the same person with the same model gets dramatically faster as the project matures, because accumulated context makes every subsequent task cheaper.
The relationship between work and effort isn't stable anymore. It's a curve formed by at least three variables.
2. Sequence now changes the total cost
In traditional development, features are treated as independent. Feature A takes 40 hours, Feature B takes 20 hours, and the total is simply 60 hours—order doesn’t matter.
In AI-native delivery, order matters a lot. If the user management module is built first, it helps the AI learn the system’s patterns and conventions. That knowledge then speeds up all later work.
As a result, the same type of module can take two days the first time, but only a couple of hours once the system is established.
3. Risk is no longer evenly distributed
The classic “+20% buffer” assumes that risks are evenly distributed across the project.
AI-native delivery has a different risk pattern. When AI makes an error, fixing it can take anywhere from minutes to several hours. This uncertainty is highest early in the project, when work is still being shaped, and context is limited. As the project progresses and context builds up, mistakes become easier and faster to fix. A flat risk percentage doesn’t capture this changing reality.
How to estimate AI-native delivery: a model that fits reality
The previous article introduced AI-Native SDLC — three integrated modes: Intent, Build, and Operate. That structure doesn't just describe how to deliver. It gives us a natural estimation model. Not a formula, but a way of thinking about how AI-native delivery behaves.
If you're coming to this fresh, here's a quick orientation.
- Intent mode is where problems get defined — through conversation, demonstration, and rapid prototyping rather than lengthy documentation.
- Build mode is where the solution takes shape, iteratively and rapidly, with AI as a collaborative partner. Each completed feature makes the next one cheaper and faster.
- Operate mode is where the solution lives. Real usage reveals what works, what doesn't, and what comes next.
Each mode behaves differently. Each needs to be estimated differently.
Step 1: Estimate Intent separately
Intent is where problems become clear through dialogue, demonstration, and rapid prototyping. Here’s the truth: this mode doesn’t compress by 3-4x. It barely compresses at all.
Client conversations take as long as they take. Stakeholder alignment doesn’t speed up because your engineers are faster. Decision-making has its characteristic rhythm. AI helps — you can prototype in hours instead of weeks, which means faster validation of understanding. But don’t promise miracles here. Expect modest compression – driven primarily by prototypes replacing specification documents. The human-paced activities don’t compress at all. In our experience, Intent is where teams most often overestimate the AI impact.
Step 2: Estimate Build as three separate investments
Stop estimating Build as a “sum of features.” Treat it as 3 distinct investments:
Before any feature work starts, the AI working environment must be set up: the project knowledge base, coding standards, architectural patterns, cursor rules, boilerplate templates, and AI agent instructions. This is a new category of work in AI-native delivery.
It scales sublinearly — a 50-feature project does not require 5× the effort of a 10-feature project. Most of the work sits in shared foundations (architecture, standards, instructions), while additional modules add only incremental cost. It is also predictable after a few projects, typically taking days rather than weeks. But if skipped, every feature becomes more expensive because the AI repeatedly has to rebuild missing context.
Do not estimate features individually — estimate the trajectory. Early features are expensive because the AI is still learning the system, and patterns have not yet been established. As context accumulates, each new feature becomes cheaper, often shifting from hours to minutes.
The curve depends on team maturity, tooling, and the quality of Context Investment. A practical approach is to estimate the first few features near traditional effort, then compress the rest and calibrate using real data from the first project.
This is not theoretical. In a recent production project, a traditional estimate of ~5,900 hours was delivered in under 1,400 hours with the same scope and quality. Compression varied (some modules ~10× faster, others ~2×), but the downward curve was consistent. Context Investment paid for itself within the first sprint through early savings.
For a simplified example: 10 features at 16 hours each equals 160 hours traditionally. In an AI-native setup, 16 hours go into context first, then features drop from 12 → 6 → 3 → 1–2 → <1 hour. Total ends up around 70 hours instead of 160.
Do not use a flat percentage. Instead, model two trajectories. In a well-prepared setup (strong context and mature team), acceleration appears early, and correction overhead stays around 10–15%. In weaker setups (new team, weak context, unclear domain), early work tends to be closer to traditional estimates and may effectively double early effort.
Context Investment determines which trajectory you land in — it is the key lever that shifts work toward optimistic outcomes.
Step 3: Estimate Operate with confidence
By the time you reach Operate, context is fully built, and your acceleration curve has plateaued at its fastest. This is where AI-native delivery becomes more predictable than traditional, because the cost of change is low and stable. Bug fixes, enhancements, and configuration changes — these benefit from everything you invested in Build.
Estimate Operate as a steady-state rate derived from the tail of your Build curve. If your last features in Build took 20% of the traditional effort, Operate tasks will fall into the same range.
The key insight: Operate estimation is the one place where your historical Build data directly transfers. Track it, and you’ll have your most reliable numbers within the first month.
Where the AI-native estimation model breaks down
This framework has real limitations, and it is better to make them explicit upfront than discover them during a project.
First, it requires calibration data that may not be available initially. In a first AI-native project, the team’s acceleration curve is unknown. The appropriate approach is to estimate conservatively, track performance closely, and treat the first project as a calibration phase rather than as model validation.
Second, it assumes a team operating in an AI-native way. An engineer using AI mainly as autocomplete will not follow the same productivity curve as someone treating AI as a delivery partner. If a team is still in an AI-augmented stage, traditional estimation with cautious adjustments is more accurate than assuming full curve effects.
Third, the model is best suited for greenfield development. Legacy systems, complex integrations, and regulated environments behave differently: the acceleration curve is flatter, and context investment is higher. Greenfield assumptions should not be applied directly to brownfield projects.
Even when these factors are considered, there is still one additional gap the model does not address on its own.
The hidden factor most estimates miss
You can get the model above perfectly right — and still miss your timeline by months. Because everything above estimates effort. But effort isn’t delivery.
What sits between them is client engagement. In traditional delivery, team pace and client pace were roughly in sync. AI-native delivery breaks that sync. Your team prototypes in four hours. The client sees it at next week’s call. Three iterations ship in a week. The client needs a month to choose a direction.
This means: you need two numbers, not one.
- Number 1 → Effort. How many hours of work does this project require from your team? This is what the model above gives you.
- Number 2 → Calendar time. How long will delivery actually take, given the client’s decision cadence and feedback rhythm? This is what most estimates miss entirely.
In practice, present both numbers side by side: “Our team needs 400 hours of work. Given weekly feedback cycles, the calendar timeline is 8-10 weeks. With biweekly deep reviews, we can compress to 5-6 weeks.” Not as blame, but as shared reality.
Rethinking estimation in the age of AI-native delivery
This isn’t just about more accurate numbers. It’s about honesty. Every estimation is a promise. And a promise built on a model that doesn’t match reality isn’t just inaccurate — it erodes trust before the first line of code is written. Clients who receive inflated estimates lose confidence. Clients who receive understated ones lose patience. Neither outcome is recoverable.
The model in this article won’t give you perfect numbers. No model will — AI-native delivery is still too young and too dependent on team maturity. But it will give you something more valuable: a structure that shows how the work actually behaves.
The next question becomes inevitable: if the unit of work has changed this much, should the unit of pricing change too? That’s where this series goes next, from estimating effort to selling outcomes.
If your estimation model hasn’t changed since you started using AI, it’s time to rebuild it. Not because the old one was wrong. Because the world it describes is gone.
FAQs
AI is used across all stages of the SDLC. For example, natural language processing models can analyse stakeholder documentation to generate requirements. Machine learning is used to suggest software architectures or detect code smells. Generative AI tools assist in writing and reviewing code. Automated testing frameworks use AI to create and execute test cases, while bug-detection tools leverage pattern recognition to identify and fix issues. In deployment and operations, AI-driven monitoring systems, such as anomaly detection and predictive incident response, help optimise system performance.
AI-native delivery is an approach to building software in which AI is treated as a collaborative partner throughout the entire delivery process, rather than a tool used to speed up individual steps such as coding or testing. Instead of optimising each phase separately, it rethinks what software delivery looks like when AI actively participates from planning through deployment.
This shift requires reimagining phases, roles, relationships, and how value is created. Work becomes more continuous and integrated, with AI contributing across the lifecycle while humans focus more on decision-making.
Related Insights
Inconsistencies may occur.
The breadth of knowledge and understanding that ELEKS has within its walls allows us to leverage that expertise to make superior deliverables for our customers. When you work with ELEKS, you are working with the top 1% of the aptitude and engineering excellence of the whole country.
Right from the start, we really liked ELEKS’ commitment and engagement. They came to us with their best people to try to understand our context, our business idea, and developed the first prototype with us. They were very professional and very customer oriented. I think, without ELEKS it probably would not have been possible to have such a successful product in such a short period of time.
ELEKS has been involved in the development of a number of our consumer-facing websites and mobile applications that allow our customers to easily track their shipments, get the information they need as well as stay in touch with us. We’ve appreciated the level of ELEKS’ expertise, responsiveness and attention to details.