AI-Native Development Workflows: A Field Report

I have spent the last six months using an agentic coding assistant (mostly Claude Code, occasionally Cursor, and briefly a couple of others I'd rather not name) as a genuine, load-bearing part of my daily workflow. Not as a toy. Not as a demo. As the thing I reach for before I open a new file. This is a field report.

Short version: the productivity gains are real, the limits are also real, and most of the online discourse is cartoonishly wrong in both directions at once.

What has genuinely changed

Three things, in descending order of impact.

1. The cost of exploratory work has collapsed

"What would this feature look like if I built it with library X instead of library Y?" used to be a half-day of work. Now it's thirty minutes. You end up with a throwaway implementation that is good enough to form an opinion on, which is all exploration needs. I make decisions now that I would previously have deferred because the cost of the decision-input was too high. The second-order effect is larger than the first: the quality of my architectural choices has improved because I am choosing from more options.

2. Boilerplate is approximately a solved problem

Migrations, API client generation, test scaffolding, plumbing between services. The unglamorous tax on any real engineering task. An assistant that has read your codebase and can produce a correct, idiomatic draft in thirty seconds is a step-change, and the most valuable thing about it is that it frees your attention for the parts of the task that are actually interesting.

3. Reading code is faster

Dropping a large file into a conversation and asking "what does this do, and where are the non-obvious bits?" is a better first pass than I would do manually. I still read the code myself afterwards — you cannot trust the summary — but the summary is a useful orientation.

What hasn't changed

1. The hard part is still the hard part

The assistant is not going to work out what the product should do, which data model to pick, how to negotiate with the team downstream, or what the real constraint on the system is. The genuinely difficult work of engineering is the same work it was two years ago. What has gone faster is the work around it.

2. Debugging is harder, not easier

This is the point most AI coding advocates skate over. When code you wrote yourself breaks, you can step through it with a model in your head of why each line is there. When code you generated breaks, you have to build that model after the fact, which is slower. I now insist on reading every line of every non-trivial change before I commit it. The speed-up from generation is real but the speed-up from "don't read it" is a trap.

3. The assistant is confident when it shouldn't be

The single biggest operational hazard. The assistant will, with absolute calm, produce a plausible-looking function that imports a method that does not exist in the library you are actually using. The output looks right. Tests pass if you forget to run them. You only find out when it lands in production.

The protection against this is old-fashioned: CI, tests, and the discipline to trust the tools you can verify over the tools you can't.

Concrete workflow changes I've made

Every repo I work on has a CLAUDE.md or equivalent at the root that tells the assistant the conventions, the commands it should run, and the pitfalls specific to the project. The difference between an assistant that has read that file and one that hasn't is about 2x in quality.
I treat the assistant like a sharp but context-free junior engineer. I write tickets for it with explicit acceptance criteria. "Make it work" is a bad brief. "Add an idempotent POST endpoint for /payments, returning 201 on create and 200 on idempotent replay, using the existing HMAC middleware" is a good one.
I commit every change the assistant makes separately. Atomic commits are a force-multiplier. If a change is wrong, git revert is cheap.
I do not let the assistant run arbitrary shell commands without a prompt. The convenience is not worth the class of error where it silently reformats a file or runs a destructive migration.

The hype cycle is wrong, twice

The maximalists ("no engineer will have a job in eighteen months") are wrong. The minimalists ("this is glorified autocomplete and you'll regret using it") are also wrong. Both positions are comforting, which is why people hold them.

The accurate position, as far as I can tell: LLMs have made certain engineering tasks faster, by a factor that varies wildly by task. They have not made judgement faster. They have not made taste faster. They have not made deciding-what-to-build faster. The engineers whose jobs are in trouble are the ones whose work was overwhelmingly the kind that has been accelerated. The engineers who will thrive are the ones whose work is overwhelmingly judgement.

I am not sure yet where the line is. I am sure it is not where most confident people think it is.

— Nivaan