I’ve been puttering away on my Quamina project since 2023. In the last few weeks GenAI has intervened. Quamina + Claude, Case 1 describes a series of Claude-generated human-curated PRs, most of which I’ve now approved and merged. Quamina + Claude, Case 2 considers quamina-rs, a largely-Claude-driven port from Go to Rust. Both of these stories seem to have happy endings and negligible downsides. So empirically, I can apply LLM technology usefully to software development. But should I?

Conclusions 1: Burn it with fire? · Let me be clear: In the big GenAI picture, I’m a contra. Why? I’ll pass the mike to Baldur Bjarnason, my favorite among GenAI’s blood enemies.: “AI” is a dick move. His tl;dr is something like “GenAI is environmentally devastating and has the goal of throwing millions of knowledge workers onto the street and is being sold by the worst people and is used for horrible applications and will increase society’s already-intolerable level of inequality!” To which I reply “Yes, yes, yes, yes, and yes.”

At the end of the day, the business goal of GenAI is to boost monopolist profits by eliminating decent jobs, and damn the consequences. This is a horrifying prospect (although I’m somewhat comforted by my belief that it basically won’t work and most of the investment capital is heading straight down the toilet).

But. All that granted, there’s a plausible case, specifically in software development, for exempting LLMs from this loathing.

First of all, size. JetBrains thinks that the world has 21 million or so software developers, i.e. less than 1% of the earth’s working population. Vanishingly small in the context of the lunatic tsumani of LLM overinvestment. Training and operating the models required for a market this small is rounding error measured on the Great GenAI Overbuild scale. There aren’t enough geeks to create a detectable bump in the global carbon load.

Another odious aspect of LLMs is RLHF, “Reinforcement Learning From Human Feedback”, which relies on underpaying Third-Worlders to polish the models’ outputs. Presumably this is one of the tools Elon uses on Grok to enable nazi propaganda and revenge porn. My guess is that much less is required for code-oriented LLMs. The combination of the compiler and your unit tests provide good starter guardrails. Then skilled professional intervention is required to deal with the remaining misfires, as with those Quamina PRs.

Finally, it seems making billionaires into multibillionaires is intrinsic to GenAI dreams. But software-development tools won’t do that. Once again, the market is just too small. But even if it weren’t, consider this from Steve Yegge:

For this blog post, “Claude Code” means “Claude Code and all its identical-looking competitors”, i.e. Codex, Gemini CLI, Amp, Amazon Q-developer ClI, blah blah, because that’s what they are. Clones.

(GenAI, overbuilding wherever you look.) None of these products have moats and the chance that any of them become extractive monopolies is about zilch. Nobody’s ever built a major cash-cow on developer tooling

One reason is (*gasp*) Open Source. Does anybody doubt that in the near future, there will be entirely open-source versions of what Yegge means by “Claude”?

So, if you want to condemn the use of GenAI in software development, I think you need arguments other than the fact that it’s also being promoted for societally-toxic business purposes.

I have a few. But stand by, let me push that on the stack and turn to technology. for a bit.

Conclusions 2: Engineering sanity? · Question: Can LLMs even participate in quality software engineering? Baldur doesn’t think so: “The gigantic, impossible to review, pull requests. Commits that are all over the place. Tests that don’t test anything. Dependencies that import literal malware. Undergraduate-level security issues. Incredibly verbose documentation completely disconnected from reality.”

I’m not saying that these pathologies can’t or don’t happen. But in my personal experience with Quamina, they didn’t. (Mind you, it’s a hobby project.)

And when they do happen, I would assume that mature open-source projects will use a network of trust, as big operations like Linux already do. PRs that don’t have the imprimatur of someone known to be clueful will be ignored. When I saw the first of those incoming Quamina PRs, I took the time for a serious look because I knew Rob and had seen evidence that he was technically competent. If I see an incoming PR that’s nontrivial and from some rando and doesn’t pass a 120-second sanity check, it’s unlikely to get any more attention.

In fact, some essentials don’t change. If you’re not requiring that PRs be clean and test coverage be good and code reviews not be skipped and dependencies be curated, you’re going to get a lousy result whether the upstream code is coming from a human or an LLM.

But it’d be naive to think that a big change in the shape of that upstream isn’t going to affect the profession.

Bottlenecks · Speaking from personal experience, reviewing the PRs from Claude&Rob was neither faster nor slower, easier nor harder, than what I’m used to pre-GenAI. The number of my disagreements with the diffs, and the amount of arguing it took to resolve them, was also about as usual. Which creates a big problem. Because if we can generate code a whole lot faster but review doesn’t speed up, all we’ve done is moved the bottleneck in the system.

Speaking of which, Armin Ronacher offers The Final Bottleneck, from which: “When one part of the pipeline becomes dramatically faster, you need to throttle input.” Think about that.

Burnout · Meanwhile, evidence is piling up that LLM-based software development is driving developers to overwork and burnout. Here’s a cool-eyed take from Harvard Business Review. Then there’s Steve Yegge’s frantic, overly-long The AI Vampire. But my favorite, and I think a must-read, is Siddhant Khare’s AI fatigue is real and nobody talks about it. From which: “AI reduces the cost of production but increases the cost of coordination, review, and decision-making. And those costs fall entirely on the human.”

The argument we’re hearing is that GenAI makes development more efficient. And more efficient is better. Until it’s not.

I’m not sure the profession I joined last century would attract me today. And on Mastodon, @GordWait said “At our office, we are noticing a huge drop in Comp Sci co-op applications. The next generation is convinced there’s no future in programming thanks to AI hype.”

Can and should · Here’s another conundrum. Suppose we can build a whole lot more stuff, faster. Should we? I don’t know about you, but I am regularly enraged at tools that work just fine popping up “wonderful new features” modals in front of what I’m trying to get accomplished. Also at damaging UI churn, driven by product managers trying to get promoted. It’s just not obvious that speeding up software development is, in the big picture, a good thing.

And I can’t help noting that every attempt to measure the productivity boost due to GenAI has shown zero (or worse) improvement. Of course, Claude’s cheering section will point out that those studies date to 2024 which is the stone age. Maybe they’re right.

Vampires · (In which I once again go all class-reductionist.) The real problem here is late-stage capitalism, and I think is best addressed in Yegge’s AI Vampires piece, from which I quote: “…dollar-signs appear in their [employers’] eyeballs, like cartoon bosses. I know that look. There’s no reasoning with the dollar-eyeball stare.” Yeah.

Thus the ancient question: cui bono? Assuming GenAI genuinely boosts productivity, who gets the benefits? Because the ownership class sure doesn’t think they should go to their newly-more-efficient employees.

But, what do I know? · I know that you gotta have test coverage or your software is an unmaintainable tangle of festering tech debt. I know you gotta have code review or your quality is on inexorable downhill drift. I don’t know how to build LLMs into a sane, sustainable software engineering culture. Nor what to do about capitalism’s AI Vampires.

And I absolutely do not believe the wild-eyed claims of 10× productivity gains, assuming we demand (as we should) that they’re sustainable at scale.

So, would I advise executives to tell software engineering shops to discard their culture in favor of vibe coding in the expectation of monstrous productivity wins? Nope. Vibe engineering, maybe. Centaurs, not reverse centaurs? Indeed.

But would I say “Stay away, don’t even look”? Nope. I’d probably suggest pointing the LLM at well-delimited non-strategic issues and optimizations, and emphasize no shortcuts on reviewing or CI/CD standards.

Also note that the GenAI apostles are at one in saying that this year’s tools are so much better than last year’s, and next year’s are guaranteed to be qualitatively still better! So why would you rush in and risk getting locked into soon-to-be outmoded tooling?

Rob Sayre wrote “I would never bother to type out these patches by hand. But I read them all.” I probably wouldn’t have either and I read them too. And now Quamina is roughly twice as fast. Which is to say, I got good results on a hobby project. That’s not nothing.

But, also not conclusive. Once the AI bubble pops and we’ve recovered from the systemic damage, I think there’ll probably be a place for open-source LLM automation in developer toolkits.

But maybe not. Wouldn’t surprise me much, either way.


author · Dad
colophon · rights
picture of the day
February 16, 2026
· Technology (90 fragments)
· · AI (13 more)

By .

The opinions expressed here
are my own, and no other party
necessarily agrees with them.

A full disclosure of my
professional interests is
on the author page.

I’m on Mastodon!