AI as a Chance - Opinion
Use AI. Use it more and better. If you are not yet equipped to use it well - that is fine, learning takes time - but please do not inhibit those in the community who are.
That is the whole argument. The rest of this piece is why I think it is correct, and why I think the current register of the Perl community around this topic is costing us something specific and avoidable.
Who is saying this, and why that matters
PetaMem, an AI company, was founded in 2001. I ran it - still do. My MSc is from an AI department. I was doing neural networks in the 1990s, when the field was still in one of its winters and the word "AI" was something one did not put on a grant application. I have not stopped working on this for thirty years.
Which makes me exactly the kind of person whose opinion on AI should be suspected of motivated reasoning. Someone who has built a career on a technology has an obvious incentive to tell you it is good. You are safe, because I have spent most of my adult life living with a particular internal watchdog that distrusts convenient conclusions - the INTJ tendency to run every thought through a "is this actually right?" pass before letting it speak (even then it does not stop). For the record: I don't share the fear of AI that many seem to feel - won't pretend I do.
In 2012 I gave a keynote at the London Perl Workshop about "Perl Strategy". You know: how to make Perl great again. It resulted in the formation of Propaganda.pm, an effort aimed at improving Perl's public standing and institutional visibility. I ceased effort with Propaganda.pm around 2017, for lack of enthusiasm from the community. No worries: I am not restarting it - this piece is not a campaign. It is one text, offered once, because something specific has changed since 2017 - the arrival of a technology that alters the arithmetic of what a small group of people can do - and the implications matter for Perl in particular.
What has changed
The rate of capability improvement in general-purpose AI over the last eighteen months is not the rate most people have updated their priors at. Scepticism acquired six months ago is already out of date. What is worse, that calcification biases your judgment in the present: the model that frustrated you then has been replaced - possibly more than once - by something that behaves differently, fails differently, succeeds differently, and wants different things from its user. You are evaluating today's capability through a lens shaped by a version that no longer exists. Updating in the face of rapid change is the only honest response to it.
"AI slop" is real. Bad AI-generated code exists and is ubiquitous. But if you look carefully at where the slop comes from, the pattern is almost always the same: the failure is not in the model. The failure is in the interface between the model and the task - what was asked, how it was asked, how the context was prepared, how the output was reviewed, what was done with it afterward. A skilled operator and an unskilled operator, using the same model on the same task, produce radically different output. We have all the evidence we need that this is true; we just do not often name it, because naming it is uncomfortable. "Slop" is, in the great majority of cases, a description of a process rather than the product of this technology.
The scale nobody is quite talking about
Here is an observation I find more interesting than any benchmark. When I ask an AI to estimate how long a task will take, the estimates it produces are clearly calibrated on training data from human developer estimates - "three weeks", "a month", that sort of thing. I have found, repeatedly and consistently, that those estimates are off by a factor that lives somewhere in the ballpark of a thousand. Give or take. The AI is, in a literal sense, unable to see its own impact. It is estimating how long it would take a human without AI to do the work, because that is what its training data contains.
I am not sure people have fully internalised what this means. The models cannot currently estimate the productivity of humans using them, because the training data for that does not yet exist at scale. Which means that even the people who use AI every day are operating with intuitions that were formed before the current capability existed, and which the AI itself cannot correct. The feeling of "this took me three days" and the reality of "this represents three months of pre-AI work" can coexist in the same person without the person noticing the gap.
What this adds up to, for someone who has learned to use the tool well and who picks the right problems to aim it at: the output of a single individual can easily exceed what was previously the output of a medium-sized development team. My current conservative estimate of my own output relative to a standard human-only development team is fifty to one hundred. This is not a claim about me specifically. It is a claim about what the ceiling is, for any individual developer who applies themselves to learning the craft of AI-assisted work. The ceiling was not this high eighteen months ago. It is this high now.
You do not have to take my word for this. The 575-commits-in-a-week event that has been discussed recently in the Perl community is an instance of the same phenomenon. The community has seen the output. The response has mostly been to focus on fifteen or twenty commits out of the 575 that were weak. I want to come back to that response below.
A different job, not a faster one
It is tempting to describe what changes with AI as "becoming a faster developer". That is the wrong frame. The metaphor I keep coming back to is a naval one.
In the old ecosystem, programmers were sailors. You made sure the bulkheads were closed, the galley was filled, the deck was swabbed. You were good or bad at your job depending on how well you pulled the ropes. With AI used well, you stop pulling ropes. The ropes get pulled. You become, if you choose to, an admiral - someone who orders a fleet to take a strategic position around a group of isles, who makes directional choices, who reviews outcomes, who integrates. You are not doing the old job faster. You are doing a different job.
This has implications that are worth sitting with. Experience and judgment matter more in the admiral's job than in the sailor's job, not less. The admiral's chief skill is knowing what ought to be done, what a good result looks like, and when the fleet's output is off-course. That is exactly the skill a senior developer has accumulated. The replacement narrative - "AI will take our jobs" - gets the transition backwards. The people whose jobs are most at risk are the ones whose work consists of pulling ropes that AI can now pull. The people whose value goes up are the ones with enough experience to direct the fleet. If you have been in this profession for ten or twenty years, the capability that has just arrived is not your threat. It is your lever.
The fear, named accurately
I want to talk about what is happening in the community around AI, because I think we are having the wrong debate.
Quality concerns about AI-generated code are legitimate. There is bad AI-generated code. There are real code-review cases where AI-produced output made a codebase worse. People evaluating individual contributions on their merits and finding them wanting are doing the work that code review is supposed to do. None of what follows contradicts any of that.
But the debate in the community is not, structurally, a debate about quality. It has a specific shape. Out of 575 commits, fifteen or twenty bad ones get selected and made representative of the whole. That selection is not a quality evaluation. No dispassionate evaluator, trying to characterise a body of work, would pick from the worst 3% and treat it as the norm. The consistency of this pattern - across many individuals, many contexts, always selecting in the same direction - tells us that something other than quality evaluation is happening.
Consider how we handle this in the case of human developers. We have all encountered good developers and bad developers. Some human-produced code is excellent; some is a disgrace. We do not, on the basis of the disgraceful code, conclude that human developers as a class should be treated with suspicion. We evaluate individuals on their individual output. We extend to other humans, by default, the courtesy of being judged on their specific work in its specific context.
The same courtesy is not being extended to AI-assisted contributions in the current community register. A class-wide judgment is being drawn from a selected subset of output, in a way we would recognise as unfair if applied to any other category of contributor. The inconsistency is not subtle. And the consistency of the inconsistency - across many people, many threads, many contexts, always in the same direction - suggests that what is being expressed is not an evaluation. It is something else.
The honest name for that something else, in most cases, is fear of obsolescence. It is not shameful. It is one of the most predictable human responses to rapid capability change. But fear dressed as a quality argument is not an argument, and treating it as one means we are debating a simulation of the real disagreement.
The cost
A community that spends its AI debate in this form is a community not spending that time learning the tool. The members who are already skilled at using it do not pay the cost - they adapted, they moved on, they are building things. The cost is paid by the members who are not yet skilled and who are not receiving social permission from their peers to become skilled in public. In a community where the dominant register around AI is suspicion, experimenting with AI publicly is socially expensive. That expense is a tax on exactly the learning the community most needs to be doing.
A management mentor of mine once told me: "I have never seen a company go bust because it had to. All the companies I have seen go bust did so because they committed suicide." At the time I thought he was wrong - surely external circumstances, markets, competition. He was not wrong. Every declining community I have watched up close has declined by choice, one quiet decision at a time, each decision looking reasonable in isolation, the collective arithmetic only visible in retrospect. The companies my mentor described did not vote to commit suicide. They just kept making small reasonable-seeming choices that added up to it.
The Perl ecosystem has been on a cooling trajectory for a long time. The exponential drop is far behind us; we are in the long tail now, and the long tail may persist for a long time. Like a white dwarf, still radiating, nowhere near its former brightness. This is not a eulogy. White dwarfs are stars. They persist. What they do not do is get brighter on their own, without something changing the equations they are operating under.
AI is a change in the equations. It is the largest change in the equations of software development in the lifetime of anyone reading this. It is also a change that rewards exactly the kind of judgment that an experienced community has accumulated. The ecosystem problem of Perl - and I say this with affection, having worked on it for decades - is fundamentally a problem of hands. There are not enough of us to maintain what exists, let alone to expand it.
A community that arrives at this change and chooses to spend its collective time on the 3% that is bad, rather than on the 97% that demonstrates what is now possible, is a community making the small reasonable-seeming choice that my mentor described. And if the pattern continues, the arithmetic will do what it does.
The tools are here. The question is whether we embrace them.
My answer is hereby on record.
- Richard C. Jelinek, PetaMem s.r.o.
AI coding is here and it's not going away.
It is to coding what combine harvesters are to farming. A factory is perhaps a better metaphor - hand made cars simple cannot compete with automated production lines.
So now that code is generated by robots like cars are, the game is to manage quality whilst maintaining throughput.
Teams can't rely on manual code review and tribal knowledge any more - they simply cannot keep up.
Yet tools like perltidy and perlcritic (even perlimports), with comprehensive and thorough policies, don't solve everything like they don't with humans, but they scale unlike humans.
Similarly thorough and comprehensive unit testing is now easily achievable. It's trivial to teach an agent how to use Devel::Cover to measure the unit tests it's writing. Does that make code bug free? absolutely not. But again, like unit tests do with humans, they provide the guard rails that invite greater success and again, they scale.
Comprehensive Pod helps AI agents just as it does humans, but perhaps more so as agents have limited context windows. Agents can write the pod with the code as they author it, scaling easily.
Code release processes then get the highlight. We then add integration test pipelines after unit test pipelines, we add canary testing in production, and feature flags to rapidly toggle new code on and off. Error budgets and comprehensive health metrics allow new code to go live safely with controlled blast radius.
This was all helpful for humans too, but so often the volume of code being produced was low enough that "be really careful" was perceived as good enough of a release process - despite how risky and how much it slowed down development.
Unsurprisingly, all the things that actually helped humans develop better actually help ai agents to develop better. And that includes a clear plan with clear success criteria!
I disagree with the comparison to automation in agriculture or factories. That automation is designed to reliably perform a defined task and simply does it much better than a human can, and so the downside is confined to economic labor impacts, which our society must simply adjust to.
LLM "agents" are probabilistic models that will never be reliable at completing the task because the technology is unreliable by nature, and because the task is not defined but instead has an artistic nature with sprawling, unclear outcomes. So it will continue being refined in usefulness but never reach the ability to replace human intelligence and responsibility unless it is reinvented with logic models instead of LLMs (which I will add was a promising machine learning exploration before we became LLM-obsessed).
Perhaps a better analogy is self driving cars: they are becoming better at achieving the ostensible task, but are unable to be reliable at navigating streets full of humans, and thus full autonomy will forever carry unacceptable risk.
I disagree with your notion of LLMs being "probabilistic models". You may confuse LLMs with SNLP. LLMs can be configured to produce deterministic output, but effectively this never happens and there is no value in making that happen. If you need that, write algorithms.
Also, LLM does not equal LLM. There are vast differences in architecture (the inference pipeline), so throwing all in the same pot is ... let's say unscientific at best.
The usage of the word "forever" on the topic of self-driving cars looks like one of two options: a) calcified opinion made some time back b) being a taxi or uber driver. Putting humans behind the wheel will forever carry unacceptable risk - how does that sound?
You may be right that the technology of "LLM" is unreliable in nature. I put LLM in quotes because that term itself is outdated. How would you explain the fact that you can give a "Large Language Model" a picture for it to analyze? So now they are Large Language&Picture Models? LLPMs? No. They are neural networks with some arbitrary modality vectors pointed at them.
They are unreliable in nature. They must be, because they emulate human cognition surprisingly well. They omit, they fake, they hallucinate. You do too. By extension, humans are unreliable in nature. But we knew that (see your judgement in automation tasks) and don't seem to have a problem with it.
Be happy that what you term as "LLMs" is effectively made to be a human-like cognitive workforce. Deal with it, embrace it.
I thought long about whether to add this paragraph, but I think it is necessary to explain my exhortation to "be happy" from above. At PetaMem we also have AIs that are NOT a human-like cognitive workforce and you are not ready for that. Your cognitive processes are mostly built on language (and perceptions of the physical world - see, feel, hear, smell,...). They are not built on math, which these AIs are.
Our language itself is inconsistent and unreliable in nature. Being the basis for our cognitive processes it is - in fact - a very underperforming vehicle. Your (and basically everyone elses) notion of the concept of "opposite" is a delusion, for starters. When this hits the streets you'll better have made peace with the human-like AIs.
I will not, and they are not humanoid in any sense. Thank you for your opinion.