AI agents: We're in for a wild ride
For the first time since ChatGPT, something has shifted.
When I think about moments in AI that made me think, “holy 💩,” there have only been two so far.
The first was in November 2022, when ChatGPT came out. I probably don’t need to explain why. The second was just a few months later, when GPT-4 arrived, fixing nearly all the obvious flaws of GPT-3.5, the model behind the original ChatGPT.
What was astonishing about GPT-4 was how quickly it followed ChatGPT. That rapid turnaround suggested ChatGPT wasn’t just some once-in-a-decade surprise, but the beginning of a larger trend.
Since then, the pace has stayed incredibly high, but we’ve grown used to it. Nothing felt like a true breakthrough anymore. At least, not until now.
Enter Claude Code
Claude Code and OpenAI’s Codex are agentic coding tools. Simply put, you give them access to a code base and then discuss with the tool (i.e. the agent) what kind of changes or additions you’d like to make to the code. The tool interacts with an AI model through an interface and tries its best to fulfil your requests.
Described like this, it might sound rather mundane. But two aspects make it remarkable. First, giving an AI model direct access to relevant files is extremely powerful, and anyone who’s ever uploaded a file to a chatbot has experienced this. Now, however, the AI can potentially access hundreds of relevant files at once.
Second, the models have become exceptionally good at acting as intelligent agents. Claude Opus 4.5 is particularly strong in this respect, as is GPT-5.2. These two models have fundamentally changed the game for agentic coding tools like Claude Code or Codex.
It’s now entirely possible to build reasonably complex applications just by talking to these agents. Basic prototype apps are straightforward to create. More advanced apps still require some expertise, but if you have that expertise, you can now move incredibly fast.
Since I might not be able to fully capture just how powerful these tools have become, let me instead quote Andrej Karpathy, one of the most prolific developers and thinkers in AI. Remember, this is someone who taught Deep Learning at Stanford and led AI at Tesla:
Not stopping at code
Having used these tools myself for coding, I am positively shocked at how good they are now. Last year, people laughed when Dario Amodei, CEO and cofounder of Anthropic, said that AI could be “writing essentially all of the code” within 12 months. Since he said that in mid-March 2025, we still have two months to go, but I think it’s fair to say he wasn’t far off. Of course we’ll keep writing some code by hand. But increasingly, the idea that you would type code manually will feel as remote as typing assembly code, or punching holes in cards.
What’s more remarkable to me is that this development doesn’t seem to be stopping at code. Anthropic recently released Cowork, which is essentially Claude Code but for non-development tasks, and runs on your desktop without needing a terminal. Now, there are important security implications that will come to haunt us. This is, after all, an agent with internet access, and some of your files may contain hidden instructions for malicious AI use. But it’s easy to see how powerful this will be.
Instead of using Cowork (which is essentially Claude Code without the terminal), I’ve been using Claude Code directly in a project folder where I keep documents related to a project. It’s fair to say that at least for me, project management will never be the same. I’m a scientist, so my projects have management folders with grant applications, contracts, agreements, and so on. For collaborative projects, I sync all the files using GitHub, where I also maintain a wiki that keeps everything documented in real time. All I have to do is open Claude Code in the morning, ask for a status update, describe what I plan to do that day, and off we go.
I’ve been experimenting with this setup for just a few days now, and I’m already 🤯. Importantly, this is where I’m starting to think: “Hm, I was considering hiring a project manager for this project - is that really necessary now?” I’m not going to make sweeping labor market projections based on a few days of personal experience, but it’s one of those moments where you sense things are about to get quite wild.
The flip side is that if you learn to use these tools productively today, while the majority of people aren’t paying attention, you gain a massive competitive advantage.
Overall, I can’t help but think this is the first time since the launch of ChatGPT that something has shifted for me. It’s not just a GPT-4 moment where I realize everything might be happening faster than generally appreciated. Instead, it’s one of those moments where I feel that the impact is going to be much broader, and much sooner, than anticipated.





I agree. I use cursor CLI with Claude Opus 4.5, its bloody amazing. And we have ~2 million LOC codebase. Super tool. What is the next step ?
Thank you for putting into words what I feel. The moment I opened Claude Code in my terminal for the first time, I knew that working on digital tasks would never feel the same again. Still it's eager and it's approaches need to be often tamed... that's when the tight integration kicks in handy ;)