Why are AI flat-rate plans disappearing, and what should I do now?

This week marked a brutal turning point in the artificial intelligence market. I have spent the last few days analyzing a series of announcements that, taken individually, seem like normal product evolutions, but when combined reveal a very clear picture: the era of free testing and unlimited compute is over. We have entered the phase of heavy orchestration, where computing costs are offloaded onto users and architectural efficiency becomes a matter of corporate survival.

I have seen business models fall that seemed untouchable and development paradigms born that will make the workflows we used until a month ago obsolete. Make yourself comfortable, because there is a lot to unpack.

The unsustainable weight of agents and the end of flat rates

Anthropic made a decision that shook developer communities: it closed access to third-party frameworks like OpenClaw for its Pro and Max subscribers. This is not a simple change to the terms of service, but a declaration of infrastructural surrender. The business model based on flat-rate subscriptions fails miserably in the face of the greedy nature of agentic frameworks.

When you leave an agent in a loop to solve a complex task, it generates continuous cycles of API calls. It checks the code, fails, rewrites, searches the internet, tries again. This process burns compute resources at a speed that is unsustainable for providers. AI companies are realizing the core problem: servers struggle to handle the load of continuous automations. Switching to pay-as-you-go billing shifts the risk directly onto our shoulders.

I need to immediately update my local development stacks. This change of course forces me to heavily optimize prompt design and agent memory management. Switching to pay-as-you-go carries an obvious risk: an infinite loop caused by a trivial bug now directly empties the credit card. The solution I am working on involves exploiting smaller local models for intermediate computations, calling the Claude or GPT-4 APIs exclusively for final validation. It is the end of the wrapper era and the dawn of autonomous development agents, but this time we have to pay the true infrastructural price for it.

OpenAI's 852 billion move and the death of Sora

While Anthropic tries to stem costs, OpenAI has formalized a devastating 122 billion dollar capital increase, reaching the astronomical valuation of 852 billion. Giants like Amazon, Nvidia, and SoftBank are pumping liquidity for a single purpose: to boost the computational infrastructure.

But the real news isn't the money, it's the product. They released the ChatGPT Super App, merging web search, the Codex coding agent, and agentic capabilities into a single unified interface. This is a centralization designed to convert 900 million weekly active users into potential corporate clients.

To make room for this B2B monster, OpenAI has decided to officially kill Sora. Generating realistic videos looked great on social media, but producing agents capable of orchestrating code brings real business value. I prefer a unified dashboard capable of executing complex actions a thousand times over a GPU-intensive clip generator. Using almost a billion consumer users as a Trojan horse to accustom people to the "agent-first" interface and then selling it to companies is an absolute strategic masterpiece.

The end of the traditional IDE: welcome Cursor 3

If there is a release that will physically change the way I spend my days, it is Cursor 3. The team has decided to rewrite the rules of software development by eliminating the classic IDE structure we have been used to for decades.

The focus shifts from manual code editing to managing actual fleets of AI agents working in parallel. The interface is now "agent-first", optimizing the shared context between different processes. I have spent the last year fitting prompts into chat windows that were too small, waiting for the model to finish one line at a time. Now I launch three refactoring tasks in the background and in the meantime I write the logic in the main module.

We have moved from glorified copilots to true orchestration tools. Those who continue to use traditional editors will lose a huge competitive advantage within a few months. This evolution perfectly matches the logic of AI leaves the browser and takes control of the terminal, where automation is no longer a passive suggestion, but a direct action on the operating system.

Insight Tecnico

The Anthropic leak and Microsoft's cross-validation

Curiously, while Cursor innovates the interface, the secrets of the underlying engines are starting to leak. Anthropic accidentally published crucial parts of the Claude Code source code online. By analyzing the repositories that emerged, I was able to study the technical details on how the model gathers information about the user's system and manages the autonomous execution of tasks.

Seeing the telemetry logic in plaintext forces me to reflect on the security of the agents running on our machines. I must constantly monitor the data read locally by these intelligent executables. The forced transparency of this incident accelerates my practical understanding of how to build true operational agents, providing me with an exact map of the system prompt structures used to avoid hallucinations.

And speaking of hallucinations, Microsoft has started releasing Copilot Cowork. The approach is brilliant: the framework assigns tasks to multiple AI models in parallel and forces them to verify their respective outputs before presenting them to the user. I have been using similar patterns in my local scripts for months. Tasking a secondary model to act as a ruthless reviewer for the primary model increases the reliability of the output exponentially. Seeing this logic integrated directly into enterprise products marks a decisive turning point.

Open source gets aggressive with Gemma 4

In the midst of this infrastructure war, Google made a massive move by releasing the Gemma 4 family under the Apache 2.0 license. We are talking about models that scale from edge devices up to high-end workstations with the 31B dense model.

I consider the move to the Apache 2.0 license the real news of the month. I can finally integrate a high-level Google model into commercial enterprise products, avoiding the continuous legal constraints of old licenses. I have already started testing the 2B parameter version locally and it is perfectly suited for lightweight RAG pipelines on peripheral devices. AI moves to the edge: the pragmatic revolution I was waiting for in automation is finally finding the right models to scale without constantly depending on cloud servers.

Transforming manual processes into scalable and measurable flows is the only way to survive the shockwave of autonomous agents.

The Sequoia report: 60 billion reasons to automate

To understand where we are going, just read the latest analysis by the Sequoia fund on legal services. They estimate that about 60 billion dollars of outsourced work will be absorbed by artificial intelligence-based "autopilots".

The division outlined by the report is surgical: on one side "intelligence" (complex tasks but based on scalable rules), on the other "judgment" (the domain of human exceptions). The addressable market starts with already outsourced services. Companies already have allocated budgets and buy a final result. The client cares zero whether an NDA was drafted by a junior lawyer at three in the morning or an agent orchestrated via API in three seconds.

Every day I see companies wasting enormous amounts of time on document workflows that can be entirely resolved by autonomous AI agents. An AI automation map built on the typical document tasks of a firm or an office usually surfaces two or three processes that can drop from eight hours to twenty minutes without touching the org chart. This is the definitive transition from passive copilots to operational autopilots. Those who continue to sell man-hours on repetitive tasks will be wiped out by new AI-native models.

The tools of the week

As always, I have tested dozens of repositories and new launches. Here are the ones that truly deserve space in your tech stack:

Tool	What it does	Why use it today
LiteLLM	Proxy gateway to centralize API calls to multiple LLM providers.	Fundamental for managing rate limits and load balancing across OpenAI, Anthropic, and local models.
Holo3	Model to delegate complete tasks to the AI directly on the screen.	Overcomes the limits of traditional APIs, acting directly on the operating system's GUI.
GLM-5V-Turbo	Multimodal model by Zhipu AI to convert mockups into executable code.	Monstrously accelerates the transition from frontend design to React/Vue implementation.
Copilot CLI /fleet	Runs agents in parallel from the terminal by declaring dependencies.	Perfect for orchestrating massive refactoring without blocking your development machine.
Pinecone Assistant	Managed knowledge layer for AI applications in production.	Solves the persistent memory problem for agents without having to manage complex vector databases.
Netflix VOID	Open-source framework to remove objects from videos.	Incredible tool for post-production, dynamically reconstructs the physical interactions of the scene.

Weak signals from the market

Besides the big news, there are underground movements that deserve attention. Mistral AI has raised 830 million in debt to build a European super cluster, trying to maintain the computational independence of the old continent. Meanwhile, Nvidia consolidates its hardware monopoly by investing 2 billion in Marvell to dominate silicon photonics.

On the alternative hardware front, Deepseek v4 will run exclusively on Huawei chips, marking an increasingly clear split between the Western and Asian ecosystems. Alibaba's Qwen team also continues to amaze, developing an algorithm capable of doubling reasoning processes with the same compute.

Everything points in the same direction: infrastructure is becoming the real bottleneck. The models are ready, the agents know what to do, but the silicon struggles to keep up with them. Optimizing calls and mastering local orchestration is no longer a quirk for geeks, it is the only skill that will guarantee survival in this market.

The unsustainable weight of agents and the end of flat rates

OpenAI's 852 billion move and the death of Sora

The end of the traditional IDE: welcome Cursor 3

Insight Tecnico

The Anthropic leak and Microsoft's cross-validation

Open source gets aggressive with Gemma 4

Transforming manual processes into scalable and measurable flows is the only way to survive the shockwave of autonomous agents.

The Sequoia report: 60 billion reasons to automate

The tools of the week

As always, I have tested dozens of repositories and new launches. Here are the ones that truly deserve space in your tech stack:

Tool	What it does	Why use it today
LiteLLM	Proxy gateway to centralize API calls to multiple LLM providers.	Fundamental for managing rate limits and load balancing across OpenAI, Anthropic, and local models.
Holo3	Model to delegate complete tasks to the AI directly on the screen.	Overcomes the limits of traditional APIs, acting directly on the operating system's GUI.
GLM-5V-Turbo	Multimodal model by Zhipu AI to convert mockups into executable code.	Monstrously accelerates the transition from frontend design to React/Vue implementation.
Copilot CLI /fleet	Runs agents in parallel from the terminal by declaring dependencies.	Perfect for orchestrating massive refactoring without blocking your development machine.
Pinecone Assistant	Managed knowledge layer for AI applications in production.	Solves the persistent memory problem for agents without having to manage complex vector databases.
Netflix VOID	Open-source framework to remove objects from videos.	Incredible tool for post-production, dynamically reconstructs the physical interactions of the scene.

Why are AI flat-rate plans disappearing, and what should I do now?

The unsustainable weight of agents and the end of flat rates

OpenAI's 852 billion move and the death of Sora

The end of the traditional IDE: welcome Cursor 3

The Anthropic leak and Microsoft's cross-validation

Open source gets aggressive with Gemma 4

The Sequoia report: 60 billion reasons to automate

The tools of the week

Weak signals from the market

Lavora Meglio con l'Intelligenza Artificiale

Before you go, I recommend you also read these insights.

Will the new autonomous agents save us from the algorithmic collapse of social networks?

What really changes this week: Will the costs of autonomous agents and new legal responsibilities change the ROI of artificial intelligence?

Are Claude Fable 5 and Ona by OpenAI about to make manual software development obsolete?

Why are AI flat-rate plans disappearing, and what should I do now?

Listen to the Insight

The unsustainable weight of agents and the end of flat rates

OpenAI's 852 billion move and the death of Sora

The end of the traditional IDE: welcome Cursor 3

The Anthropic leak and Microsoft's cross-validation

Open source gets aggressive with Gemma 4

The Sequoia report: 60 billion reasons to automate

The tools of the week

Weak signals from the market

Lavora Meglio con l'Intelligenza Artificiale

Before you go, I recommend you also read these insights.

Will the new autonomous agents save us from the algorithmic collapse of social networks?

What really changes this week: Will the costs of autonomous agents and new legal responsibilities change the ROI of artificial intelligence?

Are Claude Fable 5 and Ona by OpenAI about to make manual software development obsolete?

Fabrizio Mazzei

Listen to the Insight

Fabrizio Mazzei

Why are AI flat-rate plans disappearing, and what should I do now?

The unsustainable weight of agents and the end of flat rates

OpenAI's 852 billion move and the death of Sora

The end of the traditional IDE: welcome Cursor 3

The Anthropic leak and Microsoft's cross-validation

Open source gets aggressive with Gemma 4

The Sequoia report: 60 billion reasons to automate

The tools of the week

Weak signals from the market

Found it useful? I have more like this.

Lavora Meglio con l'Intelligenza Artificiale

Before you go, I recommend you also read these insights.

Will the new autonomous agents save us from the algorithmic collapse of social networks?

What really changes this week: Will the costs of autonomous agents and new legal responsibilities change the ROI of artificial intelligence?

Are Claude Fable 5 and Ona by OpenAI about to make manual software development obsolete?

Why are AI flat-rate plans disappearing, and what should I do now?

Listen to the Insight

The unsustainable weight of agents and the end of flat rates

OpenAI's 852 billion move and the death of Sora

The end of the traditional IDE: welcome Cursor 3

The Anthropic leak and Microsoft's cross-validation

Open source gets aggressive with Gemma 4

The Sequoia report: 60 billion reasons to automate

The tools of the week

Weak signals from the market

Found it useful? I have more like this.

Lavora Meglio con l'Intelligenza Artificiale

Before you go, I recommend you also read these insights.

Will the new autonomous agents save us from the algorithmic collapse of social networks?

What really changes this week: Will the costs of autonomous agents and new legal responsibilities change the ROI of artificial intelligence?

Are Claude Fable 5 and Ona by OpenAI about to make manual software development obsolete?

Fabrizio Mazzei

Listen to the Insight

Fabrizio Mazzei