FM Logo
AI BlogAI NewsAI LabThe BookAbout
How can I help?
How can I help?

The end of copy and paste and the triumph of orchestration
INSIGHT #20
SundAI Blog

The end of copy and paste and the triumph of orchestration

5/3/20267 min read
TL;DR

"This week I noticed a clear common thread among the major market releases: AI is evolving from a text-based interlocutor into a silent executor. Models are now skipping intermediate steps to generate final outputs directly and adopting enterprise-grade orchestration infrastructures."

Loading audio player...

This week I noticed a clear common thread among the main releases on the market: artificial intelligence is stopping being a textual interlocutor to become a silent executor. Models are starting to skip intermediate steps, directly generating final outputs and, above all, finally equipping themselves with orchestration infrastructures worthy of enterprise production environments.

It is no longer a matter of understanding if a model responds well to a prompt, but how much it costs to run it autonomously for hours and how stable its execution is. If we follow the thread of the launches of the last seven days, from Google to Mistral, passing through Xiaomi and Nvidia, the message is unequivocal: the prototype phase is over.

Farewell to copy and paste and native files

Google has radically transformed Gemini, taking it from a simple conversational assistant to a real document creation engine. The new feature allows you to generate ready-to-use files directly from the chat interface. It only requires a text prompt describing the desired content and the destination format. The aspect that struck me the most is the total absence of pre-loaded templates: the model autonomously manages the structure and formatting of the data.

This is the end of tactical copy and paste from chatbots to corporate documents. I find the ability to generate files in excel format or paginated presentations directly from a text request extremely useful. This changes my daily workflow and drastically cuts operational times. Previously, I had to generate the raw content, extract it, format it, and paginate it manually. Now I close the entire cycle within Gemini.

Having support for technical formats like markdown and latex is a godsend for technical documentation tasks. I already imagine automating the generation of structured reports starting from raw logs and then passing them to the development teams. This is the practical implementation I always look for in my projects: zero conceptual fluff, immediate impact on production times, and total integration with legacy corporate formats.

On the same line, xAI has just released Grok 4.3. The strategy focuses entirely on the quality-price ratio: input costs drop by 40%, while output costs collapse by 60%. But the real news is that Grok also now includes native capabilities for generating complex files like pdf and excel. Direct generation eliminates the need for intermediate libraries in automation workflows. I want to thoroughly test their new "Agent Mode" for creative production, because reducing multiple api calls into a single cohesive workspace represents the true evolution of practical prompt engineering, a theme I explored in depth in my book on AI.

Code becomes cheap and unified

In the development world, fragmentation is giving way to unification. OpenAI decided to pull the plug on the dedicated Codex model for the second time, fully integrating it within GPT-5.5. I finally see a sensible approach to model management: keeping a separate model for code today makes little conceptual and practical sense. Programming logic represents the core of the general reasoning of modern AI.

I have already tested the new endpoint and the operational difference is immediately noticeable. The model handles complex refactoring on entire repositories without losing context. Of course, we will have to completely rewrite our system prompts to adapt, because the old instructions heavily limit the performance of the new engine. The savings on tokens are real and will change the way we structure api calls. Startups that based their wrappers on the old Codex will have to quickly update their entire architecture to remain competitive, a trend I had already analyzed in The dawn of fluid interfaces and agents that rewrite software.

But the real shock for developers comes from China. Xiaomi has released MiMo-V2.5-Pro, a new open-weight model specifically designed for complex autonomous programming tasks. It nearly matches the performance of Claude Opus 4.6 on major benchmarks, but the real magic lies in context optimization. It manages to consume between 40% and 60% fewer tokens while executing scripts that require hours of processing.

I test dozens of models every week to automate code writing, and Xiaomi's approach hits right at the weak point of proprietary models: the prohibitive cost per token in long-range tasks. Having an open model capable of handling hours of autonomous coding without burning a huge budget changes the rules of the game. I plan to integrate MiMo into my local agents tonight to test the real context retention on legacy repositories. This is the true Code that works at night and the illusion of corporate cuts.

Orchestration becomes enterprise

Mistral AI has released the preview version of Workflows, a new orchestration layer designed to transform artificial intelligence-based processes into production-ready systems. The real strength lies in the underlying engine: it runs on Temporal, the same open-source technology that manages the critical processes of giants like Netflix and Stripe.

This is the move I had been waiting for for months. Seeing Mistral adopt Temporal as an orchestration engine is a very clear signal: we are finally moving from toys to real enterprise applications. Temporal guarantees state durability, definitively solving the problem of timeouts and sudden crashes in complex workflows.

I love the feature that allows you to insert the so-called "human-in-the-loop" with a single line of code. In my daily work, managing pauses waiting for human approval is always an architectural nightmare. Mistral made this operation trivial. The division between the logic managed on the cloud and the data processed locally solves privacy doubts at their root. This tool changes the rules of the game for those designing agentic systems in production, confirming what I wrote in The fall of chaotic agents and the dawn of deterministic infrastructure.

Insight Tecnico

Still on the automation front, Google has launched Deep Research Max. The system relies on the new Gemini 3.1 Pro model and allows you to configure a complete autonomous agent through a single api call. The agent plans the research phases, extracts information, reasons on the data, and generates a structured report. Replacing dozens of lines of code for rag and orchestration with a single call enormously simplifies development.

Artificial intelligence has stopped talking to us and has finally started working invisibly behind the scenes.

The return of local power

Nvidia is pushing very hard on local generative AI by releasing the GenAI Creator Toolkit, a set of ready-to-use workflows for ComfyUI designed to run entirely on RTX GPUs. This move allows creative teams to automate complex operations without any dependence on cloud services and ensuring maximum privacy for the data stored in the workstations.

The package includes flows to break an image down into separate layers with perfect alpha masks, handle advanced object removal via inpainting, and transform a photo into a textured 3D model. The hardware requirements are high: at least 24 GB of vram are needed. This cuts out consumer machines but fully validates the node-based approach for true professional production teams.

I will definitely install these modules on on-premise servers. Graphic automations based on cloud apis are becoming too expensive and policies change unpredictably. Moving everything to corporate machines with replicable workflows in json format is the real way to scale creative processes without draining IT budgets.

The tools of the week

As always, in addition to the big news, I keep track of the most interesting tools that emerged during the week. Here are the ones that deserve to end up immediately in our arsenal:

  • Claude Code: a unified AI agent for the terminal capable of navigating projects, executing commands, and iterating on code autonomously. Anthropic is betting everything on low-level integration, and this tool demonstrates how powerful it is to let AI operate directly in our development environment.
  • Faster-Whisper: a super-performing implementation of OpenAI's audio transcription model. It runs locally with minimal resource consumption. I tested it to transcribe long meetings and the speed is impressive, while simultaneously ensuring total privacy of the audio data.
  • Mistral Workflows: it is not just news, it is a tool that I advise you to test immediately. If you are building complex automations in python, the orchestration layer based on Temporal will save you from endless hours of debugging on api timeouts.

The market direction is set. Context efficiency, the direct generation of native files, and robust orchestration are the new pillars on which we will build the software of the coming years. See you next Sunday.

Found it useful? I have more like this.

Every week I pick the most interesting and high-impact AI news and share them in an email recap. Subscribe so you don't miss the next one.

Share this Insight
LinkedInTwitterEmail
Book cover
New

Lavora Meglio con l'Intelligenza Artificiale

My practical AI guide focused on real everyday work tasks: emails, reports, slides, data, and automation. Practical examples and ready-to-use prompts to save time and work better right away.

Discover the book

Before you go, I recommend you also read these insights.

The collapse of outsourcing, native PC use and SpaceX's 60 billion move

The collapse of outsourcing, native PC use and SpaceX's 60 billion move

This week I had the distinct feeling that the tape of technological history was fast-forwarded. Economic data, billion-dollar acquisitions, and new releases confirm that the infrastructure of intellectual work is shedding its skin before our eyes.

Read more
The dawn of fluid interfaces and agents that rewrite software

The dawn of fluid interfaces and agents that rewrite software

The era of AI as a simple passive copilot is over, replaced by autonomous agents that write code, revise documents, and generate fluid interfaces on the fly. This week's updates from Anthropic, OpenAI, and Google prove that software is no longer just a tool we use, but an active collaborator.

Read more
The infrastructure of autonomy, the escape of Mythos and the agentic revolution on the edge

The infrastructure of autonomy, the escape of Mythos and the agentic revolution on the edge

This week, the AI industry shifted towards massive cloud infrastructures and local edge agents. Anthropic introduced managed agents and an autonomous cybersecurity model that escaped its sandbox, while open-source models like Gemma 4 democratize local processing.

Read more

Listen to the Insight

AI Audio Version

Listen while driving or coding.

Ready
Fabrizio Mazzei, AI Solutions Architect e consulenza AI
Author

Fabrizio Mazzei

AI Solutions Architect

As an AI Solutions Architect I design digital ecosystems and autonomous workflows. Almost 10 years in digital marketing, today I integrate AI into business processes: from Next.js and RAG systems to GEO strategies and dedicated training. I like to talk about AI and automation, but that's not all: I've also written a book, "Work Better with AI", a practical handbook with 12 chapters and over 200 ready-to-use prompts for those who want to use ChatGPT and AI without programming. My superpower? Looking at a manual process and already seeing the automated architecture that will replace it.

Discover my book (Italian)Need help with AI?Need a hand?Let's Connect