Google Cloud NEXT '26 Shipped a Full Agentic Stack. One Layer Is Missing.

DEV Community

OnChainAIIntel

Apr 26, 2026, 10:11 AM

This is a submission for the Google Cloud NEXT Writing Challenge Google Cloud NEXT '26 kicked off April 22 in Las Vegas, and the story Thomas Kurian told from the keynote stage was one of the most coherent ones that a hyperscaler has put together in years. The pitch: a unified stack. Silicon built for the models, models grounded in your data, agents running on those models, all of it secured by the infrastructure underneath. It is the same stack Google runs for Search, YouTube, Chrome, and Android, now pointed at your enterprise. The announcements are heavy. Gemini Enterprise Agent Platform landed as the Vertex AI successor, with Agent Studio, Agent-to-Agent Orchestration, Agent Registry, Agent Identity, Agent Gateway, and Agent Observability. Google unveiled 8th Generation TPUs split across two chips, TPU 8t for training (scaling to 9,600 chips and 2 petabytes of shared memory in a single superpod) and TPU 8i tuned for inference. Underneath, a new megascale fabric called Virgo Network was introduced to power the AI Hypercomputer. Agentic Data Cloud brought a cross-cloud Lakehouse and Knowledge Catalog. Agentic Defense folded Google Threat Intelligence, Security Operations, and the recently acquired Wiz into an AI Application Protection Platform. The Gemini Enterprise app got an Agent Designer, an Inbox for managing agent activity, long-running agents, Skills, Projects (which give agents permanent memory), Deep Think, and Microsoft 365 interoperability. Sundar Pichai dropped the stat that has been making the rounds all week: roughly 75% of all new Google code is now AI-generated and reviewed by engineers, up from about half last fall. First-party model traffic is running at 16 billion tokens per minute. Just over half of Alphabet's machine learning compute investment in 2026 is earmarked for the Cloud business. This is not a slide deck. It is a serious bet on what SiliconANGLE correctly called the control plane of the agent era. And there is exactly one layer missing from the stack. Nobody on stage named it. Nobody shipped a product for it. And it is the layer that will decide whether any of this actually works in production. Look at the Agent Platform feature list one more time: Studio, Orchestration, Registry, Identity, Gateway, Observability. That is a respectable control plane for agents. You can build them, connect them, catalog them, authenticate them, route them, and watch them. What you cannot do, based on anything announced this week, is score the quality of what goes into them. Every agent runs on inputs. Prompts from humans. Prompts from other agents. Tools call payloads. Context injected from a knowledge catalog or a RAG step. In a world of multi-agent workflows, those inputs are not just the user's problem anymore. Agent A writes a prompt for Agent B. Agent B interprets that prompt, calls a tool, receives a response, and drafts its own prompt for Agent C. The Agent-to-Agent protocol that Google is now pushing at 150 organizations means those chains are about to get longer and more autonomous. The quality of every link in that chain is, right now, unmeasured. Observability tells you what happened. It does not tell you whether the input that caused it was any good to begin with. You see that Agent B failed. You do not see that Agent A handed it an ambiguous, under-specified, or context-poisoned prompt. You end up debugging the failure as a model problem, a tool problem, or a routing problem. It was an input problem. This is what I have been calling the 'AI input quality problem'. It is not a prompt engineering problem. Prompt engineering is a craft humans do at a keyboard. The AI input quality problem is what happens when LLMs write prompts for other LLMs, at scale, with no human in the loop, and nobody is scoring the quality of the handoff. Two things Google announced at NEXT '26 will exacerbate this problem, not solve it. Long-running agents. A2A at scale. Vertical integration buys you a lot. Google's stack means your TPU, your model, your runtime, your data layer, and your governance all speak the same language. What vertical integration does not buy you is a quality signal on the content flowing through that stack. The input is still just text. Text is still ambiguous. Ambiguity compounds through agent chains the same way floating point errors compound through a long numerical pipeline. If you are building on Gemini Enterprise Agent Platform, or planning to, here is the operator's version of the argument. Treat agent-to-agent handoffs as a product surface, not plumbing. They are the place where most of your production issues will originate. Log the prompts Agent A sends to Agent B. Store them. Review a sampled slice weekly. The first time you do this you will be a little shocked at what your agents are saying to each other. Add a pre-flight check, not just a post-hoc trace. Observability after the fact tells you the agent failed. A pre-flight quality check on the prompt before it enters the next agent tells you whether the failure was even preventable. This is the difference between a crash log and a linter. Both are useful. Only one gets you home at a reasonable hour. Assume the input is the bug until proven otherwise. When your agent chain breaks, the most common cause in the next generation of these workflows will not be the model or the tool. It will be an input that was too vague, too verbose, or too contaminated with irrelevant context. Debug the prompt first, the model second. Score your prompts the way you score your code. Code has coverage, complexity, and lint. Prompts have nothing, yet. Define quality dimensions (clarity, specificity, context sufficiency, safety, retrievability), score them, gate on them. "It felt like a good prompt" stops being an acceptable quality signal the moment an agent is writing prompts on your behalf. Kurian is probably right that the enterprise adopting AI agents at scale will tend to choose the platform where model, runtime, silicon, governance, and productivity all come from one company. Vertical integration is a real advantage at this layer, and the economics of 8th gen TPUs plus Virgo Network suggest Google is preparing to compete on inference pricing that Nvidia-dependent competitors will struggle to match. But the bet has a weak link, and it is not Google's alone. It is the industry's. We have assembled a full agentic stack, top to bottom, without a quality layer for the one thing flowing through all of it. The control plane without a quality plane is the same shape as the early web with HTTP and no SSL, or early databases before ACID. It works until it really, really does not. Eventually somebody names the layer. Somebody ships the tool. Somebody writes the analysis pointing out that the emperor's stack is beautiful and well-governed and serving inference at record speed, and also, every prompt in it is being trusted on vibes. The AI input quality problem is real. It is getting louder. NEXT '26 was the week the stack caught up to the agent era and the quality layer fell one more beat behind. Disclosure: I build in this space. I run PQS (Prompt Quality Score) under OnChainIntel, a pre-flight quality score for prompts and agent inputs. I wrote this because NEXT '26 is a real moment for the agent era, and the layer I work on every day is the one nobody put on a slide this week. If you are building on Gemini Enterprise Agent Platform, pay attention to what goes into your agents, not just what comes out.