AI News Hub Logo

AI News Hub

Should you build or buy an MCP runtime for enterprise AI agents in 2026?

DEV Community
Manveer Chawla

The engineering bottleneck for enterprise AI has shifted. Your team has built agents. They work in single-user environments on LangChain or Mastra. The wall hits when you try to wire those agents into secure enterprise systems for thousands of employees without creating new security exposure or a permanent maintenance load. In 2026, engineering directors face a real architectural decision, and it isn't whether to write custom Model Context Protocol (MCP) servers. Custom MCP servers are how you connect agents to proprietary internal systems, regardless of which path you choose. The actual decision is whether you also build the runtime layer that wraps those servers: OAuth lifecycle, credential vaulting, multi-user auth, permission intersection logic, audit pipeline, policy enforcement, and observability. Build that layer yourself on top of LangChain or Mastra, or buy an MCP runtime that delivers it off the shelf. The right answer depends on your deployment profile. Once multi-user authorization, audit-grade governance, or asynchronous tool-call observability enter the picture, the build path incurs increasing costs and a growing risk surface. Maintaining your own auth, credential vaulting, and audit pipeline puts every agent action inside your security blast radius. The decision favors buying a runtime. TL;DR: Build vs. buy MCP runtime An MCP runtime handles the work most teams have no business writing themselves: agent authorization, OAuth token rotation, audit logging, and policy enforcement. The runtime is the execution, authorization, and governance layer where your agent's tools (MCP servers) run. If you build your own runtime. Three narrow profiles fit this path: single-user scope, agent infrastructure as your core product, or all-internal API pipelines. You retain full control and assume responsibility for the OAuth lifecycle, credential vaulting, audit logging, and policy enforcement. Each integration becomes a permanent line item on your engineering roadmap; auth and policy maintenance never go to zero. If you buy a runtime. This is the default for multi-user production. You get centralized lifecycle governance that maps to your existing policies, multi-user authorization with full OAuth lifecycle management, tool execution, and a path to build proprietary tools without rebuilding the runtime layer. Four tipping points force a transition from a self-built runtime layer to a vendor-provided one: Crossing the three-integration threshold, where API maintenance starts consuming dedicated sprints. Introducing user-delegated actions, requiring agents to execute tool calls on behalf of specific human users with distinct permissions. Moving from synchronous read-only tasks to asynchronous, long-running read/write operations that break standard LLM timeouts. Needing OpenTelemetry-compatible audit logs to satisfy compliance and security teams. The state of MCP infrastructure: Config hell vs. the buy tradeoff The Model Context Protocol has standardized how AI applications consume context and execute tools, replacing the bespoke API wrappers teams used to write for every LLM feature. Adopting MCP introduces architectural challenges. Enterprise platform teams choose between two operational burdens: the DIY trap of "config hell," or the buy-side tradeoff of vendor cadence and ecosystem dependency. Config hell happens when you scale bespoke MCP servers. Platform engineers spend their time editing JSON configurations to re-map tool schemas every time an upstream SaaS provider deprecates an endpoint, chasing token rotation drift when an OAuth refresh expires and the custom retry logic doesn't handle the edge case, and handling the manual work that SOC 2 and GDPR compliance requires (immutable schema registries, signed tool manifests, middleware to redact PII from tool outputs). When you build your own infrastructure, you own every broken connection, every expired token, and every security patch. The runtime is not an additional proxy in front of your tools. In an agentic architecture, the agent is already the proxy. It mediates between the user and downstream systems, reasons about which tools to call, and orchestrates multi-step workflows. The runtime is the execution layer where the chosen action actually runs. It is where credentials are resolved, policy is enforced, and the call is made on behalf of a specific user. The runtime is the best gateway. The real buy-side tradeoffs are different. You accept the runtime's policy primitives and observability format as lock-in. You take on overhead from per-tool authorization checks and just-in-time token resolution, which is a fraction of LLM inference and downstream API latency. The real choice in 2026 is risk, not cost. Build your own runtime layer, and your security blast radius scales with every integration, user, and policy change. Buying a runtime moves that work to a vendor that has already been audited for it. For enterprise deployments, that is the safer side of the tradeoff. When to build your own runtime Building your own runtime layer is the right call in a narrow set of scenarios. The open-source ecosystem has matured enough that deep platform engineering teams can stand up their own orchestration layer on top of the official Model Context Protocol Python or TypeScript SDKs. The SDKs implement the MCP specification over JSON-RPC 2.0 and support both stdio for local process communication and Streamable HTTP for remote execution. Teams wrap MCP servers in adapters provided by frameworks like LangChain or Mastra so agents can invoke them directly, then deploy on Kubernetes using custom Helm charts. The MCP servers themselves then become the easy part. The runtime layer that wraps them is the actual work, and the cases where building that layer in-house make sense are narrow. Build your own runtime if you have a single-user scope. Per-user OAuth, token vaulting, and permission intersection are the hardest parts of the runtime layer, and they matter only once more than one human is involved. A solo developer connecting their own credentials to a single agent does not need them. Build your own runtime if the agent infrastructure is your core product. A startup whose entire product is a smart scheduling agent for end users must control every layer of the stack. The engineers should be deep into this work because it is the company. Build your own runtime if you own every API in the pipeline. If your agents act only on systems and data sources you control, with no third-party SaaS connections, you bypass the OAuth-lifecycle problem entirely, and the case for buying weakens. Air-gapped deployments are not a build trigger. They are a deployment-mode question. Self-hosted runtimes run the vendor's runtime layer entirely inside your infrastructure, satisfying the air-gap while inheriting auth, audit, and governance from the runtime. Build your own runtime layer only when the deployment also prohibits third-party vendor software, which typically applies to highly classified environments. Outside those three cases, building your own runtime is a misallocation of senior engineering time. Beyond the MCP servers themselves, you build secure token vaults to manage OAuth refresh lifecycles for each user and service. You handle provider-specific rate limits and pagination. You architect state machines for asynchronous debugging when a tool call takes ten minutes to execute. You patch custom servers every time an upstream API changes its schema. Skip that work, and you get agent hallucination and silent failures. Auth and policy carry their own ongoing burden, separate from API drift. People join and leave the company. Roles change. Permissions get revoked. Policies tighten after an incident. Each event has to flow through your custom auth layer in real time. This is a permanent FTE cost, not a build-once-leave-alone problem, and it never decreases as the deployment grows. When to buy a runtime An MCP runtime shifts engineering effort from infrastructure to product. Your team operates on top of an execution layer that already handles auth, vaults, audit, and policy, instead of building each one. A runtime gives you four things off the shelf. Centralized lifecycle governance. The runtime is the enforcement point for the policies your organization has already defined elsewhere (in your IDPs, your sales tools, your security systems). It maps to those existing policies and enforces them at the agent layer. It does not ask you to recreate access policies inside a new tool. Administrators get a single control plane to manage agent behavior, audit tool execution, and roll out updates safely across the organization. Multi-user post-prompt authorization. Every tool call executes using the credentials and permissions of the human user requesting the action. The runtime handles the OAuth token lifecycle (secure vaulting, refresh, rotation) without exposing credentials to the LLM. A catalog of pre-built, version-controlled MCP tools, so your agents reach thousands of enterprise systems on day one. A path for proprietary tools that doesn't require rebuilding the runtime layer. When you need custom MCP servers for internal systems, you write them on the runtime's open-source MCP framework and inherit auth, audit, and governance for free. If you already have custom MCP servers built without the framework, you can connect them to the runtime and still get the same auth, audit, governance, and pre/post-call policy hooks without rewriting them. Platform engineers shift from writing brittle integration scripts and debugging broken OAuth flows to managing high-level access policies. Your team defines which agents can access which tools, sets up visibility filtering so specific teams only see permitted integrations, and monitors OpenTelemetry-compatible dashboards to track agent reasoning and tool execution latency. You spend time on the agent's logic, not the plumbing. Enterprise MCP scorecard: Decision criteria for build vs. buy Eight dimensions separate a local prototype from a production deployment. The matrix scores each lane against them. Evaluation dimension DIY runtime layer (open-source SDKs) Vendor MCP runtime Control & customization Absolute. Full control over transport layers, custom memory state, and bespoke hardware isolation. High. Standardized tool execution with hooks for custom policies, but limited underlying infrastructure access. Setup speed Weeks to months. Requires building auth layers, token vaults, and infrastructure deployment pipelines. Hours to days. Drop-in integration with existing IdPs and immediate access to pre-built tool catalogs. Maintenance burden Severe. Team owns all API schema updates, deprecations, token rotation logic, and security patches. The work compounds with every integration and every policy change. Minimal. The vendor absorbs API drift, token lifecycle work, and security patching. Your team manages access policies and visibility, not infrastructure. Multi-user authorization Manual implementation. High risk of prompt injection and credential leakage if built incorrectly. Built-in. Automated just-in-time token issuance, scoped per user and isolated from the LLM. Lifecycle governance Fragmented. Requires custom logging middleware, disparate SIEM integrations, and manual version control. Centralized. Unified control plane, OpenTelemetry-native audit logs, and shadow MCP prevention. Async task handling Complex. Requires building external polling, dead-letter queues, and durable state machines for timeouts. Native. Parallelized execution, automatic failover, intelligent retries, and decoupled result fetching. Deployment options Infinite. Deploy anywhere, including fully air-gapped, offline environments. Cloud, self-hosted on-prem or in cloud (vendor enterprise tier), hybrid, or fully air-gapped. Cloud requires network egress to the vendor control plane; self-hosted runs the runtime entirely in your own infrastructure. Best-fit team profile Single-user scope, agent infrastructure is your core product, you own every API in the pipeline. Multi-user production, mixed proprietary plus SaaS requirements, teams optimizing for time-to-value and audit-grade governance. Multi-user authorization in production Multi-user authorization is where most enterprise agent projects stall before production. A developer testing locally passes their personal API keys to the system. In production, an agent serves thousands of users with different permission scopes. If your runtime layer relies on a shared service account or forwards a user's full-scope bearer token to the LLM context, you've created an attack vector. A prompt injection attack instructs the agent to use those inherited permissions to exfiltrate data or delete repositories. Shared service accounts also break audit-trail requirements: systems can't tell an autonomous-agent action apart from a human-directed one. A runtime solves this with multi-user, post-prompt authorization. The runtime enforces a permission intersection at execution time: Agent Permissions ∩ User Permissions = Effective Action Scope The agent can only execute an action if both the agent's role policy and the user's native SaaS permissions allow it. Every other combination is denied. For example, an HR agent scoped to recruiting tasks is invoked by an employee with admin privileges in Workday, including access to global payroll data. When the agent attempts to read payroll, the runtime evaluates the intersection at call time and denies the request. The user has the authority. The agent's restricted scope blocks the action. The runtime acquires a tightly scoped, just-in-time token to execute the allowed action on behalf of the user. The credentials never reach the LLM client, which removes prompt injection as a direct credential-theft vector. Lifecycle governance and audit Without centralized governance, enterprise agent deployments turn into shadow IT. Developers spin up rogue MCP servers on local machines or unauthorized cloud instances, connecting LLMs to internal databases without oversight. A runtime acts as the central enforcement point for the policies your organization has already defined elsewhere. It maps to your IDPs, your sales tools, your security systems, and enforces what's there. It does not ask you to recreate access policies inside a new tool. Think of the runtime as the bouncer: it enforces, it doesn't author. All tools and servers are registered in a single catalog. Visibility filtering ensures that an HR agent sees only HR-related tools, while a coding agent sees only repository tools. Beyond enforcing what's already defined, the runtime exposes pre- and post-tool-call hooks for custom logic. Compliance teams drop in their own variables (workflow state, time windows, request volume, contextual data on the user or session), and the runtime treats those as first-class enforcement primitives alongside standard policies. Organization-specific conditions get wired in without forking the runtime. The runtime generates fine-grained, OpenTelemetry-compatible audit logs. Every action is tracked: which user prompted the agent, which LLM model generated the tool call, what parameters were passed, and what the downstream API returned. That visibility is a prerequisite for passing security reviews in regulated industries. Async and long-running tasks Standard LLM architectures are synchronous. Inference endpoints time out within minutes. Enterprise agent actions, such as triggering CI/CD pipeline builds, provisioning cloud infrastructure, or querying large data warehouses, can run for tens of minutes or hours. In a DIY runtime, platform engineers build the asynchronous scaffolding themselves: job queues, external-memory state synchronization, polling mechanisms, and dead-letter queues for failed operations. A runtime handles this work. It supports the latest MCP Tasks specifications, so agents trigger a long-running process, receive a task identifier immediately, and poll for the result asynchronously. The runtime handles parallelized execution, failover routing when an endpoint drops, and backoff retries. The agent workflow stays durable without the application layer managing state. Observability: end-to-end OpenTelemetry traces The hidden cost of DIY MCP stacks is debugging. When an agent fails a tool call at 3 a.m., platform engineers stitch together traces from the agent run, each MCP server's logs, each provider SDK's retry logs, and each target SaaS API's status page. There is no correlated view. Investigating one failed async action means grepping across three systems in parallel and reconstructing the sequence by hand. A runtime emits a single OpenTelemetry trace that carries the full chain. An example span tree for one agent action ("schedule a follow-up meeting and send the recap"): agent.run (root) user_id, session_id, agent_id ├─ llm.infer model, prompt_tokens, completion_tokens ├─ mcp.tool_call tool=google_calendar.create_event │ ├─ mcp.authz policy_result=allow, user_scope=calendar.events.write │ ├─ mcp.oauth.refresh token_id, refresh_outcome=ok │ └─ mcp.http.execute target_host, status=200, latency_ms=412 ├─ mcp.tool_call tool=gmail.send │ ├─ mcp.authz policy_result=allow │ ├─ mcp.retry attempt=2, reason=rate_limited │ └─ mcp.http.execute status=202, latency_ms=890 └─ llm.infer result synthesis Export that trace to Honeycomb, Datadog, or your SIEM, and you can answer "which user, agent, tool, policy, token, or retry caused the failure?" in one view. DIY gets you there only if you build the trace-correlation layer yourself and maintain it as SDKs, provider log formats, and policy engines evolve. That maintenance is a direct cost on your DIY stack, and it goes away when you adopt a runtime that emits agent-to-tool traces natively. Operational burden of building The operational burden of a DIY runtime layer compounds with every integration and every policy change. Initial development is the smallest part of the work. Most of the engineering effort lands after launch in API deprecations, schema changes, OAuth token rotation, security patching, and the auth and policy churn that grows with every user, every role change, and every revoked permission. A production post-mortem of custom MCP servers documents the typical failure chain: auth drift, orphaned session state, brittle retries, silent tool hallucinations. Each failure costs senior engineering capacity to diagnose and remediate, on a timeline that doesn't compress. Senior engineers building a DIY runtime spend their time on OAuth refresh scripts and incident-response patches. Senior engineers using a runtime spend their time on proprietary agent logic and domain-specific workflows. The differences compound across every team and every quarter. How to evaluate MCP runtime vendors in 2026 Buying a runtime starts with picking the right vendor. The MCP infrastructure market has been segmented into three rough categories. Gateways route MCP traffic. Registries catalog MCP servers. Runtimes handle execution, authorization, and governance. Different vendors cover different layers. Most cover one. Some bundle two. The breakdown of MCP gateways, runtimes, and registries shows where specific vendors stack up across the three categories. Within the runtime category, evaluate vendors against four capabilities: Centralized lifecycle governance. Does the runtime enforce the policies your organization has already defined elsewhere (IDPs, sales tools, security systems), or does it ask you to recreate them in a new tool? Look for one control plane with audit logs, version control, and visibility filtering across every agent and tool. Multi-user post-prompt authorization. Does the runtime evaluate per-user, per-action permissions at execution time, or does it pass through a shared service account? Per-user OAuth, with credentials isolated from the LLM, is the bar. Agent-optimized tools, plus a path for proprietary ones. Are the tools intent-translating, or are they raw API wrappers that make the agent fill in object IDs and enums? Does the vendor offer an open-source framework that lets you build custom MCP servers for internal systems and inherit auth, audit, and governance without rebuilding the runtime layer? Custom policy hooks for contextual access. Can your compliance team add organization-specific logic (workflow state, time windows, request volume, contextual data on the user or session) as first-class enforcement primitives, without forking the runtime? How Arcade delivers on each Arcade is the MCP runtime. It delivers all four capabilities in a single layer for multi-user AI agents at scale. Agent lifecycle governance. Arcade is the central enforcement point for the policies your organization has already defined. It maps to and enforces policies from your IDPs, sales tools, and security systems. It does not ask you to recreate access policies inside a new tool. One control plane for every tool, agent, and auth provider. Version control to safely roll out tool upgrades. A shared registry that prevents teams from rebuilding what already exists. Visibility filtering so agents only see tools their user is permitted to invoke. Fine-grained audit logs, OpenTelemetry-exportable to your SIEM, that track every agent action per user and per service. Arcade's SOC 2 Type 2 certification validates these controls through an independent audit. Agent authorization. Every MCP request in Arcade carries two identity layers: a project-level key (which application is making the request) and a user-level identity (on whose behalf the action is taken). Arcade evaluates the intersection of agent and user permissions dynamically at runtime to prevent privilege escalation. It handles the full OAuth lifecycle (refresh, rotation, mismatch) with credentials isolated from the LLM, and hooks into existing enterprise identity governance systems like Okta, Entra, and SailPoint to enforce policies the enterprise has already defined rather than duplicating them. That is the layer that removes prompt injection as a direct credential-theft vector. Agent-optimized tools. Arcade's catalog of over 8,000 agent-optimized MCP tools are not API wrappers. They translate natural-language intent into structured API calls, so an agent asked to "send this to Finance" does not have to hallucinate the target recipient_user_id. The token cost shows up in benchmarks: for identical CRM queries, intent-level tooling produced 100x fewer response tokens than a raw API-passthrough approach, with token output equivalent to 3.7% of a 200K context window versus 373%. At scale, that overhead translates to context-window overflow in multi-step workflows and degraded agent accuracy. The runtime handles parallelized tool execution, failover, and retries. The Arcade MCP Framework lets you build custom proprietary tools that federate into the same control plane with the same auth and governance wrapping. Contextual access and custom policies. Beyond enforcing policies your organization has already defined elsewhere, Arcade exposes pre- and post-tool-call hooks for custom logic. Compliance teams drop in their own variables (workflow state, time windows, request volume, contextual data on the user or session), and the runtime treats those as first-class enforcement primitives. Organization-specific conditions get wired in without forking the runtime. For enterprises with mixed requirements (proprietary-internal systems plus SaaS breadth, multi-user auth plus governance, fast shipping plus safety), Arcade covers the full set without forcing ecosystem lock-in. Final recommendation For most enterprise deployments in 2026, buy an MCP runtime. The deployment profile shapes how the runtime gets deployed, not whether to deploy it. Proprietary-internal-only. Sensitive data is the strongest buy signal, not a build trigger. Legacy systems holding proprietary data are precisely where Arcade gets pulled in. That's where the operational pain peaks and where security and compliance officers carry the most direct accountability. A custom OAuth pipeline maintained by a small team is a position no security leader wants to defend in a regulated audit. An audited, SOC 2 Type 2 runtime that has already cleared third-party scrutiny is much easier to defend. Recommended pattern: build custom MCP servers using the Arcade MCP Framework, run them inside your VPC or on-prem, and create an MCP gateway in the runtime to connect them to the Arcade control plane. For environments where even the control plane must stay in customer infrastructure, run the runtime self-hosted. The data stays inside your boundary. The runtime handles auth, OBO, vaulted credentials, audit logs, and governance. For fully air-gapped deployments with no external network egress, run a self-hosted runtime entirely inside your infrastructure. The runtime layer is identical to the cloud version; only the deployment mode changes. Build your own runtime only when the deployment also prohibits third-party vendor software. SaaS-heavy. Once your agentic workflow needs to touch Google Workspace, Microsoft, Salesforce, GitHub, or Slack, you buy. The runtime handles the OAuth lifecycle, schema drift, and tool maintenance for hundreds of SaaS APIs your team would otherwise rebuild. The security gap is largest in this profile. So is the operational gap. Mixed (most enterprises). Agents query proprietary internal databases, synthesize that data, and act in public SaaS applications. Mixed-requirement teams do not have to choose between proprietary security and SaaS breadth. Adopt an MCP runtime, such as Arcade.dev, for SaaS coverage, then create an MCP gateway in the runtime to connect internal MCP servers (or custom servers built with the Arcade MCP Framework) to the same control plane. Both surfaces inherit the same security and audit controls, with multi-user authorization wrapping every action. If you have already built MCP servers without the Arcade Framework, you do not have to rewrite them. Connecting an existing custom server to Arcade still gives you the runtime's auth, audit, governance, and pre- and post-call policy hooks on top of what you already have. Summary Deployment profile Recommendation Pattern Proprietary-internal-only Buy an MCP runtime Build custom MCP servers on the Arcade MCP Framework, run them inside your VPC or on-prem, and create an MCP gateway in the runtime to reach them. Self-hosted for environments where the control plane must stay in customer infrastructure. Fully air-gapped (no external egress) Buy an MCP runtime, self-hosted Run a vendor's self-hosted runtime entirely inside your infrastructure. Build your own only when the deployment also prohibits third-party vendor software. SaaS-heavy Buy an MCP runtime Adopt the runtime directly. It handles the OAuth lifecycle, schema drift, and tool maintenance for hundreds of SaaS APIs. Mixed proprietary plus SaaS Buy an MCP runtime Arcade for SaaS coverage. An MCP gateway created in the runtime connects internal MCP servers (built with or without the Arcade Framework) to the same control plane. Decision checklist Run your deployment plan against these five questions: Will your agents serve more than one human user with different permission scopes? Do you need audit-grade logs that tie every tool call to a specific human, agent, and target system? Do any of your agents take asynchronous actions that exceed standard LLM request timeouts? Are you connecting to five or more external SaaS APIs across the organization? Are your regulatory constraints so severe that no external network egress is permitted, even through a gateway running inside your own network? How to read the answers: "Yes" on any one of the five questions: buy an MCP runtime. Otherwise: confirm fit against the three build cases in "When to build your own runtime" before committing to DIY. Conclusion The deployments that stall in 2026 fail on risk: auth that can't be audited, credentials sitting inside an LLM context window, and a security blast radius no one in the room can scope. Sensitive data raises that bar, which is why proprietary scenarios are a buy trigger, not a build trigger. Rebuilding OAuth pipelines and schema registries is a poor use of senior engineering time, and the build path stops compounding the moment a second user or a regulated audit enters the picture. Arcade.dev's MCP runtime provides agent lifecycle governance, agent authorization, and an agent-optimized tool catalog in a single layer. Next step: book a 30-minute technical discovery call with Arcade's team to walk through the multi-user authorization architecture and the deployment options for your environment. Or start in the Arcade playground. Connect one tool, run one user-scoped action, and see how the runtime handles OAuth, policy, and audit in a single trace. Frequently Asked Questions What is the difference between an MCP server and an MCP runtime? An MCP server is a single endpoint that exposes tools. An MCP runtime is the execution layer that hosts, secures, and governs those servers. The runtime handles production complexities like multi-user authorization, load balancing, and audit logging that individual servers lack. How do MCP runtimes handle rate limits and long-running tasks? They use the asynchronous MCP Tasks specification, returning a task ID immediately while managing the long-running job in the background. The runtime handles vendor-specific API rate limits, backoff retries, and connection failovers. Your agent polls for the final result without managing execution state. Why is multi-user authorization so difficult for custom AI agents? Multi-user authorization requires dynamic, just-in-time credential management to prevent prompt-injection attacks that compromise a user's full account. Custom builds must securely orchestrate complex "On-Behalf-Of" token flows, vault credentials out of the LLM context window, manage strict refresh token rotation, and enforce granular access policies at execution time. Can you mix custom MCP servers with an MCP runtime? Yes. Custom MCP servers and runtimes are not alternatives. You build custom MCP servers for proprietary internal systems in either path. The question is whether you also build the runtime layer wrapping them. Runtimes support hybrid architectures: custom servers running proprietary tools inside your VPC connect to the runtime's control plane via a gateway or a secure tunnel. This governs public SaaS and custom internal tools through a single control plane. Servers built on the runtime's open-source framework inherit auth and audit automatically. Existing servers built without the framework connect to the runtime and still get its auth, audit, and policy hooks without being rewritten. When should we build our own runtime layer instead of buying an MCP runtime? Build your own runtime if you have a single-user scope with no multi-user requirement, if the agent infrastructure is itself your core product, if you own every API in the pipeline (no third-party SaaS). Sensitive data on its own is not a build trigger. Air-gapped deployments are handled by self-hosted runtimes from vendors that offer them. Buy a runtime in every other case. When does it become cheaper to buy an MCP runtime? Once you support multiple integrations and multi-user OAuth. Maintenance and security work exceed the runtime's usage cost beyond three integrations. Do MCP runtimes expose OAuth tokens or credentials to the LLM? No. The runtime keeps credentials in a vault and issues tightly scoped, just-in-time tokens for tool execution without placing secrets in the model context. What security and compliance features should an enterprise MCP runtime include? Post-prompt authorization, least-privilege policy enforcement, immutable audit logs (OpenTelemetry-friendly), secret vaulting and rotation, and admin controls for tool access and visibility. What is "post-prompt" (on-behalf-of) authorization for AI agents? Post-prompt authorization means the runtime authorizes and executes each tool call using the requesting user's permissions at execution time, rather than using a shared service account or passing user tokens into prompts. How much latency does an MCP runtime add? A small overhead from per-tool authorization checks and just-in-time token resolution. The overhead is a fraction of LLM inference and downstream SaaS API latency. Can an MCP runtime work in a private VPC or hybrid environment? Yes. The runtime's MCP gateway lets internal MCP servers run inside your VPC while governance and routing stay centralized. Self-hosted deployment runs the runtime entirely in your own infrastructure. How do MCP runtimes help with audit logging and incident response? They record who requested the action, which tool was called, the parameters, results, and timing. All exportable to SIEM via OpenTelemetry for compliance and investigations. How do MCP runtimes handle SaaS API changes and version drift? The vendor maintains tool schemas and centrally updates integrations. This reduces breakage from deprecations and keeps tool definitions consistent across agents. Can we start DIY and migrate to a runtime later? Yes. Teams begin with DIY for prototypes and migrate to a runtime when multi-user auth, governance, and operational load become production requirements.