Cloudflare’s remote MCP turns the edge into an agent backend

The edge just became an agent backend

For years, most agent demos looked magical on a developer laptop and then stalled on the way to production. Local tool servers hid behind home routers, secrets lived in plain text, and every crash wiped the agent’s memory. That story changes when the tool server moves to the network edge with built‑in identity, durable state, and a backbone for long‑running work.

Cloudflare now lets you host a remote Model Context Protocol server on its global network so your agent tools are reachable from anywhere and fast for everyone. If you only read one announcement, make it the official write‑up on remote MCP servers on Cloudflare’s network. The rest of this guide explains what shipped, how the pieces fit together, and how to build your first edge‑native agent backend.

What shipped and why it matters

Three platform capabilities landed in 2025 that make agents practical beyond the lab:

Remote MCP servers at the edge so tools are public by default behind managed TLS and routing, not trapped on localhost.
Durable state via Durable Objects, so memory and locks survive process restarts without bolting on a separate lock service.
A production‑grade orchestrator in Workflows, which reached general availability in April 2025 and brings retries, timers, event waiting, and observability. See the milestone where Workflows reached general availability in April 2025.

The effect is simple to describe and powerful in practice. Agents need reachability, state, and durability all at once. Delivering those at the edge means your users get low latency without you juggling regions or warming VMs. It also means security teams can reason about identity and permissions with the same mental model they already use for SaaS.

Remote MCP vs local MCP vs VM backends

Think of agent backends like kitchens:

Local MCP is your home kitchen. Perfect for tinkering, not for a shared team dinner. Secrets in files, NAT between you and everyone else, and uptime tied to your laptop.
Traditional VM backends are a rented commercial kitchen. You invite guests, but you handle fire safety and staffing. That means patching OS images, managing certificates, and planning capacity.
Remote MCP at the edge is a kitchen inside a global food hall. Compliance checks are done, the ovens are everywhere, and you pay when you cook. You bring recipes and ingredients and focus on the menu, not the building.

Concretely, here is how the three approaches compare:

Reachability
- Local MCP: bound to localhost or a private LAN. External access requires tunnels or custom networking.
- VM: public with manual TLS, ingress hardening, and DDoS considerations.
- Remote MCP: public on a managed edge with automatic TLS, routing, and protections.
State and coordination
- Local MCP: process memory or flat files that vanish on restart.
- VM: you assemble a database, caches, and a lock service.
- Remote MCP: per‑entity state, storage, and locks using Durable Objects colocated with compute.
Long‑running work
- Local MCP: scripts plus cron plus hope.
- VM: you piece together queues, schedulers, and backoffs.
- Remote MCP: Workflows give retries, timers, wait‑for‑event, and auditability in one place.
Security and permissions
- Local MCP: ad hoc prompts and secrets sprinkled into environment variables.
- VM: roll your own OAuth and policy layers.
- Remote MCP: first‑class OAuth patterns, scoped tokens, and easy integration with enterprise identity providers.
Latency and scale
- Local MCP: great for one user near the machine, poor for global teams.
- VM: depends on chosen regions and autoscaling.
- Remote MCP: tens of milliseconds to most users, scale to zero when idle, and burst without warmup playbooks.

Built‑in auth, scoped permissions, and human checks

Two questions slow agent rollouts in every company: How do we authorize tools with least privilege, and how do we keep a human in the approval loop without gluing together a custom app per action?

At the edge, you can run an OAuth provider or integrate with your identity provider and mint short‑lived tokens scoped to the minimum required privileges. That enables delegation that fits the task instead of blanket access. For example, allow an agent to read a calendar and write to a single document database, but not to file uploads or repository settings.

The September 2025 Agents SDK update adds tool confirmation detection so your UI can pause before a sensitive call. When the model proposes an action, the interface surfaces the tool name and parameters and asks the user to approve. Approval becomes data the system records rather than a vibe inferred from silence.

If you are tracking the broader shift toward agent platforms, this aligns with trends we covered in AgentKit and ChatGPT Apps as a native platform and in WebexOne 2025 turning collaboration into an agent surface. The difference here is that Cloudflare is handling the backend plumbing so your team can spend time on product behavior.

Stateful memory at global edge latency

Durable Objects assign each logical thing a single‑threaded stateful actor with its own storage and mailbox. That sounds academic until you apply it to agents. Give every user, document, or robot an addressable object that holds long‑term notes, the latest resource identifiers, and encrypted OAuth tokens. Because compute and storage sit together, you get strong ordering without a separate distributed lock.

This is exactly what agent memory needs. One object stores long‑term profile data and safe storage for tokens. Another mediates a shared resource, like a rate‑limited API. The agent restores context on the next turn by addressing the right object, not by scraping the transcript.

Durable execution and observability

Workflows give you a backbone for multi‑step agent behavior. A workflow can call a tool, pause for an external event, retry with exponential backoff after a transient failure, and continue hours or days later. With general availability in April 2025, Workflows shipped higher concurrency, wait‑for‑event primitives, and better metrics so you can see where time and budget go. That is the difference between a demo that collapses on a network blip and a production agent that recovers gracefully.

Hands‑on: build a remote MCP server with OAuth, memory, retries, and observability

The following blueprint assumes you have the Cloudflare developer tooling installed. It yields an internet‑reachable MCP server, OAuth‑scoped access to a SaaS API, a Durable Object for memory, and a Workflow for retries and long runs. The sample uses TypeScript, but any Workers language works.

1) Scaffold the project

Create a new Workers project and add the packages you need.

npm create cloudflare@latest my-agent-backend
cd my-agent-backend
npm install @cloudflare/agents @cloudflare/workers-types

2) Define the Durable Object for memory and tokens

A simple object keeps long‑term notes per user, plus encrypted token material. The surface below is intentionally minimal so you can extend it with your needs.

// src/memory.ts
export class MemoryDO {
  state: DurableObjectState
  constructor(state: DurableObjectState) {
    this.state = state
  }
  async fetch(req: Request) {
    const url = new URL(req.url)
    if (req.method === 'POST' && url.pathname === '/remember') {
      const body = await req.json()
      await this.state.storage.put(`notes:${body.userId}`, body.notes)
      return new Response('ok')
    }
    if (req.method === 'GET' && url.pathname === '/remember') {
      const userId = url.searchParams.get('userId')!
      const notes = await this.state.storage.get(`notes:${userId}`)
      return Response.json({ notes })
    }
    return new Response('not found', { status: 404 })
  }
}

3) Configure OAuth at the edge

Use Cloudflare’s OAuth provider or integrate with an identity provider. Create an application with narrow scopes, for example read‑write to a single Notion database or limited repository scopes on GitHub. Store client identifiers and secrets as encrypted environment variables. Map issued access tokens to user identifiers in your Durable Object.

4) Implement the remote MCP server

Expose tools such as calendar.search, doc.create, and repo.open_pull_request. Each tool should verify scopes and log usage. The conceptual import below represents your MCP server implementation.

// src/mcp.ts
import { McpServer } from '@cloudflare/agents' // conceptual import

export default function createServer(env: Env) {
  const server = new McpServer({
    transport: 'remote',
    authorize: async (userId, tool, params) => {
      const id = env.MEMORY.idFromName(userId)
      const obj = env.MEMORY.get(id)
      const { token, scopes } = await obj.fetch('https://do/tokens').then(r => r.json())
      if (!scopes.includes(tool.requiredScope)) throw new Error('forbidden')
      return token
    },
  })

  server.tool('doc.create', async ({ userId, title, body }) => {
    const token = await server.authorize(userId, 'doc.create', { title })
    const res = await fetch(env.DOCS_API_BASE + '/docs', {
      method: 'POST',
      headers: { Authorization: `Bearer ${token}` },
      body: JSON.stringify({ title, body })
    })
    if (!res.ok) throw new Error('doc.create failed')
    const data = await res.json()
    return { id: data.id, url: data.url }
  })

  return server
}

5) Add human‑in‑the‑loop confirmations

With the Agents SDK, enable tool confirmation detection. Your frontend can intercept proposed tool calls, render the tool and parameters, and gate execution on explicit approval.

// src/ui.tsx
import { useAgent } from '@cloudflare/agents/react'

function Chat() {
  const { messages, propose, approve } = useAgent({ confirmTools: true })
  // when propose fires, show a dialog with tool + params
  // if the user clicks Approve, call approve() to continue
}

This gives you consistent approvals across tools without writing custom glue for each integration. If you are building richer client behavior, see how computer use shows up in the browser in our look at Gemini 2.5 computer use.

6) Wrap long‑running tasks in a Workflow

Use a Workflow for anything that needs retries, timers, or multi‑step orchestration.

// src/workflow.ts
export async function planTrip({ userId, city }) {
  const flights = await step('searchFlights', { city })
  await step('notifyUser', { userId, flights })
  await waitForEvent('user:chooseFlight', { userId })
  const receipt = await step('purchase', { userId })
  return receipt
}

When a tool call fails transiently, the workflow retries with backoff and records the attempt. If a human must choose an option, waitForEvent pauses without consuming compute. Observability helps you pinpoint slow steps and heavy CPU time.

7) Instrument for observability

Log tool usage with a request identifier, user identifier, tool name, and duration. Emit counters for success and error paths. Use platform metrics and Workflows dashboards to identify hotspots and costs. Pay special attention to the ratio of tool latency to model latency so you know where to optimize first.

8) Deploy to the edge

Bind your Durable Object and environment variables in your configuration file, then deploy with a single command.

wrangler deploy

Once deployed, your MCP server has a stable, public endpoint. Any compatible client can connect remotely, and users around the world see low latency without you spinning up regions. The end result aligns with how teams already expect SaaS to behave, just now for your agent’s tools.

Security model in practice

Start with least‑privilege scopes. If your agent writes weekly status notes, grant it write access only to a single folder or database. Store refresh tokens in the user’s Durable Object, not in a global environment variable. Encrypt at rest, rotate on a schedule, and restrict tool endpoints by tenant. Rate limit per user or per object instance to contain blast radius.

Human control belongs on the critical path. Treat approval like a database transaction. Block the tool call until a decision arrives, persist the approval result, and never infer consent from silence. The confirmation signals added to the Agents SDK help you standardize this pattern, and Workflows supply the waiting room.

Cost and scale notes

Two properties matter in production: scaling down and cold starts. The Workers platform scales to zero when idle and bursts on demand, which means you pay for use rather than reserving instances. That is friendly for agents with bursty traffic. Workflows let you run for minutes, hours, or days without building a scheduler. Durable Objects give you many small stateful processes rather than one hot database, which improves isolation and simplifies coordination.

A good mental model is that you are buying higher baseline reliability and a simpler operator story in exchange for giving up low‑level OS control. If you need kernel modules or exotic dependencies, keep a VM in the mix for that part of the system and let your remote MCP orchestrate the interaction.

Local and VM backends still have roles

You will still use local MCP for rapid prototyping and offline tests. It is fast to iterate and easy to debug. Virtual machines still shine when you need custom kernels, unusual dependencies, or colocation with data that must stay in a specific region. The edge backend pattern is strongest when reachability, low latency, and managed durability matter more than operating system control.

For many teams, the new pattern pairs well with other surfaces. Voice and telephony can become agent platforms as we explored in phone lines as an agent platform. The browser, the chat client, and the meeting app are all front doors. The edge is the shared kitchen behind those doors.

A concrete demo to replicate this week

Goal: a remote MCP server that drafts a customer update, creates a calendar invite, and opens a pull request. All actions require user approval, use OAuth scopes, and record memory of what changed.

Suggested steps:

Pick three integrations and create OAuth apps with minimal scopes, for example Notion write access to a single database, Google Calendar event write, and GitHub pull request write.
Store client identifiers and secrets in environment variables and wire up the OAuth callback route in your Worker.
Create one Durable Object class per user. Store notes, the latest document identifiers, and encrypted refresh tokens.
Implement tools doc.create_summary, calendar.create_event, and repo.open_pr. Each should read tokens from the user’s Durable Object and verify required scopes.
Wrap the multi‑step flow in a Workflow that drafts the doc, asks for approval, creates the invite, then proposes the pull request. On transient failure, retry. On permission failure, prompt the user to reauthorize.
Enable tool confirmation detection in the Agents SDK and render confirmation prompts in your chat interface.
Add logging for tool calls with success and error status, and set up alerts for repeated retries.
Deploy, then test from your agent client using the remote transport.

By the end, your agent will be reachable on the internet, remember what happened, and ask before it acts. That checks the boxes that security teams care about and gives product teams confidence to move toward production.

What this unlocks next

When agents have a secure, stateful, internet‑reachable backend, patterns that were awkward become natural:

Approval queues for sensitive actions like spend, deploys, or data exports.
Long‑tail integrations where a small community needs a reliable tool server with shared memory.
Event‑driven agents that sleep until a webhook or message arrives, then wake, act, and record outcomes.
Multi‑tenant agent platforms where every tenant has its own Durable Objects and policies.

Each is smaller than a monolith and larger than a script, which is exactly the sweet spot for Workflows and Durable Objects.

The bottom line

Agents become real when you solve four problems at once: reachability, state, durability, and consent. Cloudflare’s remote MCP server gives reachability on a global network. Durable Objects provide state for memory and coordination without the overhead of a database cluster. Workflows add a spine for long‑running behavior with retries and observability. The September 2025 Agents SDK update brings human‑in‑the‑loop confirmations that keep people in charge of sensitive actions.

If you want the one link that started the shift, read the Cloudflare announcement on remote MCP servers. Then wire up a minimal memory object, turn on confirmation detection, and ship your first edge‑native agent backend. Your laptop demo can finally graduate to production.