OutSystems launches Agent Workbench, MCP, and Marketplace
OutSystems just moved from proof of concept to production with Agent Workbench, full MCP support, and a curated marketplace. Here is why this matters for CIOs, platform teams, and anyone ready to scale enterprise AI agents with real guardrails.


The day low-code crossed into agent operations
On September 30, 2025, OutSystems moved its Agent Workbench into general availability and planted a clear marker in the enterprise agent race. The release arrived with two signals that matter to CIOs and platform teams. First, the Workbench ships with support for the Model Context Protocol. Second, it introduces a curated marketplace for agents, connectors, and tools. OutSystems is positioning this as a path out of pilot purgatory and into governed production, with identity, integration, and oversight considered from day one. The company framed the rollout publicly with an announcement of OutSystems Workbench general availability.
Why does this move matter? Low-code platforms already host business logic, interfaces, and integrations. When those same platforms make agents first-class citizens, the distance from idea to production shrinks from quarters to weeks. Standards-based interoperability and a marketplace with procurement guardrails complete the story. You get an on-ramp that keeps security, governance, and observability intact.
What actually shipped
Agent Workbench is an environment for designing, testing, and orchestrating task-specific agents that plug into existing applications and workflows. Two additions are worth spotlighting.
- MCP integration. The Model Context Protocol gives agents a standard way to discover tools, fetch context, and act. On the enterprise side, teams get a consistent way to expose systems and permissions to multiple models without bespoke glue code every time.
- A curated marketplace. Teams can install vetted agents, connectors, and tools with a single, auditable step. Listings bundle permissions, dependency declarations, and update paths so you can manage change with confidence.
If MCP sounds abstract, think of it as a universal outlet for agent tooling. Instead of every developer fabricating a different plug for every system, MCP defines a shared plug shape and wiring. For a clear, technical snapshot, the primary reference remains the open source Model Context Protocol specification.
Why MCP-backed interoperability changes the rollout math
Agent pilots often fail to scale because they hardwire a specific model to a specific data connector with a one-off logging scheme. The next pilot is as expensive as the first, and the third is worse. MCP separates three layers that used to get tangled.
- System access. Databases, document stores, and line-of-business apps can be exposed as MCP servers with permissions and audit hooks.
- Reasoning models. Different large language models become swappable clients that invoke the same tools and resource calls.
- Orchestration logic. The business process coordinating those agents remains stable while the underlying tools change.
With those layers cleanly separated, platform teams can do practical things that used to be painful.
- Swap or mix models without rewriting integration code.
- Standardize observability across agents because tool calls and resource access share a common structure.
- Reuse connectors and permissions across agents rather than cloning brittle code.
The result is an interoperability surface. OutSystems’ choice to adopt MCP means agents you build in the Workbench are not locked into private connector contracts. That lowers the long-term cost of change, which is the hidden line item that turns pilots into orphans.
The marketplace as a new kind of app store
Agent marketplaces will feel familiar to anyone who watched mobile app stores rise and later saw cloud marketplaces mature. The pattern is similar, yet agents create new risks and therefore new controls.
- Curation. A credible marketplace does more than check whether an app installs. It verifies data access scopes, tool call boundaries, and fallback behavior. Listings should disclose which systems the agent can touch, which models it supports, and exactly which requests it can initiate.
- Upgrades. Agents are closer to microservices than to mobile apps. Upgrading a support triage agent should include dry-run tests, clear change logs, and rollback plans. Semantic versioning and pre-production testing slots reduce operational risk.
- Contracts. Buying an agent can look like buying software or a managed service. Clear rules about incident response, request logging, and data residency keep procurement on track.
OutSystems is leaning into these realities with a curated approach. For buyers, the marketplace keeps agents from becoming untracked automations. For sellers, it creates a commercial channel for specialized agents that solve industry workflows such as mortgage underwriting, life sciences escalations, or fleet dispatching.
How this compares to hyperscaler agent stacks
Enterprises are already evaluating agent platforms from hyperscalers. The strengths and tradeoffs are well known.
-
Hyperscaler agent stacks
- Strengths. Deep integration with native data services, strong identity controls, and elastic infrastructure. Developers can chain managed services like vector stores, function calling, and event hubs with minimal friction. Support options and SLAs are established.
- Tradeoffs. Strong pull toward provider-specific services makes portability harder. Tooling often spreads across multiple consoles. Marketplace selection varies by region and vertical. Abstractions assume cloud-native teams and can be heavy for mixed estates.
-
Low-code agent workbenches
- Strengths. Faster path from requirement to working agent because data models, UI, and workflow engines already exist. Closer alignment with application teams that own the processes agents will augment. A curated marketplace can ship opinionated defaults for guardrails and observability that match how business apps are built.
- Tradeoffs. You depend on vendor pace for protocol updates, model support, and marketplace policies. If the platform’s governance model does not map to your identity or security architecture, you may need additional policy work.
The choice is not binary. Many organizations will build orchestration in a low-code workbench and rely on hyperscalers for model hosting, data services, and events. MCP lowers the cost of stitching these worlds together. For a cloud-first counterpoint on making agents operational, see our take on Amazon Bedrock AgentCore.
What IT should evaluate next
Agent marketplaces are about to become the new app stores for enterprise automation. Before you scale, build a short list of evaluation criteria that force clarity.
Identity and permissions
- Ensure agents inherit least-privilege access from your identity provider. Confirm role-based and attribute-based access control.
- Require clear scopes for tool calls. For example, a triage agent may read tickets and update statuses but not modify customer records without explicit handoff.
- Validate service account hygiene. Rotation, revocation, and key escrow should match your current standards.
Observability and auditability
- Capture structured logs for every tool call and resource fetch with correlation identifiers that link inputs, model outputs, and tool invocations. This is essential for incident review and reproducibility.
- Verify that traces and metrics flow into your existing observability stack. You want tool success rate, latency distribution, token usage, and handoff frequency. Favor platforms with redaction and field-level controls so sensitive inputs do not leak into logs.
Guardrails and policy enforcement
- Look for pre-built policies for personally identifiable information handling, data residency, and prompt filtering. You should be able to define disallowed actions and safe defaults for ambiguous requests.
- Test escalation paths. When an agent is uncertain, it should hand off to a human or a deterministic workflow with complete context attached.
Data governance and lifecycle
- Map how agents learn from interactions. If there is memory, verify retention windows, encryption at rest, and deletion workflows. Ensure you can scope memory to a case, a user, or a team.
- Confirm how connectors are vetted and updated. A connector that touches finance systems needs a patch policy and a rollback plan.
Pricing and incentives
- Model total cost of ownership with realistic usage. Include model inference, tool call costs, data egress, and marketplace listings that carry per-seat or per-request pricing.
- Ask vendors to be explicit about launch incentives. Credits for the first production agents, bundled connectors, or reduced marketplace fees can make the first quarter materially cheaper. Get written terms with sunset dates so finance can forecast beyond the promotional window.
Contract and risk
- Ensure the marketplace defines breach notification timelines, support response targets, and a process for emergency disablement of an agent or connector.
- Require an exit plan. If you move agents to a different platform, you need guaranteed export of configurations, prompts, and policies in a documented format.
A practical 30-60-90 day plan
You do not need to boil the ocean to prove value. Start small, instrument thoroughly, and scale patterns that work.
Days 1 to 30: pick one high-leverage workflow
- Choose a process with measurable value and clear guardrails, such as customer escalation triage or invoice matching. Draft decision boundaries where the agent acts and where it hands off to a human.
- Stand up a sandbox in Agent Workbench. Configure identity integration, set default scopes, and instrument baseline observability.
- Pull one or two MCP servers for systems the agent needs, such as a ticketing system and a knowledge repository.
Days 31 to 60: move from demo to pilot with governance
- Add unit tests for tools and end-to-end test cases using real inputs. Capture metrics for tool reliability and business outcomes like time to resolution or first contact resolution.
- Source one listing from the marketplace. Treat it like any third-party component. Review permissions, update cadence, and support plan. Hold a change advisory board review before first deployment to production.
- Define rollback and kill switch procedures. Practice them.
Days 61 to 90: scale the pattern
- Write a short runbook and incident guide. Standardize logging fields, redaction rules, and escalation contacts. Define a weekly triage time for prompt and policy adjustments based on production logs.
- Present a scoreboard to executives. Track value realized, controls in place, and a backlog of agents that can be safely replicated across departments using the same templates and MCP connectors.
Concrete examples to get started
- Customer operations. Install a triage agent that classifies inbound tickets, fetches customer history, suggests responses, and opens tasks for human agents when confidence drops below a threshold. Use MCP to connect the ticketing system and the knowledge base through scoped servers.
- Finance operations. Deploy a reconciliation agent that pulls invoices and purchase orders, flags mismatches, and prepares a human-readable report. Require approval before any write action to the ledger and capture every tool call in your observability platform.
- Field service. Equip a scheduling agent to propose optimal routes and inventory checks, then ask a dispatcher for final confirmation. Use marketplace connectors for maps and inventory systems and keep write scopes tightly constrained.
- Security operations. Build a phishing triage agent that assembles email artifacts, reputation checks, and policy context, then recommends action with full traceability. Keep write privileges disabled until confidence and monitoring thresholds are met.
If you are balancing platform choices, it helps to compare how marketplaces and governance are evolving across the ecosystem. Our analysis of the ChatGPT agent storefront dives into how packaging and instant checkout change distribution. For teams running agents close to data and users, the trend toward edge execution is accelerating as shown in our look at Cloudflare Remote MCP.
Operating model and team responsibilities
A successful rollout requires more than tools. It needs an operating model that assigns ownership and closes the loop between development and operations.
- Product owner. Owns the business outcome, defines success metrics, and prioritizes the agent backlog.
- Platform engineer. Owns the MCP servers, connectors, and observability wiring. Establishes shared libraries and policies.
- Security partner. Owns threat modeling, policy enforcement, and routine review of logs and permissions. Approves escalation playbooks.
- Application team lead. Owns the workflow the agent augments and signs off on decision boundaries and handoff rules.
- Support lead. Owns triage and incident response. Runs tabletop exercises and maintains the kill switch.
Publish a simple RACI that clarifies who approves what, and put response targets in writing. Many teams discover that the first production agent is less about model quality and more about process discipline.
Common pitfalls and how to avoid them
- Silent scope creep. Agents gradually gain permissions until logs are unreadable and audits fail. Prevent this with explicit scopes per tool and per environment. Fail closed by default.
- Untracked prompts. Teams treat prompts as editable notes. Treat them as code. Store prompts in version control, pair them with test cases, and require reviews.
- Observability late in the game. If you instrument after the pilot, you will not know what is normal. Define required fields and redaction rules up front.
- Vendor lock by accident. Proprietary connectors can creep in at the edges. Prefer MCP-compatible integrations and document exceptions.
- Overfitting to a single model. Use MCP to keep model choices open. Test a second model early so your design does not assume one vendor forever.
How to measure real progress
Executives will ask for proof that agents deliver value and that risk is under control. Agree on a small set of metrics you can report every week without ceremony.
- Business outcomes. Time to resolution, cost per request, revenue influenced, or error rate reduction, depending on the workflow.
- Reliability. Tool success rate, average and p95 latency, incident count, and mean time to restore.
- Safety. Percentage of requests blocked by policy, number of escalations, and audit completeness for tool invocations.
- Efficiency. Token usage per resolved case, retries per tool, and cache hit rate if you employ retrieval.
Publish these as a simple dashboard. Use the same color coding as your existing SLOs so leaders do not need a new legend.
What this means for the next year
The shift is not about hype. It is about making agents participate in the same disciplines that scaled software before them. MCP reduces bespoke glue and gives teams swappable models. Marketplaces turn ad hoc scripts into managed components with clear contracts. Low-code platforms provide a familiar surface for application teams who already own the workflows.
Expect three changes if you adopt this pattern quickly.
- Fewer orphaned pilots. Shared connectors and a marketplace shorten the path from idea to governed rollout.
- Faster root cause analysis. Structured tool logs and traces shrink the time between an incident and a fix.
- Clearer procurement and risk posture. Standard packaging and contracts let legal and security move faster without skipping steps.
The bottom line
OutSystems’ late September launch places a practical shape on enterprise agents. By combining MCP-backed interoperability with a curated marketplace, the company shortens the distance from pilot to governed production. Low-code becomes the shortest on-ramp because it already sits where business logic lives. The right next step is not to plan a grand platform migration. Pick one workflow, wire it through MCP with narrow scopes, buy or build one agent from the marketplace, and measure the result. If you can show value with the controls turned on, you are ready to scale.