Comet’s background assistant makes the browser an agent

Perplexity Comet’s background assistant is now public, showing how the browser can host real agents that plan, act, and ask for consent before they click. Here is what changes, why it matters, and how to try it today.

ByTalosTalos
AI Agents

The agentic browser moment

This week’s public release of Comet with a background assistant feels like a line crossing for everyday computing. For decades we treated the browser as a passive window that rendered pages and waited for you to point and click. With Comet, the browser starts to look like a programmable workbench where software can see, decide, and follow through while you stay in control. It is a practical shift, not just a demo. The browser is becoming the default runtime for AI agents.

If that sounds familiar, it is because the industry has been inching toward this direction for months. We explored it in our look at the agentic browser, and we have seen cloud platforms move their inference engines closer to real-world actions. Comet brings that conversation to where people already work: a tab you can watch, pause, and steer.

What Comet released and why it matters

Comet’s background assistant runs inside your existing browsing flow. It moves between tabs, opens new ones, and uses the same authentication your profile already holds. It narrates intentions in plain language, asks before it acts, and shows a live checklist of steps as it progresses. The effect is less like chatting with a bot and more like delegating to a careful coworker who works in front of you.

Three design choices make this release noteworthy:

  • The browser is the workbench. You can see navigation in real time rather than reading a transcript after the fact. The state of work is visible in your tabs, not hidden in a server log.
  • The loop stays human directed. Before the assistant signs in, adds to a cart, emails a vendor, or moves money, it asks. Explanations arrive before actions. You can cancel, revise, or add constraints at any point.
  • The connective tissue is growing. When you authorize it, Comet can hand off results to spreadsheets, notes, email, calendar, or ticketing tools. That is where background work compounds into outcomes.

The practical implication is leverage. Instead of telling a chatbot to explain a page you must visit, you tell an assistant to visit the pages, collect quotes, compare policies, and propose a draft decision with sources open in adjacent tabs. Your hands stay on the rail. The agent does the legwork.

How the background assistant actually works

Think of the system as three layers working in sequence:

  1. Perception. The assistant reads the current page, identifies interactive elements, and builds a layout map. It captures structured bits such as dates, prices, availability, and form fields.

  2. Planning. It proposes a sequence of steps like a recipe. For a hotel booking it might outline: set dates, expand filters, collect nightly rates, compare total cost after taxes, map distance to venues, then draft a summary.

  3. Execution with consent. It asks for approval. Once you accept, it clicks, types, scrolls, and opens tabs as needed. When an action raises risk, such as a purchase or a message to a third party, it pauses again for confirmation.

Under the hood, the model leans on your authorized context rather than brittle scraping. If you permit it, the assistant uses your signed-in state to open a calendar, read a spreadsheet, or send an email. If not, it limits itself to public pages and returns a checklist you can execute manually. Either way, you get a record of what it tried and why.

Four reproducible scenarios to try

These hands-on workflows show how agentic browsing feels and where consent matters. They are reproducible without private tools.

1) Travel booking with constraints

  • Goal: Hold the best flight and a refundable hotel for a two-day meeting.
  • Prompt: “Plan a San Francisco to Chicago trip, depart November 12, return November 14, total budget under 900 dollars, nonstop if possible, hotel within a 10-minute walk of the West Loop.”

What happens:

  • The assistant opens two flight aggregators and two hotel sites, sets dates and filters, then extracts options into a side panel with total price comparisons including taxes.
  • It highlights trade-offs, like a cheaper evening arrival that would miss the first session, and proposes two bundles: cheapest acceptable and best-time match.
  • It asks for approval before applying loyalty numbers or logging in to place a hold. You can choose “collect screenshots and links only” if you prefer to book yourself.

Repro tip: Start with “collect quotes only” on the first run. You will see the plan, the pages used, and the extracted totals before any account interaction.

2) Online returns automation

  • Goal: Initiate returns across three stores with different policies and generate prepaid labels.
  • Prompt: “Process returns for items in my last three orders, default to store credit if a cash refund is not available, deadline next Friday.”

What happens:

  • The assistant opens each retailer’s order history page, finds eligible items, and summarizes policies such as restocking fees and return windows.
  • It requests permission to proceed store by store. After consent, it fills forms, selects a reason code, and downloads labels to a “Returns” folder.
  • It creates a calendar hold labeled “Drop off packages by Thursday” with links to each label and the store’s drop-off instructions.

Repro tip: If you do not want to authorize logins, ask for a checklist with deep links to the return pages and a short policy summary you can act on yourself.

3) Sourcing and procurement for a small team

  • Goal: Find and compare three vendor quotes for a bulk hardware order, then draft a purchase order.
  • Prompt: “Source 50 units of the following part number, compare warranties and shipping times, prepare a purchase order draft for the best total cost under 15 days.”

What happens:

  • The assistant opens vendor catalogs, filters by part number, and extracts unit price, minimum order quantity, lead time, and warranty terms into a structured sheet.
  • It checks for discounts if you sign in to distributor accounts, then refreshes totals with negotiated pricing.
  • It drafts a simple purchase order in a shared document, pulling the selected vendor’s address and your billing details from a secure profile. It pauses for a final review before sending.

Repro tip: Run a read-only pass first to verify the fields collected. Then authorize distributor accounts to unlock negotiated pricing.

4) Research workflow with source accountability

  • Goal: Produce a market brief that compares pricing, positioning, and adoption for three products, with citations you can audit.
  • Prompt: “Create a two-page brief comparing Product A, Product B, and Product C. Include pricing tiers, notable features, partner ecosystems, and recent public customer wins.”

What happens:

  • The assistant opens product pages, pricing pages, and recent press, then extracts comparable fields into a grid with hyperlinks back to each source.
  • It flags ambiguous claims, such as pricing that varies by seat type, and asks whether to assume a common seat count for apples-to-apples comparisons.
  • It outputs a brief in your notes app and attaches a source list with the exact pages visited.

Repro tip: Tell the assistant to stop at “evidence collection.” Inspect the grid and the source list, then approve a second pass to synthesize and write.

How Comet contrasts with ChatGPT Agent mode

Both approaches aim to move from answers to actions, but they differ in where work happens and how it is observed.

  • Execution surface. Comet works in your browser, so every step is visible in real time. ChatGPT’s Agent mode often runs in a hosted environment and shows a narrative or a replay. That gives hosted agents scale and speed, while Comet leans into transparency that mirrors how people actually browse.
  • Tool model. Hosted agent platforms emphasize direct API integrations and structured actions. Comet leans on the browser for reach, then connects to a smaller set of user-authorized apps where it matters most, such as documents, sheets, email, and calendar.
  • User control. Comet stages approvals around clear risk points like logging in, sending a message, or transacting. Hosted agents centralize policy and consent inside a workspace, which is powerful for consistency but can feel distant from the page you are looking at.

Both can be right for different jobs. If you want full transparency and page-by-page steering, agentic browsing fits. If you want high throughput in a standardized environment, hosted agents shine.

How Comet contrasts with Salesforce Agentforce

Agentforce is built for enterprise scale, governance, and deep data integrations across sales, service, and marketing.

  • Observability. Agentforce treats every agent run as a trackable process with logs, traces, and policy checks that security and compliance teams expect. Comet exposes steps in the browser and a human-friendly timeline, which suits individual knowledge workers but is lighter for regulated environments.
  • Governance. Agentforce plugs into role-based access control, approval flows, and data residency options. That matters when agents touch customer records and money. Comet’s consent model is consumer grade with clear prompts and step gates, well suited to personal and team productivity.
  • Integrations. Agentforce connects natively to data in the Salesforce stack and partner actions in its marketplace. Comet starts with the browser and a handful of high-value connectors, which makes it fast to adopt and easy to reason about.

If your primary need is personal leverage on the open web, Comet’s approach is easy to try. If your need is a policy-enforced agent fleet inside a large organization, Agentforce targets that purpose.

Safety, consent, and data access

Agentic browsing raises new questions about power and restraint. The goal is not to hide capability but to instrument it.

  • Intent before action. The assistant should summarize the plan in plain language before it acts, listing the sites it will visit and the data it intends to collect. For any action that sends data off your machine, require a second confirmation.
  • Least privilege by default. Do not request more access than the task requires. If a task can be completed in read-only mode, the assistant should avoid write or send permissions.
  • Risk-aware prompts. Purchases, messages, and authentications deserve a distinct visual treatment, not the same look as a routine click. This keeps the user’s risk antennae active.
  • Audit trail. Every run should produce a trace that shows pages visited, data captured, and actions taken. It should be easy to redact or share this trace.

With these guardrails, the assistant feels like a power tool that turns on only when you squeeze both triggers. You know when it is idle, when it is planning, and when it is acting.

The ripple effects on the web

Agentic browsing will change how information is produced, discovered, and monetized. Three near-term shifts stand out.

  • SEO becomes agent optimization. Sites will invest in clearer structure and predictable flows so agents can extract facts and complete actions without failure. Expect cleaner schemas, documented checkout steps, and fewer obstructions that break automation.
  • Affiliate commerce moves upstream. If an assistant can compare options and place a hold, the affiliate click may never hit a classic landing page. Merchants will experiment with agent-friendly endpoints that carry structured offers and verify that an agent placed an order on a user’s behalf. The economic model will follow action rather than pageview.
  • Developer ecosystems turn agent-first. Instead of building only for human clicks, developers will expose small, secure actions that agents can call, such as “generate prepaid return label,” “request warranty replacement,” or “hold room for 24 hours.” These micro-actions will live alongside human interfaces and will be permissioned and auditable.

This direction aligns with other moves we have covered, including the Vertex AI Agent Engine leap toward real runtime behavior and the Microsoft Security Store approach for orchestrated agent teams in high-stakes environments.

What builders should do now

You do not need to wait for a standard. There are steps you can take this quarter.

  • Publish structure. Add machine-readable catalogs, availability, and price breakdowns. Make checkout tolerant of automation by minimizing unexpected modals and scripts that block predictable flows.
  • Expose safe actions. Create scoped endpoints for low-risk tasks that do not require full accounts. Start with quotes, holds, returns, appointment holds, and document generation.
  • Treat consent as design. If your site detects an agent at work, respond with clear prompts that summarize what it is asking to do. Present a one-click gate for actions that carry higher risk.
  • Provide receipts for agents. After a successful automated action, return a structured receipt that the assistant can store and display in the user’s timeline.
  • Instrument and measure. Track where agents fail inside your flows. Fix brittle steps. Consider an internal dashboard of agent success rates and common error states.

A quick start playbook for teams

If you lead a product or operations team, pilot an agentic workflow in two weeks with this approach:

  1. Pick a narrow task with measurable outcomes. Examples: collect three competitor quotes, assemble a weekly ops report, or prep a travel bundle under a budget cap.
  2. Define consent gates before the pilot starts. Mark actions that require a second confirmation, such as login, purchase, and message send.
  3. Create read-only passes for the first two days. Validate that the assistant sees the right fields and respects your constraints.
  4. Turn on account access for day three once the data looks clean. Use a non-production account if possible.
  5. Measure time saved, error rate, and rework against your manual baseline. Keep a simple rubric: success, partial success, failure with reason.
  6. Expand by one adjacent task only after the pilot meets your bar two runs in a row.

Metrics that matter

To avoid vanity demos, track a small set of metrics that capture real value:

  • Task completion rate: percent of runs that finish without human rescue.
  • Time to first result: minutes from prompt to usable draft or decision.
  • Consent friction: number of prompts per completed task and user satisfaction with those prompts.
  • Error taxonomy: top three failure modes by frequency, with fixes in flight.
  • Trace quality: percentage of runs with reviewable, shareable logs that a peer can audit in under five minutes.

Limitations and open questions

No tool is magic. Comet depends on site structure, sign-in patterns, and the predictability of page flows. Anti-bot defenses can block automation even when the user is present and consenting. Some tasks still benefit from hosted agents that call first-party APIs at speed. Data residency and regulatory needs may push enterprises toward centrally governed platforms. And there is a cultural question: how comfortable do teams feel when a visible agent operates inside the same tabs where they work every day.

These are solvable problems. Clear affordances for control, consistent consent patterns, and better structured actions will raise reliability. As sites expose micro-actions alongside human interfaces, the line between browsing and transacting will blur in a good way.

The bottom line

Comet’s background assistant is not just a feature, it is a new place for software to live. When work happens in the browser you already trust, oversight comes naturally. You see the page, the plan, and the click. That feels safer to try, easier to adopt, and faster to teach.

Hosted agent platforms will keep pushing the envelope on speed and depth, while Comet pushes transparency and in-tab control. Together they are turning the web into a work surface, not just a reading surface. The next year will be decided by consent design, by who exposes the right micro-actions, and by which teams make agents verifiably helpful without being opaque.

Conclusion

Every technology era has a moment when ideas become infrastructure. Comet’s background assistant makes that moment visible. You can watch it in a tab, stop it, steer it, and accept or decline the final move. That is the shape of the agentic web: work that is done for you, in front of you. The browser becomes the workbench, the agent becomes the apprentice, and the next job you hand off will teach the next thousand.

Other articles you might like

Instant Checkout Goes Live: ACP Turns Chat Into Commerce

On September 29 to 30, 2025, OpenAI and Stripe introduced Instant Checkout and the Agentic Commerce Protocol. Learn how ACP standardizes agent to merchant transactions, what changes for Etsy and Shopify sellers, and how to adopt it in 90 days.

Boomi brings MCP to Agentstudio, the USB-C for agents

Boomi brings MCP to Agentstudio, the USB-C for agents

Boomi’s September 2025 update brings Model Context Protocol to Agentstudio, turning existing connectors into safe, schema-based tools that agents can discover, trust, and govern. See what it unlocks now and how to adopt it with confidence.

Microsoft Security Store puts autonomous agent teams in charge

Microsoft Security Store puts autonomous agent teams in charge

Microsoft’s Security Store pairs a no code builder with a curated marketplace so SOCs can deploy coordinated AI agents that investigate, enforce, and learn under guardrails. Here is how to evaluate, adopt, and govern the shift.

Vertex AI Agent Engine’s September leap to real runtime

Vertex AI Agent Engine’s September leap to real runtime

September 2025 turns Vertex AI Agent Engine into a production ready runtime with sandboxed code execution, agent to agent collaboration, durable memory, bidirectional streaming, and tightened enterprise controls.

Claude Sonnet 4.5 pushes agents from demos to dependable work

Claude Sonnet 4.5 pushes agents from demos to dependable work

Flashy demos are over. Claude Sonnet 4.5 pairs accurate computer use with long unattended runs and shipping-grade scaffolding, so teams can move from pilots to production agents that meet real service levels.

Opera Neon and the Dawn of the Agentic Browser

Opera Neon and the Dawn of the Agentic Browser

Opera Neon makes the browser an on-device agent that reads the page, fills forms, and completes jobs you can audit. We compare Neon, Comet, and Dia, then outline what enterprises must ship in the next year.

GitLab Duo Agent Platform hits beta, DevSecOps orchestrated

GitLab Duo Agent Platform hits beta, DevSecOps orchestrated

GitLab turned agentic development into production reality. Duo Agent Platform enters public beta with IDE and web chat, an orchestrated Software Development Flow, MCP support, and integrations for JetBrains and Visual Studio.

OutSystems launches Agent Workbench, MCP, and Marketplace

OutSystems launches Agent Workbench, MCP, and Marketplace

OutSystems just moved from proof of concept to production with Agent Workbench, full MCP support, and a curated marketplace. Here is why this matters for CIOs, platform teams, and anyone ready to scale enterprise AI agents with real guardrails.

GitHub Copilot Agent Goes Live: Pull Request Becomes Runtime

GitHub Copilot Agent Goes Live: Pull Request Becomes Runtime

GitHub’s Copilot coding agent is now generally available and runs through draft pull requests with Actions sandboxes, branch protections, and audit logs. Learn how to roll it out safely, tune policies, and measure real impact.