Zoomtopia 2025: AI Companion 3.0 makes meetings agent hubs
At Zoomtopia 2025, Zoom moved AI Companion 3.0 from a chat sidebar to an in-meeting operator. Learn how voice becomes the control plane, agents execute across apps, and leaders can pilot real workflows now.

The microphone becomes the command line
Zoom used Zoomtopia 2025 to make a simple argument to the enterprise. Voice and video are not just how people talk. They are the new control plane for software. With AI Companion 3.0, Zoom is moving artificial intelligence from a sidebar assistant into the center of live work. If the most important decisions are already made in meetings and calls, then agents should act inside those moments rather than waiting for a ticket or a follow up nudge.
Zoom paired that claim with product detail. The company announced an agentic work surface, outcome focused prompts, and support for third party tools and protocols that let agents execute tasks across systems, not just summarize them. The official Zoom newsroom post lays out the feature set and the intent behind it. You can read the full description in the announcement titled “Zoom unveils AI Companion 3.0 at Zoomtopia 2025.” Read the primary announcement.
This is more than a new assistant. It is a change in where automation lives. Instead of opening a separate bot, people speak or type in the flow of a live conversation and the agent handles the heavy lifting across calendars, documents, tickets, and records. Think of it as moving from sticky notes after the meeting to a production line that starts while people are still talking.
We have been moving toward this shift for a while. Chat first workflows showed how text can become a console for work, as teams discovered that chat becomes the enterprise command line. The next step is obvious. If the conversation is where intent is created, then the microphone should be a command line too.
What AI Companion 3.0 actually does
Zoom’s AI presents a single work surface across the Zoom Workplace desktop app and the browser. The surface pulls context from meetings, chats, documents, and calls. It accepts outcome requests such as prepare me for this meeting, free up my time, or chase these follow ups. The language matters. These are not chat prompts that return paragraphs. These are intents that enlist tools.
In practice, the agent can schedule, send materials, update a pipeline, file a ticket, or kick off a change while participants are still in the meeting. The meeting becomes a cockpit. The agent has dials and switches connected to other systems and it can operate them in real time.
Two design choices stand out for enterprises:
- The agent is cross modal. You can talk to it in a meeting, type to it in chat, and watch it act across apps. That reduces context loss between channels.
- The agent is cross platform. Zoom describes support for tools and protocols that let customers connect their own systems so the agent can act where your data and policies live.
The headline for leaders is straightforward. With AI Companion 3.0, the conversation surface becomes the operations surface. The payoff is not better meeting summaries. It is faster, more compliant execution of routine work triggered by speech.
From chat bots to live workflows
Most corporate bots began life in chat. They answered questions or filled forms in a message thread. That was fine for basic tasks, but it forced users to leave the moment where the need arose. AI Companion 3.0 drags the agent into the moment itself.
Consider a customer escalation call that starts at 9 a.m. The project lead says, Let us move the shipment to next Tuesday, add two onsite technicians, and send a revised statement of work to legal. A chat bot would log these and hand them to someone after the call. Zoom’s design aims for a different result. While the call is ongoing, the agent checks technician calendars, opens a change order in the project tool, generates the revised statement of work with the legal template, and drafts the customer email for the account manager to approve before the call ends. Meeting time becomes execution time.
Another example from internal operations. A hiring manager says in a panel debrief, Let us advance candidate three, schedule a loop, and prepare an offer band analysis. The agent pulls the interview notes, selects panelists based on availability and role, schedules the loop, and populates a compensation worksheet using the company’s policy document. No handoffs. No delay. Less error.
These scenarios sound simple, but they depend on plumbing that is often missing. Enterprises need identity, permissions, routing, and well defined action interfaces across many systems. Zoom’s announcements point toward those layers, including support for configuration of tools for customer and virtual agents, along with connective tissue that lets one agent talk to another and act on their behalf. The less glamorous parts of automation are being brought into the meeting room.
Zoom Business Services and Zoom Virtual Agent grow up
Zoom Business Services is Zoom’s suite that covers customer experience, revenue workflows, and events. At Zoomtopia, the company highlighted upgrades to this stack and to Zoom Virtual Agent. The interesting twist is what happens when those capabilities sit in the same place where your sales calls, onboarding sessions, and support reviews already run.
- Customer experience leaders can pair live agents with virtual ones that have automated quality checks, real time guidance, and the ability to close tasks during the interaction, not after it. A virtual agent can start a return, verify identity, and pre fill forms while the human agent answers higher value questions.
- Revenue teams get agent led prospecting that can pick up signals from events, generate outreach, and book first meetings. When the first call happens, the same intelligence is already in the meeting, ready to pull content, update the record, and set follow ups.
- Events and webinars get an attendee panel that lets late joiners catch up instantly without interrupting the presenter. That cuts friction that usually leads to missed details and follow up emails.
Zoom details these Business Services upgrades in a companion newsroom post. It is the second source worth reading if you are evaluating the full stack beyond meetings. See Business Services innovations.
Bring your own voice is about brand control, not novelty
Voice is not just sound. For a business, it is a brand asset. Zoom’s bring your own voice capability for Zoom Virtual Agent lets companies use a voice that matches their identity. In practice, this is a governance feature in creative clothing.
Here is why. When your automated receptionist speaks to a customer, a mismatch between tone and brand is jarring. Bring your own voice lets enterprises license or supply a voice model that matches expectations and apply it across touchpoints. That reduces the risk of inconsistent experiences across phone, chat, and meeting follow ups. It also makes it more likely that stakeholders approve voice first automation because marketing and legal teams can sign off on a single voice profile rather than chase down dozens of ad hoc decisions.
Bring your own voice also has a workplace safety angle. In industries that require clear call recordings and human like interactions, a consistent voice model helps quality managers review interactions reliably. It also makes it easier to train human agents with shadow sessions that pair them with a virtual agent that sounds like the company, not like a demo.
How this compares to broader agent trends
The shift Zoom describes aligns with a larger pattern. Tools are moving closer to the moment of intent. We have seen browser based agents that can operate across applications, as explored in our review of agents in your browser. We have also seen platform teams consolidate the components of an agent stack into reusable kits, such as a standard stack for enterprise agents. Zoom’s move brings those ideas into the meeting itself, where much of the world’s work already happens.
What this unlocks in 2026
If this year is about moving agents into live workflows, next year is about adoption at scale. Several things change when voice and video become the control plane.
- Adoption. Users do not have to learn a new place to ask for help. They speak in the tools they already use. That removes a common barrier to automation projects, which is change management fatigue.
- Data quality. Actions taken during live conversations are grounded in the freshest context. Notes, decisions, and task assignments captured while people are present are less prone to error. That pays off across customer experience metrics and internal productivity.
- Speed. When the agent sits in the meeting or the call, the loop between request and action collapses. That shift alone can reclaim hours per person per week in coordination time.
- Compliance. By centralizing agent actions inside a governed platform, enterprises can log who asked for what, what the agent did, and what approvals were applied. That makes audits simpler than trying to piece together bot trails scattered across departments.
The shape of automation also changes. Instead of dozens of departmental bots that operate in silos, a few shared agents can act across departments because everyone meets in the same place. That consolidation reduces integration spend and lowers the cognitive load for employees who would otherwise juggle multiple bot interfaces.
A practical evaluation playbook for leaders
If you lead technology, operations, or customer experience, start with a short plan to make sense of this shift.
-
Map decisions that happen live. Inventory the recurring meetings and calls where decisions translate to tasks. Examples include weekly sales pipeline reviews, incident triage calls, product launch standups, and partner escalations. These are high value targets because the payoff from same day execution compounds.
-
Define outcome verbs, not prompts. For each meeting type, list two or three outcome phrases that agents should understand, such as prepare me for this meeting, reschedule the on site deployment, or generate and route the updated agreement. This improves reliability because you are designing for the enterprise’s recurring actions instead of free form chat.
-
Connect the first three systems. Pick the systems that show up in the most decisions. For many companies this is calendar, ticketing, and customer relationship management. Connect those to the agent’s tool layer so the first wave of actions is end to end.
-
Decide who approves what. Create a simple approval map. For example, the agent can schedule and send internal docs without approval, but it must request human sign off before sending external emails or moving a sales stage. Keep the rules short so people learn them.
-
Pilot in one recurring meeting. Start with a team that meets often and owns clear outcomes. Introduce the agent as part of the agenda. Assign someone to watch for false positives and to refine the outcome phrases.
-
Measure speed to action. Do not measure only sentiment or usage. Track the time from request to completion for the top three tasks. This is the most honest indicator of value.
-
Add identity and recording policies. Decide which meetings allow the agent to listen and act, and what it is allowed to store. Make the policy readable and visible before each meeting so trust is built on purpose, not by accident.
Architecture, in human terms
A useful mental model for this stack has three layers.
- Context layer. Meeting audio, transcripts, chat, shared documents, and relevant system records. This is where the agent learns what is happening. Conservation of context is what makes it competent during live conversations.
- Tool layer. Connectors to apps that do work. Calendars, project systems, customer records, identity checks. This is where the agent stops being a writer and becomes a doer. Zoom has signaled support for standards and protocols that make these tools configurable by customers, which matters because every enterprise has a unique mix.
- Policy layer. Identity, permissions, approvals, logging, and data retention. This is where the enterprise makes the agent safe. Without this layer, adoption will stall.
The art is in keeping the policy layer predictable while allowing the tool layer to evolve. That means favoring systems with clear action interfaces and using approval patterns that are easy to teach. A good litmus test is whether a new hire can explain, in their first week, what the agent can or cannot do in a given meeting.
Risks and how to govern them
No platform shift comes without risk. The right response is to name the risks and set explicit controls.
- Leaky context. If the agent hears too much, it could act on the wrong trigger or leak sensitive details into a follow up email. Control this by limiting listening to specific meeting types and by requiring a visible confirmation before the agent sends anything to an external recipient.
- Over eager actions. Agents that are allowed to change records without oversight can create expensive mistakes. Use a simple rule. The agent can prepare and stage actions during a meeting, but it needs human approval to commit anything that touches a customer or a contract.
- Tool sprawl. Every new connector expands the attack surface. Create a registry of approved tools. New connections require a data owner, a security review, and a fallback plan if the tool is down.
- Hidden labor. If the agent becomes a silent teammate, people may not notice how much it is doing until it fails. Build visible feedback into meetings. When the agent completes a task, have it announce the result to the group and post a summary in the channel.
These controls are not novel, but applying them to voice first automation is new for many organizations. The reward for doing this well is more reliable automation where it is needed most, inside the moments when decisions are made.
What to watch next
A few signals will tell you how fast this shift is moving in 2026.
- First, watch for native outcome catalogs inside meeting templates. If the meeting owner can pre select outcomes for the agent before a call starts, adoption will jump.
- Second, look for cross vendor cooperation on tool configuration. Standards that let enterprises bring their own skills and security policies will decide how deep these agents can go.
- Third, expect more voice options. Bring your own voice will not stop at branded tones. It will include role based voices for legal, finance, or support. That will reduce confusion in recordings and help teams scan transcripts more quickly.
- Fourth, track integrations that let agents work across other conferencing platforms while respecting consent. Many companies already run mixed stacks. The winners will honor that reality.
The bottom line
Zoom started as the venue for the talk. It is becoming the venue for the work. AI Companion 3.0 aims to turn the live conversation into the first and best place to automate the next step. When the microphone becomes the command line, meetings stop being a cost center and start behaving like a factory for outcomes. That is what will make or break adoption next year. The companies that articulate their outcomes, wire the right tools, and set simple, human policies will find themselves with a new advantage. They will not just meet faster. They will finish faster.








