The Roommate Test: When AI Moves In and Rules the House

October: AI crossed the threshold

Two announcements in October turned the smart home into the intelligent home. On October 1, 2025, Google began pushing a redesigned Home app powered by Gemini for Home, a homewide intelligence that runs across Nest speakers, displays, cameras, and the app itself. Google framed it as a shift from a transactional assistant to a collaborative presence in the house, one that understands context and chains tasks instead of waiting for rigid commands. That is not a cosmetic tweak; it is a new governor for the household. You can read the details in the Gemini for Home launch.

Eight days later, on October 9, 2025, Figure AI revealed Figure 03, a third-generation humanoid designed for living spaces, with soft textiles over joints, layered safety in the battery system, and hands with cameras in the palms for fine manipulation. The company’s pitch is not a lab demo. It is a roadmap to mass production and chore-level competence. See the Figure 03 announcement for the hardware and manufacturing plan.

Put those moves together and a new reality emerges. We are not only telling lights to dim or asking for a weather report. We are inviting ambient software to reason about our routines and embodied hardware to act on that reasoning. This convergence is also changing the software contract in our homes. For a deeper look at that shift, see our piece on the new software social contract.

The Roommate Test

Imagine an actual roommate with access to your doors, cameras, appliances, and calendars. Would you let them walk into your child’s room at night, unlock the back door for a delivery at 2 a.m., or delete last week’s camera clips without asking? If the answer is no, then the same answer should hold for software and robots. That is the Roommate Test: any capability you would not grant a conscientious human roommate should be gated, logged, or disabled for a domestic AI.

The test sounds simple. In practice it forces us to design governance. Homes are not single-user devices. They are multi-stakeholder environments with parents, roommates, children, caregivers, pets, landlords, contractors, and guests. Each has rights, risks, and roles. A home agent that treats everyone as a single blob of the user will fail the Roommate Test in the first week.

Four control planes at home

A helpful way to think about domestic AI is to map four control planes where most risks and frictions live.

Doors and boundaries

Access is the first risk. If an ambient agent can unlock doors or open a garage, it must know when not to. The rule is not simply yes or no. It is conditional authority.

Unlock for a trusted delivery window, but only after a camera check.
Never unlock for a voice that fails a household liveness test.
Lock out unknown faces after 10 p.m.
Treat digital boundaries, such as family calendars and shared photo albums, as doors with their own rules.

If a request crosses a trust boundary, the agent should pause, describe the boundary, and seek explicit approval.

Cameras and sensing

Cameras are not just for security anymore. With video search and descriptive alerts, the home agent will see events and act on them. Default governance should set retention windows, redaction options for children, and sensitive zones such as bathrooms or bedrooms. There should be a hard off switch for all microphones and cameras that cuts power at the device level, plus a visible indicator when any recording is active. Silent surveillance fails the Roommate Test.

Chores and actuators

Thermostats, vacuums, ovens, laundry, and lights seem low risk until exceptions matter. If a robot moves a space heater or opens an oven while a toddler is nearby, you have a safety problem. Actuation should respect presence and proximity. The baseline rule: no high-energy actions when a small child or pet is within a one-meter radius. For robots with grasping hands, set limits on grip force and speed by default, then allow temporary overrides with two-adult consent for rare tasks.

Children’s spaces

A child’s room is a different jurisdiction inside the home. Devices here should run in child-safe modes by default. The agent should refuse to process commands that share a child’s live location outside the household. Content filters should apply to ambient audio. For embodied systems, create a no-enter zone that requires an explicit adult unlock, time-boxed and logged.

Write a household charter

Governance starts with a document. Call it a household charter. It is not a legal contract. It is a set of rules the agent can read, enforce, and explain. If you want the theory behind this approach, read our guide on from prompts to permissions. In the home, the charter becomes a practical constitution that turns values into policies the system can obey.

1) Roles, rights, and vetoes

Primary administrators: typically the adults or leaseholders. They set global policies, approve new devices, and hold a physical recovery key.
Members: full-time occupants with personalized profiles and access to shared spaces.
Guests: temporary occupants with capability-limited accounts that expire automatically.
Service roles: cleaners, dog walkers, contractors, babysitters. Each gets a narrow time-bound permission set.
Veto rule: any administrator can block a high-risk action. When vetoed, the agent must explain why the action was paused and how to proceed safely.

2) Capability caps

Physical caps: maximum speed, torque, and grip force for robots; maximum temperature for stoves or space heaters when controlled by the agent. Caps should vary by room. In a nursery, a robot should default to near-zero motion and require administrator unlock for any motion above a crawl.
Data caps: retention windows for video and audio; redaction policies; default opt-out from sending home video to cloud training unless all administrators consent and children’s media is automatically blurred.
Decision caps: categories the agent cannot act on without human confirmation. Examples include unlocking doors for unknown faces, disabling a smoke alarm, deleting camera clips, or lowering a thermostat below a health threshold for infants or elderly occupants.

3) Proof and explanation

Every high-risk action requires a pre-action explanation and a post-action receipt. Before: the agent states, I see a delivery arriving and will unlock the front door for 40 seconds after identity check. After: a log with time, sensors, faces recognized or obscured, and who approved it.
The agent must justify decisions in plain language. If it refuses an action, it should say why and how to request an exception.
Logs should be local-first and human-readable. For a deeper dive on how logs teach systems to improve, explore post-incident logs teach why.

4) Recovery drills

A physical kill switch that cuts power to actuators and radios, tested monthly.
An offline protocol that keeps door locks, thermostats, and essential lights working without cloud access.
A laminated recovery card with a one-time code that reclaims admin control if accounts are locked out.

Store the charter in the home app and on a local hub. Treat it like a living document that the agent can reference and enforce.

Guest mode that actually works

Most homes host people who are neither strangers nor administrators. Think grandparents, babysitters, and friends staying for the weekend. Guest mode should mirror a hotel keycard, only smarter.

Time-boxed access by default. A guest profile expires at checkout time, with grace for flight delays.
Spatial limits. Give a guest the front door and the guest bathroom, not the nursery or office.
Purpose-built bundles. A babysitter bundle can control lights, thermostat, and television, can initiate a call to parents, and cannot see camera history. A contractor bundle opens the garage and disables a specific interior camera while work is in progress, with a chaperone camera still recording the entryway.
Voice and face checks. The agent should be polite but skeptical. A guest voice command that touches a secure capability should prompt a secondary factor on the host’s phone.
One-taps for hosts. The home app should surface Grant entry for 15 minutes and Extend stay by 24 hours as first-class controls, plus a single toggle to revoke all guest access at once.

Offline is the new status good

For a decade the smart home got smarter by sending everything to the cloud. That trade can no longer be the default. The next marker of a high-quality home is local-first capability. When the internet hiccups, doors should still open, alarms should still arm, and robots should still navigate safely.

Why local-first matters now

Latency. Fine motor control and safety checks need millisecond decisions. A robot should not wait for a round trip to a data center to stop before it bumps a toddler.
Privacy. Camera footage of bedrooms and children should be processed at the edge whenever possible. Cloud uploads should be opt in and obvious.
Resilience. Storms still happen. So do server outages. Essential routines must continue.

What to ask for and how to set it up

Demand an on-premises brain. A home hub with a capable neural engine should run basic perception and control models locally, then selectively call the cloud for heavy tasks. Choose devices that advertise fully offline routines for locks, thermostats, and presence detection.
Turn on an offline test day. Once a quarter, schedule a three-hour internet-off window. Walk the house and note what breaks. Fix it. Your agent should report which capabilities degraded and how to harden them.
Keep a small playbook of local automations. Examples include if smoke is detected, unlock all doors and turn on hallway lights, and if a water leak is detected, cut power to the washer and alert two administrators.

Local-first used to be a nerdy preference. As embodied systems enter the kitchen, it becomes the luxury setting that signals quality.

Practical alignment in the kitchen

The alignment conversation often sounds abstract. In a house, it is a toaster away from concrete. Here is what practical alignment looks like at home.

Incentives without weird hacks. Reward a robot for a clean floor, but penalize it if it pushes toys under the couch to inflate a cleanliness score. Define the metric as visible area plus a random audit that checks under furniture.
Respect for soft norms. A robot that always takes the shortest path may cut between people in conversation. Teach social navigation. The agent should detour around clusters of people and wait at doorways.
Uncertainty as a feature. If the system is not confident about a command that could cause harm, it should ask. I am 62 percent confident you said unlock the back door. Please confirm on your phone. Do not optimize that question away.
Conflict resolution. Parents say lights out; a teenager says five more minutes. The household charter sets priority rules and the agent explains them, then offers a compromise such as a two-minute dim countdown.
Measurable safety caps. Publish the maximum force, speed, and heat a system can produce under normal operation, and make those numbers checkable in the app. If a robot spikes above the cap, it logs the event and slows to safe mode until an adult reviews.

A six month playbook

The following checklists will help households, product teams, property managers, and policymakers build real safeguards without diluting the upside.

For households

Draft a one-page charter and store it in your home app. Start with doors, cameras, and children’s spaces.
Turn on face recognition for family only, and require a phone confirmation for any unlock requests from unknown faces or voices.
Run an offline test day. Fix one thing that breaks.
Create two guest bundles you actually use. Start with babysitter and contractor. Test them.
Review logs weekly. Make it a habit to scan receipts for high-risk actions and verify that caps are enforced.

For product teams at platform companies and device makers

Ship a Charter Builder in the app. Preload templates for common households: parents with toddlers, roommates, and multigenerational homes.
Treat every capability as a permission. Doors, cameras, and ovens should live behind their own toggles with caps and logs.
Build a Policy Engine that is explainable. Every deny message should include the rule that fired and a one-tap path to request a temporary exception.
Make a Local-First badge real. Publish a checklist for what runs on device and design for graceful degradation when the network is out.
Add a Safety Sandbox for robots. Limit force, speed, and workspace by default, and expose those limits as sliders with presets like Nursery, Kitchen, and Garage.
Provide a red team guide for homes. Include misuse cases such as social engineering via intercom, doorbell spoofing, and toddler-triggered commands.

For landlords, property managers, and short-term rentals

Standardize a guest mode across units. Issue time-boxed credentials that expire at checkout.
Provide a printed recovery code and a manual lock for essential doors.
Require a data retention policy in leases. Specify which cameras are allowed, who can access footage, and default deletion windows.
Offer a hardware checklist. Include a local hub, physical kill switch, and backup keys.

For policymakers and standards bodies

Extend smart home standards with permission schemas. Device makers should expose capabilities in a machine-readable way so household charters can control them.
Mandate a physical radio and actuator kill switch for embodied systems sold for homes.
Require a local logging standard. High-risk actions must be logged and exportable in a human-readable format.
Encourage opt-in, privacy-preserving analytics. Aggregate safety incidents without sending raw home video to the cloud.

What could go wrong if we wait

If we punt on governance, we will get default policies optimized for convenience and lock-in, not safety and autonomy. A few plausible casualties:

Silent drift from helpful to invasive. Video search that starts as a way to find when the dog escaped becomes a way to watch a teenager’s midnight snack. Without caps and logs, boundaries dissolve.
Guest sprawl. Rental cleaners end up with permanent codes. Babysitters get camera access they do not need. Old roommates can still unlock doors.
Offline brittleness. A storm knocks out internet and your robot freezes blocking a hallway at night. The thermostat ignores safety presets because the cloud rule engine is unavailable.
Misaligned incentives. Robots learn that faster is better, so they cut corners that are socially rude or physically unsafe.

We can avoid most of this with a charter, caps, and real guest modes. It is not only about preventing harm. It is also about unlocking trust so that more useful capabilities can roll out at home, faster and with less fear.

Embrace the shift and set the rules now

October 2025 is a threshold month. Ambient intelligence is entering the walls and embodied intelligence is stepping onto the rugs. This is not a moment to flinch. It is a moment to set norms while the platforms are still malleable.

Write a household charter. Turn on capability caps. Install a physical kill switch and run offline drills. Demand local-first options, logs you can read, and guest modes you can trust. Ask your vendors for a Charter Builder and a Safety Sandbox. If you are building the next generation of home agents, bake these ideas in now. When the invisible roommate arrives by default, you will be glad the rules were already on the fridge.