Compute Non‑Alignment: OpenAI’s Poly‑Cloud Breakout

The week the cloud stopped being a place

On November 3, 2025, OpenAI announced a seven year, 38 billion dollar partnership with Amazon Web Services that grants rapid access to large pools of compute for both training and serving. The headline number matters less than the posture. OpenAI is no longer treating any single provider as home base. It is buying compute like electricity, wherever supply is abundant and priced to move. That is the essence of compute nonalignment, a strategic choice to avoid single cloud lock in and to treat providers as interchangeable suppliers for agentic workloads. See the official note where AWS and OpenAI announce partnership.

At the same time, OpenAI is accelerating its Stargate program with Oracle and SoftBank. New sites, new gear, and new interconnects are being staged across the United States so massive training runs and wide scale inference can be scheduled across more than one vendor. This is not a side project. It is a parallel build out designed to turn infrastructure into a liquid market, not a static address. SoftBank’s release confirms that Stargate expands with new sites.

Compute nonalignment is not neutrality. It is freedom to align per workload and per moment with the supplier that meets price, performance, safety, and policy constraints. If you have been tracking our perspective on architectural shifts, it builds on the protocol pivot thesis, where value moves up the stack toward open orchestrations rather than sealed platforms.

From platform loyalty to compute liquidity

For a decade, the default enterprise strategy was platform loyalty. Choose a primary cloud, adopt its managed services, and optimize for its primitives. That approach made sense when the main goal was speed to market for web scale applications. It breaks down when the constraint becomes access to specialized accelerators, region specific network topologies, and predictable cost at the scale of frontier model training and agentic inference.

Compute liquidity is the alternative. Think of shipping containers. Containers made global trade modular because a box could ride any truck, train, or ship. Compute liquidity aims for the same effect with models, data pipelines, and agent swarms. Workloads are packaged so they can be scheduled wherever capacity is available at the right unit price and reliability.

In practice this means building with portable layers. Containers make runtime environments repeatable. Infrastructure as code keeps stacks declarative. Distributed compute frameworks coordinate tasks across clusters without tying the job to a single vendor’s scheduler. Object storage, model checkpoints, and intermediate artifacts get versioned so a job can resume without redoing mountains of work. The result is a workload that can move like a container, not a workload welded to one dock.

Why labs are doing this now

There are three immediate drivers.

Scarcity and spikes. Training runs for frontier scale models can require hundreds of thousands of accelerators. Inference for agentic systems can swing by orders of magnitude when a product surges. A single provider rarely has the right mix of capacity, topology, and price at the exact moment you need it. Multiple suppliers reduce the probability of stalled science or throttled launches.
Bargaining power. The moment a buyer can credibly move workloads, pricing becomes negotiable. This is procurement math. Software companies learned it with open source databases and with alternative content delivery networks. Labs are now applying it to compute. The less it matters where a job runs, the more providers must compete on unit price, interconnect fees, and service level assurances.
Safety through redundancy. Alignment does not live only inside the model. It lives in the system around the model. If a region drops, a backbone saturates, or a firmware issue cripples a generation of accelerators, a lab that can reroute across clouds is safer than a lab that cannot. Portability becomes a safety feature, not only a cost feature.

If you are exploring the operational side of agentic systems, cross check this with our view that Agent OS turns the firm runtime. When agents plan, act, and coordinate, the substrate they ride on must be mobile, observable, and price aware.

The new center of standardization

For most of the cloud era, the API was the standard. If two storage services had similar interfaces, a developer could swap one for the other with modest friction. That is still relevant, but in a poly cloud world the decisive standards are the interconnects that allow jobs to move at speed without intolerable penalties.

Three kinds matter most.

Data gravity bridges. Training data is large and sticky. Labs are building repeatable pipelines to compress, shard, and pre index data so partial sets can be teleported rather than copied whole. The gold standard is to move the least amount of data required for progress, then reconcile deltas.
Model artifact portability. Checkpoints are the saved states of long training runs. They must be stored in formats that can resume across providers. That means consistent naming, metadata, and encryption schemes so a job restarts on a different vendor’s hardware without manual patching.
High speed fabrics and routing. Cross cloud scheduling only works if inter region links and backbone routes are predictable. Teams negotiate dedicated links for the sensitive paths, reserve public internet for the least sensitive traffic, and model egress as part of every plan. In effect, the standard is not a document. It is the throughput and latency numbers your system can reliably hit.

The emergent market around liquidity

If compute turns liquid, a market forms to price it. Expect three categories to move from niche to normal through 2026 and 2027.

Compute brokers. These intermediaries hold forward contracts with multiple providers and resell slices of guaranteed capacity. Buyers gain predictability without negotiating ten separate agreements. Sellers gain higher utilization and smoother demand.
Cross cloud schedulers. Today many teams glue together homegrown scripts with orchestration tools. The next step is job planners that understand queue depth, accelerator type, data egress, and failure domains across providers. A scheduler may place step one of a pipeline in a Midwest site that has bandwidth and step two in a Virginia site with spot availability.
Agent level spot pricing. Spot instances let you buy spare capacity at a discount with the risk of interruption. In an agentic system, each agent can accept a fluctuating price for its next action. An agent that summarizes text might bid low and run in an older region. An agent that steers a robot arm might bid high to guarantee latency. Instead of one price per job, you get many micro prices per step.

This is parallel to electricity markets. Fixed contracts arrive first, then day ahead auctions, then real time pricing. Compute has an advantage that power does not. A job can be checkpointed. If a supplier pulls capacity, the job pauses then resumes elsewhere.

Safety as a first class outcome

Portability and redundancy add two very practical safety levers.

Failure isolation. If a bug in a network card driver triggers errors in one provider, you can fence off that zone and continue in another. In practice that requires every job to ship with a runbook that names a secondary and tertiary site, plus automated rehydration from the last clean checkpoint.
Operational transparency. When you exercise your exit ramp every quarter, you find hidden assumptions. Your logging may rely on a provider specific feature. Your identity system may depend on an access token format that a different cloud cannot mint. Drills expose these choices so you can refactor before an outage forces your hand.

Teams that design for movement tend to have cleaner dependency graphs, smaller blast radii, and better observability because they need all three to make mobility real.

What this means for providers

The large providers will compete on four dimensions.

Price and financial engineering. Expect longer prepays, more creative credits, and convertible agreements that let buyers shift across instance families or even across regions without penalties.
Specialized accelerators and network topologies. If a provider ships a chip that pairs unusually well with a class of models, or a fabric that reduces synchronization time in large training jobs, that becomes a reason to bring that stage of the pipeline to that cloud. The rest can run elsewhere.
Egress and ingress pricing. Transparent, predictable egress pricing is the difference between a healthy market and soft lock in. Providers that reduce egress penalties will see more inbound jobs because customers are not trapped.
Second party ecosystems. Broker friendly and scheduler friendly APIs will matter as much as developer tooling once brokers and schedulers sit between buyers and sellers.

If you track the physical constraints beneath all this, connect it to our analysis that sovereign AI goes industrial. Capacity, siting, and regulatory posture will influence where this market clears.

A pragmatic playbook for enterprises

You do not need a frontier scale budget to benefit from compute nonalignment. Here is a concrete checklist you can run this quarter.

Package your workloads. Use containers and infrastructure as code so environments are reproducible. Make every build artifact addressable by a content hash and store it in a registry that mirrors to at least two providers.
Establish checkpoint discipline. For training and long lived inference, checkpoint frequently, record schema versions in metadata, and test recovery on a different provider every month.
Keep data close to the job. Preprocess large data sets into shards that can be pulled on demand and cached in the same region as the job. Track the egress cost for every pipeline in the pipeline’s dashboard so the team cannot ignore it.
Design a mobility budget. Set a percentage of compute spend that you are willing to allocate to running the same job in two places for one hour a week. This proves your systems can move and gives you a credible lever in price negotiations.
Contract for exit. Negotiate commitments that include capacity portability, data egress caps, and the right to burst into a partner region when needed. The clause you will need is the clause you write before the crisis.
Increase observability. Capture end to end traces that include provider identity, region, instance type, and key performance counters. If your logs cannot tell you where a job ran and what it paid, you cannot price or move that job intelligently.
Map compliance once, run many. Create a compliance layer that maps your controls to each provider. The team should not redo audits when a job moves. They should check that the mapped controls are equivalent and documented.
Test failure isolation. Practice evacuating a region, failing over a network, and recovering from a compromised artifact. Treat these drills as standing operations, not as one time events.
Incentivize neutrality in tooling. Favor tools that treat providers evenly. Where you accept provider specific dependencies, document the exit path and the premium you are paid for taking on lock in risk.

The new gatekeepers

As brokers and schedulers sit between buyers and sellers, a new layer of gatekeepers will emerge. This layer will decide who gets scarce accelerators at 2 a.m. on a surge night. That power is real, and it will be contested. Expect three checks on that power.

Transparency norms. Buyers will demand that brokers disclose inventory sources and any conflicts of interest with specific providers.
Audit trails for allocation decisions. Logs that show why a scheduler placed a job in one location rather than another will be necessary for cost control and for ethics reviews when models touch sensitive domains.
Open formats for job definitions. If brokers or schedulers try to lock customers in with proprietary job descriptions, open formats will arise because the alternative is being stuck during an outage.

When compute becomes a right, not a place

The deeper question beneath all this is sovereignty. When compute is liquid, what does infrastructure sovereignty mean for firms and nations?

For firms, sovereignty becomes the ability to enforce your rules wherever a job runs. That implies strong key management, independent identity and access control, and portable security controls. It also implies a negotiating stance that treats compute as a right you can exercise with any qualified supplier.
For nations, sovereignty becomes the ability to guarantee that critical workloads can run inside borders at a fair price even during global stress. That points to investment in national interconnects, incentives for multiple providers to operate domestically, and export control regimes that consider not just chips but also cross cloud scheduling software.

These themes align with the industrial turn in policy and infrastructure. If you want the macro view, revisit how the protocol pivot thesis reframes where control points sit when networks, not platforms, define the center of gravity.

What to watch in 2026 and 2027

Price discovery for agent steps. Expect firms to publish internal reference prices for common agent actions such as retrieval, summarization, and tool augmented planning. That is the prerequisite for agent level spot pricing.
Cross cloud inference at scale. Today most inference is anchored to one provider to minimize latency. As interconnects improve and model sharding matures, production systems will answer millions of queries per day across two or more clouds without users noticing.
Frontier labs as wholesalers. Labs that build or reserve capacity under programs like Stargate will resell compute to partners when demand dips. That changes bargaining power again because labs can be sellers as well as buyers.
Public sector procurement. Governments will begin to demand mobility clauses and exit drills from their suppliers. The ones that do will gain leverage and resilience, and they will catalyze healthier markets for everyone else.
Power and siting constraints. Expect more deals that hinge on substation access, water rights, and regional energy mix. Compute liquidity is bounded by physics and policy even when the software is portable.

The bottom line

The OpenAI and AWS partnership, paired with Stargate’s expansion alongside Oracle and SoftBank, marks a break from single cloud thinking. Compute is turning into a market, not a location. That shift changes bargaining power, elevates portability to a safety feature, and sets the stage for brokers, schedulers, and granular pricing that looks more like energy markets than software licensing.

If you are building agents or deploying large models, design for movement. Put your artifacts in containers, keep your checkpoints lean, and put your contracts on your side. Treat providers as suppliers, not landlords. When compute becomes negotiable, sovereignty belongs to the builders who can walk.