AI Alignment Gets Personal: Users Pick Moral Settings

The news: alignment just moved into your settings

On October 14, 2025, OpenAI chief executive Sam Altman said the company will relax ChatGPT content restrictions for verified adults starting in December 2025, and will add controls for tone and personality. As Reuters on OpenAI adult content noted, the intent is to treat adult users as adults while giving them more say in how the assistant speaks. That same update previewed granular dials for style, from warmer and more conversational to terse and strictly factual.

That signal is larger than it looks. For a decade, alignment was mostly a lab-side affair. Research teams created model constitutions, safety rails, and refusal rules, then shipped a single personality to everyone. Now the dials are edging into the product. The moment you can pick not only tone but which parts of a constitution apply to your session, alignment becomes personal.

From one rulebook to many

Alignment, in plain language, is how a system decides what it will and will not do, and how it behaves when it is allowed to proceed. In practice, the stack has three layers:

Platform policies: the company’s rules for what is allowed in general.
Model principles: the values and safety constraints used to steer a large language model during training and at run time.
Application prompts and product settings: the knobs that tailor behavior for a specific job.

Until recently, the stack lived behind glass. Users could hint at style in a prompt, but the deeper rules were fixed. Adult-gated mature content and tone sliders change the center of gravity. They turn alignment from a hidden system setting into a visible preference.

A helpful metaphor: imagine cars if seat position, steering feel, and lane keeping were locked at the factory. That would be absurd. We expect to adjust fit and feel, and we also expect guardrails on cliffs. Personal alignment is the same split. The platform keeps the cliffside rail. You get to move the seat.

The rise of selectable constitutions

A constitution in artificial intelligence is a brief set of principles that tells the model how to make tradeoffs: be helpful, avoid harm, respect law, cite sources when possible, de-escalate when uncertain. Labs publish versions of these lists to explain why models refuse some requests and comply with others.

What happens when constitutions become selectable? We move from a world where a company ships one canonical rulebook to one where you can pick a rulebook per task, per team, or per jurisdiction. That opens a market.

Writing tools might offer a few default packs: Business Formal, Creative Freeform, Classroom Safe.
A health-support drafting tool might let licensed clinicians choose a Clinical Guardrails mode that enforces documentation habits and discourages false certainty.
A fiction assistant could let a verified adult choose Mature Themes, while a teen account remains capped to a family recognizable rating.

This is not a leap of faith. Media services already let households select rating levels. Security tools already shift posture by environment, such as work versus home. The difference here is that constitutions do not just filter outcomes, they shape behavior all the way down. A different constitution can change how the assistant interprets your question, what it decides is safe to say, and how it escalates when unsure.

For builders, selectable constitutions are a product pattern. For users, they are a genuine form of agency. For regulators, they are a way to make rules explicit instead of implicit. And for the broader ecosystem, they will link to adjacent shifts we have already explored, like the aviation-style safety era that favors transparency, logging, and continuous improvement.

Distributed responsibility without abdication

If users can pick moral settings, who is accountable when something goes wrong? The instinct is to say responsibility simply shifts to users. The reality is layered.

The platform is still on the hook for illegal or clearly harmful content. That includes robust age checks, abuse prevention, and the ability to shut off dangerous capabilities by default for minors.
Developers who package assistants for specific jobs inherit duty of care for those contexts. A medical drafting tool, even if not a regulated medical device, must lock in safety affordances and add guardrails relevant to clinicians.
Users get meaningful agency, along with consent prompts, a clear record of what is on or off, and the ability to share or import settings like any other configuration.

Think of it as the difference between a traction control switch and the braking system. The switch is yours, the brakes are theirs, and both are documented.

Safety rails evolve: from brittle refusals to preference-aware agents

Everyone has seen the brittle refusal. You ask for a frank answer, the assistant declines mechanically, and the session is over. Personal alignment makes gray zones more manageable. Rather than a flat no, the assistant can adapt to your verified status and chosen mode. It can warn, sanitize, or route to a safer capability. It can avoid content that is inappropriate for teens while remaining useful for adults. When tone is adjustable, it can switch from clinical to conversational without changing the facts.

Preference-aware behavior does not mean anything goes. It means the system picks the narrowest refusal and the broadest help it can provide safely. That often reduces prompt hacking because the model is not stuck in a single yes or no stance. It can find a safer path that still honors intent.

This shift also touches memory. If a mode implies different retention or disclosure habits, teams must decide how long preferences persist and what counts as sensitive. That connects to the opt-in memory divide, where assistants are learning to forget by default and remember with permission.

App stores, regional rules, and platform distribution

As alignment becomes personal, distribution rules matter more. On October 2025, Instagram announced that teen accounts will be guided by the United States PG-13 movie rating standard by default, and that this will apply to its AI experiences for teens. Meta outlines PG-13 teen mode and allows parents to add a stricter Limited Content setting.

This is a preview of the new equilibrium. Regional policies, app store rules, and corporate defaults will shape which constitutions are even offered, who can select them, and which devices can access them. Two examples illustrate the point:

Mobile distribution. Major app stores typically prohibit overtly pornographic content in native apps. That raises practical questions about where adult-gated experiences can live, how in-app age checks work, and whether certain settings must be restricted to the web or enterprise channels.
Cross-border settings. A constitution that is acceptable for an adult in one country may violate norms or law in another. Expect alignment packs that vary by region, and expect travel-mode disclaimers similar to streaming rights alerts.

The lesson for builders is straightforward. Personal alignment does not erase platform governance. It makes it explicit. You will need clear compatibility maps: which settings require which identity checks, what happens on iOS or Android, and how behavior changes for teens, for guests, and for enterprise accounts.

Competition shifts to defaults, controls, and portability

When everyone ships a strong base model, the difference moves to product fit. In a personal alignment era, winners will compete on three fronts.

Default stances. People rarely change defaults. A lab that nails a calm, competent, transparent baseline will gain trust, especially for teens and first-time users.
Controls that do not burden. Sliders and toggles only help if they are legible and do not demand a degree in prompt engineering. Expect a small set of opinionated modes, with a tucked-away advanced panel for power users.
Portability of preference. Today you can export browser bookmarks across vendors. Tomorrow you should be able to export an alignment profile that includes tone preferences, blocked topics, escalation choices, and disclosure rules. If users can bring their moral settings with them, switching costs fall and trust climbs.

A new market will form around curated constitutions. Universities might publish a Teaching Mode pack. Professional bodies could maintain an Ethics Mode that maps to their codes of conduct. Newsrooms may ship a Sourcing and Corrections pack. Enterprises will standardize team profiles, such as a Moderation Team mode that flags and routes edge cases instead of replying.

This competition will also play out in how agents act on your behalf, not only what they say. As assistants gain the ability to click, type, and operate software, the alignment profile becomes an action policy. That ties directly to the new software social contract, where agents must declare what they are about to do and why.

Practical design patterns for builders

If you ship assistants or build on models, move now. A practical playbook:

Separate identity from preference. Tie adult-gated settings to verified age, but keep style and tone independent, so adults can use formal or casual modes without revealing extraneous personal information.
Make modes visible at the point of use. Show a small banner that states which constitution and tone are active, with a one-click way to change it or to read what the mode means. Avoid burying these choices in a settings maze.
Log and label. Store the active alignment profile with each response, so auditors and users can see why the model behaved as it did. Let users download those logs in a human-readable format, not only machine metadata.
Design refusal ladders. For any risky category, define three to five steps, such as reframe, summarize, warn then continue, route to a safer tool, and final refuse with resources. Map each step to user status and active mode.
Offer a small set of curated modes first. Think Starter Safe, Research Neutral, and Warm Assistant. Add a power-user editor later, with guardrails, rate limits, and built-in explanations.
Provide migration paths. If you change defaults, offer a one-time prompt to adopt the new default or keep current settings. Changes to safety behavior should be explicit and reversible. Roll out with opt-in previews and clear release notes.
Calibrate across channels. A setting that is acceptable in a web app might not be permitted in a native mobile app. Maintain a policy compatibility matrix for each platform. Where functionality must be limited, show a clear notice and link to the alternative channel.
Make consent meaningful. For adult-gated modes, require fresh consent on first use and provide a visible on-off state. Include a short explanation of what changes when this mode is active.
Build for teams. Let organizations set default packs by project or role. Offer templates that can be cloned and tuned with audit trails. Provide bulk assignment for classes or departments.
Simulate edge cases. Treat alignment modes as code. Write tests. Generate adversarial prompts. Confirm that the refusal ladder and tool routing work as intended. Document the outcomes and share them with stakeholders.

Measurement that matches the stakes

To avoid regressions and drift, measurement must evolve with personal alignment. Static red teaming is not enough when modes change behavior. Consider a multi-layer approach:

Mode-specific evaluation. Test each curated pack with its own rubric. A Creative Freeform pack might prioritize diversity of ideas and safe handling of sensitive themes. A Clinical Guardrails pack might penalize false certainty and reward evidence citations.
Age-tier evaluation. Run separate test suites for teen and adult experiences. Verify that teen sessions never access mature capabilities, and that adult sessions honor consent and show the correct notices.
Longitudinal checks. Sample responses across weeks to confirm the model still behaves according to the selected constitution. Look for regressions after vendor updates or model swaps.
Human review loops. Invite domain experts to review outputs produced under specialized modes. Pay particular attention to boundary cases where helpfulness and safety are in tension.

What regulators can do without freezing progress

Personal alignment blurs lines of responsibility. Sensible regulation can create clarity without micromanaging product choices.

Require transparent alignment profiles. Users should be able to see which rules governed a response, and to download that configuration.
Define tiered accountability. Platforms own illegal content prevention and robust age checks. Developers own domain-specific safeguards. Users own elective settings within legal bounds. Put this split into guidance so everyone knows their lane.
Standardize age verification options. Set minimum standards that vendors can implement, and certify conforming methods, so platforms do not invent divergent systems.
Encourage portability. Treat alignment profiles as user data that can be exported and imported. This reduces lock-in and sharpens competition on quality, not sticky defaults.
Support third party audits of adult-gated features. Verify that content classified as mature is properly gated, that teen experiences honor their caps, and that switching a mode has the documented effect.

Regulation can also promote transparency by nudging industry toward shared incident reporting. As alignment modes proliferate, a public registry of issues and fixes will speed learning across vendors. That aligns with an emerging safety mindset we have discussed in our piece on an aviation-style safety era.

How enterprises and educators can prepare

The organizations that benefit most will prepare now, before the defaults reset.

Establish an internal alignment catalog. For the tools you deploy, pick a small set of modes and write clear descriptions of what each mode allows, encourages, and refuses.
Train teams to check the banner. Teach employees and students to look for the active mode before sharing sensitive prompts or outputs.
Use policy-aware routing. Sensitive prompts should route to a stricter mode automatically. You can do this at the application layer without changing the base model.
Review logs for drift. Sample responses across weeks to confirm the model still behaves according to the selected constitution and that the refusal ladder triggers where it should.
Plan for portability. Decide how you will export, version, and migrate alignment profiles between vendors and internal tools. Document ownership and retention policies.
Practice incident response. When a mode misbehaves, you should know how to roll back, notify affected users, and apply a fix. Keep a small library of canned responses and remediation steps.

The social contract between users and their assistants

Personal alignment is not a culture war in miniature. It is an interface challenge. The goal is not to let anyone do anything. The goal is to make values explicit, adjustable within legal and ethical bounds, and visible at the moment of use. That is how people learn what a system is doing on their behalf, and how they gain confidence that an assistant will act the way they intend.

The near future looks like this. Verified adults can opt into mature themes with logs and consent flows, while teens see experiences capped to a rating that families understand. Tone becomes a dial instead of a prompt hack. Enterprises pin a constitution to a task, not a vendor. Regulators ask for transparency and portability instead of a single global rulebook. And the best products do not hide these choices. They make them easy, obvious, and accountable.

Alignment just moved into your settings. That is a breakthrough, not because it loosens the rails, but because it finally puts the rails where they belong, in view of the person who is driving.