I'm intrigued by this new paper on AI alignment that proposes everyone's framing it wrong.

2025-07-21 · Bits and Bobs 7/21/25

I'm intrigued by this new paper on AI alignment that proposes everyone's framing it wrong.
- It proposes we stop searching for universal human values and instead build systems that help diverse communities manage their inevitable disagreements.
- It argues we should be tailors stitching together a "polychrome quilt" of different contexts rather than astronomers seeking "something true and deep"
- The entire AI alignment community is built on a probably-false "Axiom of Rational Convergence," an idea that with enough time and information, everyone would converge on the same values.
- Empirically, people have persistent disagreements that don't go away with more education or time to think.
- The paper proposes an "appropriateness framework" instead of alignment—AI should learn context-specific norms, not follow some universal rulebook.
  - A comedy bot and a tech support bot need totally different senses of appropriate behavior.
  - Context collapse makes AI bland and useless for everyone by trying to be safe for everyone.
- The real danger isn't misaligned AI but concentrated power—whether in humans, AI, or human-AI coalitions.
- This reframes AI safety from "find the perfect objective function" to "prevent any entity from dominating the whole system."
- Their four principles in their paper map to intentional tech:
  - 1) contextual grounding (know your situation),
  - 2) community customization (different groups need different things)
  - 3) continual adaptation (coactive evolution), and
  - 4) polycentric governance (no single point of control).
- Society as an emergent system, not a top-down design.
- You need the right philosophical foundations to think about AI's impact on society.
- Those foundations must acknowledge the emergent characteristic of meaning.

I'm intrigued by this new paper on AI alignment that proposes everyone's framing it wrong.

More on this topic