I'm intrigued by this new paper on AI alignment that proposes everyone's framing it wrong.

· Bits and Bobs 7/21/25
  • I'm intrigued by this new paper on AI alignment that proposes everyone's framing it wrong.
    • It proposes we stop searching for universal human values and instead build systems that help diverse communities manage their inevitable disagreements.
    • It argues we should be tailors stitching together a "polychrome quilt" of different contexts rather than astronomers seeking "something true and deep"
    • The entire AI alignment community is built on a probably-false "Axiom of Rational Convergence," an idea that with enough time and information, everyone would converge on the same values.
    • Empirically, people have persistent disagreements that don't go away with more education or time to think.
    • The paper proposes an "appropriateness framework" instead of alignment—AI should learn context-specific norms, not follow some universal rulebook.
      • A comedy bot and a tech support bot need totally different senses of appropriate behavior.
      • Context collapse makes AI bland and useless for everyone by trying to be safe for everyone.
    • The real danger isn't misaligned AI but concentrated power—whether in humans, AI, or human-AI coalitions.
    • This reframes AI safety from "find the perfect objective function" to "prevent any entity from dominating the whole system."
    • Their four principles in their paper map to intentional tech:
      • 1) contextual grounding (know your situation),
      • 2) community customization (different groups need different things)
      • 3) continual adaptation (coactive evolution), and
      • 4) polycentric governance (no single point of control).
    • Society as an emergent system, not a top-down design.
    • You need the right philosophical foundations to think about AI's impact on society.
    • Those foundations must acknowledge the emergent characteristic of meaning.

More on this topic

From other episodes