Artificial‑intelligence alignment cannot be solved by focusing on single systems in a vacuum. Even perfectly intent‑aligned models will go awry when embedded inside misaligned economic, political, or social institutions. We argue that two dominant paradigms—(i) preference/utility maximisation inherited from the Standard Institution Design Toolkit (SIDT) and (ii) prompt‑ or self‑critique–based “text alignment”—are structurally incapable of delivering robust socio‑technical alignment. Instead we propose Full‑Stack Alignment (FSA): a new toolkit centred on explicit, structured representations of human norms and values. After analysing the shortcomings of existing toolkits, we sketch four concrete representation techniques and illustrate their promise through five case studies ranging from AI negotiation to democratic regulation. We close with a research and deployment roadmap toward institutions and AI systems that co‑evolve for global human flourishing.
The growing field of socio‑technical alignment argues that beneficial AI outcomes require more than aligning individual systems with operators’ intentions. Even perfectly intent‑aligned AI systems will become misaligned if deployed within broader institutions—such as profit‑driven corporations, competitive nation‑states, or inadequately regulated markets—that conflict with global human flourishing.
Once we agree that it is important to co‑align artificial intelligence and institutions, the key question becomes how.
Historically, researchers have treated the problem as one of game theory and social choice, modelling both humans and AI agents as rational expected‑utility maximisers. More recently, practical work has shifted toward text‑based paradigms such as RLHF, Constitutional AI, or model specifications that encode values in natural‑language strings. While each paradigm improved on its predecessor, neither meets the demands of high‑stakes alignment. Sections 2.1 and 2.2 analyse these shortcomings; Section 3 introduces a richer toolkit, and Sections 4–5 lay out evidence and a roadmap.
To address socio‑technical alignment we must often redesign institutional structures, yet the 20th‑century toolkit—micro‑economics, game theory, mechanism design, welfare economics, and social‑choice theory—is inadequate. We label this package the Standard Institution Design Toolkit (SIDT) and highlight six core limitations:
(The original text contains richer discussion and citations for each point; they have been retained but lightly compressed for flow.)