<aside> 🌀

The Meaning Alignment Institute is helping coordinate the Better AI and Institutions via Thick Models of Choice field-building effort. Here are 6 areas where we believe accelerated progress would be timely.

</aside>

<aside> 👉

I’ve added some desired papers for each of these clusters here.

https://docs.google.com/spreadsheets/d/1YCeWaLUeSgjpau_TMihezTtxapiGzK44mN-H2p-ArhM/edit?gid=0#gid=0

</aside>

Overview

Roughly speaking, each area below suggest a kind of research pipeline: { certain techniques need to be developed → then comes a proof of concept system/mechanism → then a variety of legitimating tests and proofs → followed by a policy advocacy paper }.

I personally believe that in each case, the policy space is extremely limited mainly because of a lack of technical contributions, and that norms-based, values-based, or hybrid moral reasoning and thick choice can address that.

Ideally, to head off the first two areas, we’d publish 12-15 papers in 2025! I do think we’re pretty much the group to do it, but we’d have to coordinate tightly.

Hmm

Level #1 - Morally-Competent Agents (User Alignment)

Threat Manipulative Assistants and Agents. AI assistants can (1) incept foreign values into users; (2) persuade users that actions were beneficial which weren’t; or just (3) act in ways that don’t reflect users’ morality, while seeming to help their human users or represent their interests.

Opportunity Morally-Competent Agents. Assistants and agents understand their users’ values and reason about how they apply to new situations, in a pattern that’s robust against incentives for manipulation.

More info

Level #2 - Morally-Competent Agents (Contextual Values)

Threat Purely Goal-Directed Agents. As AI agents with inadequate moral reasoning replace human agents (lawyers, doctors, financiers) important values and professional norms are abandoned, leading to chaos and harm.