Anthropic revises Claude's 'Constitution,' and hints at chatbot consciousness

Anthropic revises Claude’s ‘Constitution,’ and hints at chatbot consciousness

Anthropic was released on Wednesday A Revised Edition of Claude’s ConstitutionA living document that provides a “comprehensive” explanation of “the context in which the cloud operates and the kind of entity we want the cloud to be.” The document was released in conjunction with Anthropic CEO Dario Amoudi’s appearance at the World Economic Forum in Davos.

Over the years, Anthropologie has sought to differentiate itself from its competitors.”Constitutional AI“A system in which its chatbot, Claude, is trained using a specific set of ethical principles rather than human feedback. Anthropologist first reveals these principles — Claude’s constitution — in 2023. The revised version maintains most of the same principles but adds more nuance and detail on ethics and user safety, among other issues.

When Claude’s Constitution was first published nearly three years ago, Jared Kaplan, co-founder of Anthropic, described it As an “AI system [that] The Constitution governs itself based on a specific list of principles.” Anthropic says that these principles refer to “models of adoption of ideal behavior as set forth in the Constitution” and thus, “avoid toxic or discriminatory outputs.” Preliminary 2022 Policy Memo More specifically notes that Anthropic’s system works by training an algorithm using a list of natural language instructions (referred to above as “principles”), which then creates what Anthropic refers to as the “constitution” of the software.

Anthropologists wanted long ago Positioning itself as a moral (some might argue, disturbing) alternative To other AI companies — such as OpenAI and xAI — that have faced disruption and controversy more aggressively. To that end, the new constitution released Wednesday fully aligns with that brand and allows Anthropic to portray itself as a more inclusive, moderate and democratic business. The 80-page document consists of four separate sections, which, according to Anthropic, represent the chatbot’s “core values.” Those values are:

Being “broadly secure”.
Being “broadly moral”.
Adhering to Anthropic’s guidelines.
Being “genuinely helpful”.

Each section of the document dives into what each of those specific principles mean and how they (theoretically) affect Claude’s behavior.

In the safety section, Anthropic notes that its chatbot is designed to avoid problems that plague other chatbots and, when evidence of mental health issues is found, directs the user to appropriate services. “Always refer users to relevant emergency services or provide basic safety information in situations involving risk to human life, even if it cannot go into more detail than that,” the document reads.

Moral considerations are another major part of Claude’s constitution. “We are less interested in Claude’s moral theories and more interested in Claude actually knowing how to be moral in a given context. In other words, Anthropic wants Claude to be able to skillfully navigate what it calls “real-world moral situations,” the document says.

TechCrunch event

San Francisco
|
October 13-15, 2026

Claude also has some limitations that prevent it from engaging in certain types of conversations. For example, discussion of developing a biological weapon is strictly prohibited.

Finally, there is Claude’s promise to help. Anthropologie gives a broad outline of how the cloud’s programming is designed to be helpful to users. The chatbot is programmed to consider a wide variety of policies when providing information. Some of these principles include considering the user’s “immediate desires” as well as the user’s “well-being” – that is, “the long-term well-being of the user and not just their immediate interests.” The document notes: “Claude should always try to identify the most reasonable interpretation of what his superiors want and to balance these considerations appropriately.”

Anthropic’s constitution ends on a decidedly dramatic note, with its authors leaning fairly large and questioning whether the company’s chatbots actually have consciousness. “Claude’s moral standing is deeply uncertain,” the document said. “We believe that the ethical status of AI models is a serious question worth considering. This view is not unique to us: some prominent philosophers of mind take this question very seriously.”

Source link

Being “broadly secure”.
Being “broadly moral”.
Adhering to Anthropic’s guidelines.
Being “genuinely helpful”.