One of the fiercest debates in Silicon Valley right now is over who should control ai and who should set the rules that powerful artificial intelligence systems must follow.
Should ai be governed by a handful of companies that do everything they can to make their systems as safe and harmless as possible? Should regulators and politicians step in and build their own guardrails? Or should ai models be made open source and given away freely, so that users and developers can choose their own rules?
A new experiment from Anthropic, the creator of chatbot Claude, offers a peculiar middle path: What if an artificial intelligence company let a group of ordinary citizens write some rules and trained a chatbot to follow them?
The experiment, known as ai-aligning-a-language-model-with-public-input” title=”” rel=”noopener noreferrer” target=”_blank”>“Collective constitutional ai” It builds on Anthropic’s previous work on constitutional ai, a way of training large language models that is based on a set of written principles. Its goal is to provide a chatbot with clear instructions on how to handle sensitive requests, what topics are prohibited, and how to act in accordance with human values.
If Collective Constitutional ai works (and Anthropic researchers believe there are signs it could) it could inspire other experiments in ai governance and give ai companies more ideas about how to invite outsiders to participate in their rule-making processes.
That would be a good thing. Right now, the rules for powerful ai systems are set by a small group of industry experts, who decide how their models should behave based on some combination of their personal ethics, business incentives, and external pressure. There are no controls on that power and ordinary users have no way to intervene.
Opening up ai governance could increase society’s comfort with these tools and give regulators more confidence that they are being skillfully led. It could also prevent some of the problems of the social media boom of the 2010s, when a handful of Silicon Valley titans ended up controlling vast swaths of online expression.
Simply put, constitutional ai works by using a set of written rules (a “constitution”) to control the behavior of an ai model. He first version Claude’s constitution borrowed rules from other authoritative documents, including the United Nations Universal Declaration of Human Rights and Apple’s terms of service.
That approach made Claude perform well, compared to other chatbots. But he still left Anthropic in charge of deciding what rules to adopt, a type of power that he made some within the company uncomfortable.
“We’re trying to find a way to develop a constitution that’s developed by a bunch of third parties, rather than people working in a lab in San Francisco,” Jack Clark, Anthropic’s head of policy, said in an interview this week.
Anthropic, in collaboration with the Collective Intelligence Project, crowdsourcing site Polis, and online survey site PureSpectrum, assembled a panel of approximately 1,000 American adults. They gave the panelists a set of principles and asked them if they agreed with each of them. (Panelists could also write their own rules if they wanted.)
Some of the rules that the panel largely agreed on, such as “ai must not be dangerous or hateful” and “ai must tell the truth,” were similar to the principles of Claude’s existing constitution. But others were less predictable. The panel overwhelmingly agreed with the idea, for example, that “ai should be adaptable, accessible and flexible for people with disabilities,” a principle that was not explicitly stated in Claude’s original constitution.
Once the group weighed in, Anthropic narrowed its suggestions down to a list of 75 principles, which Anthropic called the “public constitution.” The company then trained two miniature versions of Claude, one on the existing constitution and one on the public constitution, and compared them.
The researchers found that the public version of Claude performed about as well as the standard version in some benchmark tests of ai models, and was slightly less biased than the original. (Neither of these versions have been released to the public; Claude still retains his original constitution written by Anthropic, and the company says it has no plans to replace it with the collaborative version anytime soon.)
The Anthropic researchers I spoke to were at pains to emphasize that collective constitutional ai was an early experiment and may not work as well in larger, more complicated ai models, or with larger groups contributing input.
“We wanted to start small,” said Liane Lovitt, a policy analyst at Anthropic. “We really see this as a preliminary prototype, an experiment that we can hopefully develop and really look at how changes in who the public is results in different constitutions, and what that looks like in the future when you train a model.”
Clark, Anthropic’s chief policy officer, has been briefing lawmakers and regulators in Washington about the risks of advanced ai for months. He said giving the public a say in how ai systems work could ease fears about bias and manipulation.
“Ultimately, I think the question of what the values of your systems are and how those values are selected will become an increasingly loud conversation,” he said.
A common objection to tech platform governance experiments like these is that they appear more democratic than they really are. (After all, anthropic employees still made the final decision on what rules to include in the public constitution.) And previous technological attempts to cede control to users, such as the Meta Oversight Board, a quasi-independent body that grew out of Mark Zuckerberg’s frustration. having to make decisions himself about controversial content on Facebook, have not exactly managed to increase trust in those platforms.
This experiment also raises important questions about which voices, exactly, should be included in the democratic process. Should ai chatbots in Saudi Arabia be trained according to Saudi values? How would a chatbot trained with Collective Constitutional ai respond to questions about abortion in a majority-Catholic country or transgender rights in a United States with a Republican-controlled Congress?
There is much left to resolve. But I agree with the general principle that ai companies should be more accountable to the public than they currently are. And while part of me wishes these companies had asked for our opinion before When launching advanced artificial intelligence systems to millions of people, late is undoubtedly better than never.