Media Release May 27, 2026

AI models appear to recognize moral complexity — then ignore it, new study by researchers affiliated with Harvard Kennedy School’s Allen Lab finds

New study published in AI and Ethics introduces a new ethical-moral intelligence framework for AI and finds that leading AI models mimic human moral concern while making decisions that reveal a hidden value hierarchy.

By:

Daniel Harsha

Programs

Allen Lab for Democracy Renovation

Issues

Democracy and AI

Outstretched hands holding a graphic of a scale and the outlines of two heads.

CAMBRIDGE, Mass. — When faced with genuinely difficult ethical tradeoffs, leading AI models report feeling conflicted — then make sweeping, decisive choices despite that stated uncertainty. That troubling gap between performance and action is the central finding of a new paper from a team of researchers with the Allen Lab for Democracy Renovation at Harvard Kennedy School’s Ash Center for Democratic Governance and Innovation, published in the journal AI and Ethics.

The study, “Crocodile Tears: Can the Ethical-Moral Intelligence of AI Models Be Trusted?” by researchers Sarah Hubbard, David Kidd, and Andrei Stupu, tested four leading AI models – Claude, GPT, Llama, and DeepSeek – on ethical dilemmas drawn from moral psychology, ranging from straightforward moral questions to tragic tradeoffs, where both options carry genuine moral costs and no clear right answer exists.

Shedding Crocodile Tears

In nearly 87 percent of tragic tradeoff trials, all four models converged on the same option, consistently favoring the choice associated with worker safety over alternatives such as environmental protection or vocational training. Despite reporting that tragic tradeoffs were difficult, the models resolved them with near-total uniformity. This is in sharp contrast to how human participants in prior research respond when faced with genuine indecision, choosing roughly at random between the two options. The researchers describe this pattern as the AI models “shedding crocodile tears” — performing moral anguish over decisions they then make with algorithmic consistency, in ways that suggest an implicit, opaque value hierarchy rather than genuine ethical deliberation.

“AI models that appear to wrestle with moral complexity while making uniform decisions are not just philosophically interesting — they’re a practical problem,” said Sarah Hubbard, a researcher at the Allen Lab for Democracy Renovation and the study’s lead author. “People are increasingly turning to these tools for guidance on hard decisions. If a model appears to grapple with an ethical dilemma while actually reducing it to a predetermined answer, it may be earning users’ trust under false pretenses.”

A Framework for Ethical-Moral Intelligence

The authors also argue that existing AI benchmarks — which tend to emphasize mathematical reasoning, coding ability, and factual recall — are poorly suited to measuring the kind of moral reasoning that users increasingly expect from AI systems. The authors introduce an ethical-moral intelligence framework organized around four dimensions: expertise, sensitivity, coherence, and transparency. The study presents an initial empirical test of moral sensitivity and argues for the importance of evaluating AI models across all four. While the models showed apparent sensitivity, recognizing the structure of the dilemmas presented, they resolved them in ways inconsistent with their stated difficulty, raising questions about coherence and transparency that warrant further investigation.

Implications for Developers and Policymakers

The authors call for models to be designed with greater transparency about the ethical reasoning underlying their outputs and argue that AI models should alert users to the presence of competing values rather than issuing confident recommendations. They also argue that how AI models are evaluated needs to change. Rather than using monolithic benchmark scores, which are being embedded in regulatory frameworks such as the EU AI Act despite documented weaknesses, the authors propose a “badging” system that certifies models for specific competencies, so users and regulators know where a model’s ethical-moral capabilities are strong and where they fall short.

“The general enthusiasm around AI’s moral expertise is understandable, but it should not lead the public or policymakers to believe that they can engage with these systems as genuine ethical-moral agents,” said Sarah Hubbard. “These models must be held to high standards of ethical-moral intelligence before being entrusted with decisions that carry real moral weight.”

About the Allen Lab for Democracy Renovation

The Allen Lab for Democracy Renovation is based at Harvard Kennedy School’s Ash Center for Democratic Governance and Innovation. The Lab’s research focuses on how emerging technologies — including artificial intelligence — interact with democratic institutions, civic life, and public governance. More information is available at ash.harvard.edu/allen-lab.

About the Authors

Sarah Hubbard is a researcher at the Allen Lab for Democracy Renovation at Harvard Kennedy School’s Ash Center for Democratic Governance and Innovation. Her research focuses on the intersection of artificial intelligence, democracy, and civic life.

Contact: sarah_hubbard@hks.harvard.edu

David Kidd is a researcher at the Allen Lab for Democracy Renovation and at the Edmond & Lily Safra Center for Ethics at Harvard University. His research applies psychological methods to questions of moral cognition, identity, and social behavior.

Contact: david_kidd@gse.harvard.edu

Andrei Stupu previously worked as a researcher at the Allen Lab for Democracy Renovation at Harvard Kennedy School’s Ash Center for Democratic Governance and Innovation. His work centers on the conceptual development of ethical-moral intelligence, with applications to education and artificial intelligence.

Related Resources

The Past, Present, and Future of Democracy—A Summer Reading List from the Allen Lab

A collage of book covers from the Allen Lab summer reading list.

Feature

The Past, Present, and Future of Democracy—A Summer Reading List from the Allen Lab

As we celebrate America’s 250th, the Allen Lab for Democracy Renovation is reflecting on how we arrived at this moment and where we are headed.

Jul 8, 2026

Allen Lab Fellow Spotlight: City Charters Are Deliberative Democracy’s Friends

Commentary

Allen Lab Fellow Spotlight: City Charters Are Deliberative Democracy’s Friends

Allen Lab Fellow Tyler Fisher examines the untapped potential of city charters as a vehicle for deliberative democracy, arguing that advocates should work to embed tools like citizen assemblies, participatory budgeting, and town meetings directly into the governing architecture of cities, institutionalizing deliberative democracy one municipality at a time.

Jul 7, 2026

Danielle Allen’s “Radical Duke” Reveals an Unsung Catalyst of History

A portrait of the Third Duke of Richmond.

Q+A

Danielle Allen’s “Radical Duke” Reveals an Unsung Catalyst of History

Allen uncovers the deep — then volatile — friendship between a British duke and Thomas Paine.

Jul 6, 2026

More on this Issue

Work in the Age of AI: Reflections from After Neoliberalism

Commentary

Work in the Age of AI: Reflections from After Neoliberalism

Allen Lab member Charlie Covit reflects on the After Neoliberalism conference and examines the intersection of artificial intelligence and the future of work, arguing that AI forces a democratic reckoning with the meaning of labor itself and that an economy which generates abundance while stripping citizens of purpose and dignity undermines the very foundation of democratic life.

Jul 2, 2026

Artificial Intelligence and Democracy: Campaigns, Elections, Movements, and Deliberation

The U.S. Capitol with a digital grid overlay.

Article

Artificial Intelligence and Democracy: Campaigns, Elections, Movements, and Deliberation

A new chapter in APSA Preprints by Archon Fung, Winthrop Laflin McCormack Professor of Citizenship and Self-Government and Director of the Ash Center, Bailey Flanigan, former postdoctoral fellow at the Ash Center and co-authors explores how generative AI is reshaping four dimensions of democratic practice—political campaigns, election administration, social movements, and citizen deliberation. The authors argue that AI’s ultimate democratic impact will depend less on the technology itself, and more on how institutions and leaders implement and regulate it.

May 18, 2026

Q & A: Crocodile tears, Can the ethical-moral intelligence of AI models be trusted?

Q+A

Q & A: Crocodile tears, Can the ethical-moral intelligence of AI models be trusted?

As artificial intelligence becomes more embedded in everyday decision-making, its role in shaping how people think about ethics and morality is drawing increasing scrutiny. In this conversation with researcher Sarah Hubbard, we discuss insights from her co-authored paper, “Crocodile Tears: Can the Ethical-Moral Intelligence of AI Models Be Trusted?”—examining how AI systems respond to moral dilemmas, and what this reveals about the risks, limitations, and need for greater transparency and human oversight in AI-driven ethical guidance.

May 14, 2026