Media Release  

AI models appear to recognize moral complexity — then ignore it, new study by researchers affiliated with Harvard Kennedy School’s Allen Lab finds

New study published in AI and Ethics introduces a new ethical-moral intelligence framework for AI and finds that leading AI models mimic human moral concern while making decisions that reveal a hidden value hierarchy. 

Outstretched hands holding a graphic of a scale and the outlines of two heads.

CAMBRIDGE, Mass. — When faced with genuinely difficult ethical tradeoffs, leading AI models report feeling conflicted — then make sweeping, decisive choices despite that stated uncertainty. That troubling gap between performance and action is the central finding of a new paper from a team of researchers with the Allen Lab for Democracy Renovation at Harvard Kennedy School’s Ash Center for Democratic Governance and Innovation, published in the journal AI and Ethics. 

The study, “Crocodile Tears: Can the Ethical-Moral Intelligence of AI Models Be Trusted?” by researchers Sarah Hubbard, David Kidd, and Andrei Stupu, tested four leading AI models – Claude, GPT, Llama, and DeepSeek – on ethical dilemmas drawn from moral psychology, ranging from straightforward moral questions to tragic tradeoffs, where both options carry genuine moral costs and no clear right answer exists.  

Shedding Crocodile Tears 

In nearly 87 percent of tragic tradeoff trials, all four models converged on the same option, consistently favoring the choice associated with worker safety over alternatives such as environmental protection or vocational training. Despite reporting that tragic tradeoffs were difficult, the models resolved them with near-total uniformity. This is in sharp contrast to how human participants in prior research respond when faced with genuine indecision, choosing roughly at random between the two options. The researchers describe this pattern as the AI models “shedding crocodile tears” — performing moral anguish over decisions they then make with algorithmic consistency, in ways that suggest an implicit, opaque value hierarchy rather than genuine ethical deliberation. 

“AI models that appear to wrestle with moral complexity while making uniform decisions are not just philosophically interesting — they’re a practical problem,” said Sarah Hubbard, a researcher at the Allen Lab for Democracy Renovation and the study’s lead author. “People are increasingly turning to these tools for guidance on hard decisions. If a model appears to grapple with an ethical dilemma while actually reducing it to a predetermined answer, it may be earning users’ trust under false pretenses.” 

A Framework for Ethical-Moral Intelligence 

The authors also argue that existing AI benchmarks — which tend to emphasize mathematical reasoning, coding ability, and factual recall — are poorly suited to measuring the kind of moral reasoning that users increasingly expect from AI systems. The authors introduce an ethical-moral intelligence framework organized around four dimensions: expertise, sensitivity, coherence, and transparency. The study presents an initial empirical test of moral sensitivity and argues for the importance of evaluating AI models across all four. While the models showed apparent sensitivity, recognizing the structure of the dilemmas presented, they resolved them in ways inconsistent with their stated difficulty, raising questions about coherence and transparency that warrant further investigation. 

Implications for Developers and Policymakers 

The authors call for models to be designed with greater transparency about the ethical reasoning underlying their outputs and argue that AI models should alert users to the presence of competing values rather than issuing confident recommendations. They also argue that how AI models are evaluated needs to change. Rather than using monolithic benchmark scores, which are being embedded in regulatory frameworks such as the EU AI Act despite documented weaknesses, the authors propose a “badging” system that certifies models for specific competencies, so users and regulators know where a model’s ethical-moral capabilities are strong and where they fall short. 

“The general enthusiasm around AI’s moral expertise is understandable, but it should not lead the public or policymakers to believe that they can engage with these systems as genuine ethical-moral agents,” said Sarah Hubbard. “These models must be held to high standards of ethical-moral intelligence before being entrusted with decisions that carry real moral weight.” 

About the Allen Lab for Democracy Renovation 

The Allen Lab for Democracy Renovation is based at Harvard Kennedy School’s Ash Center for Democratic Governance and Innovation. The Lab’s research focuses on how emerging technologies — including artificial intelligence — interact with democratic institutions, civic life, and public governance. More information is available at ash.harvard.edu/allen-lab. 

About the Authors 

Sarah Hubbard is a researcher at the Allen Lab for Democracy Renovation at Harvard Kennedy School’s Ash Center for Democratic Governance and Innovation. Her research focuses on the intersection of artificial intelligence, democracy, and civic life. 

Contact: sarah_hubbard@hks.harvard.edu 

David Kidd is a researcher at the Allen Lab for Democracy Renovation and at the Edmond & Lily Safra Center for Ethics at Harvard University. His research applies psychological methods to questions of moral cognition, identity, and social behavior. 

Contact: david_kidd@gse.harvard.edu 

Andrei Stupu previously worked as a researcher at the Allen Lab for Democracy Renovation at Harvard Kennedy School’s Ash Center for Democratic Governance and Innovation. His work centers on the conceptual development of ethical-moral intelligence, with applications to education and artificial intelligence. 

Related Resources

Q & A: Crocodile tears, Can the ethical-moral intelligence of AI models be trusted?
A hand pressing a tablet using AI.

Q+A

Q & A: Crocodile tears, Can the ethical-moral intelligence of AI models be trusted?

As artificial intelligence becomes more embedded in everyday decision-making, its role in shaping how people think about ethics and morality is drawing increasing scrutiny. In this conversation with researcher Sarah Hubbard, we discuss insights from her co-authored paper, “Crocodile Tears: Can the Ethical-Moral Intelligence of AI Models Be Trusted?—examining how AI systems respond to moral dilemmas, and what this reveals about the risks, limitations, and need for greater transparency and human oversight in AI-driven ethical guidance.

Bootstrap Blackness: Black Men, Conservatism, and Party Politics
A man voting.

Article

Bootstrap Blackness: Black Men, Conservatism, and Party Politics

A new research article by Dr. Christine Slaughter, Research Fellow at the Allen Lab for Democracy Renovation and co-authors examines the narrative of black men’s political “shift right”. The study finds Black men remain overwhelmingly Democratic, despite growing public attention to ideological divides.

 

Voter Experience Summit Recap

Commentary

Voter Experience Summit Recap

Allen Lab Fellow Hillary Lehr convened a Voter Experience Summit at Harvard’s Ash Center in March, bringing together 25 cross-sector experts to rigorously map the voter journey. This essay explores how that collaborative process could lay the groundwork for new interventions to understand and improve the experience of voting for all.

More on this Issue

Artificial Intelligence and Democracy: Campaigns, Elections, Movements, and Deliberation
The U.S. Capitol with a digital grid overlay.

Article

Artificial Intelligence and Democracy: Campaigns, Elections, Movements, and Deliberation

A new chapter in APSA Preprints by Archon Fung, Winthrop Laflin McCormack Professor of Citizenship and Self-Government and Director of the Ash Center, Bailey Flanigan, former postdoctoral fellow at the Ash Center and co-authors explores how generative AI is reshaping four dimensions of democratic practice—political campaigns, election administration, social movements, and citizen deliberation. The authors argue that AI’s ultimate democratic impact will depend less on the technology itself, and more on how institutions and leaders implement and regulate it.

Q & A: Crocodile tears, Can the ethical-moral intelligence of AI models be trusted?
A hand pressing a tablet using AI.

Q+A

Q & A: Crocodile tears, Can the ethical-moral intelligence of AI models be trusted?

As artificial intelligence becomes more embedded in everyday decision-making, its role in shaping how people think about ethics and morality is drawing increasing scrutiny. In this conversation with researcher Sarah Hubbard, we discuss insights from her co-authored paper, “Crocodile Tears: Can the Ethical-Moral Intelligence of AI Models Be Trusted?—examining how AI systems respond to moral dilemmas, and what this reveals about the risks, limitations, and need for greater transparency and human oversight in AI-driven ethical guidance.

AI for Democracy Movements: Toward a New Agenda
A cover photo of the report.

Policy Brief

AI for Democracy Movements: Toward a New Agenda

A new report summarizes key insights from the Nonviolent Action Lab’s December 2025 convening on how artificial intelligence can empower pro-democracy movements.