Commentary Sep 13, 2024

GETTING-Plurality Public Comment on U.S. AI Safety Institute Guidance

The GETTING-Plurality Research Network recently submitted a public comment on the NIST U.S. AI Safety Institute’s “Managing Misuse Risk for Dual-Use Foundation Models” draft guidance. The full text of the public comment can be found below.

By:

Sarah Hubbard

Programs

Allen Lab for Democracy Renovation, Allen Lab: Technology & Democracy

Issues

Democracy and AI

Below you will find data on the Plurarity Public Comment on U.S. AI Safety Institute Guidance. Public Commenting is an important opportunity to have a voice on the topic at hand, and essential to providing input in the development of effective rules and regulations that serve the community.

The U.S. AI Safety Institute’s guidelines for managing AI misuse risks are commendable, especially their focus on mitigating risks before deployment. The principles in Objectives 5 and 6, which emphasize both pre-deployment safety and ongoing post-deployment monitoring, are particularly strong. The recommendations for independent third-party safety research, external safety testing, and internal reporting protections are also welcomed. Overall, the draft guidance offers a solid foundation for ensuring the safety of dual-use AI models for the public.

The GETTING-Plurality Research Network appreciates the opportunity to provide feedback on the “Managing Misuse Risk for Dual-Use Foundation Models” draft guidance. We’ve compiled comments below from members of our research network.

We commend the leadership of the U.S. AI Safety Institute for establishing guidelines for managing misuse risk and, importantly, for its bedrock principle that risk be properly managed and mitigated before AI deployment. This is a positive step forward for AI safety and a promising direction for the development standards which should exist in the field. In particular, we felt the recommendations in Objectives 5 and 6 were especially strong. We appreciated the recognition of the full lifecycle of AI to ensure safety – not only considering the pre-deployment stages of AI development for safety, but also the emphasis on post-deployment monitoring and response. We agree with the need to provide safe harbors for independent third-party safety research, the need to establish a robust regime of both external safety testing of models as well as protections for internal reporting of safety concerns, and the creation of other internal processes and norms that will set an organization up for success. We believe that this draft guidance is a strong starting point for guidelines needed to ensure that dual-use foundation models are as safe as possible for the public.

Below, we offer a few suggestions for your consideration in revision which we believe will further strengthen the guidance:

Consider open-source models.
- Many leading companies today are releasing open-source AI models (e.g. Meta’s Llama 3), which have already seen rapid adoption. This guidance does not seem to adequately address the open-source approach to model deployment. For example, mentions of model theft (Objective 3) would not be as relevant. If open-source development is purposely out-of-scope, we believe it would be helpful to add more information addressing this in the Scope and Key Challenges sections.
Too much is left up to developers’ own risk threshold and determinations.
- While the guidance is understandably written to be flexible, we worry that far too much deference would be left up to developers’ own risk thresholds without clear guidance to standardize or categorize the degree of risk, such as in Objective 2 Practice 2.1, Objective 3 Practice 3.2, and Objective 5 Practice 5.3. An organization’s own interpretation of the risks of its AI models is very subjective, and history suggests that market incentives will drive many developers to underrate the safety risks of its AI systems for the public. To address this, we recommend further defining guidance for these acceptable risk thresholds for developers to follow.
- The guidance seems to assume a high degree of internal capacity and expertise among model developers to understand and parse dual-use risks to society. How will such developers, especially lesser-resourced ones but even the largest companies and labs, acquire and leverage multifaceted risk expertise to ascertain what kinds and degrees of risk their models pose to individuals and society? If they are making use of external expertise, how can the public trust and validate that external expertise? This concern applies to most of the objectives in the draft guidance. We recommend the guidance advise model developers on how best to find, requisition and make use of threat/risk expertise to pressure test their models.
Create clear, transparent, and public deployment criteria to guide decision making.
- Defined criteria and thresholds, established in the early phases of a project, should be used to make decisions that either justify deployment or are used as a “tripwire” which can block or rollback a deployment. While this type of consideration is mentioned in Objective 5 Practice 5.3, we believe it should be strengthened, and that clear, transparent, and public criteria – aligned with risk thresholds and mitigation processes – should be established by AI developers well in advance to guide deployment decisions.
- Perhaps a flowchart-style decision tree (e.g. Frontier AI Regulation Blueprint) for these objectives would be helpful so that “stop, go back” steps might be included when deployment criteria are not met.
Cultivate internal incentive and cultures and norms for anticipating, reporting and mitigating safety risks upstream in the development process.
- We appreciate including recommendations for creating incentive systems, such as in Objective 6 Practice 6.5, but believe the draft guidance would benefit from additional measures and a strong overall call for developers to institute robust internal incentives systems and cultures of addressing safety risks. Besides strengthening and expanding the bounty program, perhaps there are rewards for employees who identify and report safety issues, channels of direct communication established to company leaders, and other mechanisms that encourage raising concerns in order to align all actors with identifying and mitigating risks early and often.
- We support Objective 6 Practice 6.3 establishing protections for internal reporting, but would recommend moving this to the pre-deployment stage, as well as including documentation about how the whistleblower protections have been communicated to staff. It is important that employees clearly understand their protections, and documenting communications encourages companies to be more transparent and forthright with their employees about such protections.
Include rapid incident reporting to relevant authorities.
- As this is such a quickly evolving space, information sharing will be critical in refining the ability to identify and respond to threats. In Objective 7, if a misuse issue does occur, we would also recommend adding some reporting, escalation, or notification to relevant authorities and partners, as well as the U.S. AI Safety Institute. We would also encourage sharing findings back with the developer of the proxy model used. Generally, additional guidance on how information should flow, and to whom, would be incredibly helpful in strengthening this iterative process.
Add direct references to help operationalize this guidance where possible.
- We think this guidance could be strengthened by a few direct references of common benchmarks (Objective 1 Practice 1.3), proxy models one might use (Objective 4), and case studies– perhaps using a referenced proxy model– of the suggested threat models and impact assessments.

The comments provided are from members of our research community. For any additional information, please reach out to Sarah Hubbard (sarah_hubbard@hks.harvard.edu).

More from this Program

Technology and Democracy: What to Read This Summer

A collection of books from the GETTING-Plurality Research Network.

Feature

Technology and Democracy: What to Read This Summer

This list of resources, curated by the GETTING-Plurality Research Network at the Allen Lab for Democracy Renovation, highlights emerging ideas at the intersection of technology and democracy.

Jul 8, 2025

A Summer Reading List for America’s 250th Anniversary

A collection of books curated by the Allen Lab for Democracy Renovation.

Feature

A Summer Reading List for America’s 250th Anniversary

On July 4, 2026, America will celebrate the 250th anniversary of the signing of the Declaration of Independence. As this milestone approaches, the team at the Allen Lab for Democracy Renovation has curated a collection of books, podcasts, and events that explore the meaning and impact of the declaration from 1776 to today. Join us in revisiting the document itself, reflecting on its legacy, and considering the ongoing struggle to uphold democratic ideals.

Jul 3, 2025

Experiential Civic Learning for American Democracy

An image of the report with an American flag in the background.

Additional Resource

Experiential Civic Learning for American Democracy

A new report provides a clear, actionable framework for effective experiential civic learning—what it is, why it matters, and how to do it well.

Jun 23, 2025

More on this Issue

Technology and Democracy: What to Read This Summer

Feature

Technology and Democracy: What to Read This Summer

This list of resources, curated by the GETTING-Plurality Research Network at the Allen Lab for Democracy Renovation, highlights emerging ideas at the intersection of technology and democracy.

Jul 8, 2025

In Appearance Before Congress, Bruce Schneier Raises Concerns about DOGE Data Handling Practices

Cyber image of a lock on a computer screen

Feature

In Appearance Before Congress, Bruce Schneier Raises Concerns about DOGE Data Handling Practices

In a warning to lawmakers, cybersecurity expert Bruce Schneier testified before the House Committee on Oversight and Government Reform, sharply criticizing the Department of Government Efficiency’s (DOGE) handling of federal data. Describing DOGE’s security protocols as dangerously inadequate, Schneier warned that the agency’s practices have put sensitive government and citizen information at risk of exploitation by foreign adversaries and criminal networks.

Jul 1, 2025

GETTING-Plurality Comments on the Development of a 2025 National AI R&D Strategic Plan

Policy Brief

GETTING-Plurality Comments on the Development of a 2025 National AI R&D Strategic Plan

The GETTING-Plurality Research Network submitted a public comment on the Development of a 2025 National Artificial Intelligence Research and Development Strategic Plan.

Jun 4, 2025