5 AI RMF Core
The AI RMF Core provides outcomes and actions that enable dialogue, understanding, and activities to manage AI risks and responsibly develop trustworthy AI systems. As illustrated in Figure 5, the Core is composed of four functions: govern, map, measure, and manage. Each of these high-level functions is broken down into categories and subcategories. Categories and subcategories are subdivided into specific actions and outcomes. Actions do not constitute a checklist, nor are they necessarily an ordered set of steps.
Risk management should be continuous, timely, and performed throughout the AI system lifecycle dimensions. AI RMF Core functions should be carried out in a way that reflects diverse and multidisciplinary perspectives, potentially including the views of AI actors outside the organization. Having a diverse team contributes to more open sharing of ideas and assumptions about purposes and functions of the technology being designed, developed, deployed, or evaluated – which can create opportunities to surface problems and identify existing and emergent risks.
An online companion resource to the AI RMF, the NIST AI RMF Playbook, is available to help organizations navigate the AI RMF and achieve its outcomes through suggested tactical actions they can apply within their own contexts. Like the AI RMF, the Playbook is voluntary and organizations can utilize the suggestions according to their needs and interests. Playbook users can create tailored guidance selected from suggested material for their own use and contribute their suggestions for sharing with the broader community. Along with the AI RMF, the Playbook is part of the NIST Trustworthy and Responsible AI Resource Center.
5.1 Govern
The govern function:
-
cultivates and implements a culture of risk management within organizations designing, developing, deploying, evaluating, or acquiring AI systems;
-
outlines processes, documents, and organizational schemes that anticipate, identify, and manage the risks a system can pose, including to users and others across society – and procedures to achieve those outcomes;
incorporates processes to assess potential impacts;
-
provides a structure by which AI risk management functions can align with organizational principles, policies, and strategic priorities;
-
connects technical aspects of AI system design and development to organizational values and principles, and enables organizational practices and competencies for the individuals involved in acquiring, training, deploying, and monitoring such systems; and
-
addresses full product lifecycle and associated processes, including legal and other issues concerning use of third-party software or hardware systems and data.
govern is a cross-cutting function that is infused throughout AI risk management and enables the other functions of the process. Aspects of govern, especially those related to compliance or evaluation, should be integrated into each of the other functions. Attention to governance is a continual and intrinsic requirement for effective AI risk management over an AI system’s lifespan and the organization’s hierarchy.
Strong governance can drive and enhance internal practices and norms to facilitate organizational risk culture. Governing authorities can determine the overarching policies that direct an organization’s mission, goals, values, culture, and risk tolerance. Senior leadership sets the tone for risk management within an organization, and with it, organizational culture. Management aligns the technical aspects of AI risk management to policies and operations. Documentation can enhance transparency, improve human review processes, and bolster accountability in AI system teams.
After putting in place the structures, systems, processes, and teams described in the govern function, organizations should benefit from a purpose-driven culture focused on risk understanding and management. It is incumbent on Framework users to continue to execute the govern function as knowledge, cultures, and needs or expectations from AI actors evolve over time.
Practices related to governing AI risks are described in the NIST AI RMF Playbook. Table 1 lists the govern function’s categories and subcategories.
Categories | Subcategories |
---|---|
Govern 1: Policies, processes, procedures, and practices across the organization related to the mapping, measuring, and managing of AI risks are in place, transparent, and implemented effectively. | Govern 1.1: Legal and regulatory requirements involving AI are understood, managed, and documented. |
Govern 1.2: The characteristics of trustworthy AI are integrated into organizational policies, processes, procedures, and practices. | |
Govern 1.3: Processes, procedures, and practices are in place to determine the needed level of risk management activities based on the organization's risk tolerance. | |
Govern 1.4: The risk management process and its outcomes are established through transparent policies, procedures, and other controls based on organizational risk priorities. | |
Govern 1.5: Ongoing monitoring and periodic review of the risk management process and its outcomes are planned and organizational roles and responsibilities clearly defined, including determining the frequency of periodic review. | |
Govern 1.6: Mechanisms are in place to inventory AI systems and are resourced according to organizational risk priorities. | |
Govern 1.7: Processes and procedures are in place for decommissioning and phasing out AI systems safely and in a manner that does not increase risks or decrease the organization’s trustworthiness. | |
Govern 2: Accountability structures are in place so that the appropriate teams and individuals are empowered, responsible, and trained for mapping, measuring, and managing AI risks. | Govern 2.1: Roles and responsibilities and lines of communication related to mapping, measuring, and managing AI risks are documented and are clear to individuals and teams throughout the organization. |
Govern 2.2: The organization’s personnel and partners receive AI risk management training to enable them to perform their duties and responsibilities consistent with related policies, procedures, and agreements. | |
Govern 2.3: Executive leadership of the organization takes responsibility for decisions about risks associated with AI system development and deployment. | |
Govern 3: Workforce diversity, equity, inclusion, and accessibility processes are prioritized in the mapping, measuring, and managing of AI risks throughout the lifecycle. | Govern 3.1: Decision-making related to mapping, measuring, and managing AI risks throughout the lifecycle is informed by a diverse team (e.g., diversity of demographics, disciplines, experience, expertise, and backgrounds). |
Govern 3.2: Policies and procedures are in place to define and differentiate roles and responsibilities for human-AI configurations and oversight of AI systems. | |
Govern 4: Organizational teams are committed to a culture that considers and communicates AI risk. | Govern 4.1: Organizational policies and practices are in place to foster a critical thinking and safety-first mindset in the design, development, deployment, and uses of AI systems to minimize potential negative impacts. |
Govern 4.2: Organizational teams document the risks and potential impacts of the AI technology they design, develop, deploy, evaluate, and use, and they communicate about the impacts more broadly. | |
Govern 4.3: Organizational practices are in place to enable AI testing, identification of incidents, and information sharing. | |
Govern 5: Processes are in place for robust engagement with relevant AI actors. | Govern 5.1: Organizational policies and practices are in place to collect, consider, prioritize, and integrate feedback from those external to the team that developed or deployed the AI system regarding the potential individual and societal impacts related to AI risks. |
Govern 5.2: Mechanisms are established to enable the team that developed or deployed AI systems to regularly incorporate adjudicated feedback from relevant AI actors into system design and implementation. | |
Govern 6: Policies and procedures are in place to address AI risks and benefits arising from third-party software and data and other supply chain issues. | Govern 6.1: Policies and procedures are in place that address AI risks associated with third-party entities, including risks of infringement of a third-party’s intellectual property or other rights. |
Govern 6.2: Contingency processes are in place to handle failures or incidents in third-party data or AI systems deemed to be high-risk. |
5.2 Map
The map function establishes the context to frame risks related to an AI system. The AI lifecycle consists of many interdependent activities involving a diverse set of actors (See Figure 3). In practice, AI actors in charge of one part of the process often do not have full visibility or control over other parts and their associated contexts. The interdependencies between these activities, and among the relevant AI actors, can make it difficult to reliably anticipate impacts of AI systems. For example, early decisions in identifying purposes and objectives of an AI system can alter its behavior and capabilities, and the dynamics of deployment setting (such as end users or impacted individuals) can shape the impacts of AI system decisions. As a result, the best intentions within one dimension of the AI lifecycle can be undermined via interactions with decisions and conditions in other, later activities. This complexity and varying levels of visibility can introduce uncertainty into risk management practices. Anticipating, assessing, and otherwise addressing potential sources of negative risk can mitigate this uncertainty and enhance the integrity of the decision process.
The information gathered while carrying out the map function enables negative risk prevention and informs decisions for processes such as model management, as well as an initial decision about appropriateness or the need for an AI solution. Outcomes in the map function are the basis for the measure and manage functions. Without contextual knowledge, and awareness of risks within the identified contexts, risk management is difficult to perform. The map function is intended to enhance an organization’s ability to identify risks and broader contributing factors.
Implementation of this function is enhanced by incorporating perspectives from a diverse internal team and engagement with those external to the team that developed or deployed the AI system. Engagement with external collaborators, end users, potentially impacted communities, and others may vary based on the risk level of a particular AI system, the makeup of the internal team, and organizational policies. Gathering such broad perspectives can help organizations proactively prevent negative risks and develop more trustworthy AI systems by:
improving their capacity for understanding contexts;
checking their assumptions about context of use;
-
enabling recognition of when systems are not functional within or out of their intended context;
-
identifying positive and beneficial uses of their existing AI systems;
-
improving understanding of limitations in AI and ML processes;
-
identifying constraints in real-world applications that may lead to negative impacts;
-
identifying known and foreseeable negative impacts related to intended use of AI systems; and
-
anticipating risks of the use of AI systems beyond intended use.
After completing the map function, Framework users should have sufficient contextual knowledge about AI system impacts to inform an initial go/no-go decision about whether to design, develop, or deploy an AI system. If a decision is made to proceed, organizations should utilize the measure and manage functions along with policies and procedures put into place in the govern function to assist in AI risk management efforts. It is incumbent on Framework users to continue applying the map function to AI systems as context, capabilities, risks, benefits, and potential impacts evolve over time.
Practices related to mapping AI risks are described in the NIST AI RMF Playbook. Table 2 lists the map function’s categories and subcategories.
Categories | Subcategories |
---|---|
Map 1: Context is established and understood. | Map 1.1: Intended purposes, potentially beneficial uses, context-specific laws, norms and expectations, and prospective settings in which the AI system will be deployed are understood and documented. Considerations include: the specific set or types of users along with their expectations; potential positive and negative impacts of system uses to individuals, communities, organizations, society, and the planet; assumptions and related limitations about AI system purposes, uses, and risks across the development or product AI lifecycle; and related TEVV and system metrics. |
Map 1.2: Interdisciplinary AI actors, competencies, skills, and capacities for establishing context reflect demographic diversity and broad domain and user experience expertise, and their participation is documented. Opportunities for interdisciplinary collaboration are prioritized. | |
Map 1.3: The organization’s mission and relevant goals for AI technology are understood and documented. | |
Map 1.4: The business value or context of business use has been clearly defined or – in the case of assessing existing AI systems – re-evaluated. | |
Map 1.5: Organizational risk tolerances are determined and documented. | |
Map 1.6: System requirements (e.g., “the system shall respect the privacy of its users”) are elicited from and understood by relevant AI actors. Design decisions take socio-technical implications into account to address AI risks. | |
Map 2: Categorization of the AI system is performed. | Map 2.1: The specific tasks and methods used to implement the tasks that the AI system will support are defined (e.g., classifiers, generative models, recommenders). |
Map 2.2: Information about the AI system’s knowledge limits and how system output may be utilized and overseen by humans is documented. Documentation provides sufficient information to assist relevant AI actors when making decisions and taking subsequent actions. | |
Map 2.3: Scientific integrity and TEVV considerations are identified and documented, including those related to experimental design, data collection and selection (e.g., availability, representativeness, suitability), system trustworthiness, and construct validation. | |
Map 3: AI capabilities, targeted usage, goals, and expected benefits and costs compared with appropriate benchmarks are understood. | Map 3.1: Potential benefits of intended AI system functionality and performance are examined and documented. |
Map 3.2: Potential costs, including non-monetary costs, which result from expected or realized AI errors or system functionality and trustworthiness – as connected to organizational risk tolerance – are examined and documented. | |
Map 3.3: Targeted application scope is specified and documented based on the system’s capability, established context, and AI system categorization. | |
Map 3.4: Processes for operator and practitioner proficiency with AI system performance and trustworthiness – and relevant technical standards and certifications – are defined, assessed, and documented. | |
Map 3.5: Processes for human oversight are defined, assessed, and documented in accordance with organizational policies from the govern function. | |
Map 4: Risks and benefits are mapped for all components of the AI system including third-party software and data. | Map 4.1: Approaches for mapping AI technology and legal risks of its components – including the use of third-party data or software – are in place, followed, and documented, as are risks of infringement of a third party’s intellectual property or other rights. |
Map 4.2: Internal risk controls for components of the AI system, including third-party AI technologies, are identified and documented. | |
Map 5: Impacts to individuals, groups, communities, organizations, and society are characterized. | Map 5.1: Likelihood and magnitude of each identified impact (both potentially beneficial and harmful) based on expected use, past uses of AI systems in similar contexts, public incident reports, feedback from those external to the team that developed or deployed the AI system, or other data are identified and documented. |
Map 5.2: Practices and personnel for supporting regular engagement with relevant AI actors and integrating feedback about positive, negative, and unanticipated impacts are in place and documented. |
5.3 Measure
The measure function employs quantitative, qualitative, or mixed-method tools, techniques, and methodologies to analyze, assess, benchmark, and monitor AI risk and related impacts. It uses knowledge relevant to AI risks identified in the map function and informs the manage function. AI systems should be tested before their deployment and regularly while in operation. AI risk measurements include documenting aspects of systems’ functionality and trustworthiness.
Measuring AI risks includes tracking metrics for trustworthy characteristics, social impact, and human-AI configurations. Processes developed or adopted in the measure function should include rigorous software testing and performance assessment methodologies with associated measures of uncertainty, comparisons to performance benchmarks, and formalized reporting and documentation of results. Processes for independent review can improve the effectiveness of testing and can mitigate internal biases and potential conflicts of interest.
Where tradeoffs among the trustworthy characteristics arise, measurement provides a traceable basis to inform management decisions. Options may include recalibration, impact mitigation, or removal of the system from design, development, production, or use, as well as a range of compensating, detective, deterrent, directive, and recovery controls.
After completing the measure function, objective, repeatable, or scalable test, evaluation, verification, and validation (TEVV) processes including metrics, methods, and methodologies are in place, followed, and documented. Metrics and measurement methodologies should adhere to scientific, legal, and ethical norms and be carried out in an open and transparent process. New types of measurement, qualitative and quantitative, may need to be developed. The degree to which each measurement type provides unique and meaningful information to the assessment of AI risks should be considered. Framework users will enhance their capacity to comprehensively evaluate system trustworthiness, identify and track existing and emergent risks, and verify efficacy of the metrics. Measurement outcomes will be utilized in the manage function to assist risk monitoring and response efforts. It is incumbent on Framework users to continue applying the measure function to AI systems as knowledge, methodologies, risks, and impacts evolve over time.
Practices related to measuring AI risks are described in the NIST AI RMF Playbook. Table 3 lists the measure function’s categories and subcategories.
Categories | Subcategories |
---|---|
Measure 1: Appropriate methods and metrics are identified and applied. | Measure 1.1: Approaches and metrics for measurement of AI risks enumerated during the map function are selected for implementation starting with the most significant AI risks. The risks or trustworthiness characteristics that will not – or cannot – be measured are properly documented. |
Measure 1.2: Appropriateness of AI metrics and effectiveness of existing controls are regularly assessed and updated, including reports of errors and potential impacts on affected communities. | |
Measure 1.3: Internal experts who did not serve as front-line developers for the system and/or independent assessors are involved in regular assessments and updates. Domain experts, users, AI actors external to the team that developed or deployed the AI system, and affected communities are consulted in support of assessments as necessary per organizational risk tolerance. | |
Measure 2: AI systems are evaluated for trustworthy characteristics. | Measure 2.1: Test sets, metrics, and details about the tools used during TEVV are documented. |
Measure 2.2: Evaluations involving human subjects meet applicable requirements (including human subject protection) and are representative of the relevant population. | |
Measure 2.3: AI system performance or assurance criteria are measured qualitatively or quantitatively and demonstrated for conditions similar to deployment setting(s). Measures are documented. | |
Measure 2.4: The functionality and behavior of the AI system and its components – as identified in the map function – are monitored when in production. | |
Measure 2.5: The AI system to be deployed is demonstrated to be valid and reliable. Limitations of the generalizability beyond the conditions under which the technology was developed are documented. | |
Measure 2.6: The AI system is evaluated regularly for safety risks – as identified in the map function. The AI system to be deployed is demonstrated to be safe, its residual negative risk does not exceed the risk tolerance, and it can fail safely, particularly if made to operate beyond its knowledge limits. Safety metrics reflect system reliability and robustness, real-time monitoring, and response times for AI system failures. | |
Measure 2.7: AI system security and resilience – as identified in the map function – are evaluated and documented. | |
Measure 2.8: Risks associated with transparency and accountability – as identified in the map function – are examined and documented. | |
Measure 2.9: The AI model is explained, validated, and documented, and AI system output is interpreted within its context – as identified in the map function – to inform responsible use and governance. | |
Measure 2.10: Privacy risk of the AI system – as identified in the map function – is examined and documented. | |
Measure 2.11: Fairness and bias – as identified in the map function – are evaluated and results are documented. | |
Measure 2.12: Environmental impact and sustainability of AI model training and management activities – as identified in the map function – are assessed and documented. | |
Measure 2.13: Effectiveness of the employed TEVV metrics and processes in the measure function are evaluated and documented. | |
Measure 3: Mechanisms for tracking identified AI risks over time are in place. | Measure 3.1: Approaches, personnel, and documentation are in place to regularly identify and track existing, unanticipated, and emergent AI risks based on factors such as intended and actual performance in deployed contexts. |
Measure 3.2: Risk tracking approaches are considered for settings where AI risks are difficult to assess using currently available measurement techniques or where metrics are not yet available. | |
Measure 3.3: Feedback processes for end users and impacted communities to report problems and appeal system outcomes are established and integrated into AI system evaluation metrics. | |
Measure 4: Feedback about efficacy of measurement is gathered and assessed. | Measure 4.1: Measurement approaches for identifying AI risks are connected to deployment context(s) and informed through consultation with domain experts and other end users. Approaches are documented. |
Measure 4.2: Measurement results regarding AI system trustworthiness in deployment context(s) and across the AI lifecycle are informed by input from domain experts and relevant AI actors to validate whether the system is performing consistently as intended. Results are documented. | |
Measure 4.3: Measurable performance improvements or declines based on consultations with relevant AI actors, including affected communities, and field data about context-relevant risks and trustworthiness characteristics are identified and documented. |
5.4 Manage
The manage function entails allocating risk resources to mapped and measured risks on a regular basis and as defined by the govern function. Risk treatment comprises plans to respond to, recover from, and communicate about incidents or events.
Contextual information gleaned from expert consultation and input from relevant AI actors – established in govern and carried out in map – is utilized in this function to decrease the likelihood of system failures and negative impacts. Systematic documentation practices established in govern and utilized in map and measure bolster AI risk management efforts and increase transparency and accountability. Processes for assessing emergent risks are in place, along with mechanisms for continual improvement.
After completing the manage function, plans for prioritizing risk and regular monitoring and improvement will be in place. Framework users will have enhanced capacity to manage the risks of deployed AI systems and to allocate risk management resources based on assessed and prioritized risks. It is incumbent on Framework users to continue to apply the manage function to deployed AI systems as methods, contexts, risks, and needs or expectations from relevant AI actors evolve over time.
Practices related to managing AI risks are described in the NIST AI RMF Playbook. Table 4 lists the manage function’s categories and subcategories.
Categories | Subcategories |
---|---|
Manage 1: AI risks based on assessments and other analytical output from the MAP and MEASURE functions are prioritized, responded to, and managed | Manage 1.1: A determination is made as to whether the AI system achieves its intended purposes and stated objectives and whether its development or deployment should proceed. |
Manage 1.2: Treatment of documented AI risks is prioritized based on impact, likelihood, and available resources or methods. | |
Manage 1.3: Responses to the AI risks deemed high priority, as identified by the map function, are developed, planned, and documented. Risk response options can include mitigating, transferring, avoiding, or accepting. | |
Manage 1.4: Negative residual risks (defined as the sum of all unmitigated risks) to both downstream acquirers of AI systems and end users are documented. | |
Manage 2: Strategies to maximize AI benefits and minimize negative impacts are planned, prepared, implemented, documented, and informed by input from relevant AI actors. | Manage 2.1: Resources required to manage AI risks are taken into account – along with viable non-AI alternative systems, approaches, or methods – to reduce the magnitude or likelihood of potential impacts. |
Manage 2.2: Mechanisms are in place and applied to sustain the value of deployed AI systems. | |
Manage 2.3: Procedures are followed to respond to and recover from a previously unknown risk when it is identified. | |
Manage 2.4: Mechanisms are in place and applied, and responsibilities are assigned and understood, to supersede, disengage, or deactivate AI systems that demonstrate performance or outcomes inconsistent with intended use. | |
Manage 3: AI risks and benefits from third-party entities are managed. | Manage 3.1: AI risks and benefits from third-party resources are regularly monitored, and risk controls are applied and documented. |
Manage 3.2: Pre-trained models which are used for development are monitored as part of AI system regular monitoring and maintenance. | |
Manage 4: Risk treatments, including response and recovery, and communication plans for the identified and measured AI risks are documented and monitored regularly. | Manage 4.1: Post-deployment AI system monitoring plans are implemented, including mechanisms for capturing and evaluating input from users and other relevant AI actors, appeal and override, decommissioning, incident response, recovery, and change management. |
Manage 4.2: Measurable activities for continual improvements are integrated into AI system updates and include regular engagement with interested parties, including relevant AI actors. | |
Manage 4.3: Incidents and errors are communicated to relevant AI actors, including affected communities. Processes for tracking, responding to, and recovering from incidents and errors are followed and documented. |