What does trustworthy AI look like?
A quick overview of NIST's AI Risk Management Framework
AI risk management is important to ensure the responsible development of AI systems. NIST’s AI risk management framework (RMF), published in January 2023, offers a practical guide for identifying and managing AI risks to help ensure the development of trustworthy AI:
…the goal of the AI RMF is to offer a resource to the organizations designing, developing, deploying, or using AI systems to help manage the many risks of AI and promoting trustworthy and responsible development and the use of AI systems.1
The framework provided by NIST sets out the characteristics that make up a trustworthy AI system.2 These characteristics include:
Valid and reliable
Safe
Secure and resilient
Accountable and transparent
Explainable and interpretable
Privacy-enhanced
Fair with harmful bias managed
By enhancing AI trustworthiness, the associated risks of AI can be reduced. This is applicable to actors across the AI value chain, including providers, deployers and users of AI systems.
Additionally, it is important to note that these characteristics are not independent of each other. Organisations will need to use the risk management framework to carefully balances these characteristics depending on the relevant circumstances:
Trustworthiness characteristics explained in this document influence each other. Highly secure but unfair systems, accurate but opaque and uninterpretable systems, and inaccurate but secure, privacy-enhanced, and transparent systems are all undesirable. A comprehensive approach to risk management calls for balancing tradeoffs among the trustworthiness characteristics.3
Below is a more detailed description of NIST's trustworthy AI characteristics along with the relevant risks4 that can be addressed by implementing these characteristics.
Valid and reliable
Description: Validation is about using objective evidence to confirm that the requirements for the system have been fulfilled, meaning for example that the system is generalised to data outside its training environment. Reliability means the system performing as required without failure over its entire lifetime. Accuracy refers to how close the outputs of the system are to the true values or values accepted as true. Robustness means the system being able to maintain its level of performance under different circumstances. These elements are usually assessed posted-deployment with ongoing testing and monitoring.
Relevant Risks:
Poorly generalized AI systems
AI systems requiring more frequent maintenance due to data, model or concept drift.
Underdeveloped software testing standards
Difficulty in performing regular AI-based software testing
Intentional or unintentional changes during training impacting performance
Safe
Description: Safety is about the system not leading to a state that endangers human life, health, property or the environment. This can be improved through: (i) responsible practices for design, development and deployment, (ii) clear information on responsible use to deployers, (iii) responsible decision-making by deployers and end users and (iv) explanation and documentation of risks based on empirical evidence of incidents.
Relevant Risks:
Greater difficulty in predicting failure modes for emergent properties of large-scale pre-trained models
Inability to predict or detect the side effects of AI-based systems beyond statistical measures
Computational costs of AI development and their impact on the environment
Secure and resilient
Description: Resilience means the system can withstand adverse events or unexpected changes impacting their environment. Secure means the system can ensure confidentiality, integrity and availability. Accordingly, resilience is about the system being able to function normally after adverse events, whereas security is about preventing adverse events from taking place and having protocols in place to respond to such events if they do materialise.
Relevant Risks:
AI system scale and complexity
Inability to predict or detect the side effects of AI-based systems beyond statistical measures
Accountable and transparent
Description: Transparency is about the extent to which information about the system is available to those interacting with the system. This includes information about design choices, training data, model training, model architecture, intended use cases and the modalities of the deployment.
Relevant Risks:
Increased opacity and concerns about reproducibility
AI system scale and complexity
Explainable and interpretable
Description: Explainability is about being able to explain how the system, in general, produces its outputs based on the inputs. Interpretability is about being able to explain how the system produced a particular output based on a specific input.
Relevant Risks:
Increased opacity and concerns about reproducibility
Inability to predict or detect the side effects of AI-based systems beyond statistical measures
Privacy-enhanced
Description: Privacy refers to norms and practices relating to freedom from intrusion, limiting observation or individual agency to consent to disclosure or control of identity.
Relevant Risks:
Enhanced data aggregation capabilities of AI systems
AI system dependency and reliance on data
Fair with harmful bias managed
Description: Fairness is about ensuring that the system, in terms of its design, development and deployment respects equality and equity. This includes addressing issues such as harmful bias and discrimination. Categories of AI bias include: (i) systemic bias, (ii) computational and statistical bias, (iii) human-cognitive bias.
Relevant Risks:
Training data may not be a true or appropriate representation of the context or intended use of the AI system, potentially leading to harmful bias or other data quality issues
Use of pre-trained models for research and improved performance can increase statistical uncertainty and cause issues with bias management, scientific validity and reproducibility
NIST, Artificial Intelligence Risk Management Framework (AI RMF 1.0) (January 2023), p.2.
NIST, Artificial Intelligence Risk Management Framework (AI RMF 1.0) (January 2023), pp.13-18.
NIST, Artificial Intelligence Risk Management Framework (AI RMF 1.0) (January 2023), p.13.
These risks are taken from 'Appendix B: How AI Risks Differ from Traditional Software Risks' of the NIST AI RMF.