Teams know how to ship. They know how to iterate. But they’ve rarely been asked to prove that the technology they’re using and systems they’re building are safe. This isn’t about security no, it’s about proving fairness, accountability, and explainability.
And with all the changes coming to what responsible AI means from 2025 onwards thinking about how you work now is setting you up to work with AI and not against it.
For AI delivery teams it’s clear that their expectations are changing at rapid speed. Where Responsible AI considerations used to be ‘handballed’ to legal and governance teams its now getting passed right back as an active part of the delivery process itself.
Introducing the NIST AI Risk Management Framework (AI RMF) and ISO/IEC 23894:2023. The key players that house the operating instructions for how AI risk must be mapped, measured, and managed by delivery teams, inside sprints and in real time.
Both NIST and ISO 23894 focus on systems. They expect visibility into how risk is handled across the lifecycle of an AI system, from early design decisions to post-launch monitoring.
What they’re really describing is familiar to delivery teams. Feedback loops. Version control. Decision logs. Risk flags. The things usually managed by project managers or legal, but really, they’re the standard parts of what modern delivery work looks like.
For organisations that already work and embody the agile ways of working, this shift is less a disruption and more an iteration. AI Risk governance can work the same way modern delivery does: lightweight, traceable, testable.
Frameworks that Support the Work
These two frameworks above are not competing. They reinforce each other
Function / Clause |
NIST AI RMF |
ISO/IEC 23894 |
Purpose |
Function / Clause
Govern
|
NIST AI RMF
Policies, roles, and risk posture
|
ISO/IEC 23894
Clause 4 (Principles), Clause 5 (Leadership)
|
Purpose
Establishes accountability baseline
|
Function / Clause
Map
|
NIST AI RMF
Context, stakeholders, data, intended use
|
ISO/IEC 23894
Clause 6.2 (Risk Identification & Analysis)
|
Purpose
Identifies who might be affected and in what way
|
Function / Clause
Measure
|
NIST AI RMF
Risk quantification and socio-technical evaluation
|
ISO/IEC 23894
Clause 6.3 (Risk Evaluation)
|
Purpose
Translates principles into observable metrics
|
Function / Clause
Manage
|
NIST AI RMF
Response, monitoring, mitigation
|
ISO/IEC 23894
Clause 6.4 (Treatment), Clause 7 (Monitoring)
|
Purpose
Keeps risk within defined tolerance over time
|
Together, they offer a clear blueprint especially for teams who are already running ISO 9001 or 27001 where the structures are familiar (policies, roles, artefacts, logs).
Where Risk Lives in the Sprint
Embedding this AI governance just requires a few new rituals, ones that show up inside the structures that already exist.
Day |
Ceremony |
Framework Emphasis |
Common RAI Activities |
Day
Day 0
|
Ceremony
Sprint Planning
|
Framework Emphasis
Govern / Map
|
Common RAI Activities
Reconfirm risk posture; update stakeholder context; validate data sourcing
|
Day
Day 4
|
Ceremony
Mid-Sprint Checkpoint
|
Framework Emphasis
Measure
|
Common RAI Activities
Run bias and robustness tests; log metric shifts
|
Day
Day 9
|
Ceremony
Sprint Review
|
Framework Emphasis
Manage
|
Common RAI Activities
Check mitigation effectiveness; record open risk issues
|
Day
Day 10
|
Ceremony
Retrospective
|
Framework Emphasis
Govern (learning integration)
|
Common RAI Activities
Document learnings; update training needs or policy library
|
The result: Governance becomes continuously evolving with the work where every sprint leaves behind real, auditable evidence.
When “Done” Includes Risk Artefacts
To embed Responsible AI, teams need to treat risk outcomes like product outcomes. Below are examples of what that looks like:
Epic |
RMF Outcome |
Definition of Done |
Epic
Train a model
|
RMF Outcome
MEASURE 2.2 – Risks quantified
|
Definition of Done
Bias test results logged; model card created
|
Epic
Prepare public-facing API
|
RMF Outcome
MANAGE 3.1 – Mitigation actions deployed
|
Definition of Done
Risk mitigations verified; failure modes documented
|
Epic
Submit release candidate
|
RMF Outcome
GOVERN 1.4 – Accountability mechanisms in place
|
Definition of Done
AI sign-off recorded; RACI updated; data access approvals archived
|
This is where the shift becomes visible: when risk accountability shows up inside commits, releases, and demo day conversations.
Four Tactics That Make It Work
These patterns make Responsible AI practical and a part of everyday delivery:
- CI/CD Risk Gates
Fail builds if fairness, drift or security metrics breach thresholds. Store results for audit. - Model Cards as Code
Generate explainability documents at train-time. Version and expose internally like any artefact. - Risk Logs via Microservice
Maintain a central register of risks with API endpoints. Link directly to user stories and sprint boards. - Incident Playbooks in Workflow
Embed ISO Clause 7 directly into alerting and runbooks. Turn incidents into feedback loops.
The result is more clarity, alignment with other essential risk management activities and ultimately fewer surprises post go-live.
Objections You’ll Hear.
As with any change objections are only natural, but with change comes the opportunity for growth. Some of the objections you’ll likely hear through this transitionary period and what the reality actually is.
“This will just slow us down.”
In reality their concerns are incorrectly front-loaded because embedding AI governance early ensure that quality is consistent and reduces backlog churn later.
“The metrics aren’t going to be exact.”
The key is that they don’t need to be. Track what’s useful, not perfect. Trends and deltas are more actionable than absolutes.
“This isn’t mandatory yet.”
That window is closing and fast, readiness now becomes advantage later.
The Principle Now Ships With the Product
NIST and ISO ask teams to show how they’re maintaining trust as they do the work they do. And that’s why we’re so passionate about this moment, a moment to make Responsible AI observable, explainable, and operational as a part of the work.
For organisations building real systems, it’s about proving structure and being able to show that proof, every sprint.
THE HUMAN WHY:
Humans have the deeply felt need to know that we are protected, understood, and treated fairly in and by the systems we engage with and ultimately that we are safe.
A system that fosters growth, performance, and shared trust gives people what they ultimately seek: a place to become more of who they are, contribute meaningfully, and rely on others and the system itself.
Download our Responsible AI delivery map