AI Risk Assessment for Regulated Industries: A Working Template
A generic AI risk assessment template will not survive scrutiny in a regulated context. The reason is that the regulator is not actually asking "is this risky?" The regulator is asking "have you demonstrated, with evidence, that you understood the risks specific to this system, made justified decisions, and built the controls that those decisions imply?" That is a much narrower question, and the assessment has to be structured to answer it.
This is the working template we use for AI risk assessments in financial services, healthcare, and regulated industrial settings. It maps to ISO 42001, the EU AI Act high-risk requirements, NIST AI RMF, and the sectoral guidance most regulated industries publish. The template itself is simple. The discipline is in completing it with operational evidence rather than aspirational language.
The seven sections every regulator-grade assessment needs
A working AI risk assessment has seven sections. None of them are optional in a regulated context. The order matters because each section depends on what the previous one established.
1. System description and intended use
A factual description of what the system does, what it does not do, and the boundary between them. The intended-use statement carries unusual weight in regulated contexts because it bounds liability: actions outside intended use are off-label, which is a different conversation from actions within intended use that produced unintended outcomes.
The fail mode here is vagueness. "An AI assistant for financial advisors" does not bound anything. "A natural-language interface that produces document summaries from a defined set of internal investment memos, intended for use by licensed advisors during client preparation, not intended for use as a recommendation engine or as a substitute for the advisor's own analysis" is the kind of specificity that an audit can verify and that limits the regulator's interpretation.
2. Data lifecycle map
For every dataset that touches the system: source, ownership, lawful basis (GDPR, sectoral data laws), provenance documentation, quality measurements, retention policy, and the boundary between personal and non-personal. Same level of specificity as the intended use.
In healthcare and financial services, the data lifecycle map is the section that takes the most time because the underlying records are usually scattered. The exercise of producing the map often surfaces compliance issues that exist for the entire organisation, not just the AI system.
3. Risk identification, by category
A structured walk through risk categories, with a documented determination per category. The categories we use:
- Safety and physical harm (where applicable; mandatory in medical, industrial, automotive contexts)
- Fundamental rights and discrimination (mandatory under EU AI Act; usually applicable in financial decisions, hiring, public services)
- Privacy and data protection (always applicable when personal data touches the system)
- Cybersecurity (always applicable; integration point with the cyber programme)
- Operational and business continuity
- Model performance under distribution shift
- Misuse and adversarial use
- Third-party and supply-chain risk (especially for systems built on foundation models)
For each category, three things must appear: identified risks specific to this system, severity-likelihood determination with justification, and a decision (mitigate, accept, transfer, avoid). "Not applicable" is a determination, but it must be justified.
4. Control mapping
For every risk where the determination is "mitigate," the specific control that mitigates it. Controls must be operational, not aspirational. "Continuous monitoring" is not a control; "drift detection on three named metrics, evaluated daily, with thresholds defined and a runbook for breaches" is.
This section is where the assessment overlaps with engineering work. Most failed assessments fail here, not at the risk-identification stage. The risks were identified accurately; the controls were aspirational; the evidence demonstrated the controls were not operating.
5. Performance and validation
The actual performance characteristics of the system, on the actual data it will see in deployment, evaluated against the actual decisions it will inform. Generic accuracy numbers from a benchmark dataset are not enough. The relevant question is: under realistic conditions, how does the system perform on the dimensions that matter for the intended use?
Two specific elements regulators look for:
- Subgroup performance. Performance broken down by demographic subgroups where applicable, with documented disparities and the decisions made about whether they are acceptable.
- Stress testing. Performance under deliberately adversarial conditions, distribution shift, and edge cases the deployment will encounter. "We tested on the holdout set" is not sufficient.
6. Monitoring plan
What gets measured continuously after deployment, with what cadence, against what thresholds, and what happens when thresholds are breached. The monitoring plan must include both performance monitoring (model still working as intended) and risk monitoring (categories from section 3 still bounded).
The hardest part of this section is the "what happens" question. Most monitoring plans we see specify the metrics and the thresholds and stop. The plan needs to specify the response: who is notified, who decides whether to retrain, who decides whether to pause the system, what evidence the decision is based on.
7. Governance and decision authority
Who decided to deploy this system, on what evidence, with what dissent, and who has authority to change the decision. Recorded as a specific list of named roles with documented decisions, not a generic governance committee description.
In regulated contexts the governance section is often where personal liability is established. This is not an artefact to delegate; it is the documented chain of accountability that the regulator will examine if something goes wrong.
What turns this into evidence rather than documentation
The template above is straightforward. The work is in completing it with material that holds up under scrutiny. Three disciplines distinguish operational assessments from theatrical ones.
Every claim has a source. "We monitor for drift" is a claim. "We monitor drift on these three metrics; the dashboard is at this URL; the runbook for breaches is at this URL; the last drift incident was logged here on this date" is evidence. Auditors who have done a few of these can tell within ten minutes whether they are looking at evidence or claims.
Cross-references between sections are checked. A risk identified in section 3 should appear as a control in section 4 or as a justified acceptance with a documented decision-maker. A monitoring metric in section 6 should map to a risk in section 3. Internal consistency is what the audit verifies first; gaps are immediately visible.
The assessment is dated and versioned. AI systems change. The assessment must show its history: when sections were last updated, what triggered each update, who approved it. A current snapshot without history is treated by auditors as a prepared document; a versioned history is treated as a working artefact.
Sectoral additions worth knowing
Different regulated sectors add specific requirements on top of the seven-section base.
Financial services (EBA, CFPB, MAS guidance). Specific scrutiny on adverse-impact testing for credit and lending decisions, model governance roles aligned with three-lines-of-defence, board reporting frequency tied to risk classification.
Healthcare (FDA, EU MDR, MHRA). When the AI is part of a medical device, the assessment must integrate with IEC 62304 software lifecycle and ISO 14971 risk management; clinical evaluation evidence is required.
EU AI Act high-risk systems. Conformity assessment per Annex VI or VII, registration in the EU AI database, post-market monitoring per Article 72, and incident notification under Article 73. The AI risk assessment becomes one input into a broader Annex IV technical file.
The seven-section template applies in all of these; the sectoral additions extend it rather than replace it.
Where this connects to our practice
Pelican Tech's AI Solutions practice builds these assessments alongside the engineering and governance work, so the artefacts are real evidence rather than retrospective documentation. We work with our MedTech team for medical-device contexts, and with our risk management team when the AI risk assessment integrates with broader enterprise risk reporting.
If your AI system is deployed and the existing risk assessment would not survive a competent regulator's first interview, that is the conversation to have with us before the regulator arrives.