When Agents Decide

Agentic AI in Credit & KYC and the Governance Gap That Follows

Abstract

Agentic AI systems are now capable of executing end-to-end credit decisioning and KYC workflows with minimal human input, retrieving documents, querying credit bureaus, assessing risk, and routing outcomes in real time. Yet the governance frameworks that regulate these decisions were designed for a fundamentally different technology: static, periodic models validated offline and monitored retrospectively. This paper argues that a critical governance gap exists between what agentic systems do and what current model risk management frameworks, chiefly SR 11-7 in the United States and emerging equivalents across Africa, require of them. It identifies four specific structural gaps in validation, documentation, explainability, and accountability, and proposes a practical governance framework for financial institutions deploying autonomous agents in regulated credit and compliance environments. The African fintech context receives particular attention, given the convergence of rapid agentic adoption with nascent but active regulatory reform.

1. Introduction: The New Credit Stack

There is a version of SME credit onboarding that takes three days. A human reviews the application, pulls bureau data, checks documents, applies a scorecard, escalates edge cases, and eventually reaches a decision. The process is defensible, auditable, and slow.

There is another version that takes eleven minutes. An orchestration layer receives an application, spins up a set of specialized sub-agents, and each one works in parallel: one verifies identity against a national registry, one queries a credit bureau, one extracts financials from uploaded statements using a document AI model, one applies a risk scoring model, one checks sanctions and PEP lists. The orchestrator synthesizes results, applies a confidence threshold, and either issues a decision or escalates to a human reviewer. Nothing waits.

This is not a prototype. It is the direction that every serious fintech in East and West Africa is moving toward, and several are already there. The question is not whether agentic systems will be used in credit and KYC decisions. They already are. The question is whether the governance infrastructure surrounding those decisions has kept pace.

It has not.

The model risk management frameworks that exist, anchored by the U.S. Federal Reserve's SR 11-7 and, in Africa, by an evolving patchwork of central bank guidance, were written for a prior era of AI. An era of scorecards, logistic regression models, and batch-processed credit engines that produced a number, which a human then used to make a decision. Agentic systems do not produce a number. They produce an action. And that distinction has profound consequences for how we govern them.

This paper explores those consequences, identifies where the gap is most acute, and proposes a governance layer appropriate to the technology we are actually deploying, not the technology these frameworks originally imagined.

2. What Agentic Systems Actually Do in Credit & KYC

To understand where governance frameworks fall short, it helps to be specific about what agentic systems are doing in practice, rather than what industry marketing claims they will someday do.

A modern agentic credit or KYC pipeline is not a single model. It is an orchestrated network of specialized agents, each with a defined scope, invocation conditions, and output contract. The orchestrator, typically an LLM-based reasoning layer, manages sequencing, handles failures, and routes decisions based on intermediate outputs. Sub-agents may be rule-based, ML-based, or LLM-based, depending on task complexity.

In a credit underwriting context, this agentic mesh typically executes the following tasks concurrently or sequentially:

Identity verification: document parsing, biometric matching against national ID databases
Bureau interrogation: querying one or more credit bureaus and normalizing the returned schema
Financial extraction: parsing uploaded bank statements, payslips, or tax returns using document AI
Risk scoring: passing normalized features through a trained ML model
Sanctions and PEP screening: cross-referencing applicant against watchlists
Confidence routing: escalating to a human underwriter if aggregate confidence falls below threshold

Each step involves a model making a decision. The orchestrator itself makes decisions. A human may only enter the picture at the end, and only for a fraction of applications. In production systems, that fraction may be very small.

"An agentic mesh of multiple expert agents can carry out identity verification, document retrieval and validation, metrics evaluation, credit bureau checks, and psychometric analysis simultaneously, with each agent operating under defined objectives, success metrics, and escalation protocols; making the process faster with increased accuracy." [1]

One global bank applied agentic AI to its KYC process by deploying ten agent squads, each coordinating four to five AI agents, shifting from periodic reviews to continuous, event-driven customer due diligence. [2]

The operational implications are significant. Agentic systems do not just automate tasks; they execute decisions that previously required human judgment. They invoke external APIs. They trigger transactions. They produce outcomes that are, in the language of SR 11-7, "quantitative estimates", but also actions. Governance frameworks built around the former are structurally unprepared for the latter.

3. The Regulatory Framework Was Written for a Different World

3.1 SR 11-7 and Its Assumptions

Supervisory Guidance on Model Risk Management (SR 11-7), jointly issued by the Federal Reserve and the Office of the Comptroller of the Currency in April 2011, remains the most influential model governance framework in financial services globally. It has since been adopted or referenced by regulators in the UK, EU, Singapore, and across Africa. Its three foundational requirements are independent validation, ongoing monitoring, and documentation sufficient to support effective challenge.

These requirements reflect a specific set of assumptions about what a model is: a relatively static quantitative system with parameters calibrated offline, deployed at a known point in time, producing numerical outputs that inform human decisions. The model sits between data and a decision-maker. The decision-maker is a person.

"SR 11-7 was conceived in an era dominated by static, largely deterministic models with parameters calibrated periodically and deployed unchanged between reviews. By contrast, many modern AI systems are dynamic rather than static, probabilistic rather than deterministic, and in some cases agentic; capable of initiating actions in real time." [3]

Agentic systems violate every one of these assumptions. They are dynamic. Their effective behavior changes based on context, tool outputs, and intermediate reasoning steps that are invisible to any downstream monitoring system. They are not deployed at a point in time; they are continuously active. And critically, they do not inform a human decision. They make the decision, or a significant sub-component of it.

3.2 Where the Framework Strains

The structural gaps are not peripheral. They go to the core of what SR 11-7 was designed to protect against.

Independent validation, under SR 11-7, typically means a model validation team reviews the model's conceptual soundness, tests its performance on holdout data, and documents the findings. This process assumes a bounded, stable system that can be interrogated off-line. Agentic systems are not bounded or stable. Their behavior in production is a function of live context, which means validation findings from six months ago may tell you very little about what the system will do today.

"SR 11-7 and SS1/23 emphasize conceptual soundness and outcome testing, but they do not address how to validate behavior in action sequences that unfold over time. Existing controls do not anticipate autonomy or tool use. Traditional models are not expected to invoke APIs or trigger transactions without human intervention, yet these capabilities are central to many agentic deployments." [4]

Documentation requirements suffer a similar structural mismatch. A model card that describes a credit scoring model is a reasonable artifact — it captures inputs, outputs, training methodology, and known limitations. A documentation artifact for an agentic orchestration system would need to capture not just the model but also the tool-calling behavior, the escalation logic, the inter-agent communication protocol, and how all of these interact under real-world conditions. Static descriptions cannot fully capture systems whose behavior depends on live context and external, evolving data. [4]

Explainability, too, changes character. Under SR 11-7, explainability means being able to articulate why a model produced a given output. For a credit scoring model, this maps well to feature importance, SHAP values, or counterfactual explanations. For an agentic system, the question is not just why the final score was 680; it is why the document agent flagged an anomaly, how the orchestrator weighted that flag against the bureau result, and which rule in the escalation policy caused the application to route to human review rather than auto-approve. This requires step-level decision traceability, not just output-level justification.

4. The African Dimension: Regulatory Convergence and a Critical Window

The African fintech context is not a footnote to this conversation. It is, arguably, its most important theater.

Across East and West Africa, credit infrastructure is being built from scratch or rebuilt from low bases. There is no decades-old incumbent scorecard architecture to protect. The population of creditworthy SMEs without formal credit histories is enormous. The pressure to onboard at scale and at speed is acute. These conditions make agentic AI not just attractive, but practically inevitable for any fintech operating at meaningful volume.

The regulatory environment is simultaneously active and underspecified, which creates both risk and opportunity.

4.1 Kenya

Kenya's CBK has moved aggressively on digital credit regulation. Following the CBK Amendment Act 2021, it introduced Digital Credit Provider regulations in 2022, and in 2025 expanded its remit further via the Business Laws (Amendment) Act 2024, extending oversight to all Non-Deposit Taking Credit Providers (NDTCPs). As of mid-2025, the CBK had received over 700 licensing applications and licensed 126 DCPs. [5]

The Draft NDTCP Regulations 2025 require NDTCPs to adopt robust corporate governance standards, establish risk management and compliance frameworks, and notably prohibit the outsourcing of core functions such as loan approval and portfolio management without preserving the CBK's oversight rights. [6] This last provision has direct implications for agentic deployments: if an autonomous agent is effectively making loan approval decisions, the regulatory question of whether that constitutes outsourcing, or something else entirely, is open.

Meanwhile, the CBK's own survey of AI in the banking sector found that while 65% of AI-using lenders apply the technology to credit risk scoring, few have embedded mechanisms for bias detection, explainability, or customer redress. The conclusion drawn by the Group CIDO of Diamond Trust Bank, commenting on this survey, is direct:

"A system that cannot give justifications to its decisions is not only ethically problematic but legally vulnerable and reputationally dangerous." [7]

Kenya also launched its National AI Strategy in 2025, with explicit values around inclusivity, ethics, and human oversight, values that, as the CBK survey makes clear, are not yet consistently embedded in practice.

4.2 Nigeria

Nigeria's regulatory environment is denser and more contested. The CBN issued draft standards for Automated AML Solutions in May 2025, specifying requirements for real-time alerts, dynamic rule engines using AI for anomaly detection, onboarding integration, stress testing, and audit trails. [8] These requirements go further than most comparable frameworks in requiring behavioral testing rather than purely structural compliance.

Nigeria's ongoing FATF Grey List status has intensified regulatory pressure. New FCCPC rules introduced in 2025 include fines of up to ₦100 million for violations, prohibit data scraping, and require detailed disclosures from lenders. [9] The direction of travel is clear: more oversight, not less.

4.3 The Opportunity in the Gap

What makes the African context distinct is the timing. Regulatory frameworks are being written now, while the technology is still being deployed. This is the inverse of the U.S. situation, where SR 11-7 predates large language models by more than a decade and is being strained retroactively.

African regulators have the opportunity to write governance frameworks that are native to agentic AI, that assume autonomous action rather than treating it as an edge case. The fintech companies, credit bureaus, and lenders that engage with regulators during this window will shape the frameworks that govern their industry for the next decade.

The Afrimark Parallel

A KYC decisioning engine built to reduce SME onboarding time faces exactly the governance questions this paper describes: when the engine queries multiple data sources, resolves conflicts between them, and produces an onboarding decision, which component is the model? Who validates it? What constitutes a material change that triggers revalidation? These are not hypothetical questions. They are operational ones, and the absence of clear regulatory guidance leaves practitioners building bespoke answers where common standards should exist.

5. Four Structural Governance Gaps

Based on the analysis above, four structural gaps can be identified between what current model risk frameworks require and what agentic credit and KYC systems demand.

Gap 1: The Validation Gap

Traditional model validation is periodic and pre-deployment. A validation team reviews the model before it goes live, and thereafter on a risk-based schedule, typically annually for high-risk models. The model's behavior between validation cycles is assumed to be stable.

Agentic systems are not stable between cycles. An LLM-based orchestrator will behave differently as its context changes, as tool outputs change, and as the distribution of incoming applications shifts. A sub-agent that queries a credit bureau will produce different outputs if the bureau's API returns a schema change, even if the agent itself has not changed. Periodic validation cannot capture this.

What is needed is continuous behavioral monitoring, not just tracking output distributions, but tracking action sequences: which tools are being invoked, in what order, with what inputs, and with what downstream effects. This requires a logging infrastructure that most current ML monitoring platforms do not provide.

Gap 2: The Documentation Gap

SR 11-7 requires documentation sufficient to support effective challenge. For a static model, this means a model card: purpose, methodology, data sources, assumptions, limitations, validation results. For an agentic system, a model card for the orchestrator alone is insufficient, because the orchestrator's behavior is a function of its sub-agents' behaviors, which are themselves functions of live tool outputs.

The appropriate documentation artifact for an agentic system is a dynamic decision log; an immutable, append-only record of every agent action: which agent was invoked, what its input was, what it returned, and how the orchestrator used that output in subsequent steps. This is closer to an audit trail than a model card. It is also the artifact that makes post-hoc investigation possible when a decision is challenged.

Gap 3: The Explainability Gap

The obligation to explain adverse credit decisions is well-established in most jurisdictions — adverse action notices in the U.S., right to explanation under GDPR in Europe, and emerging equivalents in Kenya and Nigeria. For a traditional credit scoring model, this obligation is met by identifying the top factors that drove a low score.

For an agentic system, this is structurally harder. The final credit decision may be a function of the identity agent's confidence score, the document AI's extraction accuracy, the bureau's returned data, and the orchestrator's reasoning about how to weight conflicting signals. Any one of these could be the proximate cause of an adverse decision. The chain of causation runs through multiple systems, potentially including steps that are probabilistic and non-deterministic.

Financial institutions deploying agentic systems in credit must invest in step-level decision tracing; the ability to reconstruct, for any given application, the full sequence of agent actions and the reasoning that connected them to the final outcome. This is technically achievable, but it requires deliberate architecture, not retrospective instrumentation.

Gap 4: The Accountability Gap

When a traditional credit model produces a wrong decision, accountability is relatively clear: the model development team, the model owner, and the validation function all bear defined responsibilities under SR 11-7's governance structure.

When an agentic system produces a wrong decision, accountability is diffuse. Was the error caused by the orchestrator's reasoning? By a sub-agent's incorrect output? By a tool that returned stale data? By a training data bias in the document AI? By an edge case the escalation policy didn't anticipate? Each of these implicates a different team, a different vendor, and potentially a different regulatory regime.

As institutions increasingly rely on third-party foundational models as their orchestration layer, the question of who is responsible for an agentic system's decision becomes a question about accountability for a system whose core reasoning component was developed by an entity with no regulatory relationship to the deploying institution. Existing SR 11-7 guidance on vendor model risk is a starting point, but it was not written with foundational LLMs in mind.

6. A Practical Governance Framework for Agentic Credit Systems

The four gaps described above are not insurmountable. They require a shift in how governance is conceived, from validating a model at a point in time, to governing a system's behavior continuously.

The following framework proposes four governance components appropriate to agentic credit and KYC deployments, designed to be compatible with SR 11-7's intent and implementable within the CBK and CBN regulatory contexts.

Governance Layer	Traditional MRM (SR 11-7)	Agentic AI Requirement
Validation	Periodic, pre-deployment	Continuous behavioral monitoring
Documentation	Static model card	Dynamic action log per agent
Explainability	Output-level justification	Step-level decision trace
Human oversight	Model owner review cycle	Tiered authorization thresholds
Accountability	Model developer / validator	Orchestrator + sub-agent lineage

Component 1: Behavioral Monitoring Over Periodic Validation

Rather than validating the agentic system as a whole at periodic intervals, institutions should implement continuous behavioral monitoring at the agent level, tracking input distribution drift, output distribution drift, tool call failure rates, escalation rates, and the correlation between agent outputs and downstream decision outcomes.

When behavioral drift is detected, for example, the document AI agent's extraction confidence drops across a class of document types, this should trigger a targeted revalidation of that agent, not a full system revalidation. This modular approach is both more efficient and more responsive to the actual sources of risk.

Component 2: Tiered Authorization for Agent Actions

Not all agent actions carry the same risk profile. An agent that queries a publicly available business registry is categorically different from an agent that issues a loan offer or flags an application for rejection. Governance frameworks should establish explicit authorization tiers:

Autonomous tier: low-stakes, easily reversible actions (e.g., requesting additional documentation)
Supervised tier: actions above defined confidence or value thresholds, requiring human confirmation
Prohibited tier: actions categorically outside agent scope regardless of confidence

Component 3: Immutable Decision Logs

Every agent action in a credit or KYC workflow should be logged in an immutable, timestamped record: the agent ID, the input it received, the output it produced, the tool(s) it called, and the orchestrator's decision about how to use that output. These logs are the audit infrastructure that makes post-hoc investigation and regulatory examination possible.

This is an architecture decision that must be made at design time, not retrofitted after deployment. Institutions building or procuring agentic credit systems should require this logging capability as a contractual specification.

Component 4: Human-in-the-Loop Triggers Based on Confidence and Reversibility

The appropriate role for human oversight is not to review every decision, but to be reliably inserted when the system's confidence is low or the decision's consequences are difficult to reverse. Escalation policies should be explicit and codified:

Aggregate confidence below a defined threshold → route to human underwriter
Decision value above a defined threshold → human confirmation required
Identity verification conflict unresolvable → application stops

These policies should themselves be documented, validated, and version-controlled. They define the boundary between autonomous action and human oversight, and are therefore the governance layer's most consequential artifact.

7. Conclusion: Build the Governance Layer Now

The adoption curve is not waiting for governance frameworks to catch up. Research from S&P Global indicates that 54% of financial services firms had deployed AI initiatives by early 2025, up from 40% a year prior. [10] In Africa, the pace of agentic deployment in credit and KYC is accelerating precisely because the operational case is overwhelming: faster onboarding, lower cost per decision, and the ability to serve segments that manual underwriting cannot reach profitably.

The problem is not that institutions are moving fast. The problem is that they are moving fast without a shared language for what governance of these systems should look like. SR 11-7 provides a starting point, but it is a framework designed for a technology that no longer represents the frontier. African regulatory bodies are closer to the starting line, and therefore have the opportunity to write frameworks that are native to what these systems actually are.

The argument of this paper is not that agentic AI in credit and KYC should be slowed down. It is that the governance infrastructure should be built in parallel with the deployment, not retrofitted when something goes wrong. The cost of building a behavioral monitoring system, an immutable decision log, and a tiered authorization policy at the outset of an agentic deployment is marginal. The cost of reconstructing accountability after an adverse regulatory finding or a systematic credit error is not.

The practitioners building these systems right now, at credit bureaus, at digital lenders, at KYC platforms across East and West Africa, are making architectural decisions today that will define how these questions get answered. The frameworks this paper proposes are not regulatory compliance artifacts. They are engineering decisions about what kind of system gets built. And those decisions belong to the people building them.

References

[1] FinTech Weekly. "Agentic AI Enabled Credit Evaluation Process: A Strategic Blueprint." January 2026. https://www.fintechweekly.com/magazine/articles/agentic-ai-credit-evaluation-strategic-blueprint

[2] Neontri. "Agentic AI in Banking: Strategy Guide for C-Level Leaders." November 2025. https://neontri.com/blog/agentic-ai-banking/

[3] Sharma, Krishan. "SR 11-7 in the Age of Agentic AI: Where the Framework Holds – and Where It Strains." GARP Risk Intelligence, March 2026. https://www.garp.org/risk-intelligence/operational/sr-11-7-age-agentic-ai-260227

[4] ValidMind. "From Static Models to Autonomous Agents: How AI Risk Management Must Evolve." December 2025. https://validmind.com/blog/how-ai-risk-management-must-evolve/

[5] Fintech News Africa. "CBK Licenses 41 More Digital Credit Providers." June 2025. https://fintechnews.africa/45405/fintech-kenya/cbk-licenses-41-digital-credit-providers-2025/

[6] Afriwise. "The Draft Central Bank of Kenya (Non-Deposit Taking Credit Providers) Regulations, 2025." https://www.afriwise.com/blog/the-draft-central-bank-of-kenya-non-deposit-taking-credit-providers-regulations-2025

[7] Mwangi, Jimmie. "The AI Governance Kenya Needs For Fair Credit Access." CIO Africa, August 2025. https://cioafrica.co/the-ai-governance-kenya-needs-for-fair-credit-access/

[8] VOVE ID. "AML Compliance in Nigeria: A 2025 Guide for Fintechs and Regulated Businesses." July 2025. https://blog.voveid.com/aml-compliance-in-nigeria-a-2025-guide-for-fintechs-and-regulated-businesses/

[9] TechCabal. "These African Countries Passed Major Tech Laws in 2025." December 2025. https://techcabal.com/2025/12/05/these-african-countries-passed-major-tech-laws-in-2025/

[10] Fintech Global. "Agentic AI Drives Next Phase of AML Innovation." February 2026. https://fintech.global/2026/02/16/agentic-ai-drives-next-phase-of-aml-innovation/

[11] Federal Reserve / OCC. "Supervisory Guidance on Model Risk Management (SR 11-7)." April 2011. https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm

[12] Hogan Lovells. "Agentic AI in Financial Services: Regulatory and Legal Considerations." 2025. https://www.hoganlovells.com/en/publications/agentic-ai-in-financial-services-regulatory-and-legal-considerations

[13] Deloitte Insights. "How Banks Can Supercharge Intelligent Automation with Agentic AI." December 2025. https://www.deloitte.com/us/en/insights/industry/financial-services/agentic-ai-banking.html

[14] ScienceDirect. "AI Agents in Finance and Fintech: A Scientific Review." November 2025. https://www.sciencedirect.com/org/science/article/pii/S1546221825010938

When Agents Decide

Agentic AI in Credit & KYC and the Governance Gap That Follows

1. Introduction: The New Credit Stack

2. What Agentic Systems Actually Do in Credit & KYC