Skip to main content
November 2025
CURRENT ISSUE
AAM Magazine
November 2025
Back to June 2025

Governed autonomy

By Triwit Ariyathugun, Joseph Cherian, and Bhaskar Kaushik*   
  • Asia
  • Global
A framework for human-AI collaboration in decision-making

The rapid integration of artificial intelligence into financial services presents complex challenges and unprecedented opportunities. Institutions must critically examine where AI systems – from algorithmic trading to robo advising – excel, where they fall short, and how human oversight can optimise outcomes as adoption of the technology accelerates.

We have developed a framework grounded in delegated management theory and real-world cases to examine why individuals delegate decisions to AI, analyse domains where AI excels versus its limitations, and propose a multi-pronged approach, encompassing bounded autonomy, theoretical anchoring, dynamic oversight, and personalised risk calibration, for effective implementation and risk management in human-AI financial collaboration.

Delegated management theory sheds light on how individuals delegate decision-making to artificial agents. Evidence from industry adoption and academic research reveals a preference for assigning high-stake, loss-averse tasks to AI, even when its performance matches human experts. The technology’s reputation for impartiality alleviates concerns about irrational decision-making, fostering greater trust in its reliability within financial systems.

Trust in AI rests more on perceived objectivity than on process transparency. While humans earn additional delegation when they explain their reasoning, AI earns it through consistently reliable outcomes. These behavioural patterns shed light on why investors often migrate to robo advisers during periods of extreme market turbulence; the technology’s perceived impartiality mitigates worries about emotional bias, panic disorder, and other cognitive biases that humans exhibit during periods of extreme market dislocation.

Domains of excellence

AI has found its strong foothold in finance, where tasks are data-dense, rules-based, and readily measurable. Research by Goldman Sachs in 2023 estimated that generative AI could automate 44% of legal work and 46% of administrative work. Real-world applications already demonstrate the technology’s dominance in three key areas.

  • Algorithmic trading: Machine-learning engines ingest voluminous trade records, order flow information, news sentiment and other market signals in milliseconds, then execute strategies at speeds no human trader can match. For example, one of us wrote in this journal in September 2019 (Vol 24, No. 9) that blockchain technology can demonstrate access to the best alternative investment opportunities.
  • Document processing: JPMorgan’s Contract Intelligence system can currently review about 12,000 commercial loan agreements a year in seconds, saving 360,000 hours of lawyers’ time.
  • Fraud detection and credit underwriting: Credit card networks deploy anomaly detection models to screen millions of transactions every day. In consumer lending, Ant Group, a leading Chinese financial technology company, uses a system that evaluates 3,000 variables per borrower and makes credit decisions in 3.2 seconds with a default rate well below the industry average.

Limitations

AI’s impressive performance in finance masks three structural weaknesses. First, large language models can hallucinate; they can fabricate facts, citations, or even entire data series with perfect grammatical polish. For instance, in the case of Mata versus Avianca in 2023, lawyers submitted a legal brief generated by ChatGPT that cited six non-existent court opinions!

Second, data-driven systems tend to amplify biases already present in their training sets. Amazon’s resume-screening pilot in 2015 systematically, but not unsurprisingly, down-ranked female applicants for software-engineering jobs. Research by various academics identifies various “fairness” techniques, including reweighting and adversarial debiasing, which can mitigate such biases in financial applications.

Third, AI may pass statistical performance tests but still violate bedrock principles of finance. Academic research has documented cases where AI trading agents achieve excellent Sharpe ratios while trading on stale prices, violating no-arbitrage principles, or ignoring steep transaction costs. The academic literature attributes such failures to “under-specification”. This is when models learn spurious correlations rather than meaningful economic relationships between variables.

Fourth, linguistic and cultural diversity introduces risks. AI trained on English data may overlook non-Western contexts, for example, Japan’s zaibatsu or family-run monopolies evolving into keiretsu networks. Systems trained on Western data might misinterpret such structures, leading to exclusion or flawed risk assessments in civil law jurisdictions with distinct ownership norms.

Mathematical approach

Modern AI systems make probabilistic, non-deterministic decisions, yielding inconsistent outputs for identical inputs, either being confidently wrong or uncertainly right.

We propose statistical thresholds to escalate low-confidence decisions, and context-aware escalation that mirror human expert protocols, to trigger review when uncertainty exceeds risk tolerance.

We can express the problem as:

Prob (correct | confidence = c) ≠ c

For financial applications, we derive context-specific uncertainty thresholds, τi, using:
τi = min { c: Prob (correct | confidence ≥ c) > αi }
where αi represents the required reliability level for context i.

For example, in a credit approval system, an AI app may report 80% confidence in a loan recommendation, but achieves only 65% accuracy at this level. To ensure 90% reliability (αi = 0.9), we set a confidence threshold (τi) at 92%. Decisions below 92% confidence are escalated for human review, filtering out overconfident errors while preserving autonomy for high-certainty cases.

Just as a doctor orders additional tests when diagnosis confidence falls below 90%, AI systems should escalate decisions to human reviewers when uncertainty exceeds personalised thresholds. These thresholds should vary by context – higher for low-stake decisions like document categorisation, and lower for critical functions like fraud detection.

The framework

An effective implementation and risk management framework rests on four mutually reinforcing dimensions:

  • Bounded autonomy
    Draw clear operational limits for each AI system. Trading algorithms operate only within pre-set risk limits, for example, maximum position size of 2% of portfolio value; document systems will only process standardised templates; and credit models will decide only when the data coverage meets minimum thresholds. When boundaries are breached, the AI workflow automatically forks to human review.
  • Theoretical anchoring
    Embed first principles as hard constraints. “Hard-code” core economic rules like no-arbitrage and budget constraints into the objective function. For example, portfolio allocation systems should incorporate lifecycle investment rules that dynamically adjust asset allocation based on job stability, human capital wage income variability with the market, and risk profile.
  • Dynamic oversight
    Implement continuous rather than episodic monitoring. Each model output carries an uncertainty score; when confidence deteriorates, such as when market volatility spikes, decisions are automatically throttled for human clearance. Some central banks stress-test systems with adversarial scenarios to uncover hidden failure modes.
  • Personalised risk calibration
    Adjust thresholds to match individual risk capacity. For example, a broker can be limited to low-risk trades, e.g., less than 2% portfolio risk, due to existing market exposure, while a tenured professor with a stable salary can have a higher-risk auto approval, e.g., up to 5% risk.

Safeguards adapt to financial stability and risk exposure, ensuring fairness by linking autonomy to loss absorption capacity.

Yin and yang of decision-making

Modern decision-making no longer pits human judgement against machine intelligence – it prospers when the two operate in concert. Our four-dimensional risk governance framework offers a practical blueprint for that partnership.

Organisations can let algorithms deliver speed and scale by carving out clear domains of automated action, embedding first-principles constraints, subjecting every model to real-time validation, and customising risk thresholds to individual circumstances. At the same time, humans supply strategic context and ultimate accountability.

The payoffs are tangible:

  • Amplified expertise: vast data streams become actionable insights for subject matter experts.
  • Reduced blind spots: human reviewers catch contextual nuances and theoretical violations.
  • Personalised risk management: governance calibrated to individual circumstances.

While we developed this framework for financial settings, its principles extend naturally to other domains:

In healthcare, a radiology study demonstrated that an AI algorithm, tested across multiple hospitals, detected abnormal chest X-rays with 99.1% sensitivity, automatically clearing high-confidence normals and flagging lower-confidence cases for radiologist review.

Law firms now deploy AI to autonomously review standard contracts such as leases and non-disclosure agreements, and escalate unique clauses to attorneys. They encode legal precedents as constraints, enabling systems to flag potential violations automatically.

Adaptive learning platforms adjust content difficulty based on each student’s performance. If the system detects a lack of clarity in understanding, it could personalise the learning pace or prioritise periodic repetitions to match each student’s demonstrated level of mastery.

Across applications, the framework maintains the essential balance between automation efficiency and appropriate human oversight. Many financial regulators converge on governance approaches that blend AI efficiency with human accountability. Together, these requirements keep humans in control of AI outputs. Hence, society enjoys the efficiency gains of automation while being sane.

The relationship between artificial intelligence and agency – the ability to delegate decision-making to AI tools that operate independently – is a delicate and complex one. Proceed with caution: informed judgement is essential.

* Dr. Triwit Ariyathugun is an alumnus of the University of Chicago’s economics department; Dr. Joseph Cherian is chief executive officer and distinguished professor at the Asia School of Business in Kuala Lumpur; and Bhaskar Kaushik is a Ph.D candidate in finance at the University of Oklahoma’s Price College of Business.


Readers interested in the working papers underpinning this article may refer to the following website:
https://docs.google.com/document/d/1grc6HImM5Dd5_aPCG7Ott-1vBhPAFnup/edit?usp=sharing&ouid=108760237008285180318&rtpof=true&sd=true