The Quiet Deployment: How Tier-One Banks Are Embedding LLMs Into Core Operations

On the twenty-fourth of February 2026, Jamie Dimon addressed an investor conference and offered a remark that deserved rather more analytical attention than the financial press, absorbed as it was by his commentary on interest rates and the approaching final rule on capital, elected to give it. "We have displaced people from AI," he said, and then, in a tone suggesting he was describing a thing that had already been settled, "and we offer them other jobs." He added, with an emphasis that implied he had anticipated this question, that JPMorgan Chase has "huge redeployment plans" for its affected employees [1]. The observation was remarkable not because it was alarming, but because it was unremarkable to the man making it. Dimon was not announcing a crisis, or a workforce restructuring, or a change in strategic direction. He was describing a state of affairs that had arrived, as major transformations in banking infrastructure tend to arrive, without a formal announcement and without the attention it warranted.

The language of "redeployment" is the language of managed transition, and JPMorgan's operational data supports the claim that the transition is genuinely managed rather than merely euphemised. The bank's overall headcount of 318,512 has remained approximately stable over the past year; operations staff fell by four per cent and support staff by two per cent, while client-facing and revenue-generating roles rose by four per cent [1]. Operations teams now process six per cent more accounts per employee than a year ago. Fraud-related costs per unit have declined by eleven per cent. Software engineer productivity, measured by the bank's internal metrics, has risen by ten per cent. These are not the numbers of a bank in the early stages of an AI experiment. They are the numbers of an institution in which AI-assisted processes have achieved sufficient scale to move aggregate efficiency figures.

This article is about that scale, and about what it means to have achieved it. The deployment of large language models into the core operational workflows of the major global banks has proceeded, over the past two years, with a speed and a degree of institutional discretion that stands in notable contrast to the more publicised debates about artificial intelligence in finance. Those debates have largely concerned the visible, user-facing layer: assistants that surface relevant case law for compliance lawyers, tools that help analysts draft research notes, copilots that autocomplete routine correspondence. These applications exist, and they matter; but they are not the story that Dimon's comment reveals. The story Dimon's comment reveals is an infrastructure story, and infrastructure stories, by their nature, do not make good headlines until something goes wrong.

The Precedent: COBOL and the Invisible Transformation

There is a historical precedent that illuminates the present situation better than any of the comparisons that have been offered in the popular commentary. It is not the spreadsheet, which supplemented rather than replaced the accountant's judgment, and it is not the ATM, whose labour-substitution dynamics I have addressed elsewhere in these pages. The relevant precedent is the mainframe and the programming language that ran on it.

COBOL, the Common Business-Oriented Language, was designed between 1959 and 1961 by a committee convened under the auspices of the United States Department of Defence, with Grace Hopper as its most significant contributor. Its stated purpose was to create a language for business data processing that could be read and understood by managers who were not programmers, and that would run portably across different hardware manufacturers' machines. In both ambitions it partially succeeded and partially failed, but its adoption by the financial services industry was rapid and, in retrospect, remarkable for how little attention it received from those outside the computing rooms of the major banks. By the mid-1970s, every significant clearing bank in Britain and every money-centre bank in the United States had migrated its core processing, its settlement systems, and its general ledger functions onto COBOL-based mainframe architectures. The trade press of the period noted these deployments as productivity improvements; the Bank of England and the Federal Reserve considered them primarily as operational matters. There was no systemic-risk assessment of the kind that the Financial Stability Board would later learn to commission.

The consequence of this quiet transformation is well understood, though rarely connected to the present moment in discussions of AI in finance. An estimated 95 billion lines of COBOL code remain in active production today [2], running a disproportionate share of the world's daily financial transactions: clearing, settlement, ledger-posting, and the dozens of subsidiary processes that keep the apparatus of money in motion. This code was written between roughly 1960 and 1990 by people who are now largely retired or deceased, in a language that fewer than a hundred thousand active programmers can read with fluency. It is the most critical and least visible infrastructure in global finance, maintained primarily because the cost and risk of replacing it exceed, for most institutions, the cost of continuing to employ the specialists who understand it. The Y2K remediation of 1997 to 1999 was, in its operational essence, a belated reckoning with the consequences of infrastructure decisions that had been made without adequate regard for their long-term implications.

I raise this history not to suggest that LLMs will prove as durable as COBOL, a proposition that would require a degree of technological prediction I do not intend to offer. I raise it because the structural dynamics are similar: a transformative technology is being embedded in operational infrastructure at a speed that outpaces the development of the governance frameworks appropriate to it, and the entities doing the embedding are, quite rationally, more focused on the productivity benefits of deployment than on the long-term implications of dependency.

The Scale of What Has Been Deployed

The figures for JPMorgan Chase are, by some margin, the most extensively documented among the major institutions, in part because the bank's leadership has been willing to publish them and in part because the scale is sufficiently large that it is difficult to understate. As the accompanying chart illustrates, the bank's annual technology investment has grown from approximately nine and a half billion dollars in 2017 to eighteen billion dollars in 2025, a compound increase that accelerated materially from 2022 onwards, precisely as the generative AI era began [3]. Of the 2024 technology budget, approximately 1.3 billion dollars was specifically allocated to AI capabilities, a figure projected to generate between 1.5 and 2 billion dollars in annual business value [4].

Fig. 1 — Infrastructure Investment

From $9.5bn to $18bn: JPMorgan’s Technology Investment Has Doubled in Eight Years, With Material Acceleration From 2022

The generative AI era began against a backdrop of sustained, decade-long investment in financial technology infrastructure at the world’s largest bank by market capitalisation

Sources: JPMorgan Chase Investor Day presentations; Annual Reports 2017–2025; CNBC reporting Feb 2026. 2018, 2020, 2022 values interpolated from confirmed anchor points and reported compound annual growth rate of approximately 7 per cent over 2019–2023.

The operational deployment behind these figures is, if anything, more striking than the investment figures themselves. JPMorgan's LLM Suite, a model-agnostic platform that routes queries across multiple foundation models depending on task characteristics, had reached all two hundred thousand of its employees globally as of 2025, with users reporting several hours per week of time recovered from lower-value tasks [4]. More significant, from an infrastructure perspective, than the employee-facing tool is the bank's parallel deployment of AI into operational processes that are not employee-facing at all. Its Know Your Customer processing provides the clearest illustration: in 2022, JPMorgan processed 155,000 KYC files with a workforce of 3,000 dedicated staff. By the following year, the projected capacity was 230,000 files, a fifty per cent volume increase, handled by approximately 2,400 staff, a twenty per cent headcount reduction, representing in aggregate a productivity improvement of nearly ninety per cent per file processed [4]. This is not the efficiency gain of a tool that helps an analyst work faster. This is the efficiency gain of a process whose fundamental architecture has been altered.

BNY Mellon, whose operational footprint as the world's largest custodian bank spans more daily transaction volume than most central banks process in a year, has deployed a comparable internal platform under the name Eliza. Built in 2023 on Microsoft Azure infrastructure and drawing on a combination of OpenAI's GPT-4, Google's Gemini, and Meta's Llama models, Eliza was extended in early 2025 through a multi-year agreement with OpenAI that enhanced its reasoning capabilities [5]. By the close of 2025, more than half of BNY Mellon's employees were using Eliza regularly, a figure that would be noteworthy in isolation; the figure that is genuinely consequential is that 20,000 of those employees had become proficient enough to build their own agents on the platform, with 125 distinct use cases now operating in production [6]. When the users of an internal AI platform have become its developers, the platform has crossed a threshold from tool to infrastructure.

The specific operational domain where BNY Mellon's AI deployment is most acutely needed illustrates the point. On an average day, between sixty and seventy billion dollars' worth of United States Treasury transactions fail to settle, a failure rate of approximately two per cent of daily volume that has persisted, with varying severity, since the structure of the Treasury market was established in the 1980s [7]. BNY Mellon, as the pre-eminent clearing bank for US government securities, bears a disproportionate share of the operational burden created by these failures. Working initially with Google Cloud, the bank developed a machine-learning system capable of predicting approximately forty per cent of potential settlement failures in Fed-eligible securities with ninety per cent accuracy, enabling pre-emptive intervention before the failure materialises [7]. This is not an assistant; it is an operational control system embedded in the clearing infrastructure of the world's most liquid sovereign bond market.

When 20,000 BNY Mellon employees have become proficient enough to build agents on the bank’s AI platform, the platform has crossed a threshold from tool to infrastructure.

Goldman Sachs occupies a somewhat different position in this deployment landscape, having proceeded through a succession of publicly described phases that permit a clearer view of the trajectory than most institutions offer. In January 2025, the firm extended access to its GS AI Assistant, a proprietary platform operating behind the firm's security perimeter and drawing on both OpenAI and Anthropic models, to approximately ten thousand analysts and associates [8]. By June 2025, the tool had been made available to all 46,500 of the firm's knowledge workers, with adoption exceeding fifty per cent within weeks of the broad rollout [9]. By February 2026, as this publication reported, Goldman had moved beyond the assistant layer entirely and was deploying autonomous agents, built in collaboration with embedded Anthropic engineers, into trade accounting and KYC compliance workflows [10], the same processes that JPMorgan's data shows are generating its most significant productivity improvements. The progression from assistant to agent is not a quantitative change in capability; it is a qualitative change in the nature of the bank's relationship with the AI system.

Why This Is an Infrastructure Story, Not an Assistant Story

The distinction between an assistant and infrastructure deserves more careful treatment than the popular discussion has given it. An assistant, in the sense relevant here, is a tool that amplifies the productivity of a human who retains both the authority and the accountability for the work being done. A legal research assistant that surfaces relevant precedent makes the lawyer more efficient; it does not replace the lawyer's judgment, and the professional responsibility for the advice given remains entirely with the lawyer. The AI assistant deployed to an investment bank analyst to help draft research notes is, in this sense, a productivity tool, and its deployment does not raise qualitatively different governance questions from those raised by earlier generations of productivity software.

Infrastructure is different. Infrastructure is what the process runs on, and when the process fails, it fails at the infrastructure level. JPMorgan's KYC system, which now processes files at a rate that would require roughly 2,400 additional human staff to replicate without AI assistance, is not amplifying the productivity of individual analysts who could, if the system were removed, perform the same work more slowly. It is the system through which the work is done. The bank's operations staff fell by four per cent last year not because four per cent of them became less productive but because the infrastructure was doing work that they had previously done. This is the distinction that Dimon's "redeployment" language was accurately if compactly describing.

The infrastructure designation matters because it changes the relevant questions. The question one asks of an assistant is: how does this tool affect the human who uses it? The question one asks of infrastructure is: what happens when it fails, who is responsible, and how are the dependencies managed? The governance frameworks developed for AI assistants in financial services, by and large, ask the first question. The governance frameworks adequate to AI infrastructure need to ask the second.

The Herding Problem and the FSB's Warning

On the tenth of October 2025, the Financial Stability Board published a report on the monitoring of artificial intelligence adoption in the financial sector [11]. The report is an honest document, and its honesty is most evident in its acknowledgment that financial supervisory authorities are, as a group, still in early stages of developing the monitoring frameworks adequate to the scale of deployment they are observing. The report's principal concern is not any individual institution's AI deployment but a systemic property that emerges from the aggregate: what the FSB terms model homogenisation.

The concern is precise and analytically sound. When multiple major financial institutions build their AI infrastructure on a small number of foundation models, provided by a small number of vendors, trained on largely overlapping data, and deployed on a small number of cloud platforms, the resulting system acquires a property that no individual institution's risk management framework can fully address: correlated failure modes. If the conditions that cause one institution's AI-assisted fraud detection system to misclassify a transaction type are embedded in the model itself, rather than in the institution-specific deployment, then every institution using that model will misclassify the same transaction type at the same moment. The FSB drew the comparison to the homogenised Value-at-Risk models that characterised major bank risk management in the years preceding 2008 [11]. The parallel is not decorative; it is analytical. VaR models did not fail individually and at different times; they failed simultaneously, in the same direction, because they encoded the same assumptions about tail-risk correlations that the crisis revealed to be wrong.

The FSB's report does not suggest that the current AI deployment has reached this threshold. It suggests that the monitoring frameworks necessary to assess whether it is approaching this threshold do not yet exist, and that developing them is urgent. The point bears emphasis because the gap between the pace of deployment and the pace of regulatory response is not a new observation, but the consequences of a significant gap between the two are more severe at the infrastructure layer than at the assistant layer. When the assistant fails, the human falls back on their own judgment; when the infrastructure fails, the process stops.

The Regulatory Frame That Does Not Quite Fit

The primary existing regulatory framework for the governance of AI models in US banking is the Federal Reserve's SR 11-7, a supervisory letter issued in 2011 that established guidance on model risk management [12]. SR 11-7 requires institutions to maintain a model inventory, to conduct independent validation of models before deployment, to monitor model performance on an ongoing basis, and to ensure that model limitations are understood by those who rely on model outputs. It is a well-constructed framework, and it is technically applicable to LLMs deployed in credit decisions, compliance determinations, and risk assessments. The difficulty is that it was designed for statistical models whose input-output relationships are, in principle, explicable: a credit scoring model can be interrogated to understand why it assigned a particular score, and an independent validator can assess whether the scoring methodology is sound.

Large language models are not, in the same sense, explicable. Their reasoning processes are distributed across billions of parameters in ways that do not reduce to a set of interpretable rules, and their behaviour in edge cases cannot be fully characterised by the kind of structured testing that SR 11-7's validation requirements contemplate. Applying SR 11-7 to a large language model deployed in a KYC determination involves a degree of conceptual stretching that the framework's authors did not intend, and that risks producing compliance activities that satisfy the letter of the guidance while leaving the substantive governance questions unaddressed. Regulators in the United States are aware of this tension, and the Office of the Comptroller of the Currency, the Federal Reserve, and the FDIC have issued joint guidance acknowledging that existing model risk management frameworks require adaptation for AI; but the adaptation has not yet produced a framework with the specificity that the scale of current deployment requires [12].

The European Union's AI Act, which began applying its general-purpose model provisions in August 2025 and will extend to high-risk systems, including those used in credit scoring and financial compliance decisions, in August 2026, offers a more structured regulatory response, though one that introduces its own complications for institutions operating across jurisdictions [13]. The high-risk classification under the AI Act imposes requirements for human oversight, technical documentation, and conformity assessment that are, in their substance, more appropriate to AI infrastructure than the assistant-layer provisions; but their implementation requires the development of audit methodologies that do not yet exist at scale, and the August 2026 deadline creates a compliance pressure that may generate documentation rather than genuine oversight.

What to Watch

The question that the current deployment trajectory poses most acutely is not whether major banks will continue to embed LLMs in core operations; the deployment is already at a scale that makes reversal implausible and the efficiency gains are sufficiently documented to ensure that institutions which have not yet deployed will face competitive pressure to do so. The question is whether the governance frameworks, supervisory capacities, and institutional accountabilities will develop at a pace that is commensurate with the infrastructure role the technology is assuming.

Three variables seem to me most worth monitoring. The first is whether the category of work retained by human oversight continues to contract as agent capabilities develop. BNY Mellon's 20,000 agent-builders are, in effect, an in-house development organisation extending the reach of the AI infrastructure into processes that were previously manual. Each new production use case narrows the frontier at which human judgment is the primary mechanism of control; and while this is a rational response to demonstrated productivity gains, it is also a systematic reduction in the resilience of the institution's operations against the kinds of failures that AI systems produce, which tend to be different in character from the failures that human judgment produces.

The second variable is whether a significant AI-assisted compliance failure, of the kind that would establish legal and regulatory precedent for accountability in AI-mediated processes, occurs before or after the regulatory framework has been adequately developed to address it. The COBOL parallel is again instructive: the consequences of the 1960s and 1970s infrastructure decisions were not fully visible until the 1990s, by which point they had become structural rather than contingent. A compliance failure traceable to an AI model operating in the KYC or AML layer of a major institution would produce a regulatory response; the question is whether that response would be designed to address the governance challenge posed by AI infrastructure, or would instead be designed to address the specific failure in the specific institution, leaving the broader structural question unresolved.

The third variable is whether the concentration of foundation model provision among a small number of vendors, which the FSB has identified as the proximate mechanism through which model homogenisation produces systemic risk, is addressed through regulatory intervention or through the natural development of a more diverse vendor ecosystem. The current landscape, in which OpenAI, Anthropic, Google, and Meta account for the overwhelming majority of the foundation models deployed across major financial institutions, is the landscape that the FSB's herding-effect warning describes. Whether it changes, and whether the change is driven by competitive dynamics or supervisory pressure, is a question that bears watching with the same attention that the deployment itself has not yet received.

When the COBOL transition was complete, nobody outside the computing rooms of the major banks fully understood what had happened or what it would mean for the decades that followed. The people who wrote the code moved on; the people who inherited it maintained what they could understand; and the institutions that depended on it managed the dependency with varying degrees of sophistication and varying degrees of acknowledgment that it was a dependency at all. The LLM transition is occurring more quickly, and its outputs are less legible than COBOL to those who would audit them. Whether the governance of this transition proves more adequate than the governance of the last one is the question that Dimon's casual remark about redeployment does not, and was not intended to, answer.

References

CNBC. "Jamie Dimon says AI is already reshaping JPMorgan Chase's workforce as bank plans 'huge redeployment'." CNBC. 24 February 2026. cnbc.com
Reuters Institute; Micro Focus. Estimates of active COBOL code in production. Various sources; the figure of 95 billion lines is widely cited in the industry literature and in testimony before the US Senate Banking Committee.
JPMorgan Chase. Annual Reports and Investor Day Presentations, 2017–2025. See also: Yahoo Finance. "JPMorgan's $18 Billion Tech Budget Draws Comparisons to Nvidia's AI Dominance." 2025. finance.yahoo.com
JPMorgan Chase. 2024 Investor Day Transcript. "Firm Overview." 20 May 2024. jpmorganchase.com
BNY; OpenAI. "BNY builds 'AI for everyone, everywhere' with OpenAI." OpenAI Case Studies. 2025. openai.com
Yahoo Finance. "Exclusive: How BNY's new AI tool Eliza is minting an army of disposable assistants." Yahoo Finance. 2025. finance.yahoo.com
BNY Mellon; Google Cloud. "BNY Mellon and Google Cloud Collaborate to Help Transform U.S. Treasury Market Settlement and Clearance Process." Press release. 2021. bny.com
CNBC. "Goldman Sachs launches AI assistant." CNBC. 21 January 2025. cnbc.com
Fortune. "Goldman Sachs rolls out an internal AI assistant firm-wide." Fortune. 24 June 2025. fortune.com
CNBC. "Goldman Sachs Taps Anthropic's Claude to Automate Accounting, Compliance Roles." CNBC. 6 February 2026. cnbc.com
Financial Stability Board. "Monitoring Adoption of Artificial Intelligence and Related Vulnerabilities in the Financial Sector." FSB Report. 10 October 2025. fsb.org
Board of Governors of the Federal Reserve System; OCC; FDIC. "Supervisory Guidance on Model Risk Management." SR 11-7. April 2011. See also joint agency statements on AI applicability, 2023–2025.
European Parliament and Council. Regulation (EU) 2024/1689 on Artificial Intelligence (AI Act). Official Journal of the European Union. 12 July 2024. High-risk system provisions applicable from August 2026.