The problem is as old as machine learning in financial services, and as persistently unsolved: the data that would most improve a fraud detection model is precisely the data that cannot, under the laws of most jurisdictions, be freely shared with the people who would use it to build that model. Customer transaction records are personal data. They are subject to the General Data Protection Regulation in Europe, to state-level privacy laws in the United States, to the Personal Data Protection Act in Singapore, and to an expanding array of analogous frameworks in markets across Asia and Latin America. Cross-border transfers of this data are, at best, administratively burdensome and, at worst, legally impermissible.
The paper published this week by Citigroup's AI research division and the Massachusetts Institute of Technology's Computer Science and Artificial Intelligence Laboratory proposes a solution that its authors acknowledge is not new in principle but is, by their account, new in its demonstrated performance: synthetic transaction data, generated by a diffusion model trained on real transaction history and then stripped of any direct correspondence to actual customer records, can serve as a substitute for real data in training fraud detection systems. The key finding is the margin: 2.1 per cent below performance parity. That number, if it can be replicated across a range of deployment contexts, changes the economics of financial AI development in a material way.
The Architecture
The synthesis process described in the paper operates in two stages. In the first stage, a diffusion model learns the statistical structure of real Citigroup transaction data, including the joint distributions of transaction size, merchant category, time of day, geographic location, and dozens of other features that characterise legitimate and fraudulent activity differently. In the second stage, the model generates a synthetic dataset that reproduces those statistical relationships without retaining any specific transaction that could be linked to an identifiable individual.
2.1% below performance parity: a gap small enough to dissolve the legal obstacle to cross-border AI collaboration in financial services.
The Regulatory Significance
The practical significance of the 2.1 per cent gap is not primarily technical. It is regulatory and commercial. A fraud detection model trained on synthetic data that performs within 2 per cent of one trained on real data is, in the judgment of the paper's authors, good enough to be deployed in jurisdictions where real data cannot be transferred. For a global bank operating across thirty or more legal jurisdictions, each with distinct data residency requirements, the ability to share synthetic training data across borders without triggering regulatory obligations is not a marginal convenience. It is the difference between building a unified global model and maintaining a patchwork of locally trained systems that cannot learn from each other.
Citigroup has not yet announced plans to deploy the synthetic data pipeline in production, and the paper is explicit that the 2.1 per cent gap represents an average across the test conditions examined, with variance across transaction types and geographies that the authors recommend further investigation. The methodology has not yet been independently replicated, which is a standard caveat for work of this kind but a more consequential one when the stakes include the fraud detection capabilities protecting tens of millions of customers across dozens of markets.
What the paper represents, with appropriate caution, is a credible technical path around one of the most persistent obstacles to collaborative AI development in regulated financial services. Whether the path is walkable in production conditions, at the scale that global banks require, remains to be demonstrated. The 2.1 per cent, for now, is an aspiration as much as a result.