Bloomberg AI Researchers’ Advance Text-to-SQL Accuracy with Multi-Agent PExA Framework
November 25, 2025
A team of AI researchers at Bloomberg have developed PExA, an agentic framework that achieved 70.2% execution accuracy, sharing one of the top positions on the Spider 2.0 (Snow) leaderboard, one of the most rigorous public benchmarks for complex text-to-SQL generation.
When it was originally submitted in late September 2025, PExA established a new performance record while maintaining wall-clock latency comparable to prior systems, defining a stronger Pareto frontier between speed and accuracy. The Spider 2.0 (Snow) benchmark evaluates real-world database generalization across hundreds of intricate schemas, thousands of columns, and highly ambiguous multi-hop natural language questions, widely regarded as the most challenging test of executable SQL synthesis. All leaderboard metrics were independently computed by the official Spider 2.0 (Snow) evaluation team.
“Achieving a top position on Spider 2.0 Snow established PExA as a state-of-the-art system for end-to-end, executable text-to-SQL,” explains Srivas Prasad, head of Bloomberg’s AI Engineering group’s Code Generation team. “PExA’s breakthrough performance, which balances speed and accuracy, is enabled by a software-testing-inspired multi-agent architecture and an efficient parallel exploration strategy that diverges from traditional monolithic LLM prompting.”
PExA integrates three coordinated components: (1) a Planner that transforms the user query and generates semantically meaningful test cases, (2) a Test Case Generator that executes these cases to probe the database and gather targeted evidence through a structured multi-path search, and (3) a SQL Proposer that synthesizes and verifies the final SQL program using the accumulated test-case results. This architecture delivers broader semantic coverage, higher reliability, and fast parallel search – highlighting a promising new direction for scalable, trustworthy natural-language interfaces to structured data.
This notable achievement was accomplished by Bloomberg Data Science Ph.D. Fellow Tanmay Parekh of UCLA Samueli School of Engineering’s Computer Science department and the UCLA NLP Group (who worked on this during his internship this past summer), Ella Hofmann-Coyle, Shuyi Wang, Sachith Sri Ram Kothur, Srivas Prasad and Yunmo Chen of our AI Engineering – Code Generation team, which operates across the firm’s Toronto and NY offices.
The full technical details of this achievement will be shared in an forthcoming preprint paper.