AI

Zoom AI: Redefining what’s possible beyond the hardest challenges

Zoom is closing 2025 with major AI breakthroughs. Its federated architecture research now outperforms leading frontier models on complex reasoning and search benchmarks, proving that orchestrating multiple systems can achieve deeper accuracy and reliability across industries.
7 min read

Updated on December 29, 2025

Published on December 29, 2025

Zoom AI: Redefining what’s possible beyond the hardest challenges
Xuedong Huang
Xuedong Huang
Chief Technology Officer

Xuedong Huang is the  Chief Technology Officer (CTO). Prior to Zoom, he was at Microsoft where he served as Azure AI CTO and Technical Fellow. His career is illustrious in the AI space: he began Microsoft’s speech technology group in 1993, led Microsoft’s AI teams to achieve several of the industry’s first human parity milestones in speech recognition, machine translation, natural language understanding, and computer vision, is an IEEE and ACM Fellow and an elected member of the National Academy of Engineering and the American Academy of Arts and Sciences.

Xuedong received his Ph.D. in EE from the University of Edinburgh in 1989 (sponsored by the British ORS and Edinburgh University Scholarship), his MS in CS from Tsinghua University in 1984, and BS in CS from Hunan University in 1982.

As 2025 draws to a close, I’m proud to reflect on the progress our team has made in advancing Zoom’s AI capabilities: what began earlier with our work on the Humanity's Last Exam (HLE) benchmark — a rigorous evaluation designed to test reasoning and expert-level understanding in AI — has extended into broader performance gains across multiple evaluations, reinforcing that our proprietary agentic federated AI is capable of delivering significant improvements beyond the limits of any single frontier model.

DeepSearchQA: Surpassing previous state-of-the-art

We applied Zoom's federated AI approach in our research environment to Google’s new DeepSearchQA benchmark: an evaluation of AI agents on complex, multi-step information-seeking tasks across 17 fields. Released on Dec 11, 2025 along with the new Gemini Deep Research, this benchmark challenges AI systems beyond single-answer retrieval or broad-spectrum factuality.
 
Instead, DeepSearchQA features a dataset of challenging, hand-crafted tasks designed to evaluate an agent’s ability to execute complex search plans to generate exhaustive answer lists. Zoom’s federated AI achieved 76.3% accuracy in testing, surpassing the previous state‑of‑the‑art of 66.1%.
 
This improvement revealed a key insight: the importance lies in how AI is built and applied systematically. By orchestrating OpenAI GPT‑5 and Gemini 3 Pro Preview in internal tests through our proprietary agentic federated framework of "explore–verify–federate" workflow, we aim to deliver deeper reasoning coverage and more reliable factual synthesis than a single model can achieve.

 

Model/System
DeepSearchQA Accuracy
Zoom Federated AI (GPT-5 + Gemini 3 Pro Preview)
76.3%
Google Gemini Deep Research Agent
66.1%
OpenAI GPT‑5 Pro
65.2%
OpenAI GPT-5
59.4%
Google Gemini 3 Pro Preview
56.6%
Anthropic Claude Opus 4.5 (thinking)
24.0%
Third-party benchmarking results last updated December 10, 2025
 
This leap underscores that Zoom's federation is not just an ordinary ensemble — it's a scalable reasoning system capable of adapting across difficulty levels and domains. The principles that drove our Humanity's Last Exam breakthrough prove powerful for agentic enterprise applications where reliability and reasoning depth matter more than latency.
 
Our innovation lies not in building another monolithic model, but in connecting the best models into a system that emphasizes improved reliability and orchestration over raw scale.

HLE: Scaling with new frontier models

As new frontier AI models emerge, the architecture continues to scale capable of delivering smarter and more human‑centered intelligence across tasks, industries, and applications. With the release of OpenAI's new GPT-5.2, Zoom federated AI research has improved HLE full-set accuracy from 48.1% to 53.0%, again outperforming all individual frontier models.
 
Model/System
HLE Full Set Accuracy
Zoom Federated AI
(GPT-5.2 + Gemini 3 Pro Preview)
53.0%
OpenAI GPT‑5.2 Pro
Zoom Federated AI
(GPT-5 + Gemini 3 Pro Preview)
48.1%
Google Gemini Deep Research Agent
Google Gemini 3 Pro Preview
OpenAI GPT‑5.2
OpenAI GPT-5 Pro
Anthropic Claude Opus 4.5
OpenAI GPT-5

Implications for Agentic AI

These groundbreaking results demonstrate that Zoom's federated AI approach represents a paradigm shift in orchestrating the world's most advanced models. This approach transcends traditional single-model limitations, creating a robust framework with profound implications for agentic AI and the evolution of Zoom AI Companion and Zoom Virtual Agent.
 
Zoom's proprietary agentic federation has laid the groundwork for a new generation of AI agents capable of tackling humanity's most complex challenges. While these benchmark achievements mark a significant milestone, our continued focus on optimizing latency will help this federated architecture deliver transformative value across diverse real-world applications.

Xuedong Huang is Chief Technology Officer at Zoom. He previously served as Technical Fellow and Azure AI CTO at Microsoft. He is an elected member of the National Academy of Engineering and American Academy of Arts and Sciences.

Note on Benchmarking Results: These metrics reflect Zoom's ongoing research with frontier AI models. Referenced models may still be in testing for integration in Zoom's federated AI deployment for customers.

Our customers love us

Okta
Nasdaq
Rakuten
Logitech
Western Union
Autodesk
Dropbox
Okta
Nasdaq
Rakuten
Logitech
Western Union
Autodesk
Dropbox

Zoom - One Platform to Connect