Alibaba's open-source AI agent rivals OpenAI's Deep Research

Alibaba Group Holding has unveiled a “leading open-source deep research” artificial intelligence agent that it says matches the performance of OpenAI’s flagship Deep Research tool, while being more efficient. The agent has been integrated into Alibaba’s maps app, Amap, and its AI-powered legal research tool, Tongyi FaRui, according to Alibaba’s AI search development team, Tongyi Lab. Users of Amap can leverage the deep research agent’s web retrieval capabilities to plan multi-day trips. Meanwhile, Tongyi FaRui has been updated with the agent’s research functions, enhancing its ability to retrieve case law with verified citations, according to Alibaba. The agent is the latest addition to Alibaba’s rapidly expanding AI initiatives. In the past weeks, the company has launched its first trillion-parameter base model, Qwen-3-Max-Preview, along with Qwen3-Next-80B-A3B – a smaller yet more powerful model, according to benchmarking firm Artificial Analysis.

Deep research agents are AI tools designed to perform complex web retrieval tasks that require multiple steps. The first such agent, OpenAI’s Deep Research, was launched and integrated into ChatGPT in February. Other major U.S. tech companies, including Google DeepMind, have also introduced similar tools. Alibaba said its deep research agent showed “incredible efficiency” compared with U.S. proprietary tools, as it had only 30 billion parameters – significantly fewer than the estimated parameter counts of the models driving U.S. deep research agents. Parameters are the variables that encode an AI model’s “intelligence” and are adjusted during the training process. Generally, a higher number of parameters indicates a more powerful model, but it also requires more computational resources to train and operate.

A graphic released by Alibaba showed that its new agent achieved industry-leading scores across various advanced benchmarks, including Humanity’s Last Exam – a challenging set of academic questions known to test the limits of existing AI systems. Alibaba said its agent scored 32.9% on this benchmark, surpassing OpenAI’s Deep Research’s 26.6%.

Adina Yakefu, a machine learning community manager at open-source platform Hugging Face, described Alibaba’s self-reported benchmark scores as “amazing”. After it was open-sourced, the agent quickly gained traction on the platform, enabling developers worldwide to download and build upon it. The strength and efficiency of Alibaba’s agent stemmed from its innovative data curation pipeline, which produced “very high-quality” synthetic training data, said Tan Sijun, AI Researcher at the Sky Computing Lab of the University of California, Berkeley. Synthetic training data is generated by AI systems rather than sourced from the real world. As real-world data becomes increasingly scarce, AI companies are turning to synthetic data to train new systems, the South China Morning Post reports.