AI platform for drug discovery cuts screening time from years to seconds

China has unveiled an artificial intelligence platform for drug discovery that can screen a vast library of chemical compounds, cutting the initial drug screening phase from months or years down to tens of seconds. Developers said they expected the system to provide a novel method for identifying lead molecules to treat tumors, neurodegenerative conditions, rare diseases and emerging infectious diseases, as well as possibly speeding up drug research during public health crises. Called GalaxyVS, the platform is powered by China’s new generation of Tianhe supercomputers. Its daily throughput in predicting how two or more molecules interact is six orders of magnitude higher – or a million times faster – than the existing world record for supercomputing molecular docking. The breakthrough was announced by the National Supercomputing Center in Tianjin, which developed the platform in collaboration with a team from Tsinghua University’s Institute for AI Industry Research.

The Tsinghua team brought in DrugCLIP, their ultra-fast virtual screening method that was documented in the leading peer-reviewed publication Science in January. In the latest development, the supercomputing team optimized the DrugCLIP model by leveraging the massive parallel processing capability of the new-generation Tianhe supercomputer to achieve super-fast results. According to Li Peishun, Researcher at the National Supercomputer Center, “GalaxyVS is not simply an amplification of existing models”. Instead, it is “a complete platform that reconstructs a chemical space of nearly 100 billion elements, integrating AI models, supercomputing, high-performance retrieval and medicinal chemistry”, Li explained. “This platform effectively addresses challenges in traditional drug discovery, including a scarcity of active molecules, an insufficient screening space and the homogenization of candidate molecules.”

Innovative drug research and development typically takes more than 10 years to yield results and could require investments in the billions of U.S. dollars. A critical initial phase in drug development involves sorting through immense volumes of compounds to pinpoint active molecules capable of binding to specific biological targets. But traditional experimental screening is costly and time-consuming and conventional methods of molecular docking are limited in their efficiency and have a high rate of false positives. As the library of synthesizable compounds grows to a scale of hundreds of billions or even trillions, existing virtual screening technologies face severe constraints in algorithms, computing power, storage and engineering capabilities. The system’s real-world tests showed that a single retrieval operation across a 100-billion-molecule library could be completed in less than a minute. Its massive efficiency is reflected in its daily throughput, which reaches 16 trillion molecular dockings, the South China Morning Post reports.