From Structure to Cell Permeability: Advancing Peptide Drug Discovery with AI

Cyclic peptides are a class of molecules composed of amino acids linked in a ring-shaped structure. This unique configuration enhances their structural stability and enables them to target intracellular protein–protein interactions — a class of targets often considered “undruggable” using conventional small-molecule drugs. These interactions play a critical role in treating cancer and other diseases but are difficult to access due to their size and structural complexity.

However, a key challenge lies in their ability to cross cell membranes efficiently. To do so, cyclic peptides must exhibit a “chameleon-like” behaviour — dynamically changing their conformation to shield polar atoms when moving into membrane environments. Capturing this behaviour requires large-scale molecular dynamics (MD) simulations and advanced AI models that go beyond simple 2D molecular representations.

To address this, researchers simulated how these molecules behave in both water and hexane — an organic solvent commonly used to mimic the cell membrane. By observing how the peptides adapt their conformation across these environments, scientists are able to assess their ability to cross cell membranes.

Powered by NSCC Singapore’s ASPIRE 2A+ supercomputer, researchers from the Agency for Science, Technology and Research (A*STAR)’s Bioinformatics Institute (BII) created CycPeptMPDB-4D, a large-scale dataset detailing the movements and behaviours of over 5,000 cyclic peptides in water and membrane-like environments. The team also developed the Local Update Function Network (LUFnet), an AI model that enhances simulation efficiency by enabling larger time integration steps while preserving physical accuracy.

Together, these advances establish a computational framework for predicting and optimising cyclic peptide permeability, supporting more targeted experimental design and improving the efficiency of drug discovery.

The Research

Building on the potential of cyclic peptides to target “undruggable” intracellular protein–protein interactions, the research focuses on overcoming a critical limitation: poor membrane permeability.

Traditional experimental methods to assess permeability are both time-consuming and resource-intensive, while existing computational approaches often lack sufficient predictive accuracy. This creates a bottleneck in identifying viable drug candidates.

By integrating large-scale MD simulations with machine learning models, researchers are able to better understand how cyclic peptides behave. This allows them to identify promising candidates earlier and prioritise them for experimental validation, improving the efficiency of the drug discovery process.

This work was enabled by NSCC Singapore’s ASPIRE 2A+ supercomputer, which provided the computational scale required for data generation, simulation and AI model development.

The research was enabled through close collaboration across industry, academia and high-performance computing (HPC). WuXi AppTec, a Contract Research, Development and Manufacturing Organisation (CRDMO), synthesised the cyclic peptides and measured membrane permeability using the Parallel Artificial Membrane Permeability Assay (PAMPA), providing experimental data to validate computational predictions. Chosun University (Republic of Korea) contributed specialised expertise in computational physics, supporting the development of the LUFnet model.

The Technology

At the core of this project is the integration of high performance computing with artificial intelligence (AI) to model molecular behaviour at scale. Using NSCC Singapore’s ASPIRE 2A+ supercomputer, researchers modelled the dynamic behaviours of over 5,000 cyclic peptides across different environments.

Simulations were conducted in both water and hexane to capture how these molecules adapt their structure across environments, allowing researchers to evaluate their likelihood of crossing cell membranes.

To efficiently process the large volume of data, the team developed LUFnet, a scalable AI model designed to enhance molecular dynamics (MD) simulations. By enabling larger time integration steps while preserving key physical constraints, LUFnet allows l molecular motion to be simulated over longer timescales without compromising accuracy.

Through the close integration of HPC and AI, the project delivered several major innovations: the LUFnet architecture, a first-of-its-kind dataset called CycPeptMPDB-4D, and the first rigorous benchmark evaluating 13 different AI models for permeability prediction.

Benefits of HPC

NSCC Singapore’s ASPIRE 2A+ supercomputer was critical in enabling this research, providing the scale and the performance required to integrate large-scale simulations with AI model development.

- Large-scale MD simulations: Over 10,000 independent GROMACS simulations were executed on ASPIRE 2A+ to generate the CycPeptMPDB-4D dataset
- Deep learning model training: ASPIRE 2A+’s GPU nodes enabled the intensive training and benchmarking of 13 AI models across diverse data splits and hyperparameter configurations
- Accelerated simulation development: Parallel computing capabilities of ASPIRE 2A+ made it feasible to generate high-fidelity reference trajectories and train the LUFnet transformer across multiple thermodynamic states and system sizes. Simulations were run 5 times faster than other conventional research methods, reducing half-year workloads to just one month.

Streamlined research workflow: Simulation, data generation and AI model training were conducted within a unified environment, eliminating the need for cross-system data transfer.

These capabilities allowed researchers to explore thousands of molecular candidates, capture complex molecular behaviour with greater fidelity, and conduct analyses that would not have been feasible using conventional computing resources.

The Impact

The research introduces a more targeted and computationally driven approach to discovering cell-permeable cyclic peptides — addressing a key challenge in developing therapies for diseases such as cancer that involve intracellular protein–protein interactions.

By combining AI models, scalable simulation methods and large-scale datasets, the project establishes an integrated framework that can be applied beyond cyclic peptides. These approaches can be applied to predict other critical molecular properties, such as oral bioavailability and metabolic stability — key factors in the successful development of new therapeutics.

The outputs of this research, including the CycPeptMPDB-4D dataset, LUFnet simulation framework, and AI benchmarking approaches are openly available. This supports broader adoption across the global research and pharmaceutical community and enables more efficient, data-driven drug discovery. Based on this dataset, the team is also developing their own AI model.

Beyond the research community, these advances contribute to ongoing efforts to develop new treatment strategies for diseases that are currently difficult to target, while improving the efficiency of early-stage drug discovery.

Looking ahead, the availability of these datasets, benchmarks and tools is expected to support near-term adoption across the research and pharmaceutical landscape. Over time, this foundation can be extended through generative AI models capable of designing cyclic peptides with optimised permeability and target specificity.

In the longer term, the project contributes towards a more integrated computational pipeline for cyclic peptide drug discovery — from initial target identification through to the selection of viable clinical candidates.

Supported by a strong collaborative network across industry and academia, the team is extending these AI-driven tools into new therapeutic areas, establishing a more resource-efficient pathway for drug development.

“Cyclic peptides change conformation as they cross membranes, so capturing that structural flexibility is key to predicting permeability. This involves modelling thousands of molecular dynamics trajectories and training AI models on large-scale datasets — tasks that go far beyond the capabilities of desktop computing. NSCC Singapore’s ASPIRE 2A+ supercomputing resources were instrumental in enabling this work, providing the computational power needed to perform these simulations efficiently and at scale.”

Liu Wei

Senior Scientist II

A*STAR Bioinformatics Institute (BII)

Back To Case Studies

Other Case Studies

Advanced Manufacturing and Engineering

Designing Perovskite Quantum Emitters through Large-Scale HPC Simulations

Quantum technologies are increasingly recognised as a strategic priority for Singapore, with implications for national security and the country’s industrial photonics and electronics ecosystem. Beyond advanced manufacturing, their impact also extends to finance, logistics, and infocomm technologies, where progress in quantum information science has the potential to strengthen critical infrastructure and support long-term economic growth.

Health and Biomedical Sciences

The Asian genome is one of the most diverse but least studied

With the help of supercomputing resources from NSCC, researchers from the GenomeAsia 100K project published the data in the Nature journal. The study also highlighted the…

Health and Biomedical Sciences

Balancing skin bacterial composition for healthier skin

NSCC’s supercomputer help analyse the Asian skin microbiome in order to develop products to improve skin treatments and maintain skin health. Our skin harbours a complex…

Advanced Manufacturing and Engineering

A quieter way to fly – Reducing jet engine noise through HPC research

Researchers from NUS are harnessing the power of supercomputing to understand the mechanism of noise generated by jet engines to reduce the impact of noise emission on the…

Health and Biomedical Sciences

Supercomputers help scientists better understand peanut allergies

Researchers from A*STAR leverage high-performance computing (HPC) resources to develop advanced structural model and simulations of the interactions between antigens and…

Urban Solutions and Sustainability

Dispersion Analysis for Handling Maritime New Fuels

31 December 2024

From Structure to Cell Permeability: Advancing Peptide Drug Discovery with AI

Other Case Studies

Join Our Mailing List

Privacy Statement