As artificial intelligence (AI) predictions produce an explosion of three-dimensional protein structure data, a bottleneck is growing as interpretation fails to keep up with the deluge. In this context, a Korea-based research team has developed multiple alignment software that can rapidly and precisely align and compare hundreds of thousands of protein structures at once.
The Ministry of Science and ICT said on the 30th that a research team led by Martin Steinegger, a professor in the School of Biological Sciences at Seoul National University, developed FoldMason, an ultrafast, high-accuracy multiple alignment analysis technology based on large-scale protein structure big data. The results were published the same day in the journal Science.
Proteins drive life through three-dimensional structures formed by complexly folded amino acid sequences. For example, diverse functions related to disease and aging—such as enzyme activity, immune responses, and cell signaling—are connected to protein structures. Understanding how protein structures have evolved is therefore important to revealing the causes of disease and identifying new therapeutic strategies. Recently, AI-based protein structure prediction technologies have advanced quickly, accumulating vast amounts of protein structure data.
However, some note that while data have multiplied, comparison and interpretation are lagging. In particular, when comparing proteins, there remained a "twilight zone" in which amino acid sequence similarity is so low that it is difficult to determine evolutionary relationships using conventional methods.
To overcome this limitation, the team developed FoldMason, software that integrates analysis of three-dimensional protein structures and amino acid sequences while boosting speed. It aligns multiple proteins at once and helps infer evolutionary processes by finding core structures that are commonly preserved across entire proteins.
FoldMason was about 100 to 1,000 times faster than existing approaches while maintaining high accuracy. As a result, it could compare and align hundreds of thousands of protein structures simultaneously. The team said it can be used to analyze almost all proteins, including the challenging twilight zone.
Steinegger told ChosunBiz, "The most difficult part of the research was meeting accuracy and scale (scalability) at the same time," adding, "Proteins can flexibly move, experience insertions and deletions, or vary in length. Precisely yet quickly aligning such targets was a challenge on a completely different level."
In fact, using FoldMason, the team found clues that the basic design of key proteins that fight viruses has remained largely unchanged for billions of years, even between organisms that are evolutionarily distant, such as humans and bacteria. This could offer important hints for understanding the processes through which the immune system in our bodies formed.
Steinegger said, "By using FoldMason, we can analyze the evolutionary relationships of vast numbers of proteins and, by linking structural differences to disease mechanisms, help identify potential drug targets," adding, "To make it useful for biologists, we will strengthen functions that interpret structural variants and connect the results to evolutionary analyses and disease-related variant information."
References
Science (2026), DOI: https://doi.org/10.1126/science.ads6733