A new machine learning framework called MULTI-evolve dramatically speeds up the process of designing high-performance proteins, eliminating the need for countless trial-and-error experiments. This breakthrough promises to accelerate advancements in medicine, biofuels, and various industrial applications.
The Challenge of Protein Design
Proteins are central to many modern technologies—from life-saving drugs to more efficient enzymes in laundry detergents. Improving protein performance often requires making multiple, coordinated changes to their amino acid sequences. However, swapping one amino acid can affect how future swaps impact the protein’s function, creating a complex search problem that traditionally demands many iterative rounds of laboratory testing. As bioengineer Patrick Hsu notes, this process has historically been “guess and check.”
How MULTI-evolve Works
MULTI-evolve tackles this challenge by integrating machine learning with targeted laboratory experiments. The process involves three key steps:
- Predicting Single Mutations: The system uses existing data or machine learning models to forecast how individual amino acid changes affect protein function.
- Mapping Interactions: Researchers create and test proteins with paired mutations to determine how those changes interact. This generates crucial data on combined effects.
- Training the Model: The laboratory results feed into a machine learning model, which then predicts the performance of proteins with five or more mutations in a single round.
Real-World Performance
The team at the University of California, Berkeley, and the Arc Institute tested MULTI-evolve on three proteins, including an antibody for autoimmune diseases and a protein used in CRISPR gene editing. In all cases, the model identified combinations of mutations that significantly outperformed the original proteins in laboratory tests. This confirms that the system can reliably select effective swaps.
Implications and Future Applications
MULTI-evolve has the potential to revolutionize how proteins are engineered. Hsu highlights two promising applications: tracking protein movement within cells and developing more effective gene therapies for enzyme deficiencies. The tool is expected to change how scientists approach protein design, making the process faster, more precise, and ultimately, more productive.
“We’re excited about this work,” Hsu says. “I think there’s tremendous interest in how this actually changes the practice of science.”
This new machine learning framework represents a major step forward in biotechnology, offering a more efficient and accurate way to design proteins for a wide range of critical applications.
