Home » A diffusion model for protein design

A diffusion model for protein design

A team led by Baker Lab scientists Joseph Watson, David Juergens, Nate Bennett, Brian Trippe, and Jason Yim has created a powerful new way to design proteins by combining structure prediction networks and generative diffusion models. The team demonstrated extremely high computational success and tested hundreds of A.I.-generated proteins in the lab, finding that many may be useful as medications, vaccines, or even new nanomaterials. This research is available as a preprint on bioRvix titled “Broadly applicable and accurate protein design by integrating structure prediction networks and diffusion generative models.

The software tool DALL-E produces high-quality images that have never existed before using something called a diffusion model, which is a machine-learning algorithm that specializes in adding and removing noise. Diffusion models for image generation begin with grainy bits of static and gradually remove noise until a clear picture is formed. Additional pieces of software guide this de-noising process so that the new images end up matching what was asked for.

We have developed a guided diffusion model for generating new proteins. With prior design methods, tens of thousands of molecules may have to be tested before finding a single one that performs as intended. Using the new design method, dubbed RFdiffusion, the team had to test as little as one per design challenge. RFdiffusion outperforms existing protein design methods across a broad range of problems, including topology-constrained protein monomer design, protein binder design, symmetric oligomer design, enzyme active site scaffolding, and symmetric motif scaffolding for therapeutic and metal-binding protein design. Highlights include a picomolar binder generated through pure computation and a series of novel symmetric assemblies experimentally confirmed by electron microscopy. 

“These works reveal just how powerful diffusion models can be for protein design,” says Watson. “It’s extremely exciting,” added Juergens, “and it’s really just the beginning.”

 

RFdiffusion can generate novel proteins that bind to target molecules.

Here RFdiffusion generates a novel protein that binds to the insulin receptor.