A team of researchers at the University of Washington has collaborated to address challenges in the protein sequence design method by using a deep learning-based protein sequence design method, LigandMPNN. The model addresses enzymes and small molecule sensor and binder designs. Existing physics-based approaches, such as Rosetta, and deep learning-based models, such as ProteinMPNN, cannot model non-protein atoms and molecules explicitly, making it difficult to accurately design protein sequences that They interact with small molecules, nucleotides and metals.
The aforementioned methods neglect the explicit consideration of non-protein atoms and molecules, which is crucial for the design of enzymes, protein-DNA/RNA interactions, and protein-small molecule and protein-metal binders. The proposed solution, LigandMPNN, is based on the ProteinMPNN architecture but explicitly incorporates the full non-protein atomic context. LigandMPNN features protein-ligand graphs, leveraging neural networks to model interactions and encode ligand atom geometries. The modification leads LigandMPNN to generate sequences and side chain conformations tailored to specific non-protein contexts.
LigandMPNN employs a graph-based approach, treating protein residues as nodes and incorporating nearest neighbor edges based on Cα-Cα distances. The model introduces protein-ligand graphs to capture interactions, with protein residues and ligand atoms as nodes and edges representing geometric relationships. The ligand graph enhances the transfer of information to the protein across ligand-protein edges.
The experiment showed that LigandMPNN and its side chain package have better performance compared to Rosetta and ProteinMPNN, with higher sequence recovery for residues that interact with small molecules, nucleotides and metals with 20-30% higher precision and sample its effectiveness in detailed structural design. LigandMPNN also outperforms existing models in speed and efficiency. LigandMPNN is about 250 times faster than Rosetta.
In conclusion, LigandMPNN fills a critical gap in existing protein sequence design methods by explicitly including non-protein atoms and molecules. LigandMPNN's graph-based approach shows a notable improvement in performance, leading to higher sequence recovery and superior precision of side chain packaging around small molecules, nucleotides, and metals. LigandMPNN had exceptional performance in designing small molecules and DNA-binding proteins with high affinity and specificity, which would be of great help in protein engineering.
Review the Paper. All credit for this research goes to the researchers of this project. Also, don't forget to follow us on Twitter and Google news. Join our 36k+ ML SubReddit, 41k+ Facebook community, Discord channeland LinkedIn Grabove.
If you like our work, you will love our Newsletter..
Don't forget to join our Telegram channel
Pragati Jhunjhunwala is a Consulting Intern at MarktechPost. She is currently pursuing B.tech from the Indian Institute of technology (IIT), Kharagpur. She is a technology enthusiast and has a keen interest in the scope of data science software and applications. She is always reading about the advancements in different fields of ai and ML.
<!– ai CONTENT END 2 –>