To read an article, click on its title and select the PDF file.
Use the box below to search publications.
2018 |
Lu, Peilong ; Min, Duyoung ; DiMaio, Frank ; Wei, Kathy Y; Vahey, Michael D; Boyken, Scott E; Chen, Zibo ; Fallas, Jorge A; Ueda, George ; Sheffler, William ; Mulligan, Vikram Khipple ; Xu, Wenqing ; Bowie, James U; Baker, David Accurate computational design of multipass transmembrane proteins Journal Article Science, 359 (6379), pp. 1042–1046, 2018, ISSN: 0036-8075. @article{Lu1042, title = {Accurate computational design of multipass transmembrane proteins}, author = {Lu, Peilong and Min, Duyoung and DiMaio, Frank and Wei, Kathy Y. and Vahey, Michael D. and Boyken, Scott E. and Chen, Zibo and Fallas, Jorge A. and Ueda, George and Sheffler, William and Mulligan, Vikram Khipple and Xu, Wenqing and Bowie, James U. and Baker, David}, url = {http://science.sciencemag.org/content/359/6379/1042 https://www.bakerlab.org/wp-content/uploads/2018/03/Lu_Science_2018.pdf}, doi = {10.1126/science.aaq1739}, issn = {0036-8075}, year = {2018}, date = {2018-03-02}, journal = {Science}, volume = {359}, number = {6379}, pages = {1042--1046}, abstract = {In recent years, soluble protein design has achieved successes such as artificial enzymes and large protein cages. Membrane proteins present a considerable design challenge, but here too there have been advances, including the design of a zinc-transporting tetramer. Lu et al. report the design of stable transmembrane monomers, homodimers, trimers, and tetramers with up to eight membrane-spanning regions in an oligomer. The designed proteins adopted the target oligomerization state and localized to the predicted cellular membranes, and crystal structures of the designed dimer and tetramer reflected the design models.Science, this issue p. 1042The computational design of transmembrane proteins with more than one membrane-spanning region remains a major challenge. We report the design of transmembrane monomers, homodimers, trimers, and tetramers with 76 to 215 residue subunits containing two to four membrane-spanning regions and up to 860 total residues that adopt the target oligomerization state in detergent solution. The designed proteins localize to the plasma membrane in bacteria and in mammalian cells, and magnetic tweezer unfolding experiments in the membrane indicate that they are very stable. Crystal structures of the designed dimer and tetramer{\textemdash}a rocket-shaped structure with a wide cytoplasmic base that funnels into eight transmembrane helices{\textemdash}are very close to the design models. Our results pave the way for the design of multispan membrane proteins with new functions.}, keywords = {}, pubstate = {published}, tppubtype = {article} } In recent years, soluble protein design has achieved successes such as artificial enzymes and large protein cages. Membrane proteins present a considerable design challenge, but here too there have been advances, including the design of a zinc-transporting tetramer. Lu et al. report the design of stable transmembrane monomers, homodimers, trimers, and tetramers with up to eight membrane-spanning regions in an oligomer. The designed proteins adopted the target oligomerization state and localized to the predicted cellular membranes, and crystal structures of the designed dimer and tetramer reflected the design models.Science, this issue p. 1042The computational design of transmembrane proteins with more than one membrane-spanning region remains a major challenge. We report the design of transmembrane monomers, homodimers, trimers, and tetramers with 76 to 215 residue subunits containing two to four membrane-spanning regions and up to 860 total residues that adopt the target oligomerization state in detergent solution. The designed proteins localize to the plasma membrane in bacteria and in mammalian cells, and magnetic tweezer unfolding experiments in the membrane indicate that they are very stable. Crystal structures of the designed dimer and tetramer{textemdash}a rocket-shaped structure with a wide cytoplasmic base that funnels into eight transmembrane helices{textemdash}are very close to the design models. Our results pave the way for the design of multispan membrane proteins with new functions. |
Silva, Daniel-Adriano ; Stewart, Lance ; Lam, Kwok-Ho ; Jin, Rongsheng ; Baker, David Structures and disulfide cross‐linking of de novo designed therapeutic mini‐proteins Journal Article FEBS Journal, 285 (10), pp. 1783-1785, 2018. @article{Silva2018, title = {Structures and disulfide cross‐linking of de novo designed therapeutic mini‐proteins}, author = {Silva, Daniel-Adriano and Stewart, Lance and Lam, Kwok-Ho and Jin, Rongsheng and Baker, David}, url = {https://febs.onlinelibrary.wiley.com/doi/abs/10.1111/febs.14394 }, doi = {10.1111/febs.14394}, year = {2018}, date = {2018-02-01}, journal = {FEBS Journal}, volume = {285}, number = {10}, pages = {1783-1785}, abstract = {Recent advances in computational protein design now enable the massively parallel de novo design and experimental characterization of small hyperstable binding proteins with potential therapeutic activity. By providing experimental feedback on tens of thousands of designed proteins, the design-build-test-learn pipeline provides a unique opportunity to systematically improve our understanding of protein folding and binding. Here, we review the structures of mini-protein binders in complex with Influenza hemagglutinin and Bot toxin, and illustrate in the case of disulfide bond placement how analysis of the large datasets of computational models and experimental data can be used to identify determinants of folding and binding.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Recent advances in computational protein design now enable the massively parallel de novo design and experimental characterization of small hyperstable binding proteins with potential therapeutic activity. By providing experimental feedback on tens of thousands of designed proteins, the design-build-test-learn pipeline provides a unique opportunity to systematically improve our understanding of protein folding and binding. Here, we review the structures of mini-protein binders in complex with Influenza hemagglutinin and Bot toxin, and illustrate in the case of disulfide bond placement how analysis of the large datasets of computational models and experimental data can be used to identify determinants of folding and binding. |
2017 |
Hosseinzadeh, Parisa* ; Bhardwaj, Gaurav* ; Mulligan, Vikram Khipple* ; Shortridge, Matthew D; Craven, Timothy W; Pardo-Avila, F{'a}tima ; Rettie, Stephen A; Kim, David E; Silva, Daniel-Adriano ; Ibrahim, Yehia M; Webb, Ian K; Cort, John R; Adkins, Joshua N; Varani, Gabriele ; Baker, David Comprehensive computational design of ordered peptide macrocycles Journal Article Science, 358 (6369), pp. 1461-1466, 2017, ISSN: 0036-8075. @article{Hosseinzadeh2017, title = {Comprehensive computational design of ordered peptide macrocycles}, author = {Hosseinzadeh, Parisa* and Bhardwaj, Gaurav* and Mulligan, Vikram Khipple* and Shortridge, Matthew D. and Craven, Timothy W. and Pardo-Avila, F{'a}tima and Rettie, Stephen A. and Kim, David E. and Silva, Daniel-Adriano and Ibrahim, Yehia M. and Webb, Ian K. and Cort, John R. and Adkins, Joshua N. and Varani, Gabriele and Baker, David}, url = {http://science.sciencemag.org/content/358/6369/1461 https://www.bakerlab.org/wp-content/uploads/2017/12/Science_Hosseinzadeh_et_al_2017.pdf}, doi = {10.1126/science.aap7577}, issn = {0036-8075}, year = {2017}, date = {2017-12-15}, journal = {Science}, volume = {358}, number = {6369}, pages = {1461-1466}, abstract = {Mixed-chirality peptide macrocycles such as cyclosporine are among the most potent therapeutics identified to date, but there is currently no way to systematically search the structural space spanned by such compounds. Natural proteins do not provide a useful guide: Peptide macrocycles lack regular secondary structures and hydrophobic cores, and can contain local structures not accessible with L-amino acids. Here, we enumerate the stable structures that can be adopted by macrocyclic peptides composed of L- and D-amino acids by near-exhaustive backbone sampling followed by sequence design and energy landscape calculations. We identify more than 200 designs predicted to fold into single stable structures, many times more than the number of currently available unbound peptide macrocycle structures. Nuclear magnetic resonance structures of 9 of 12 designed 7- to 10-residue macrocycles, and three 11- to 14-residue bicyclic designs, are close to the computational models. Our results provide a nearly complete coverage of the rich space of structures possible for short peptide macrocycles and vastly increase the available starting scaffolds for both rational drug design and library selection methods.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Mixed-chirality peptide macrocycles such as cyclosporine are among the most potent therapeutics identified to date, but there is currently no way to systematically search the structural space spanned by such compounds. Natural proteins do not provide a useful guide: Peptide macrocycles lack regular secondary structures and hydrophobic cores, and can contain local structures not accessible with L-amino acids. Here, we enumerate the stable structures that can be adopted by macrocyclic peptides composed of L- and D-amino acids by near-exhaustive backbone sampling followed by sequence design and energy landscape calculations. We identify more than 200 designs predicted to fold into single stable structures, many times more than the number of currently available unbound peptide macrocycle structures. Nuclear magnetic resonance structures of 9 of 12 designed 7- to 10-residue macrocycles, and three 11- to 14-residue bicyclic designs, are close to the computational models. Our results provide a nearly complete coverage of the rich space of structures possible for short peptide macrocycles and vastly increase the available starting scaffolds for both rational drug design and library selection methods. |
Butterfield, Gabriel L *; Lajoie, Marc J *; Gustafson, Heather H; Sellers, Drew L; Nattermann, Una ; Ellis, Daniel ; Bale, Jacob B; Ke, Sharon ; Lenz, Garreck H; Yehdego, Angelica ; Ravichandran, Rashmi ; Pun, Suzie H; King, Neil P; Baker, David Evolution of a designed protein assembly encapsulating its own RNA genome Journal Article Nature, 2017, ISSN: 1476-4687. @article{Butterfield2017, title = {Evolution of a designed protein assembly encapsulating its own RNA genome}, author = {Butterfield, Gabriel L.* and Lajoie, Marc J.* and Gustafson, Heather H. and Sellers, Drew L. and Nattermann, Una and Ellis, Daniel and Bale, Jacob B. and Ke, Sharon and Lenz, Garreck H. and Yehdego, Angelica and Ravichandran, Rashmi and Pun, Suzie H. and King, Neil P. and Baker, David}, url = {http://dx.doi.org/10.1038/nature25157 https://www.bakerlab.org/wp-content/uploads/2017/12/Nature_Butterfield_et_al_2017.pdf}, doi = {10.1038/nature25157}, issn = {1476-4687}, year = {2017}, date = {2017-12-13}, journal = {Nature}, abstract = {The challenges of evolution in a complex biochemical environment, coupling genotype to phenotype and protecting the genetic material, are solved elegantly in biological systems by the encapsulation of nucleic acids. In the simplest examples, viruses use capsids to surround their genomes. Although these naturally occurring systems have been modified to change their tropism and to display proteins or peptides, billions of years of evolution have favoured efficiency at the expense of modularity, making viral capsids difficult to engineer. Synthetic systems composed of non-viral proteins could provide a ‘blank slate’ to evolve desired properties for drug delivery and other biomedical applications, while avoiding the safety risks and engineering challenges associated with viruses. Here we create synthetic nucleocapsids, which are computationally designed icosahedral protein assemblies with positively charged inner surfaces that can package their own full-length mRNA genomes. We explore the ability of these nucleocapsids to evolve virus-like properties by generating diversified populations using Escherichia coli as an expression host. Several generations of evolution resulted in markedly improved genome packaging (more than 133-fold), stability in blood (from less than 3.7% to 71% of packaged RNA protected after 6hours of treatment), and in vivo circulation time (from less than 5minutes to approximately 4.5hours). The resulting synthetic nucleocapsids package one full length RNA genome for every 11 icosahedral assemblies, similar to the best recombinant adeno-associated virus vectors. Our results show that there are simple evolutionary paths through which protein assemblies can acquire virus-like genome packaging and protection. Considerable effort has been directed at ‘top-down’ modification of viruses to be safe and effective for drug delivery and vaccine applications; the ability to design synthetic nanomaterials computationally and to optimize them through evolution now enables a complementary ‘bottom-up’ approach with considerable advantages in programmability and control.}, keywords = {}, pubstate = {published}, tppubtype = {article} } The challenges of evolution in a complex biochemical environment, coupling genotype to phenotype and protecting the genetic material, are solved elegantly in biological systems by the encapsulation of nucleic acids. In the simplest examples, viruses use capsids to surround their genomes. Although these naturally occurring systems have been modified to change their tropism and to display proteins or peptides, billions of years of evolution have favoured efficiency at the expense of modularity, making viral capsids difficult to engineer. Synthetic systems composed of non-viral proteins could provide a ‘blank slate’ to evolve desired properties for drug delivery and other biomedical applications, while avoiding the safety risks and engineering challenges associated with viruses. Here we create synthetic nucleocapsids, which are computationally designed icosahedral protein assemblies with positively charged inner surfaces that can package their own full-length mRNA genomes. We explore the ability of these nucleocapsids to evolve virus-like properties by generating diversified populations using Escherichia coli as an expression host. Several generations of evolution resulted in markedly improved genome packaging (more than 133-fold), stability in blood (from less than 3.7% to 71% of packaged RNA protected after 6hours of treatment), and in vivo circulation time (from less than 5minutes to approximately 4.5hours). The resulting synthetic nucleocapsids package one full length RNA genome for every 11 icosahedral assemblies, similar to the best recombinant adeno-associated virus vectors. Our results show that there are simple evolutionary paths through which protein assemblies can acquire virus-like genome packaging and protection. Considerable effort has been directed at ‘top-down’ modification of viruses to be safe and effective for drug delivery and vaccine applications; the ability to design synthetic nanomaterials computationally and to optimize them through evolution now enables a complementary ‘bottom-up’ approach with considerable advantages in programmability and control. |
Dou, Jiayi; Doyle, Lindsey; Greisen, Per; Schena, Alberto; Park, Hahnbeom; Johnsson, Kai; Stoddard, Barry; Baker, David Sampling and energy evaluation challenges in ligand binding protein design Journal Article Protein Science, 26 , pp. 2426-2437, 2017, ISSN: 1469-896. @article{1000b, title = {Sampling and energy evaluation challenges in ligand binding protein design}, author = {Jiayi Dou and Lindsey Doyle and Per Greisen and Alberto Schena and Hahnbeom Park and Kai Johnsson and Barry Stoddard and David Baker}, url = {http://onlinelibrary.wiley.com/doi/10.1002/pro.3317/abstract https://www.bakerlab.org/wp-content/uploads/2017/12/Dou_et_al-2017-Protein_Science.pdf}, doi = {10.1002/pro.3317}, issn = {1469-896}, year = {2017}, date = {2017-10-30}, journal = {Protein Science}, volume = {26}, pages = {2426-2437}, abstract = {The steroid hormone 17α-hydroxylprogesterone (17-OHP) is a biomarker for congenital adrenal hyperplasia and hence there is considerable interest in development of sensors for this compound. We used computational protein design to generate protein models with binding sites for 17-OHP containing an extended, nonpolar, shape-complementary binding pocket for the four-ring core of the compound, and hydrogen bonding residues at the base of the pocket to interact with carbonyl and hydroxyl groups at the more polar end of the ligand. Eight of 16 designed proteins experimentally tested bind 17-OHP with micromolar affinity. A co-crystal structure of one of the designs revealed that 17-OHP is rotated 180° around a pseudo-two-fold axis in the compound and displays multiple binding modes within the pocket, while still interacting with all of the designed residues in the engineered site. Subsequent rounds of mutagenesis and binding selection improved the ligand affinity to nanomolar range, while appearing to constrain the ligand to a single bound conformation that maintains the same “flipped” orientation relative to the original design. We trace the discrepancy in the design calculations to two sources: first, a failure to model subtle backbone changes which alter the distribution of sidechain rotameric states and second, an underestimation of the energetic cost of desolvating the carbonyl and hydroxyl groups of the ligand. The difference between design model and crystal structure thus arises from both sampling limitations and energy function inaccuracies that are exacerbated by the near two-fold symmetry of the molecule.}, keywords = {}, pubstate = {published}, tppubtype = {article} } The steroid hormone 17α-hydroxylprogesterone (17-OHP) is a biomarker for congenital adrenal hyperplasia and hence there is considerable interest in development of sensors for this compound. We used computational protein design to generate protein models with binding sites for 17-OHP containing an extended, nonpolar, shape-complementary binding pocket for the four-ring core of the compound, and hydrogen bonding residues at the base of the pocket to interact with carbonyl and hydroxyl groups at the more polar end of the ligand. Eight of 16 designed proteins experimentally tested bind 17-OHP with micromolar affinity. A co-crystal structure of one of the designs revealed that 17-OHP is rotated 180° around a pseudo-two-fold axis in the compound and displays multiple binding modes within the pocket, while still interacting with all of the designed residues in the engineered site. Subsequent rounds of mutagenesis and binding selection improved the ligand affinity to nanomolar range, while appearing to constrain the ligand to a single bound conformation that maintains the same “flipped” orientation relative to the original design. We trace the discrepancy in the design calculations to two sources: first, a failure to model subtle backbone changes which alter the distribution of sidechain rotameric states and second, an underestimation of the energetic cost of desolvating the carbonyl and hydroxyl groups of the ligand. The difference between design model and crystal structure thus arises from both sampling limitations and energy function inaccuracies that are exacerbated by the near two-fold symmetry of the molecule. |
Chevalier*, Aaron; Silva*, Daniel-Adriano; Rocklin*, Gabriel J; Hicks, Derrick R; Vergara, Renan; Murapa, Patience; Bernard, Steffen M; Zhang, Lu; Lam, Kwok-Ho; Yao, Guorui; Bahl, Christopher D; Miyashita, Shin-Ichiro; Goreshnik, Inna; Fuller, James T; Koday, Merika T; Jenkins, Cody M; Colvin, Tom; Carter, Lauren; Bohn, Alan; Bryan, Cassie M; Fernández-Velasco, Alejandro D; Stewart, Lance; Dong, Min; Huang, Xuhui; Jin, Rongsheng; Wilson, Ian A; Fuller, Deborah H; Baker, David Massively parallel de novo protein design for targeted therapeutics Journal Article Nature, 550 (7674), pp. 74-79, 2017, ISSN: 0028-0836. @article{Chevalier2017, title = {Massively parallel de novo protein design for targeted therapeutics}, author = {Aaron Chevalier* and Daniel-Adriano Silva* and Gabriel J. Rocklin* and Derrick R. Hicks and Renan Vergara and Patience Murapa and Steffen M. Bernard and Lu Zhang and Kwok-Ho Lam and Guorui Yao and Christopher D. Bahl and Shin-Ichiro Miyashita and Inna Goreshnik and James T. Fuller and Merika T. Koday and Cody M. Jenkins and Tom Colvin and Lauren Carter and Alan Bohn and Cassie M. Bryan and D. Alejandro Fernández-Velasco and Lance Stewart and Min Dong and Xuhui Huang and Rongsheng Jin and Ian A. Wilson and Deborah H. Fuller and David Baker }, url = {https://www.nature.com/nature/journal/v550/n7674/full/nature23912.html https://www.bakerlab.org/wp-content/uploads/2017/12/Nature_Chevalier_etal_2017.pdf}, doi = {10.1038/nature23912}, issn = {0028-0836}, year = {2017}, date = {2017-10-05}, journal = {Nature}, volume = {550}, number = {7674}, pages = {74-79}, abstract = {De novo protein design holds promise for creating small stable proteins with shapes customized to bind therapeutic targets. We describe a massively parallel approach for designing, manufacturing and screening mini-protein binders, integrating large-scale computational design, oligonucleotide synthesis, yeast display screening and next-generation sequencing. We designed and tested 22,660 mini-proteins of 37–43 residues that target influenza haemagglutinin and botulinum neurotoxin B, along with 6,286 control sequences to probe contributions to folding and binding, and identified 2,618 high-affinity binders. Comparison of the binding and non-binding design sets, which are two orders of magnitude larger than any previously investigated, enabled the evaluation and improvement of the computational model. Biophysical characterization of a subset of the binder designs showed that they are extremely stable and, unlike antibodies, do not lose activity after exposure to high temperatures. The designs elicit little or no immune response and provide potent prophylactic and therapeutic protection against influenza, even after extensive repeated dosing.}, keywords = {}, pubstate = {published}, tppubtype = {article} } De novo protein design holds promise for creating small stable proteins with shapes customized to bind therapeutic targets. We describe a massively parallel approach for designing, manufacturing and screening mini-protein binders, integrating large-scale computational design, oligonucleotide synthesis, yeast display screening and next-generation sequencing. We designed and tested 22,660 mini-proteins of 37–43 residues that target influenza haemagglutinin and botulinum neurotoxin B, along with 6,286 control sequences to probe contributions to folding and binding, and identified 2,618 high-affinity binders. Comparison of the binding and non-binding design sets, which are two orders of magnitude larger than any previously investigated, enabled the evaluation and improvement of the computational model. Biophysical characterization of a subset of the binder designs showed that they are extremely stable and, unlike antibodies, do not lose activity after exposure to high temperatures. The designs elicit little or no immune response and provide potent prophylactic and therapeutic protection against influenza, even after extensive repeated dosing. |
Sergey Ovchinnikov Hahnbeom Park, David Kim Frank DiMaio David Baker E Protein structure prediction using Rosetta in CASP12 Journal Article Proteins, 2017. @article{Ovchinnikov2017, title = {Protein structure prediction using Rosetta in CASP12}, author = {Sergey Ovchinnikov, Hahnbeom Park, David E. Kim, Frank DiMaio, David Baker}, url = {https://onlinelibrary.wiley.com/doi/epdf/10.1002/prot.25390 https://www.bakerlab.org/wp-content/uploads/2019/10/Ovchinnikov_et_al-2018-Proteins__Structure_Function_and_Bioinformatics.pdf}, doi = {10.1002/prot.25390}, year = {2017}, date = {2017-09-22}, journal = {Proteins}, abstract = {We describe several notable aspects of our structure predictions using Rosetta in CASP12 in the free modeling (FM) and refinement (TR) categories. First, we had previously generated (and published) models for most large protein families lacking experimentally determined structures usingRosetta guided by co-evolution based contact predictions, and for several targets these models proved better starting points for comparative modeling than any known crystal structure—our model database thus starts to fulfill one of the goals of the original protein structure initiative. Second, while our“human”group simply submitted ROBETTA models for most targets, for six targets expert intervention improved predictions considerably; the largest improvement was for T0886where we correctly parsed two discontinuous domains guided by predicted contact maps to accurately identify a structural homolog of the same fold. Third, Rosetta all atom refinement followed by MD simulations led to consistent but small improvements when starting models were close to the native structure, and larger but less consistent improvements when starting models were further away.}, keywords = {}, pubstate = {published}, tppubtype = {article} } We describe several notable aspects of our structure predictions using Rosetta in CASP12 in the free modeling (FM) and refinement (TR) categories. First, we had previously generated (and published) models for most large protein families lacking experimentally determined structures usingRosetta guided by co-evolution based contact predictions, and for several targets these models proved better starting points for comparative modeling than any known crystal structure—our model database thus starts to fulfill one of the goals of the original protein structure initiative. Second, while our“human”group simply submitted ROBETTA models for most targets, for six targets expert intervention improved predictions considerably; the largest improvement was for T0886where we correctly parsed two discontinuous domains guided by predicted contact maps to accurately identify a structural homolog of the same fold. Third, Rosetta all atom refinement followed by MD simulations led to consistent but small improvements when starting models were close to the native structure, and larger but less consistent improvements when starting models were further away. |
Bick, Matthew J*; Greisen, Per J*; Morey, Kevin J; Antunes, Mauricio S; La, David ; Sankaran, Banumathi ; Reymond, Luc ; Johnsson, Kai ; Medford, June I; Baker, David Computational design of environmental sensors for the potent opioid fentanyl Journal Article eLife Sciences Publications, 6 , pp. e28909, 2017, ISBN: 2050-084X. @article{Bick2017, title = {Computational design of environmental sensors for the potent opioid fentanyl}, author = {Bick, Matthew J* and Greisen, Per J* and Morey, Kevin J and Antunes, Mauricio S and La, David and Sankaran, Banumathi and Reymond, Luc and Johnsson, Kai and Medford, June I and Baker, David}, editor = {Cravatt, Benjamin F}, url = {https://elifesciences.org/articles/28909 https://www.bakerlab.org/wp-content/uploads/2018/06/elife-28909-v2-1.pdf}, doi = {10.7554/eLife.28909}, isbn = {2050-084X}, year = {2017}, date = {2017-09-19}, journal = {eLife Sciences Publications}, volume = {6}, pages = {e28909}, abstract = {We describe the computational design of proteins that bind the potent analgesic fentanyl. Our approach employs a fast docking algorithm to find shape complementary ligand placement in protein scaffolds, followed by design of the surrounding residues to optimize binding affinity. Co-crystal structures of the highest affinity binder reveal a highly preorganized binding site, and an overall architecture and ligand placement in close agreement with the design model. We use the designs to generate plant sensors for fentanyl by coupling ligand binding to design stability. The method should be generally useful for detecting toxic hydrophobic compounds in the environment.}, keywords = {}, pubstate = {published}, tppubtype = {article} } We describe the computational design of proteins that bind the potent analgesic fentanyl. Our approach employs a fast docking algorithm to find shape complementary ligand placement in protein scaffolds, followed by design of the surrounding residues to optimize binding affinity. Co-crystal structures of the highest affinity binder reveal a highly preorganized binding site, and an overall architecture and ligand placement in close agreement with the design model. We use the designs to generate plant sensors for fentanyl by coupling ligand binding to design stability. The method should be generally useful for detecting toxic hydrophobic compounds in the environment. |
Anishchenko, I; Ovchinnikov, S; Kamisetty, H; Baker, D Origins of coevolution between residues distant in protein 3D structures Journal Article Proceedings of the National Academy of Sciences, 114 (34), pp. 9122-9127, 2017. @article{1000, title = {Origins of coevolution between residues distant in protein 3D structures}, author = {I Anishchenko and S Ovchinnikov and H Kamisetty and D Baker}, editor = {August 22, 2017}, url = {http://www.pnas.org/content/114/34/9122 https://www.bakerlab.org/wp-content/uploads/2018/08/9122.full1_.pdf}, doi = {10.1073/pnas.1702664114}, year = {2017}, date = {2017-08-22}, journal = {Proceedings of the National Academy of Sciences}, volume = {114}, number = {34}, pages = {9122-9127}, abstract = {Residue pairs that directly coevolve in protein families are generally close in protein 3D structures. Here we study the exceptions to this general trend—directly coevolving residue pairs that are distant in protein structures—to determine the origins of evolutionary pressure on spatially distant residues and to understand the sources of error in contact-based structure prediction. Over a set of 4,000 protein families, we find that 25% of directly coevolving residue pairs are separated by more than 5 Å in protein structures and 3% by more than 15 Å. The majority (91%) of directly coevolving residue pairs in the 5–15 Å range are found to be in contact in at least one homologous structure—these exceptions arise from structural variation in the family in the region containing the residues. Thirty-five percent of the exceptions greater than 15 Å are at homo-oligomeric interfaces, 19% arise from family structural variation, and 27% are in repeat proteins likely reflecting alignment errors. Of the remaining long-range exceptions (<1% of the total number of coupled pairs), many can be attributed to close interactions in an oligomeric state. Overall, the results suggest that directly coevolving residue pairs not in repeat proteins are spatially proximal in at least one biologically relevant protein conformation within the family; we find little evidence for direct coupling between residues at spatially separated allosteric and functional sites or for increased direct coupling between residue pairs on putative allosteric pathways connecting them.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Residue pairs that directly coevolve in protein families are generally close in protein 3D structures. Here we study the exceptions to this general trend—directly coevolving residue pairs that are distant in protein structures—to determine the origins of evolutionary pressure on spatially distant residues and to understand the sources of error in contact-based structure prediction. Over a set of 4,000 protein families, we find that 25% of directly coevolving residue pairs are separated by more than 5 Å in protein structures and 3% by more than 15 Å. The majority (91%) of directly coevolving residue pairs in the 5–15 Å range are found to be in contact in at least one homologous structure—these exceptions arise from structural variation in the family in the region containing the residues. Thirty-five percent of the exceptions greater than 15 Å are at homo-oligomeric interfaces, 19% arise from family structural variation, and 27% are in repeat proteins likely reflecting alignment errors. Of the remaining long-range exceptions (<1% of the total number of coupled pairs), many can be attributed to close interactions in an oligomeric state. Overall, the results suggest that directly coevolving residue pairs not in repeat proteins are spatially proximal in at least one biologically relevant protein conformation within the family; we find little evidence for direct coupling between residues at spatially separated allosteric and functional sites or for increased direct coupling between residue pairs on putative allosteric pathways connecting them. |
Lin, Yu-Ru; Koga, Nobuyasu; Vorobiev, Sergey M; Baker, David Cyclic oligomer design with de novo αβ-proteins Journal Article Protein Science, 2017. @article{Lin2017, title = {Cyclic oligomer design with de novo αβ-proteins}, author = {Yu-Ru Lin and Nobuyasu Koga and Sergey M. Vorobiev and David Baker}, url = {https://www.bakerlab.org/wp-content/uploads/2018/06/Lin_et_al-2017-Protein_Science.pdf http://onlinelibrary.wiley.com/doi/10.1002/pro.3270/full}, doi = {10.1002/pro.3270}, year = {2017}, date = {2017-08-12}, journal = {Protein Science}, abstract = {We have previously shown that monomeric globular αβ- proteins can be designed de novo with considerable control over topology, size and shape. In this paper, we investigate the design of cyclic homo-oligomers from these starting points. We experimented with both keeping the original monomer backbones fixed during the cyclic docking and design process, and allowing the backbone of the monomer to conform to that of adjacent subunits in the homo-oligomer. The latter flexible backbone protocol generated designs with shape complementarity approaching that of native homo-oligomers, but experimental characterization showed that the fixed backbone designs were more stable and less aggregation prone. C2 homo-oligomers with β- strand backbone interactions were designed using both fixed and flexible backbone protocols. Designed C2 oligomers were structurally confirmed through x-ray crystallography and small-angle X-ray scattering (SAXS). In contrast, C3-C5 designed homo-oligomers with primarily nonpolar residues at interfaces all formed a range of oligomeric states. Taken together, our results suggest that for homo-oligomers formed from globular building blocks, improved structural specificity will be better achieved using monomers with increased shape complementarity and with more polar interfaces. This article is protected by copyright. All rights reserved.}, keywords = {}, pubstate = {published}, tppubtype = {article} } We have previously shown that monomeric globular αβ- proteins can be designed de novo with considerable control over topology, size and shape. In this paper, we investigate the design of cyclic homo-oligomers from these starting points. We experimented with both keeping the original monomer backbones fixed during the cyclic docking and design process, and allowing the backbone of the monomer to conform to that of adjacent subunits in the homo-oligomer. The latter flexible backbone protocol generated designs with shape complementarity approaching that of native homo-oligomers, but experimental characterization showed that the fixed backbone designs were more stable and less aggregation prone. C2 homo-oligomers with β- strand backbone interactions were designed using both fixed and flexible backbone protocols. Designed C2 oligomers were structurally confirmed through x-ray crystallography and small-angle X-ray scattering (SAXS). In contrast, C3-C5 designed homo-oligomers with primarily nonpolar residues at interfaces all formed a range of oligomeric states. Taken together, our results suggest that for homo-oligomers formed from globular building blocks, improved structural specificity will be better achieved using monomers with increased shape complementarity and with more polar interfaces. This article is protected by copyright. All rights reserved. |
Rocklin, GJ; Chidyausiku, TM; Goreshnik, I; Ford, A; Houliston, S; Lemak, A; Carter, L; Ravichandran, R; Mulligan, VK; Chevalier, A; Arrowsmith, CH; Baker, D Global analysis of protein folding using massively parallel design, synthesis, and testing Journal Article Science, 357 , pp. 168-175, 2017. @article{433b, title = {Global analysis of protein folding using massively parallel design, synthesis, and testing}, author = {GJ Rocklin and TM Chidyausiku and I Goreshnik and A Ford and S Houliston and A Lemak and L Carter and R Ravichandran and VK Mulligan and A Chevalier and CH Arrowsmith and D Baker}, url = {http://science.sciencemag.org/content/357/6347/168.full?ijkey=/u00BDqfiTTGY&keytype=ref&siteid=sci https://www.bakerlab.org/wp-content/uploads/2017/12/Science_Rocklin_etal_2017.pdf}, doi = {10.1126/science.aan0693}, year = {2017}, date = {2017-07-14}, journal = {Science}, volume = {357}, pages = {168-175}, abstract = {Proteins fold into unique native structures stabilized by thousands of weak interactions that collectively overcome the entropic cost of folding. Although these forces are “encoded” in the thousands of known protein structures, “decoding” them is challenging because of the complexity of natural proteins that have evolved for function, not stability. We combined computational protein design, next-generation gene synthesis, and a high-throughput protease susceptibility assay to measure folding and stability for more than 15,000 de novo designed miniproteins, 1000 natural proteins, 10,000 point mutants, and 30,000 negative control sequences. This analysis identified more than 2500 stable designed proteins in four basic folds—a number sufficient to enable us to systematically examine how sequence determines folding and stability in uncharted protein space. Iteration between design and experiment increased the design success rate from 6% to 47%, produced stable proteins unlike those found in nature for topologies where design was initially unsuccessful, and revealed subtle contributions to stability as designs became increasingly optimized. Our approach achieves the long-standing goal of a tight feedback cycle between computation and experiment and has the potential to transform computational protein design into a data-driven science.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Proteins fold into unique native structures stabilized by thousands of weak interactions that collectively overcome the entropic cost of folding. Although these forces are “encoded” in the thousands of known protein structures, “decoding” them is challenging because of the complexity of natural proteins that have evolved for function, not stability. We combined computational protein design, next-generation gene synthesis, and a high-throughput protease susceptibility assay to measure folding and stability for more than 15,000 de novo designed miniproteins, 1000 natural proteins, 10,000 point mutants, and 30,000 negative control sequences. This analysis identified more than 2500 stable designed proteins in four basic folds—a number sufficient to enable us to systematically examine how sequence determines folding and stability in uncharted protein space. Iteration between design and experiment increased the design success rate from 6% to 47%, produced stable proteins unlike those found in nature for topologies where design was initially unsuccessful, and revealed subtle contributions to stability as designs became increasingly optimized. Our approach achieves the long-standing goal of a tight feedback cycle between computation and experiment and has the potential to transform computational protein design into a data-driven science. |
Strauch, Eva-Maria ; Bernard, Steffen M; La, David ; Bohn, Alan J; Lee, Peter S; Anderson, Caitlin E; Nieusma, Travis ; Holstein, Carly A; Garcia, Natalie K; Hooper, Kathryn A; Ravichandran, Rashmi ; Nelson, Jorgen W; Sheffler, William ; Bloom, Jesse D; Lee, Kelly K; Ward, Andrew B; Yager, Paul ; Fuller, Deborah H; Wilson, Ian A; Baker, David Computational design of trimeric influenza-neutralizing proteins targeting the hemagglutinin receptor binding site Journal Article Nature Biotechnology, [Epub ahead of print] , 2017, ISSN: 1546-1696. @article{Strauch2017, title = {Computational design of trimeric influenza-neutralizing proteins targeting the hemagglutinin receptor binding site}, author = {Strauch, Eva-Maria and Bernard, Steffen M and La, David and Bohn, Alan J and Lee, Peter S and Anderson, Caitlin E and Nieusma, Travis and Holstein, Carly A and Garcia, Natalie K and Hooper, Kathryn A and Ravichandran, Rashmi and Nelson, Jorgen W and Sheffler, William and Bloom, Jesse D and Lee, Kelly K and Ward, Andrew B and Yager, Paul and Fuller, Deborah H and Wilson, Ian A and Baker, David}, url = {https://www.bakerlab.org/wp-content/uploads/2017/06/Strauch_NatureBiotech_2017.pdf https://www.nature.com/nbt/journal/vaop/ncurrent/full/nbt.3907.html}, doi = {10.1038/nbt.3907}, issn = {1546-1696}, year = {2017}, date = {2017-06-12}, journal = {Nature Biotechnology}, volume = {[Epub ahead of print]}, abstract = {Many viral surface glycoproteins and cell surface receptors are homo-oligomers, and thus can potentially be targeted by geometrically matched homo-oligomers that engage all subunits simultaneously to attain high avidity and/or lock subunits together. The adaptive immune system cannot generally employ this strategy since the individual antibody binding sites are not arranged with appropriate geometry to simultaneously engage multiple sites in a single target homo-oligomer. We describe a general strategy for the computational design of homo-oligomeric protein assemblies with binding functionality precisely matched to homo-oligomeric target sites. In the first step, a small protein is designed that binds a single site on the target. In the second step, the designed protein is assembled into a homo-oligomer such that the designed binding sites are aligned with the target sites. We use this approach to design high-avidity trimeric proteins that bind influenza A hemagglutinin (HA) at its conserved receptor binding site. The designed trimers can both capture and detect HA in a paper-based diagnostic format, neutralizes influenza in cell culture, and completely protects mice when given as a single dose 24 h before or after challenge with influenza. }, keywords = {}, pubstate = {published}, tppubtype = {article} } Many viral surface glycoproteins and cell surface receptors are homo-oligomers, and thus can potentially be targeted by geometrically matched homo-oligomers that engage all subunits simultaneously to attain high avidity and/or lock subunits together. The adaptive immune system cannot generally employ this strategy since the individual antibody binding sites are not arranged with appropriate geometry to simultaneously engage multiple sites in a single target homo-oligomer. We describe a general strategy for the computational design of homo-oligomeric protein assemblies with binding functionality precisely matched to homo-oligomeric target sites. In the first step, a small protein is designed that binds a single site on the target. In the second step, the designed protein is assembled into a homo-oligomer such that the designed binding sites are aligned with the target sites. We use this approach to design high-avidity trimeric proteins that bind influenza A hemagglutinin (HA) at its conserved receptor binding site. The designed trimers can both capture and detect HA in a paper-based diagnostic format, neutralizes influenza in cell culture, and completely protects mice when given as a single dose 24 h before or after challenge with influenza. |
CY, Janda; LT, Dang; C, You; J, Chang; de W, Lau; ZA, Zhong; KS, Yan; O, Marecic; D, Siepe; X, Li; JD, Moody; BO, Williams; H, Clevers; J, Piehler; D, Baker; CJ, Kuo; KC, Garcia Surrogate Wnt agonists that phenocopy canonical Wnt and β-catenin signalling. Journal Article Nature, 545 (7653), pp. 234-237, 2017. @article{1001, title = {Surrogate Wnt agonists that phenocopy canonical Wnt and β-catenin signalling.}, author = {Janda CY and Dang LT and You C and Chang J and de Lau W and Zhong ZA and Yan KS and Marecic O and Siepe D and Li X and Moody JD and Williams BO and Clevers H and Piehler J and Baker D and Kuo CJ and Garcia KC}, url = {https://www.bakerlab.org/wp-content/uploads/2018/06/nature22306.pdf http://www.nature.com/nature/journal/v545/n7653/abs/nature22306.html?foxtrotcallback=true}, doi = {10.1038/nature22306}, year = {2017}, date = {2017-05-11}, journal = {Nature}, volume = {545}, number = {7653}, pages = {234-237}, abstract = {Wnt proteins modulate cell proliferation and differentiation and the self-renewal of stem cells by inducing β-catenin-dependent signalling through the Wnt receptor frizzled (FZD) and the co-receptors LRP5 and LRP6 to regulate cell fate decisions and the growth and repair of several tissues1. The 19 mammalian Wnt proteins are cross-reactive with the 10 FZD receptors, and this has complicated the attribution of distinct biological functions to specific FZD and Wnt subtype interactions. Furthermore, Wnt proteins are modified post-translationally by palmitoylation, which is essential for their secretion, function and interaction with FZD receptors2, 3, 4. As a result of their acylation, Wnt proteins are very hydrophobic and require detergents for purification, which presents major obstacles to the preparation and application of recombinant Wnt proteins. This hydrophobicity has hindered the determination of the molecular mechanisms of Wnt signalling activation and the functional importance of FZD subtypes, and the use of Wnt proteins as therapeutic agents. Here we develop surrogate Wnt agonists, water-soluble FZD–LRP5/LRP6 heterodimerizers, with FZD5/FZD8-specific and broadly FZD-reactive binding domains. Similar to WNT3A, these Wnt agonists elicit a characteristic β-catenin signalling response in a FZD-selective fashion, enhance the osteogenic lineage commitment of primary mouse and human mesenchymal stem cells, and support the growth of a broad range of primary human organoid cultures. In addition, the surrogates can be systemically expressed and exhibit Wnt activity in vivo in the mouse liver, regulating metabolic liver zonation and promoting hepatocyte proliferation, resulting in hepatomegaly. These surrogates demonstrate that canonical Wnt signalling can be activated by bi-specific ligands that induce receptor heterodimerization. Furthermore, these easily produced, non-lipidated Wnt surrogate agonists facilitate functional studies of Wnt signalling and the exploration of Wnt agonists for translational applications in regenerative medicine.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Wnt proteins modulate cell proliferation and differentiation and the self-renewal of stem cells by inducing β-catenin-dependent signalling through the Wnt receptor frizzled (FZD) and the co-receptors LRP5 and LRP6 to regulate cell fate decisions and the growth and repair of several tissues1. The 19 mammalian Wnt proteins are cross-reactive with the 10 FZD receptors, and this has complicated the attribution of distinct biological functions to specific FZD and Wnt subtype interactions. Furthermore, Wnt proteins are modified post-translationally by palmitoylation, which is essential for their secretion, function and interaction with FZD receptors2, 3, 4. As a result of their acylation, Wnt proteins are very hydrophobic and require detergents for purification, which presents major obstacles to the preparation and application of recombinant Wnt proteins. This hydrophobicity has hindered the determination of the molecular mechanisms of Wnt signalling activation and the functional importance of FZD subtypes, and the use of Wnt proteins as therapeutic agents. Here we develop surrogate Wnt agonists, water-soluble FZD–LRP5/LRP6 heterodimerizers, with FZD5/FZD8-specific and broadly FZD-reactive binding domains. Similar to WNT3A, these Wnt agonists elicit a characteristic β-catenin signalling response in a FZD-selective fashion, enhance the osteogenic lineage commitment of primary mouse and human mesenchymal stem cells, and support the growth of a broad range of primary human organoid cultures. In addition, the surrogates can be systemically expressed and exhibit Wnt activity in vivo in the mouse liver, regulating metabolic liver zonation and promoting hepatocyte proliferation, resulting in hepatomegaly. These surrogates demonstrate that canonical Wnt signalling can be activated by bi-specific ligands that induce receptor heterodimerization. Furthermore, these easily produced, non-lipidated Wnt surrogate agonists facilitate functional studies of Wnt signalling and the exploration of Wnt agonists for translational applications in regenerative medicine. |
Marcos, Enrique* ; Basanta, Benjamin* ; Chidyausiku, Tamuka M; Tang, Yuefeng ; Oberdorfer, Gustav ; Liu, Gaohua ; Swapna, G V T; Guan, Rongjin ; Silva, Daniel-Adriano ; Dou, Jiayi ; Pereira, Jose Henrique ; Xiao, Rong ; Sankaran, Banumathi ; Zwart, Peter H; Montelione, Gaetano T; Baker, David Principles for designing proteins with cavities formed by curved β sheets Journal Article Science, 355 (6321), pp. 201–206, 2017, ISSN: 0036-8075. @article{Marcos2017, title = {Principles for designing proteins with cavities formed by curved β sheets}, author = {Marcos, Enrique* and Basanta, Benjamin* and Chidyausiku, Tamuka M. and Tang, Yuefeng and Oberdorfer, Gustav and Liu, Gaohua and Swapna, G. V. T. and Guan, Rongjin and Silva, Daniel-Adriano and Dou, Jiayi and Pereira, Jose Henrique and Xiao, Rong and Sankaran, Banumathi and Zwart, Peter H. and Montelione, Gaetano T. and Baker, David}, url = {https://www.bakerlab.org/wp-content/uploads/2017/01/Marcos_Science_2017.pdf http://science.sciencemag.org/content/355/6321/201}, doi = {10.1126/science.aah7389}, issn = {0036-8075}, year = {2017}, date = {2017-01-01}, journal = {Science}, volume = {355}, number = {6321}, pages = {201--206}, publisher = {American Association for the Advancement of Science}, abstract = {In de novo protein design, creating custom-tailored binding sites is a particular challenge because these sites often involve nonideal backbone structures. For example, curved b sheets are a common ligand binding motif. Marcos et al. investigated the principles that drive β-sheet curvature by studying the geometry of β sheets in natural proteins and folding simulations. In a step toward custom design of enzyme catalysts, they used these principles to control β-sheet geometry and design proteins with differently shaped cavities.Science, this issue p. 201Active sites and ligand-binding cavities in native proteins are often formed by curved β sheets, and the ability to control β-sheet curvature would allow design of binding proteins with cavities customized to specific ligands. Toward this end, we investigated the mechanisms controlling β-sheet curvature by studying the geometry of β sheets in naturally occurring protein structures and folding simulations. The principles emerging from this analysis were used to design, de novo, a series of proteins with curved β sheets topped with α helices. Nuclear magnetic resonance and crystal structures of the designs closely match the computational models, showing that β-sheet curvature can be controlled with atomic-level accuracy. Our approach enables the design of proteins with cavities and provides a route to custom design ligand-binding and catalytic sites.}, keywords = {}, pubstate = {published}, tppubtype = {article} } In de novo protein design, creating custom-tailored binding sites is a particular challenge because these sites often involve nonideal backbone structures. For example, curved b sheets are a common ligand binding motif. Marcos et al. investigated the principles that drive β-sheet curvature by studying the geometry of β sheets in natural proteins and folding simulations. In a step toward custom design of enzyme catalysts, they used these principles to control β-sheet geometry and design proteins with differently shaped cavities.Science, this issue p. 201Active sites and ligand-binding cavities in native proteins are often formed by curved β sheets, and the ability to control β-sheet curvature would allow design of binding proteins with cavities customized to specific ligands. Toward this end, we investigated the mechanisms controlling β-sheet curvature by studying the geometry of β sheets in naturally occurring protein structures and folding simulations. The principles emerging from this analysis were used to design, de novo, a series of proteins with curved β sheets topped with α helices. Nuclear magnetic resonance and crystal structures of the designs closely match the computational models, showing that β-sheet curvature can be controlled with atomic-level accuracy. Our approach enables the design of proteins with cavities and provides a route to custom design ligand-binding and catalytic sites. |
Ovchinnikov, Sergey; Park, Hahnbeom; Varghese, Neha; Huang, Po-Ssu; Pavlopoulos, Georgios A; Kim, David E; Kamisetty, Hetunandan; Kyrpides, Nikos C; Baker, David Protein structure determination using metagenome sequence data Journal Article Science, 355 (6322), pp. 294–298, 2017, ISSN: 0036-8075. @article{Ovchinnikov294, title = {Protein structure determination using metagenome sequence data}, author = { Sergey Ovchinnikov and Hahnbeom Park and Neha Varghese and Po-Ssu Huang and Georgios A. Pavlopoulos and David E. Kim and Hetunandan Kamisetty and Nikos C. Kyrpides and David Baker}, url = {https://www.bakerlab.org/wp-content/uploads/2017/01/ovchinnikov_science_2017.pdf http://science.sciencemag.org/content/355/6322/294}, doi = {10.1126/science.aah4043}, issn = {0036-8075}, year = {2017}, date = {2017-01-01}, journal = {Science}, volume = {355}, number = {6322}, pages = {294--298}, publisher = {American Association for the Advancement of Science}, abstract = {Fewer than a third of the 14,849 known protein families have at least one member with an experimentally determined structure. This leaves more than 5000 protein families with no structural information. Protein modeling using residue-residue contacts inferred from evolutionary data has been successful in modeling unknown structures, but it requires large numbers of aligned sequences. Ovchinnikov et al. augmented such sequence alignments with metagenome sequence data (see the Perspective by S"oding). They determined the number of sequences required to allow modeling, developed criteria for model quality, and, where possible, improved modeling by matching predicted contacts to known structures. Their method predicted quality structural models for 614 protein families, of which about 140 represent newly discovered protein folds.Science, this issue p. 294; see also p. 248Despite decades of work by structural biologists, there are still ~5200 protein families with unknown structure outside the range of comparative modeling. We show that Rosetta structure prediction guided by residue-residue contacts inferred from evolutionary information can accurately model proteins that belong to large families and that metagenome sequence data more than triple the number of protein families with sufficient sequences for accurate modeling. We then integrate metagenome data, contact-based structure matching, and Rosetta structure calculations to generate models for 614 protein families with currently unknown structures; 206 are membrane proteins and 137 have folds not represented in the Protein Data Bank. This approach provides the representative models for large protein families originally envisioned as the goal of the Protein Structure Initiative at a fraction of the cost.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Fewer than a third of the 14,849 known protein families have at least one member with an experimentally determined structure. This leaves more than 5000 protein families with no structural information. Protein modeling using residue-residue contacts inferred from evolutionary data has been successful in modeling unknown structures, but it requires large numbers of aligned sequences. Ovchinnikov et al. augmented such sequence alignments with metagenome sequence data (see the Perspective by S"oding). They determined the number of sequences required to allow modeling, developed criteria for model quality, and, where possible, improved modeling by matching predicted contacts to known structures. Their method predicted quality structural models for 614 protein families, of which about 140 represent newly discovered protein folds.Science, this issue p. 294; see also p. 248Despite decades of work by structural biologists, there are still ~5200 protein families with unknown structure outside the range of comparative modeling. We show that Rosetta structure prediction guided by residue-residue contacts inferred from evolutionary information can accurately model proteins that belong to large families and that metagenome sequence data more than triple the number of protein families with sufficient sequences for accurate modeling. We then integrate metagenome data, contact-based structure matching, and Rosetta structure calculations to generate models for 614 protein families with currently unknown structures; 206 are membrane proteins and 137 have folds not represented in the Protein Data Bank. This approach provides the representative models for large protein families originally envisioned as the goal of the Protein Structure Initiative at a fraction of the cost. |
2016 |
Mills, Jeremy H; Sheffler, William; Ener, Maraia E; Almhjell, Patrick J; Oberdorfer, Gustav; Pereira, José Henrique; Parmeggiani, Fabio; Sankaran, Banumathi; Zwart, Peter H; Baker, David Computational design of a homotrimeric metalloprotein with a trisbipyridyl core Journal Article PNAS, 113 (52), pp. 15012-15017, 2016. @article{1300, title = {Computational design of a homotrimeric metalloprotein with a trisbipyridyl core}, author = {Jeremy H. Mills and William Sheffler and Maraia E. Ener and Patrick J. Almhjell and Gustav Oberdorfer and José Henrique Pereira and Fabio Parmeggiani and Banumathi Sankaran and Peter H. Zwart and David Baker}, url = {https://www.bakerlab.org/wp-content/uploads/2018/06/15012.full_.pdf http://www.pnas.org/content/113/52/15012.abstract }, doi = {10.1073/pnas.1600188113}, year = {2016}, date = {2016-12-08}, journal = {PNAS}, volume = {113}, number = {52}, pages = {15012-15017}, abstract = {Metal-chelating heteroaryl small molecules have found widespread use as building blocks for coordination-driven, self-assembling nanostructures. The metal-chelating noncanonical amino acid (2,2′-bipyridin-5yl)alanine (Bpy-ala) could, in principle, be used to nucleate specific metalloprotein assemblies if introduced into proteins such that one assembly had much lower free energy than all alternatives. Here we describe the use of the Rosetta computational methodology to design a self-assembling homotrimeric protein with [Fe(Bpy-ala)3]2+ complexes at the interface between monomers. X-ray crystallographic analysis of the homotrimer showed that the design process had near-atomic-level accuracy: The all-atom rmsd between the design model and crystal structure for the residues at the protein interface is ∼1.4 Å. These results demonstrate that computational protein design together with genetically encoded noncanonical amino acids can be used to drive formation of precisely specified metal-mediated protein assemblies that could find use in a wide range of photophysical applications.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Metal-chelating heteroaryl small molecules have found widespread use as building blocks for coordination-driven, self-assembling nanostructures. The metal-chelating noncanonical amino acid (2,2′-bipyridin-5yl)alanine (Bpy-ala) could, in principle, be used to nucleate specific metalloprotein assemblies if introduced into proteins such that one assembly had much lower free energy than all alternatives. Here we describe the use of the Rosetta computational methodology to design a self-assembling homotrimeric protein with [Fe(Bpy-ala)3]2+ complexes at the interface between monomers. X-ray crystallographic analysis of the homotrimer showed that the design process had near-atomic-level accuracy: The all-atom rmsd between the design model and crystal structure for the residues at the protein interface is ∼1.4 Å. These results demonstrate that computational protein design together with genetically encoded noncanonical amino acids can be used to drive formation of precisely specified metal-mediated protein assemblies that could find use in a wide range of photophysical applications. |
JA, Fallas; G, Ueda; W, Sheffler; V, Nguyen; DE, McNamara; B, Sankaran; JH, Pereira; F, Parmeggiani; TJ, Brunette; D, Cascio; TR, Yeates; P, Zwart; D, Baker Computational design of self-assembling cyclic protein homo-oligomers Journal Article Nature Chemistry, 9 , pp. 353–360, 2016. @article{Fallas2016, title = {Computational design of self-assembling cyclic protein homo-oligomers}, author = {Fallas JA and Ueda G and Sheffler W and Nguyen V and McNamara DE and Sankaran B and Pereira JH and Parmeggiani F and Brunette TJ and Cascio D and Yeates TR and Zwart P and Baker D}, url = {https://www.nature.com/articles/nchem.2673 https://www.bakerlab.org/wp-content/uploads/2020/10/Fassas-et-al-2016-Homooligomers.pdf}, doi = {10.1038/nchem.2673}, year = {2016}, date = {2016-12-05}, journal = {Nature Chemistry}, volume = {9}, pages = {353–360}, abstract = {Self-assembling cyclic protein homo-oligomers play important roles in biology, and the ability to generate custom homo-oligomeric structures could enable new approaches to probe biological function. Here we report a general approach to design cyclic homo-oligomers that employs a new residue-pair-transform method to assess the designability of a protein–protein interface. This method is sufficiently rapid to enable the systematic enumeration of cyclically docked arrangements of a monomer followed by sequence design of the newly formed interfaces. We use this method to design interfaces onto idealized repeat proteins that direct their assembly into complexes that possess cyclic symmetry. Of 96 designs that were characterized experimentally, 21 were found to form stable monodisperse homo-oligomers in solution, and 15 (four homodimers, six homotrimers, six homotetramers and one homopentamer) had solution small-angle X-ray scattering data consistent with the design models. X-ray crystal structures were obtained for five of the designs and each is very close to their corresponding computational model.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Self-assembling cyclic protein homo-oligomers play important roles in biology, and the ability to generate custom homo-oligomeric structures could enable new approaches to probe biological function. Here we report a general approach to design cyclic homo-oligomers that employs a new residue-pair-transform method to assess the designability of a protein–protein interface. This method is sufficiently rapid to enable the systematic enumeration of cyclically docked arrangements of a monomer followed by sequence design of the newly formed interfaces. We use this method to design interfaces onto idealized repeat proteins that direct their assembly into complexes that possess cyclic symmetry. Of 96 designs that were characterized experimentally, 21 were found to form stable monodisperse homo-oligomers in solution, and 15 (four homodimers, six homotrimers, six homotetramers and one homopentamer) had solution small-angle X-ray scattering data consistent with the design models. X-ray crystal structures were obtained for five of the designs and each is very close to their corresponding computational model. |
Berger, Stephanie; Procko, Erik; Margineantu, Daciana; Lee, Erinna F; Shen, Betty W; Zelter, Alex; Silva, Daniel-Adriano; and Chawla, Kusum; Herold, Marco J; Garnier, Jean-Marc; Johnson, Richard; MacCoss, Michael J; Lessene, Guillaume; Davis, Trisha N; Stayton, Patrick S; Stoddard, Barry L; Fairlie, Douglas W; Hockenbery, David M; Baker, David Computationally designed high specificity inhibitors delineate the roles of BCL2 family proteins in cancer Journal Article Elife, 2016. @article{S2016, title = {Computationally designed high specificity inhibitors delineate the roles of BCL2 family proteins in cancer}, author = {Stephanie Berger and Erik Procko and Daciana Margineantu and Erinna F Lee and Betty W Shen and Alex Zelter and Daniel-Adriano Silva and and Kusum Chawla and Marco J Herold and Jean-Marc Garnier and Richard Johnson and Michael J MacCoss and Guillaume Lessene and Trisha N Davis and Patrick S Stayton and Barry L Stoddard and W Douglas Fairlie and David M Hockenbery and David Baker}, url = {https://www.bakerlab.org/wp-content/uploads/2017/01/Berger_elife_2016.pdf https://elifesciences.org/articles/20352}, doi = {10.7554/eLife.20352}, year = {2016}, date = {2016-11-02}, journal = {Elife}, abstract = {Many cancers overexpress one or more of the six human pro-survival BCL2 family proteins to evade apoptosis. To determine which BCL2 protein or proteins block apoptosis in different cancers, we computationally designed three-helix bundle protein inhibitors specific for each BCL2 pro-survival protein. Following in vitro optimization, each inhibitor binds its target with high picomolar to low nanomolar affinity and at least 300-fold specificity. Expression of the designed inhibitors in human cancer cell lines revealed unique dependencies on BCL2 proteins for survival which could not be inferred from other BCL2 profiling methods. Our results show that designed inhibitors can be generated for each member of a closely-knit protein family to probe the importance of specific protein-protein interactions in complex biological processes.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Many cancers overexpress one or more of the six human pro-survival BCL2 family proteins to evade apoptosis. To determine which BCL2 protein or proteins block apoptosis in different cancers, we computationally designed three-helix bundle protein inhibitors specific for each BCL2 pro-survival protein. Following in vitro optimization, each inhibitor binds its target with high picomolar to low nanomolar affinity and at least 300-fold specificity. Expression of the designed inhibitors in human cancer cell lines revealed unique dependencies on BCL2 proteins for survival which could not be inferred from other BCL2 profiling methods. Our results show that designed inhibitors can be generated for each member of a closely-knit protein family to probe the importance of specific protein-protein interactions in complex biological processes. |
Huang, Po-Ssu; Boyken, Scott E; Baker, David The coming of age of de novo protein design Journal Article Nature, 537 , pp. 320-327, 2016. @article{Huang2016, title = {The coming of age of de novo protein design}, author = {Po-Ssu Huang and Scott E. Boyken and David Baker}, url = {https://www.bakerlab.org/wp-content/uploads/2016/09/HuangBoyken_DeNovoDesign_Nature2016.pdf}, doi = {10.1038/nature19946}, year = {2016}, date = {2016-09-15}, journal = {Nature}, volume = {537}, pages = {320-327}, abstract = {There are 20200 possible amino-acid sequences for a 200-residue protein, of which the natural evolutionary process has sampled only an infinitesimal subset. De novo protein design explores the full sequence space, guided by the physical principles that underlie protein folding. Computational methodology has advanced to the point that a wide range of structures can be designed from scratch with atomic-level accuracy. Almost all protein engineering so far has involved the modification of naturally occurring proteins; it should now be possible to design new functional proteins from the ground up to tackle current challenges in biomedicine and nanotechnology.}, keywords = {}, pubstate = {published}, tppubtype = {article} } There are 20200 possible amino-acid sequences for a 200-residue protein, of which the natural evolutionary process has sampled only an infinitesimal subset. De novo protein design explores the full sequence space, guided by the physical principles that underlie protein folding. Computational methodology has advanced to the point that a wide range of structures can be designed from scratch with atomic-level accuracy. Almost all protein engineering so far has involved the modification of naturally occurring proteins; it should now be possible to design new functional proteins from the ground up to tackle current challenges in biomedicine and nanotechnology. |
Bhardwaj*, Gaurav; Mulligan*, Vikram Khipple; Bahl*, Christopher D; Gilmore, Jason M; Harvey, Peta J; Cheneval, Olivier; Buchko, Garry W; Pulavarti, Surya V S R K; Kaas, Quentin; Eletsky, Alexander; Huang, Po-Ssu; Johnsen, William A; Greisen, Per Jr; Rocklin, Gabriel J; Song, Yifan; Linsky, Thomas W; Watkins, Andrew; Rettie, Stephen A; Xianzhong Xu, Lauren Carter P; Bonneau, Richard; Olson, James M; Coutsias, Evangelos; Correnti, Colin E; Szyperski, Thomas; Craik, David J; Baker, David Accurate de novo design of hyperstable constrained peptides Journal Article Nature, 2016. @article{Bhardwaj2016, title = {Accurate de novo design of hyperstable constrained peptides}, author = { Gaurav Bhardwaj* and Vikram Khipple Mulligan* and Christopher D. Bahl* and Jason M. Gilmore and Peta J. Harvey and Olivier Cheneval and Garry W. Buchko and Surya V. S. R. K. Pulavarti and Quentin Kaas and Alexander Eletsky and Po-Ssu Huang and William A. Johnsen and Per Jr Greisen and Gabriel J. Rocklin and Yifan Song and Thomas W. Linsky and Andrew Watkins and Stephen A. Rettie and Xianzhong Xu, Lauren P. Carter and Richard Bonneau and James M. Olson and Evangelos Coutsias and Colin E. Correnti and Thomas Szyperski and David J. Craik and David Baker }, url = {https://www.bakerlab.org/wp-content/uploads/2016/09/Bhardwaj_Nature_2016.pdf}, doi = {10.1038/nature19791}, year = {2016}, date = {2016-09-14}, journal = {Nature}, abstract = {Naturally occurring, pharmacologically active peptides constrained with covalent crosslinks generally have shapes that have evolved to fit precisely into binding pockets on their targets. Such peptides can have excellent pharmaceutical properties, combining the stability and tissue penetration of small-molecule drugs with the specificity of much larger protein therapeutics. The ability to design constrained peptides with precisely specified tertiary structures would enable the design of shape-complementary inhibitors of arbitrary targets. Here we describe the development of computational methods for accurate de novo design of conformationally restricted peptides, and the use of these methods to design 18–47 residue, disulfide-crosslinked peptides, a subset of which are heterochiral and/or N–C backbone-cyclized. Both genetically encodable and non-canonical peptides are exceptionally stable to thermal and chemical denaturation, and 12 experimentally determined X-ray and NMR structures are nearly identical to the computational design models. The computational design methods and stable scaffolds presented here provide the basis for development of a new generation of peptide-based drugs.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Naturally occurring, pharmacologically active peptides constrained with covalent crosslinks generally have shapes that have evolved to fit precisely into binding pockets on their targets. Such peptides can have excellent pharmaceutical properties, combining the stability and tissue penetration of small-molecule drugs with the specificity of much larger protein therapeutics. The ability to design constrained peptides with precisely specified tertiary structures would enable the design of shape-complementary inhibitors of arbitrary targets. Here we describe the development of computational methods for accurate de novo design of conformationally restricted peptides, and the use of these methods to design 18–47 residue, disulfide-crosslinked peptides, a subset of which are heterochiral and/or N–C backbone-cyclized. Both genetically encodable and non-canonical peptides are exceptionally stable to thermal and chemical denaturation, and 12 experimentally determined X-ray and NMR structures are nearly identical to the computational design models. The computational design methods and stable scaffolds presented here provide the basis for development of a new generation of peptide-based drugs. |
Bale, Jacob B; Gonen, Shane; Liu, Yuxi; Sheffler, William; Ellis, Daniel; Thomas, Chantz; Cascio, Duilio; Yeates, Todd O; Gonen, Tamir; King, Neil P; Baker, David Accurate design of megadalton-scale two-component icosahedral protein complexes Journal Article Science, 353 (6297), pp. 389-394, 2016. @article{Bale2016, title = {Accurate design of megadalton-scale two-component icosahedral protein complexes}, author = {Jacob B. Bale and Shane Gonen and Yuxi Liu and William Sheffler and Daniel Ellis and Chantz Thomas and Duilio Cascio and Todd O. Yeates and Tamir Gonen and Neil P. King and David Baker}, url = {https://www.bakerlab.org/wp-content/uploads/2016/07/Bale_Science_2016.pdf}, doi = {10.1126/science.aaf8818}, year = {2016}, date = {2016-07-22}, journal = {Science}, volume = {353}, number = {6297}, pages = {389-394}, abstract = {Nature provides many examples of self- and co-assembling protein-based molecular machines, including icosahedral protein cages that serve as scaffolds, enzymes, and compartments for essential biochemical reactions and icosahedral virus capsids, which encapsidate and protect viral genomes and mediate entry into host cells. Inspired by these natural materials, we report the computational design and experimental characterization of co-assembling, two-component, 120-subunit icosahedral protein nanostructures with molecular weights (1.8 to 2.8 megadaltons) and dimensions (24 to 40 nanometers in diameter) comparable to those of small viral capsids. Electron microscopy, small-angle x-ray scattering, and x-ray crystallography show that 10 designs spanning three distinct icosahedral architectures form materials closely matching the design models. In vitro assembly of icosahedral complexes from independently purified components occurs rapidly, at rates comparable to those of viral capsids, and enables controlled packaging of molecular cargo through charge complementarity. The ability to design megadalton-scale materials with atomic-level accuracy and controllable assembly opens the door to a new generation of genetically programmable protein-based molecular machines.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Nature provides many examples of self- and co-assembling protein-based molecular machines, including icosahedral protein cages that serve as scaffolds, enzymes, and compartments for essential biochemical reactions and icosahedral virus capsids, which encapsidate and protect viral genomes and mediate entry into host cells. Inspired by these natural materials, we report the computational design and experimental characterization of co-assembling, two-component, 120-subunit icosahedral protein nanostructures with molecular weights (1.8 to 2.8 megadaltons) and dimensions (24 to 40 nanometers in diameter) comparable to those of small viral capsids. Electron microscopy, small-angle x-ray scattering, and x-ray crystallography show that 10 designs spanning three distinct icosahedral architectures form materials closely matching the design models. In vitro assembly of icosahedral complexes from independently purified components occurs rapidly, at rates comparable to those of viral capsids, and enables controlled packaging of molecular cargo through charge complementarity. The ability to design megadalton-scale materials with atomic-level accuracy and controllable assembly opens the door to a new generation of genetically programmable protein-based molecular machines. |
Hsia*, Yang; Bale*, Jacob B; Gonen, Shane; Shi, Dan; Sheffler, William; Fong, Kimberly K; Nattermann, ; Xu, Chunfu; Huang, Po-Ssu; Ravichandran, Rashmi; Yi, Sue; Davis, Trisha N; Gonen, Tamir; King, Neil P; Baker, David Design of a hyperstable 60-subunit protein icosahedron Journal Article Nature, 2016. @article{Hsia2016, title = {Design of a hyperstable 60-subunit protein icosahedron}, author = { Yang Hsia* and Jacob B. Bale* and Shane Gonen and Dan Shi and William Sheffler and Kimberly K. Fong and Nattermann and Chunfu Xu and Po-Ssu Huang and Rashmi Ravichandran and Sue Yi and Trisha N. Davis and Tamir Gonen and Neil P. King and David Baker}, url = {https://www.bakerlab.org/wp-content/uploads/2016/06/Hsia_Nature_2016.pdf}, doi = {10.1038/nature18010}, year = {2016}, date = {2016-06-15}, journal = {Nature}, abstract = {The icosahedron is the largest of the Platonic solids, and icosahedral protein structures are widely used in biological systems for packaging and transport. There has been considerable interest in repurposing such structures for applications ranging from targeted delivery to multivalent immunogen presentation. The ability to design proteins that self-assemble into precisely specified, highly ordered icosahedral structures would open the door to a new generation of protein containers with properties custom-tailored to specific applications. Here we describe the computational design of a 25-nanometre icosahedral nanocage that self-assembles from trimeric protein building blocks. The designed protein was produced in Escherichia coli, and found by electron microscopy to assemble into a homogenous population of icosahedral particles nearly identical to the design model. The particles are stable in 6.7 molar guanidine hydrochloride at up to 80 degrees Celsius, and undergo extremely abrupt, but reversible, disassembly between 2 molar and 2.25 molar guanidinium thiocyanate. The icosahedron is robust to genetic fusions: one or two copies of green fluorescent protein (GFP) can be fused to each of the 60 subunits to create highly fluorescent ‘standard candles’ for use in light microscopy, and a designed protein pentamer can be placed in the centre of each of the 20 pentameric faces to modulate the size of the entrance/ exit channels of the cage. Such robust and customizable nanocages should have considerable utility in targeted drug delivery, vaccine design and synthetic biology.}, keywords = {}, pubstate = {published}, tppubtype = {article} } The icosahedron is the largest of the Platonic solids, and icosahedral protein structures are widely used in biological systems for packaging and transport. There has been considerable interest in repurposing such structures for applications ranging from targeted delivery to multivalent immunogen presentation. The ability to design proteins that self-assemble into precisely specified, highly ordered icosahedral structures would open the door to a new generation of protein containers with properties custom-tailored to specific applications. Here we describe the computational design of a 25-nanometre icosahedral nanocage that self-assembles from trimeric protein building blocks. The designed protein was produced in Escherichia coli, and found by electron microscopy to assemble into a homogenous population of icosahedral particles nearly identical to the design model. The particles are stable in 6.7 molar guanidine hydrochloride at up to 80 degrees Celsius, and undergo extremely abrupt, but reversible, disassembly between 2 molar and 2.25 molar guanidinium thiocyanate. The icosahedron is robust to genetic fusions: one or two copies of green fluorescent protein (GFP) can be fused to each of the 60 subunits to create highly fluorescent ‘standard candles’ for use in light microscopy, and a designed protein pentamer can be placed in the centre of each of the 20 pentameric faces to modulate the size of the entrance/ exit channels of the cage. Such robust and customizable nanocages should have considerable utility in targeted drug delivery, vaccine design and synthetic biology. |
Klein J. C., Lajoie Schwartz Strauch Nelson Baker & Shendure M J J J E -M J D J Multiplex pairwise assembly of array-derived DNA oligonucleotides Journal Article Nucleic Acids Research, 44 (5), pp. e43, 2016. @article{Klein2016, title = {Multiplex pairwise assembly of array-derived DNA oligonucleotides}, author = {Klein, J. C., Lajoie, M. J., Schwartz, J. J., Strauch, E.-M., Nelson, J., Baker, D., & Shendure, J}, url = {https://www.bakerlab.org/wp-content/uploads/2016/05/gkv1177.pdf}, doi = {10.1093/nar/gkv1177}, year = {2016}, date = {2016-03-18}, journal = {Nucleic Acids Research}, volume = {44}, number = {5}, pages = {e43}, abstract = {While the cost of DNA sequencing has dropped by five orders of magnitude in the past decade, DNA synthesis remains expensive for many applications. Although DNA microarrays have decreased the cost of oligonucleotide synthesis, the use of array-synthesized oligos in practice is limited by short synthesis lengths, high synthesis error rates, low yield and the challenges of assembling long constructs from complex pools. Toward addressing these issues, we developed a protocol for multiplex pairwise assembly of oligos from array-synthesized oligonucleotide pools. To evaluate the method, we attempted to assemble up to 2271 targets ranging in length from 192–252 bases using pairs of array-synthesized oligos. Within sets of complexity ranging from 131–250 targets, we observed error-free assemblies for 90.5% of all targets. When all 2271 targets were assembled in one reaction, we observed error-free constructs for 70.6%. While the assembly method intrinsically increased accuracy to a small degree, we further increased accuracy by using a high throughput ‘Dial-Out PCR’ protocol, which combines Illumina sequencing with an in-house set of unique PCR tags to selectively amplify perfect assemblies from complex synthetic pools. This approach has broad applicability to DNA assembly and high-throughput functional screens.}, keywords = {}, pubstate = {published}, tppubtype = {article} } While the cost of DNA sequencing has dropped by five orders of magnitude in the past decade, DNA synthesis remains expensive for many applications. Although DNA microarrays have decreased the cost of oligonucleotide synthesis, the use of array-synthesized oligos in practice is limited by short synthesis lengths, high synthesis error rates, low yield and the challenges of assembling long constructs from complex pools. Toward addressing these issues, we developed a protocol for multiplex pairwise assembly of oligos from array-synthesized oligonucleotide pools. To evaluate the method, we attempted to assemble up to 2271 targets ranging in length from 192–252 bases using pairs of array-synthesized oligos. Within sets of complexity ranging from 131–250 targets, we observed error-free assemblies for 90.5% of all targets. When all 2271 targets were assembled in one reaction, we observed error-free constructs for 70.6%. While the assembly method intrinsically increased accuracy to a small degree, we further increased accuracy by using a high throughput ‘Dial-Out PCR’ protocol, which combines Illumina sequencing with an in-house set of unique PCR tags to selectively amplify perfect assemblies from complex synthetic pools. This approach has broad applicability to DNA assembly and high-throughput functional screens. |
Taylor ND Garruss AS, Moretti Chan Arbing MA Cascio Rogers JK Isaacs FJ Kosuri Baker Fields Church GM Raman R S D S D S S Engineering an allosteric transcription factor to respond to new ligands Journal Article Nature Methods, 13 (2), pp. 177-83, 2016. @article{ND2016, title = {Engineering an allosteric transcription factor to respond to new ligands}, author = {Taylor ND, Garruss AS, Moretti R, Chan S, Arbing MA, Cascio D, Rogers JK, Isaacs FJ, Kosuri S, Baker D, Fields S, Church GM, Raman S}, url = {https://www.bakerlab.org/wp-content/uploads/2016/05/nmeth.36961.pdf}, doi = {10.1038/nmeth.3696}, year = {2016}, date = {2016-02-01}, journal = {Nature Methods}, volume = {13}, number = {2}, pages = {177-83}, abstract = {Genetic regulatory proteins inducible by small molecules are useful synthetic biology tools as sensors and switches. Bacterial allosteric transcription factors (aTFs) are a major class of regulatory proteins, but few aTFs have been redesigned to respond to new effectors beyond natural aTF-inducer pairs. Altering inducer specificity in these proteins is difficult because substitutions that affect inducer binding may also disrupt allostery. We engineered an aTF, the Escherichia coli lac repressor, LacI, to respond to one of four new inducer molecules: fucose, gentiobiose, lactitol and sucralose. Using computational protein design, single-residue saturation mutagenesis or random mutagenesis, along with multiplex assembly, we identified new variants comparable in specificity and induction to wild-type LacI with its inducer, isopropyl β-D-1-thiogalactopyranoside (IPTG). The ability to create designer aTFs will enable applications including dynamic control of cell metabolism, cell biology and synthetic gene circuits}, keywords = {}, pubstate = {published}, tppubtype = {article} } Genetic regulatory proteins inducible by small molecules are useful synthetic biology tools as sensors and switches. Bacterial allosteric transcription factors (aTFs) are a major class of regulatory proteins, but few aTFs have been redesigned to respond to new effectors beyond natural aTF-inducer pairs. Altering inducer specificity in these proteins is difficult because substitutions that affect inducer binding may also disrupt allostery. We engineered an aTF, the Escherichia coli lac repressor, LacI, to respond to one of four new inducer molecules: fucose, gentiobiose, lactitol and sucralose. Using computational protein design, single-residue saturation mutagenesis or random mutagenesis, along with multiplex assembly, we identified new variants comparable in specificity and induction to wild-type LacI with its inducer, isopropyl β-D-1-thiogalactopyranoside (IPTG). The ability to create designer aTFs will enable applications including dynamic control of cell metabolism, cell biology and synthetic gene circuits |
Boyken, Scott E; Chen, Zibo; Groves, Benjamin; Langan, Robert A; Oberdorfer, Gustav; Ford, Alex; Gilmore, Jason M; Xu, Chunfu; DiMaio, Frank; Pereira, Jose Henrique; Sankaran, Banumathi; Seelig, Georg; Zwart, Peter H; Baker, David De novo design of protein homo-oligomers with modular hydrogen-bond network–mediated specificity Journal Article Science, 352 (6286), pp. 680–687, 2016, ISSN: 0036-8075. @article{Boyken680, title = {De novo design of protein homo-oligomers with modular hydrogen-bond network–mediated specificity}, author = { Scott E. Boyken and Zibo Chen and Benjamin Groves and Robert A. Langan and Gustav Oberdorfer and Alex Ford and Jason M. Gilmore and Chunfu Xu and Frank DiMaio and Jose Henrique Pereira and Banumathi Sankaran and Georg Seelig and Peter H. Zwart and David Baker}, url = {http://science.sciencemag.org/content/352/6286/680 https://www.bakerlab.org/wp-content/uploads/2016/05/680.full_.pdf}, doi = {10.1126/science.aad8865}, issn = {0036-8075}, year = {2016}, date = {2016-01-01}, journal = {Science}, volume = {352}, number = {6286}, pages = {680--687}, publisher = {American Association for the Advancement of Science}, abstract = {General design principles for protein interaction specificity are challenging to extract. DNA nanotechnology, on the other hand, has harnessed the limited set of hydrogen-bonding interactions from Watson-Crick base-pairing to design and build a wide range of shapes. Protein-based materials have the potential for even greater geometric and chemical diversity, including additional functionality. Boyken et al. designed a class of protein oligomers that have interaction specificity determined by modular arrays of extensive hydrogen bond networks (see the Perspective by Netzer and Fleishman). They use the approach, which could one day become programmable, to build novel topologies with two concentric rings of helices.Science, this issue p. 680; see also p. 657In nature, structural specificity in DNA and proteins is encoded differently: In DNA, specificity arises from modular hydrogen bonds in the core of the double helix, whereas in proteins, specificity arises largely from buried hydrophobic packing complemented by irregular peripheral polar interactions. Here, we describe a general approach for designing a wide range of protein homo-oligomers with specificity determined by modular arrays of central hydrogen-bond networks. We use the approach to design dimers, trimers, and tetramers consisting of two concentric rings of helices, including previously not seen triangular, square, and supercoiled topologies. X-ray crystallography confirms that the structures overall, and the hydrogen-bond networks in particular, are nearly identical to the design models, and the networks confer interaction specificity in vivo. The ability to design extensive hydrogen-bond networks with atomic accuracy enables the programming of protein interaction specificity for a broad range of synthetic biology applications; more generally, our results demonstrate that, even with the tremendous diversity observed in nature, there are fundamentally new modes of interaction to be discovered in proteins.}, keywords = {}, pubstate = {published}, tppubtype = {article} } General design principles for protein interaction specificity are challenging to extract. DNA nanotechnology, on the other hand, has harnessed the limited set of hydrogen-bonding interactions from Watson-Crick base-pairing to design and build a wide range of shapes. Protein-based materials have the potential for even greater geometric and chemical diversity, including additional functionality. Boyken et al. designed a class of protein oligomers that have interaction specificity determined by modular arrays of extensive hydrogen bond networks (see the Perspective by Netzer and Fleishman). They use the approach, which could one day become programmable, to build novel topologies with two concentric rings of helices.Science, this issue p. 680; see also p. 657In nature, structural specificity in DNA and proteins is encoded differently: In DNA, specificity arises from modular hydrogen bonds in the core of the double helix, whereas in proteins, specificity arises largely from buried hydrophobic packing complemented by irregular peripheral polar interactions. Here, we describe a general approach for designing a wide range of protein homo-oligomers with specificity determined by modular arrays of central hydrogen-bond networks. We use the approach to design dimers, trimers, and tetramers consisting of two concentric rings of helices, including previously not seen triangular, square, and supercoiled topologies. X-ray crystallography confirms that the structures overall, and the hydrogen-bond networks in particular, are nearly identical to the design models, and the networks confer interaction specificity in vivo. The ability to design extensive hydrogen-bond networks with atomic accuracy enables the programming of protein interaction specificity for a broad range of synthetic biology applications; more generally, our results demonstrate that, even with the tremendous diversity observed in nature, there are fundamentally new modes of interaction to be discovered in proteins. |
Ovchinnikov, Sergey ; Park, Hahnbeom ; Kim, David E; Liu, Yuan ; Wang, Ray Yu-Ruei ; Baker, David Structure prediction using sparse simulated NOE restraints with Rosetta in CASP11 Journal Article Proteins: Structure, Function, and Bioinformatics, pp. n/a–n/a, 2016, ISSN: 1097-0134. @article{PROT:PROT25006, title = {Structure prediction using sparse simulated NOE restraints with Rosetta in CASP11}, author = {Ovchinnikov, Sergey and Park, Hahnbeom and Kim, David E. and Liu, Yuan and Wang, Ray Yu-Ruei and Baker, David}, url = {http://dx.doi.org/10.1002/prot.25006 https://www.bakerlab.org/wp-content/uploads/2016/05/Ovchinnikov_et_al-2016-Proteins__Structure_Function_and_Bioinformatics.pdf}, doi = {10.1002/prot.25006}, issn = {1097-0134}, year = {2016}, date = {2016-01-01}, journal = {Proteins: Structure, Function, and Bioinformatics}, pages = {n/a--n/a}, abstract = {In CASP11 we generated protein structure models using simulated ambiguous and unambiguous nuclear Overhauser effect (NOE) restraints with a two stage protocol. Low resolution models were generated guided by the unambiguous restraints using continuous chain folding for alpha and alpha-beta proteins, and iterative annealing for all beta proteins to take advantage of the strand pairing information implicit in the restraints. The Rosetta fragment/model hybridization protocol was then used to recombine and regularize these models, and refine them in the Rosetta full atom energy function guided by both the unambiguous and the ambiguous restraints. Fifteen out of 19 targets were modeled with GDT-TS quality scores greater than 60 for Model 1, significantly improving upon the non-assisted predictions. Our results suggest that atomic level accuracy is achievable using sparse NOE data when there is at least one correctly assigned NOE for every residue. Proteins 2016. © 2016 Wiley Periodicals, Inc.}, keywords = {}, pubstate = {published}, tppubtype = {article} } In CASP11 we generated protein structure models using simulated ambiguous and unambiguous nuclear Overhauser effect (NOE) restraints with a two stage protocol. Low resolution models were generated guided by the unambiguous restraints using continuous chain folding for alpha and alpha-beta proteins, and iterative annealing for all beta proteins to take advantage of the strand pairing information implicit in the restraints. The Rosetta fragment/model hybridization protocol was then used to recombine and regularize these models, and refine them in the Rosetta full atom energy function guided by both the unambiguous and the ambiguous restraints. Fifteen out of 19 targets were modeled with GDT-TS quality scores greater than 60 for Model 1, significantly improving upon the non-assisted predictions. Our results suggest that atomic level accuracy is achievable using sparse NOE data when there is at least one correctly assigned NOE for every residue. Proteins 2016. © 2016 Wiley Periodicals, Inc. |
Basanta, Benjamin; Chan, Kui K; Barth, Patrick; King, Tiffany; Sosnick, Tobin R; Hinshaw, James R; Liu, Gaohua; Everett, John K; Xiao, Rong; Montelione, Gaetano T; Baker, David Introduction of a polar core into the de novo designed protein Top7 Journal Article Protein Science, pp. n/a–n/a, 2016, ISSN: 1469-896X. @article{PRO:PRO2899, title = {Introduction of a polar core into the de novo designed protein Top7}, author = { Benjamin Basanta and Kui K. Chan and Patrick Barth and Tiffany King and Tobin R. Sosnick and James R. Hinshaw and Gaohua Liu and John K. Everett and Rong Xiao and Gaetano T. Montelione and David Baker}, url = {https://www.bakerlab.org/wp-content/uploads/2016/05/Basanta_et_al-2016-Protein_Science.pdf http://dx.doi.org/10.1002/pro.2899}, doi = {10.1002/pro.2899}, issn = {1469-896X}, year = {2016}, date = {2016-01-01}, journal = {Protein Science}, pages = {n/a--n/a}, abstract = {Design of polar interactions is a current challenge for protein design. The de novo designed protein Top7, like almost all designed proteins, has an entirely nonpolar core. Here we describe the replacing of a sizable fraction (5 residues) of this core with a designed polar hydrogen bond network. The polar core design is expressed at high levels in E. coli, has a folding free energy of 10 kcal/mol, and retains the multiphasic folding kinetics of the original Top7. The NMR structure of the design shows that conformations of three of the five residues, and the designed hydrogen bonds between them, are very close to those in the design model. The remaining two residues, which are more solvent exposed, sample a wide range of conformations in the NMR ensemble. These results show that hydrogen bond networks can be designed in protein cores, but also highlight challenges that need to be overcome when there is competition with solvent.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Design of polar interactions is a current challenge for protein design. The de novo designed protein Top7, like almost all designed proteins, has an entirely nonpolar core. Here we describe the replacing of a sizable fraction (5 residues) of this core with a designed polar hydrogen bond network. The polar core design is expressed at high levels in E. coli, has a folding free energy of 10 kcal/mol, and retains the multiphasic folding kinetics of the original Top7. The NMR structure of the design shows that conformations of three of the five residues, and the designed hydrogen bonds between them, are very close to those in the design model. The remaining two residues, which are more solvent exposed, sample a wide range of conformations in the NMR ensemble. These results show that hydrogen bond networks can be designed in protein cores, but also highlight challenges that need to be overcome when there is competition with solvent. |
Treants, Merika; Jorgen, Nelson; Aaron, Chevalier; Michael, Koday; Hannah, Kalinoski; Lance, Stewart; Lauren, Carter; Travis, Nieusma; S., Lee Peter; B., Ward Andrew; A., Wilson Ian; Ashley, Dagley; F., Smee Donald; David, Baker; Koday, Fuller Deborah Heydenburg A Computationally Designed Hemagglutinin Stem-Binding Protein Provides In Vivo Protection from Influenza Independent of a Host Immune Response Journal Article PLoS Pathog, 12 (2), pp. 1-23, 2016. @article{10.1371/journal.ppat.1005409, title = {A Computationally Designed Hemagglutinin Stem-Binding Protein Provides In Vivo Protection from Influenza Independent of a Host Immune Response}, author = { Merika Treants AND Nelson Jorgen AND Chevalier Aaron AND Koday Michael AND Kalinoski Hannah AND Stewart Lance AND Carter Lauren AND Nieusma Travis AND Lee Peter S. AND Ward Andrew B. AND Wilson Ian A. AND Dagley Ashley AND Smee Donald F. AND Baker David AND Fuller Deborah Heydenburg Koday}, url = {http://dx.doi.org/10.1371%2Fjournal.ppat.1005409 https://www.bakerlab.org/wp-content/uploads/2016/05/journal.ppat_.1005409.pdf}, doi = {10.1371/journal.ppat.1005409}, year = {2016}, date = {2016-01-01}, journal = {PLoS Pathog}, volume = {12}, number = {2}, pages = {1-23}, publisher = {Public Library of Science}, abstract = { Influenza is a major public health threat, and pandemics, such as the 2009 H1N1 outbreak, are inevitable. Due to low efficacy of seasonal flu vaccines and the increase in drug-resistant strains of influenza viruses, there is a crucial need to develop new antivirals to protect from seasonal and pandemic influenza. Recently, several broadly neutralizing antibodies have been characterized that bind to a highly conserved site on the viral hemagglutinin (HA) stem region. These antibodies are protective against a wide range of diverse influenza viruses, but their efficacy depends on a host immune effector response through the antibody Fc region (ADCC). Here we show that a small engineered protein computationally designed to bind to the same region of the HA stem as broadly neutralizing antibodies mediated protection against diverse strains of influenza in mice by a distinct mechanism that is independent of a host immune response. Protection was superior to that afforded by oseltamivir, a lead marketed antiviral. Furthermore, combination therapy with low doses of the engineered protein and oseltamivir resulted in enhanced and synergistic protection from lethal challenge. Thus, through computational protein engineering, we have designed a new antiviral with strong biopotency keywords = {}, pubstate = {published}, tppubtype = {article} } <title>Author Summary</title> <p>Influenza is a major public health threat, and pandemics, such as the 2009 H1N1 outbreak, are inevitable. Due to low efficacy of seasonal flu vaccines and the increase in drug-resistant strains of influenza viruses, there is a crucial need to develop new antivirals to protect from seasonal and pandemic influenza. Recently, several broadly neutralizing antibodies have been characterized that bind to a highly conserved site on the viral hemagglutinin (HA) stem region. These antibodies are protective against a wide range of diverse influenza viruses, but their efficacy depends on a host immune effector response through the antibody Fc region (ADCC). Here we show that a small engineered protein computationally designed to bind to the same region of the HA stem as broadly neutralizing antibodies mediated protection against diverse strains of influenza in mice by a distinct mechanism that is independent of a host immune response. Protection was superior to that afforded by oseltamivir, a lead marketed antiviral. Furthermore, combination therapy with low doses of the engineered protein and oseltamivir resulted in enhanced and synergistic protection from lethal challenge. Thus, through computational protein engineering, we have designed a new antiviral with strong biopotency <italic>in vivo</italic> that targets a neutralizing epitope on the hemagglutinin of influenza virus and inhibits its fusion activity. These results have significant implications for the use of computational modeling to design new antivirals against influenza and other viral diseases.</p> |
Garcia, Kristen E; Babanova, Sofia; Scheffler, William; Hans, Mansij; Baker, David; Atanassov, Plamen; Banta, Scott Designed protein aggregates entrapping carbon nanotubes for bioelectrochemical oxygen reduction Journal Article Biotechnology and Bioengineering, pp. n/a–n/a, 2016, ISSN: 1097-0290. @article{BIT:BIT25996, title = {Designed protein aggregates entrapping carbon nanotubes for bioelectrochemical oxygen reduction}, author = { Kristen E Garcia and Sofia Babanova and William Scheffler and Mansij Hans and David Baker and Plamen Atanassov and Scott Banta}, url = {http://dx.doi.org/10.1002/bit.25996 https://www.bakerlab.org/wp-content/uploads/2016/05/Garcia_et_al-2016-Biotechnology_and_Bioengineering.pdf}, doi = {10.1002/bit.25996}, issn = {1097-0290}, year = {2016}, date = {2016-01-01}, journal = {Biotechnology and Bioengineering}, pages = {n/a--n/a}, abstract = {The engineering of robust protein/nanomaterial interfaces is critical in the development of bioelectrocatalytic systems. We have used computational protein design to identify two amino acid mutations in the small laccase protein (SLAC) from Streptomyces coelicolor to introduce new inter-protein disulfide bonds. The new dimeric interface introduced by these disulfide bonds in combination with the natural trimeric structure drive the self-assembly of SLAC into functional aggregates. The mutations had a minimal effect on kinetic parameters, and the enzymatic assemblies exhibited an increased resistance to irreversible thermal denaturation. The SLAC assemblies were combined with single-walled carbon nanotubes (SWNTs), and explored for use in oxygen reduction electrodes. The incorporation of SWNTs into the SLAC aggregates enabled operation an elevated temperature and reduced the reaction overpotential. A current density of 1.1 mA/cm2 at 0 V vs. Ag/AgCl was achieved in an air-breathing cathode system. This article is protected by copyright. All rights reserved}, keywords = {}, pubstate = {published}, tppubtype = {article} } The engineering of robust protein/nanomaterial interfaces is critical in the development of bioelectrocatalytic systems. We have used computational protein design to identify two amino acid mutations in the small laccase protein (SLAC) from Streptomyces coelicolor to introduce new inter-protein disulfide bonds. The new dimeric interface introduced by these disulfide bonds in combination with the natural trimeric structure drive the self-assembly of SLAC into functional aggregates. The mutations had a minimal effect on kinetic parameters, and the enzymatic assemblies exhibited an increased resistance to irreversible thermal denaturation. The SLAC assemblies were combined with single-walled carbon nanotubes (SWNTs), and explored for use in oxygen reduction electrodes. The incorporation of SWNTs into the SLAC aggregates enabled operation an elevated temperature and reduced the reaction overpotential. A current density of 1.1 mA/cm2 at 0 V vs. Ag/AgCl was achieved in an air-breathing cathode system. This article is protected by copyright. All rights reserved |
2015 |
Feng, J; Jester, BW; Tinberg, CE; Mandell, DJ; Antunes, MS; Chari, R; Morey, KJ; Rios, X; Medford, JI; Church, GM; Fields, S; Baker, D A general strategy to construct small molecule biosensors in eukaryotes Journal Article Elife, 2015. @article{J2015, title = {A general strategy to construct small molecule biosensors in eukaryotes}, author = {J Feng and BW Jester and CE Tinberg and DJ Mandell and MS Antunes and R Chari and KJ Morey and X Rios and JI Medford and GM Church and S Fields and D Baker}, url = {https://www.bakerlab.org/wp-content/uploads/2016/04/elife-10606-v3-download.pdf}, doi = {10.7554/eLife.10606}, year = {2015}, date = {2015-12-29}, journal = {Elife}, abstract = {Biosensors for small molecules can be used in applications that range from metabolic engineering to orthogonal control of transcription. Here, we produce biosensors based on a ligand-binding domain (LBD) by using a method that, in principle, can be applied to any target molecule. The LBD is fused to either a fluorescent protein or a transcriptional activator and is destabilized by mutation such that the fusion accumulates only in cells containing the target ligand. We illustrate the power of this method by developing biosensors for digoxin and progesterone. Addition of ligand to yeast, mammalian or plant cells expressing a biosensor activates transcription with a dynamic range of up to ~100-fold. We use the biosensors to improve the biotransformation of pregnenolone to progesterone in yeast and to regulate CRISPR activity in mammalian cells. This work provides a general methodology to develop biosensors for a broad range of molecules in eukaryotes.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Biosensors for small molecules can be used in applications that range from metabolic engineering to orthogonal control of transcription. Here, we produce biosensors based on a ligand-binding domain (LBD) by using a method that, in principle, can be applied to any target molecule. The LBD is fused to either a fluorescent protein or a transcriptional activator and is destabilized by mutation such that the fusion accumulates only in cells containing the target ligand. We illustrate the power of this method by developing biosensors for digoxin and progesterone. Addition of ligand to yeast, mammalian or plant cells expressing a biosensor activates transcription with a dynamic range of up to ~100-fold. We use the biosensors to improve the biotransformation of pregnenolone to progesterone in yeast and to regulate CRISPR activity in mammalian cells. This work provides a general methodology to develop biosensors for a broad range of molecules in eukaryotes. |
Doyle, L; Hallinan, J; Bolduc, J; Parmeggiani, F; Baker, D; Stoddard, BL; Bradley, P Rational design of α-helical tandem repeat proteins with closed architectures Journal Article Nature, 528(7583) , pp. 585-8, 2015. @article{L2015, title = {Rational design of α-helical tandem repeat proteins with closed architectures}, author = {L Doyle and J Hallinan and J Bolduc and F Parmeggiani and D Baker and BL Stoddard and P Bradley}, url = {https://www.bakerlab.org/wp-content/uploads/2015/12/Doyle_Nature_2015.pdf}, doi = {10.1038/nature16191}, year = {2015}, date = {2015-12-24}, journal = {Nature}, volume = {528(7583)}, pages = {585-8}, abstract = {Tandem repeat proteins, which are formed by repetition of modular units of protein sequence and structure, play important biological roles as macromolecular binding and scaffolding domains, enzymes, and building blocks for the assembly of fibrous materials. The modular nature of repeat proteins enables the rapid construction and diversification of extended binding surfaces by duplication and recombination of simple building blocks. The overall architecture of tandem repeat protein structures--which is dictated by the internal geometry and local packing of the repeat building blocks--is highly diverse, ranging from extended, super-helical folds that bind peptide, DNA, and RNA partners, to closed and compact conformations with internal cavities suitable for small molecule binding and catalysis. Here we report the development and validation of computational methods for de novo design of tandem repeat protein architectures driven purely by geometric criteria defining the inter-repeat geometry, without reference to the sequences and structures of existing repeat protein families. We have applied these methods to design a series of closed α-solenoid repeat structures (α-toroids) in which the inter-repeat packing geometry is constrained so as to juxtapose the amino (N) and carboxy (C) termini; several of these designed structures have been validated by X-ray crystallography. Unlike previous approaches to tandem repeat protein engineering, our design procedure does not rely on template sequence or structural information taken from natural repeat proteins and hence can produce structures unlike those seen in nature. As an example, we have successfully designed and validated closed α-solenoid repeats with a left-handed helical architecture that--to our knowledge--is not yet present in the protein structure database.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Tandem repeat proteins, which are formed by repetition of modular units of protein sequence and structure, play important biological roles as macromolecular binding and scaffolding domains, enzymes, and building blocks for the assembly of fibrous materials. The modular nature of repeat proteins enables the rapid construction and diversification of extended binding surfaces by duplication and recombination of simple building blocks. The overall architecture of tandem repeat protein structures--which is dictated by the internal geometry and local packing of the repeat building blocks--is highly diverse, ranging from extended, super-helical folds that bind peptide, DNA, and RNA partners, to closed and compact conformations with internal cavities suitable for small molecule binding and catalysis. Here we report the development and validation of computational methods for de novo design of tandem repeat protein architectures driven purely by geometric criteria defining the inter-repeat geometry, without reference to the sequences and structures of existing repeat protein families. We have applied these methods to design a series of closed α-solenoid repeat structures (α-toroids) in which the inter-repeat packing geometry is constrained so as to juxtapose the amino (N) and carboxy (C) termini; several of these designed structures have been validated by X-ray crystallography. Unlike previous approaches to tandem repeat protein engineering, our design procedure does not rely on template sequence or structural information taken from natural repeat proteins and hence can produce structures unlike those seen in nature. As an example, we have successfully designed and validated closed α-solenoid repeats with a left-handed helical architecture that--to our knowledge--is not yet present in the protein structure database. |
Brunette, TJ; Parmeggiani, F; Huang, PS; Bhabha, G; Ekiert, DC; Tsutakawa, SE; Hura, GL; Tainer, JA; Baker, D Exploring the repeat protein universe through computational protein design Journal Article Nature, 528(7583) , pp. 580-4, 2015. @article{TJ2015, title = {Exploring the repeat protein universe through computational protein design}, author = {TJ Brunette and F Parmeggiani and PS Huang and G Bhabha and DC Ekiert and SE Tsutakawa and GL Hura and JA Tainer and D Baker}, url = {https://www.bakerlab.org/wp-content/uploads/2015/12/Brunette_Nature_2015.pdf}, doi = {10.1038/nature16162}, year = {2015}, date = {2015-12-24}, journal = {Nature}, volume = {528(7583)}, pages = {580-4}, abstract = {A central question in protein evolution is the extent to which naturally occurring proteins sample the space of folded structures accessible to the polypeptide chain. Repeat proteins composed of multiple tandem copies of a modular structure unit are widespread in nature and have critical roles in molecular recognition, signalling, and other essential biological processes. Naturally occurring repeat proteins have been re-engineered for molecular recognition and modular scaffolding applications. Here we use computational protein design to investigate the space of folded structures that can be generated by tandem repeating a simple helix-loop-helix-loop structural motif. Eighty-three designs with sequences unrelated to known repeat proteins were experimentally characterized. Of these, 53 are monomeric and stable at 95 °C, and 43 have solution X-ray scattering spectra consistent with the design models. Crystal structures of 15 designs spanning a broad range of curvatures are in close agreement with the design models with root mean square deviations ranging from 0.7 to 2.5 Å. Our results show that existing repeat proteins occupy only a small fraction of the possible repeat protein sequence and structure space and that it is possible to design novel repeat proteins with precisely specified geometries, opening up a wide array of new possibilities for biomolecular engineering. }, keywords = {}, pubstate = {published}, tppubtype = {article} } A central question in protein evolution is the extent to which naturally occurring proteins sample the space of folded structures accessible to the polypeptide chain. Repeat proteins composed of multiple tandem copies of a modular structure unit are widespread in nature and have critical roles in molecular recognition, signalling, and other essential biological processes. Naturally occurring repeat proteins have been re-engineered for molecular recognition and modular scaffolding applications. Here we use computational protein design to investigate the space of folded structures that can be generated by tandem repeating a simple helix-loop-helix-loop structural motif. Eighty-three designs with sequences unrelated to known repeat proteins were experimentally characterized. Of these, 53 are monomeric and stable at 95 °C, and 43 have solution X-ray scattering spectra consistent with the design models. Crystal structures of 15 designs spanning a broad range of curvatures are in close agreement with the design models with root mean square deviations ranging from 0.7 to 2.5 Å. Our results show that existing repeat proteins occupy only a small fraction of the possible repeat protein sequence and structure space and that it is possible to design novel repeat proteins with precisely specified geometries, opening up a wide array of new possibilities for biomolecular engineering. |
Taylor, ND; Garruss, AS; Moretti, R; Chan, S; Arbing, MA; Cascio, D; Rogers, JK; Isaacs, FJ; Kosuri, S; Baker, D; Fields, S; Church, GM; Raman, S Engineering an allosteric transcription factor to respond to new ligands Journal Article Nature Methods, 2015. @article{ND2015, title = {Engineering an allosteric transcription factor to respond to new ligands}, author = {ND Taylor and AS Garruss and R Moretti and S Chan and MA Arbing and D Cascio and JK Rogers and FJ Isaacs and S Kosuri and D Baker and S Fields and GM Church and S Raman}, url = {https://www.bakerlab.org/wp-content/uploads/2015/12/Taylor_NatMeth_2015.pdf}, doi = {10.1038/nmeth.3696}, year = {2015}, date = {2015-12-21}, journal = {Nature Methods}, abstract = {Genetic regulatory proteins inducible by small molecules are useful synthetic biology tools as sensors and switches. Bacterial allosteric transcription factors (aTFs) are a major class of regulatory proteins, but few aTFs have been redesigned to respond to new effectors beyond natural aTF-inducer pairs. Altering inducer specificity in these proteins is difficult because substitutions that affect inducer binding may also disrupt allostery. We engineered an aTF, the Escherichia coli lac repressor, LacI, to respond to one of four new inducer molecules: fucose, gentiobiose, lactitol and sucralose. Using computational protein design, single-residue saturation mutagenesis or random mutagenesis, along with multiplex assembly, we identified new variants comparable in specificity and induction to wild-type LacI with its inducer, isopropyl β-D-1-thiogalactopyranoside (IPTG). The ability to create designer aTFs will enable applications including dynamic control of cell metabolism, cell biology and synthetic gene circuits.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Genetic regulatory proteins inducible by small molecules are useful synthetic biology tools as sensors and switches. Bacterial allosteric transcription factors (aTFs) are a major class of regulatory proteins, but few aTFs have been redesigned to respond to new effectors beyond natural aTF-inducer pairs. Altering inducer specificity in these proteins is difficult because substitutions that affect inducer binding may also disrupt allostery. We engineered an aTF, the Escherichia coli lac repressor, LacI, to respond to one of four new inducer molecules: fucose, gentiobiose, lactitol and sucralose. Using computational protein design, single-residue saturation mutagenesis or random mutagenesis, along with multiplex assembly, we identified new variants comparable in specificity and induction to wild-type LacI with its inducer, isopropyl β-D-1-thiogalactopyranoside (IPTG). The ability to create designer aTFs will enable applications including dynamic control of cell metabolism, cell biology and synthetic gene circuits. |
Ovchinnikov, S; Kim, DE; Wang, RY; Liu, Y; DiMaio, F; Baker, D Improved de novo structure prediction in CASP11 by incorporating Co-evolution information into rosetta Journal Article Proteins, 2015. @article{S2015, title = {Improved de novo structure prediction in CASP11 by incorporating Co-evolution information into rosetta}, author = {S Ovchinnikov and DE Kim and RY Wang and Y Liu and F DiMaio and D Baker}, url = {https://www.bakerlab.org/wp-content/uploads/2015/12/Ovchinnikov_Proteins_2015.pdf}, doi = {10.1002/prot.24974}, year = {2015}, date = {2015-12-17}, journal = {Proteins}, abstract = {We describe CASP11 de novo blind structure predictions made using the Rosetta structure prediction methodology with both automatic and human assisted protocols. Model accuracy was generally improved using co-evolution derived residue-residue contact information as restraints during Rosetta conformational sampling and refinement, particularly when the number of sequences in the family was more than three times the length of the protein. The highlight was the human assisted prediction of T0806, a large and topologically complex target with no homologs of known structure, which had unprecedented accuracy - <3.0 Å root-mean-square deviation (RMSD) from the crystal structure over 223 residues. For this target, we increased the amount of conformational sampling over our fully automated method by employing an iterative hybridization protocol. Our results clearly demonstrate, in a blind prediction scenario, that co-evolution derived contacts can considerably increase the accuracy of template-free structure modeling. This article is protected by copyright. All rights reserved.}, keywords = {}, pubstate = {published}, tppubtype = {article} } We describe CASP11 de novo blind structure predictions made using the Rosetta structure prediction methodology with both automatic and human assisted protocols. Model accuracy was generally improved using co-evolution derived residue-residue contact information as restraints during Rosetta conformational sampling and refinement, particularly when the number of sequences in the family was more than three times the length of the protein. The highlight was the human assisted prediction of T0806, a large and topologically complex target with no homologs of known structure, which had unprecedented accuracy - <3.0 Å root-mean-square deviation (RMSD) from the crystal structure over 223 residues. For this target, we increased the amount of conformational sampling over our fully automated method by employing an iterative hybridization protocol. Our results clearly demonstrate, in a blind prediction scenario, that co-evolution derived contacts can considerably increase the accuracy of template-free structure modeling. This article is protected by copyright. All rights reserved. |
King, IC; Gleixner, J; Doyle, L; Kuzin, A; Hunt, JF; Xiao, R; Montelione, GT; Stoddard, BL; DiMaio, F; Baker, D Precise assembly of complex beta sheet topologies from de novo designed building blocks Journal Article Elife, 2015. @article{IC2015, title = {Precise assembly of complex beta sheet topologies from de novo designed building blocks}, author = {IC King and J Gleixner and L Doyle and A Kuzin and JF Hunt and R Xiao and GT Montelione and BL Stoddard and F DiMaio and D Baker}, url = {https://www.bakerlab.org/wp-content/uploads/2015/12/King_elife_2015.pdf}, doi = {10.7554/eLife.11012}, year = {2015}, date = {2015-12-09}, journal = {Elife}, abstract = {Design of complex alpha-beta protein topologies poses a challenge because of the large number of alternative packing arrangements. A similar challenge presumably limited the emergence of large and complex protein topologies in evolution. Here we demonstrate that protein topologies with six and seven-stranded beta sheets can be designed by insertion of one de novo designed beta sheet containing protein into another such that the two beta sheets are merged to form a single extended sheet, followed by amino acid sequence optimization at the newly formed strand-strand, strand-helix, and helix-helix interfaces. Crystal structures of two such designs closely match the computational design models. Searches for similar structures in the SCOP protein domain database yield only weak matches with different beta sheet connectivities. A similar beta sheet fusion mechanism may have contributed to the emergence of complex beta sheets during natural protein evolution.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Design of complex alpha-beta protein topologies poses a challenge because of the large number of alternative packing arrangements. A similar challenge presumably limited the emergence of large and complex protein topologies in evolution. Here we demonstrate that protein topologies with six and seven-stranded beta sheets can be designed by insertion of one de novo designed beta sheet containing protein into another such that the two beta sheets are merged to form a single extended sheet, followed by amino acid sequence optimization at the newly formed strand-strand, strand-helix, and helix-helix interfaces. Crystal structures of two such designs closely match the computational design models. Searches for similar structures in the SCOP protein domain database yield only weak matches with different beta sheet connectivities. A similar beta sheet fusion mechanism may have contributed to the emergence of complex beta sheets during natural protein evolution. |
Goldsmith, M; Eckstein, S; Ashani, Y; Greisen, Jr P; Leader, H; Sussman, JL; Aggarwal, N; Ovchinnikov, S; Tawfik, DS; Baker, D; Thiermann, H; Worek, F Catalytic efficiencies of directly evolved phosphotriesterase variants with structurally different organophosphorus compounds in vitro Journal Article Archives of Toxicology, 2015. @article{M2015, title = {Catalytic efficiencies of directly evolved phosphotriesterase variants with structurally different organophosphorus compounds in vitro}, author = {M Goldsmith and S Eckstein and Y Ashani and P Jr Greisen and H Leader and JL Sussman and N Aggarwal and S Ovchinnikov and DS Tawfik and D Baker and H Thiermann and F Worek}, url = {https://www.bakerlab.org/wp-content/uploads/2015/12/Goldsmith_ArchToxicol_2015.pdf}, doi = {10.1007/s00204-015-1626-2}, year = {2015}, date = {2015-11-26}, journal = {Archives of Toxicology}, abstract = {The nearly 200,000 fatalities following exposure to organophosphorus (OP) pesticides each year and the omnipresent danger of a terroristic attack with OP nerve agents emphasize the demand for the development of effective OP antidotes. Standard treatments for intoxicated patients with a combination of atropine and an oxime are limited in their efficacy. Thus, research focuses on developing catalytic bioscavengers as an alternative approach using OP-hydrolyzing enzymes such as Brevundimonas diminuta phosphotriesterase (PTE). Recently, a PTE mutant dubbed C23 was engineered, exhibiting reversed stereoselectivity and high catalytic efficiency (k cat/K M) for the hydrolysis of the toxic enantiomers of VX, CVX, and VR. Additionally, C23's ability to prevent systemic toxicity of VX using a low protein dose has been shown in vivo. In this study, the catalytic efficiencies of V-agent hydrolysis by two newly selected PTE variants were determined. Moreover, in order to establish trends in sequence-activity relationships along the pathway of PTE's laboratory evolution, we examined k cat/K M values of several variants with a number of V-type and G-type nerve agents as well as with different OP pesticides. Although none of the new PTE variants exhibited k cat/K M values >107 M-1 min-1 with V-type nerve agents, which is required for effective prophylaxis, they were improved with VR relative to previously evolved variants. The new variants detoxify a broad spectrum of OPs and provide insight into OP hydrolysis and sequence-activity relationships.}, keywords = {}, pubstate = {published}, tppubtype = {article} } The nearly 200,000 fatalities following exposure to organophosphorus (OP) pesticides each year and the omnipresent danger of a terroristic attack with OP nerve agents emphasize the demand for the development of effective OP antidotes. Standard treatments for intoxicated patients with a combination of atropine and an oxime are limited in their efficacy. Thus, research focuses on developing catalytic bioscavengers as an alternative approach using OP-hydrolyzing enzymes such as Brevundimonas diminuta phosphotriesterase (PTE). Recently, a PTE mutant dubbed C23 was engineered, exhibiting reversed stereoselectivity and high catalytic efficiency (k cat/K M) for the hydrolysis of the toxic enantiomers of VX, CVX, and VR. Additionally, C23's ability to prevent systemic toxicity of VX using a low protein dose has been shown in vivo. In this study, the catalytic efficiencies of V-agent hydrolysis by two newly selected PTE variants were determined. Moreover, in order to establish trends in sequence-activity relationships along the pathway of PTE's laboratory evolution, we examined k cat/K M values of several variants with a number of V-type and G-type nerve agents as well as with different OP pesticides. Although none of the new PTE variants exhibited k cat/K M values >107 M-1 min-1 with V-type nerve agents, which is required for effective prophylaxis, they were improved with VR relative to previously evolved variants. The new variants detoxify a broad spectrum of OPs and provide insight into OP hydrolysis and sequence-activity relationships. |
Huang, PS; Feldmeier, K; Parmeggiani, F; Velasco, DA Fernandez; Höcker, B; Baker, D De novo design of a four-fold symmetric TIM-barrel protein with atomic-level accuracy Journal Article Nature Chemical Biology, 12(1) , pp. 29-34, 2015. @article{PS2015, title = {De novo design of a four-fold symmetric TIM-barrel protein with atomic-level accuracy}, author = {PS Huang and K Feldmeier and F Parmeggiani and DA Fernandez Velasco and B Höcker and D Baker}, url = {https://www.bakerlab.org/wp-content/uploads/2015/12/Huang_NatChemBio_2015.pdf}, doi = {10.1038/nchembio.1966}, year = {2015}, date = {2015-11-23}, journal = {Nature Chemical Biology}, volume = {12(1)}, pages = {29-34}, abstract = {Despite efforts for over 25 years, de novo protein design has not succeeded in achieving the TIM-barrel fold. Here we describe the computational design of four-fold symmetrical (β/α)8 barrels guided by geometrical and chemical principles. Experimental characterization of 33 designs revealed the importance of side chain-backbone hydrogen bonds for defining the strand register between repeat units. The X-ray crystal structure of a designed thermostable 184-residue protein is nearly identical to that of the designed TIM-barrel model. PSI-BLAST searches do not identify sequence similarities to known TIM-barrel proteins, and sensitive profile-profile searches indicate that the design sequence is distant from other naturally occurring TIM-barrel superfamilies, suggesting that Nature has sampled only a subset of the sequence space available to the TIM-barrel fold. The ability to design TIM barrels de novo opens new possibilities for custom-made enzymes. }, keywords = {}, pubstate = {published}, tppubtype = {article} } Despite efforts for over 25 years, de novo protein design has not succeeded in achieving the TIM-barrel fold. Here we describe the computational design of four-fold symmetrical (β/α)8 barrels guided by geometrical and chemical principles. Experimental characterization of 33 designs revealed the importance of side chain-backbone hydrogen bonds for defining the strand register between repeat units. The X-ray crystal structure of a designed thermostable 184-residue protein is nearly identical to that of the designed TIM-barrel model. PSI-BLAST searches do not identify sequence similarities to known TIM-barrel proteins, and sensitive profile-profile searches indicate that the design sequence is distant from other naturally occurring TIM-barrel superfamilies, suggesting that Nature has sampled only a subset of the sequence space available to the TIM-barrel fold. The ability to design TIM barrels de novo opens new possibilities for custom-made enzymes. |
Lin YR Koga N, Tatsumi-Koga Liu Clouser AF Montelione GT Baker R G D Control over overall shape and size in de novo designed proteins Journal Article Proc Natl Acad Sci U S A., pp. E5478-85, 2015. @article{YR2015, title = {Control over overall shape and size in de novo designed proteins}, author = {Lin YR, Koga N, Tatsumi-Koga R, Liu G, Clouser AF, Montelione GT, Baker D}, url = {https://www.bakerlab.org/wp-content/uploads/2016/04/PNAS-2015-Lin-E5478-85.pdf}, doi = {10.1073/pnas.1509508112}, year = {2015}, date = {2015-10-06}, journal = {Proc Natl Acad Sci U S A.}, pages = {E5478-85}, abstract = {We recently described general principles for designing ideal protein structures stabilized by completely consistent local and nonlocal interactions. The principles relate secondary structure patterns to tertiary packing motifs and enable design of different protein topologies. To achieve fine control over protein shape and size within a particular topology, we have extended the design rules by systematically analyzing the codependencies between the lengths and packing geometry of successive secondary structure elements and the backbone torsion angles of the loop linking them. We demonstrate the control afforded by the resulting extended rule set by designing a series of proteins with the same fold but considerable variation in secondary structure length, loop geometry, β-strand registry, and overall shape. Solution NMR structures of four designed proteins for two different folds show that protein shape and size can be precisely controlled within a given protein fold. These extended design principles provide the foundation for custom design of protein structures performing desired functions. }, keywords = {}, pubstate = {published}, tppubtype = {article} } We recently described general principles for designing ideal protein structures stabilized by completely consistent local and nonlocal interactions. The principles relate secondary structure patterns to tertiary packing motifs and enable design of different protein topologies. To achieve fine control over protein shape and size within a particular topology, we have extended the design rules by systematically analyzing the codependencies between the lengths and packing geometry of successive secondary structure elements and the backbone torsion angles of the loop linking them. We demonstrate the control afforded by the resulting extended rule set by designing a series of proteins with the same fold but considerable variation in secondary structure length, loop geometry, β-strand registry, and overall shape. Solution NMR structures of four designed proteins for two different folds show that protein shape and size can be precisely controlled within a given protein fold. These extended design principles provide the foundation for custom design of protein structures performing desired functions. |
Holstein, Carly A; Chevalier, Aaron; Bennett, Steven; Anderson, Caitlin E; Keniston, Karen; Olsen, Cathryn; Li, Bing; Bales, Brian; Moore, David R; Fu, Elain; Baker, David; Yager, Paul Immobilizing affinity proteins to nitrocellulose: a toolbox for paper-based assay developers. Journal Article Analytical and bioanalytical chemistry, 2015, ISSN: 1618-2650. @article{626, title = {Immobilizing affinity proteins to nitrocellulose: a toolbox for paper-based assay developers.}, author = { Carly A Holstein and Aaron Chevalier and Steven Bennett and Caitlin E Anderson and Karen Keniston and Cathryn Olsen and Bing Li and Brian Bales and David R Moore and Elain Fu and David Baker and Paul Yager}, url = {http://www.bakerlab.org/wp-content/uploads/2015/12/Holstien_Anal_Bioanal_Chem_2015.pdf}, doi = {10.1007/s00216-015-9052-0}, issn = {1618-2650}, year = {2015}, date = {2015-10-01}, journal = {Analytical and bioanalytical chemistry}, abstract = {To enable enhanced paper-based diagnostics with improved detection capabilities, new methods are needed to immobilize affinity reagents to porous substrates, especially for capture molecules other than IgG. To this end, we have developed and characterized three novel methods for immobilizing protein-based affinity reagents to nitrocellulose membranes. We have demonstrated these methods using recombinant affinity proteins for the influenza surface protein hemagglutinin, leveraging the customizability of these recombinant "flu binders" for the design of features for immobilization. The three approaches shown are: (1) covalent attachment of thiolated affinity protein to an epoxide-functionalized nitrocellulose membrane, (2) attachment of biotinylated affinity protein through a nitrocellulose-binding streptavidin anchor protein, and (3) fusion of affinity protein to a novel nitrocellulose-binding anchor protein for direct coupling and immobilization. We also characterized the use of direct adsorption for the flu binders, as a point of comparison and motivation for these novel methods. Finally, we demonstrated that these novel methods can provide improved performance to an influenza hemagglutinin assay, compared to a traditional antibody-based capture system. Taken together, this work advances the toolkit available for the development of next-generation paper-based diagnostics.}, keywords = {}, pubstate = {published}, tppubtype = {article} } To enable enhanced paper-based diagnostics with improved detection capabilities, new methods are needed to immobilize affinity reagents to porous substrates, especially for capture molecules other than IgG. To this end, we have developed and characterized three novel methods for immobilizing protein-based affinity reagents to nitrocellulose membranes. We have demonstrated these methods using recombinant affinity proteins for the influenza surface protein hemagglutinin, leveraging the customizability of these recombinant "flu binders" for the design of features for immobilization. The three approaches shown are: (1) covalent attachment of thiolated affinity protein to an epoxide-functionalized nitrocellulose membrane, (2) attachment of biotinylated affinity protein through a nitrocellulose-binding streptavidin anchor protein, and (3) fusion of affinity protein to a novel nitrocellulose-binding anchor protein for direct coupling and immobilization. We also characterized the use of direct adsorption for the flu binders, as a point of comparison and motivation for these novel methods. Finally, we demonstrated that these novel methods can provide improved performance to an influenza hemagglutinin assay, compared to a traditional antibody-based capture system. Taken together, this work advances the toolkit available for the development of next-generation paper-based diagnostics. |
S Ovchinnikov L Kinch, Park Liao Pei DE Kim Kamisetty NV Grishin Baker H Y J H D Large-scale determination of previously unsolved protein structures using evolutionary information Journal Article eLife, 2015. @article{S2015b, title = {Large-scale determination of previously unsolved protein structures using evolutionary information}, author = {S Ovchinnikov, L Kinch, H Park, Y Liao, J Pei, DE Kim, H Kamisetty, NV Grishin, D Baker}, url = {https://www.bakerlab.org/wp-content/uploads/2016/01/Ovchinnikov_eLife_2015.pdf}, doi = {10.7554/eLife.09248}, year = {2015}, date = {2015-09-03}, journal = {eLife}, abstract = {The prediction of the structures of proteins without detectable sequence similarity to any protein of known structure remains an outstanding scientific challenge. Here we report significant progress in this area. We first describe de novo blind structure predictions of unprecendented accuracy we made for two proteins in large families in the recent CASP11 blind test of protein structure prediction methods by incorporating residue-residue co-evolution information in the Rosetta structure prediction program. We then describe the use of this method to generate structure models for 58 of the 121 large protein families in prokaryotes for which three-dimensional structures are not available. These models, which are posted online for public access, provide structural information for the over 400,000 proteins belonging to the 58 families and suggest hypotheses about mechanism for the subset for which the function is known, and hypotheses about function for the remainder. }, keywords = {}, pubstate = {published}, tppubtype = {article} } The prediction of the structures of proteins without detectable sequence similarity to any protein of known structure remains an outstanding scientific challenge. Here we report significant progress in this area. We first describe de novo blind structure predictions of unprecendented accuracy we made for two proteins in large families in the recent CASP11 blind test of protein structure prediction methods by incorporating residue-residue co-evolution information in the Rosetta structure prediction program. We then describe the use of this method to generate structure models for 58 of the 121 large protein families in prokaryotes for which three-dimensional structures are not available. These models, which are posted online for public access, provide structural information for the over 400,000 proteins belonging to the 58 families and suggest hypotheses about mechanism for the subset for which the function is known, and hypotheses about function for the remainder. |
Wolf, Clancey; Siegel, Justin B; Tinberg, Christine; Camarca, Alessandra; Gianfrani, Carmen; Paski, Shirley; Guan, Rongjin; Montelione, Gaetano T; Baker, David; Pultz, Ingrid S Engineering of Kuma030: a gliadin peptidase that rapidly degrades immunogenic gliadin peptides in gastric conditions. Journal Article Journal of the American Chemical Society, 2015, ISSN: 1520-5126. @article{617, title = {Engineering of Kuma030: a gliadin peptidase that rapidly degrades immunogenic gliadin peptides in gastric conditions.}, author = { Clancey Wolf and Justin B Siegel and Christine Tinberg and Alessandra Camarca and Carmen Gianfrani and Shirley Paski and Rongjin Guan and Gaetano T Montelione and David Baker and Ingrid S Pultz}, url = {http://www.bakerlab.org/wp-content/uploads/2015/12/Wolf_JACS_2015.pdf}, doi = {10.1021/jacs.5b08325}, issn = {1520-5126}, year = {2015}, date = {2015-09-01}, journal = {Journal of the American Chemical Society}, abstract = {Celiac disease is characterized by intestinal inflammation triggered by gliadin, a component of dietary gluten. Oral administration of proteases that can rapidly degrade gliadin in the gastric compartment has been proposed as a treatment for celiac disease; however, no protease has been shown to specifically reduce the immunogenic gliadin content, in gastric conditions, to below the threshold shown to be toxic for celiac patients. Here, we used the Rosetta Molecular Modeling Suite to redesign the active site of the acid-active gliadin endopeptidase KumaMax. The resulting protease, Kuma030, specifically recognizes tripeptide sequences that are found throughout the immunogenic regions of gliadin, as well as in homologous proteins in barley and rye. Indeed, treatment of gliadin with Kuma030 eliminates the ability of gliadin to stimulate a T cell response. Kuma030 is capable of degrading >99% of the immunogenic gliadin fraction in laboratory-simulated gastric digestions with minutes, to a level below the toxic threshold for celiac patients, suggesting great potential for this enzyme as an oral therapeutic for celiac disease.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Celiac disease is characterized by intestinal inflammation triggered by gliadin, a component of dietary gluten. Oral administration of proteases that can rapidly degrade gliadin in the gastric compartment has been proposed as a treatment for celiac disease; however, no protease has been shown to specifically reduce the immunogenic gliadin content, in gastric conditions, to below the threshold shown to be toxic for celiac patients. Here, we used the Rosetta Molecular Modeling Suite to redesign the active site of the acid-active gliadin endopeptidase KumaMax. The resulting protease, Kuma030, specifically recognizes tripeptide sequences that are found throughout the immunogenic regions of gliadin, as well as in homologous proteins in barley and rye. Indeed, treatment of gliadin with Kuma030 eliminates the ability of gliadin to stimulate a T cell response. Kuma030 is capable of degrading >99% of the immunogenic gliadin fraction in laboratory-simulated gastric digestions with minutes, to a level below the toxic threshold for celiac patients, suggesting great potential for this enzyme as an oral therapeutic for celiac disease. |
Bale, Jacob B; Park, Rachel U; Liu, Yuxi; Gonen, Shane; Gonen, Tamir; Cascio, Duilio; King, Neil P; Yeates, Todd O; Baker, David Structure of a designed tetrahedral protein assembly variant engineered to have improved soluble expression Journal Article Protein science : a publication of the Protein Society, 2015, ISSN: 1469-896X. @article{616, title = {Structure of a designed tetrahedral protein assembly variant engineered to have improved soluble expression}, author = { Jacob B Bale and Rachel U Park and Yuxi Liu and Shane Gonen and Tamir Gonen and Duilio Cascio and Neil P. King and Todd O. Yeates and David Baker}, url = {http://www.bakerlab.org/wp-content/uploads/2015/12/Bale_designed_tetrahedral_ProteinSci2015.pdf}, doi = {10.1002/pro.2748}, issn = {1469-896X}, year = {2015}, date = {2015-07-01}, journal = {Protein science : a publication of the Protein Society}, abstract = {We recently reported the development of a computational method for the design of coassembling multicomponent protein nanomaterials. While four such materials were validated at high-resolution by X-ray crystallography, low yield of soluble protein prevented X-ray structure determination of a fifth designed material, T33-09. Here we report the design and crystal structure of T33-31, a variant of T33-09 with improved soluble yield resulting from redesign efforts focused on mutating solvent-exposed side chains to charged amino acids. The structure is found to match the computational design model with atomic-level accuracy, providing further validation of the design approach and demonstrating a simple and potentially general means of improving the yield of designed protein nanomaterials.}, keywords = {}, pubstate = {published}, tppubtype = {article} } We recently reported the development of a computational method for the design of coassembling multicomponent protein nanomaterials. While four such materials were validated at high-resolution by X-ray crystallography, low yield of soluble protein prevented X-ray structure determination of a fifth designed material, T33-09. Here we report the design and crystal structure of T33-31, a variant of T33-09 with improved soluble yield resulting from redesign efforts focused on mutating solvent-exposed side chains to charged amino acids. The structure is found to match the computational design model with atomic-level accuracy, providing further validation of the design approach and demonstrating a simple and potentially general means of improving the yield of designed protein nanomaterials. |
Gonen, Shane; DiMaio, Frank; Gonen, Tamir; Baker, David Design of ordered two-dimensional arrays mediated by noncovalent protein-protein interfaces Journal Article Science (New York, N.Y.), 348 , pp. 1365-8, 2015, ISSN: 1095-9203. @article{613, title = {Design of ordered two-dimensional arrays mediated by noncovalent protein-protein interfaces}, author = { Shane Gonen and Frank DiMaio and Tamir Gonen and David Baker}, url = {http://www.bakerlab.org/wp-content/uploads/2015/12/Gonen_2DArrays_Baker2015.pdf}, doi = {10.1126/science.aaa9897}, issn = {1095-9203}, year = {2015}, date = {2015-06-01}, journal = {Science (New York, N.Y.)}, volume = {348}, pages = {1365-8}, abstract = {We describe a general approach to designing two-dimensional (2D) protein arrays mediated by noncovalent protein-protein interfaces. Protein homo-oligomers are placed into one of the seventeen 2D layer groups, the degrees of freedom of the lattice are sampled to identify configurations with shape-complementary interacting surfaces, and the interaction energy is minimized using sequence design calculations. We used the method to design proteins that self-assemble into layer groups P 3 2 1, P 4 2(1) 2, and P 6. Projection maps of micrometer-scale arrays, assembled both in vitro and in vivo, are consistent with the design models and display the target layer group symmetry. Such programmable 2D protein lattices should enable new approaches to structure determination, sensing, and nanomaterial engineering.}, keywords = {}, pubstate = {published}, tppubtype = {article} } We describe a general approach to designing two-dimensional (2D) protein arrays mediated by noncovalent protein-protein interfaces. Protein homo-oligomers are placed into one of the seventeen 2D layer groups, the degrees of freedom of the lattice are sampled to identify configurations with shape-complementary interacting surfaces, and the interaction energy is minimized using sequence design calculations. We used the method to design proteins that self-assemble into layer groups P 3 2 1, P 4 2(1) 2, and P 6. Projection maps of micrometer-scale arrays, assembled both in vitro and in vivo, are consistent with the design models and display the target layer group symmetry. Such programmable 2D protein lattices should enable new approaches to structure determination, sensing, and nanomaterial engineering. |
Park, Hahnbeom; DiMaio, Frank; Baker, David The origin of consistent protein structure refinement from structural averaging. Journal Article Structure (London, England : 1993), 23 , pp. 1123-8, 2015, ISSN: 1878-4186. @article{615, title = {The origin of consistent protein structure refinement from structural averaging.}, author = { Hahnbeom Park and Frank DiMaio and David Baker}, url = {http://www.bakerlab.org/wp-content/uploads/2015/12/Park_Structure_2015.pdf http://www.ncbi.nlm.nih.gov/pubmed/?term=The+Origin+of+Consistent+Protein+Structure+Refinement+from+Structural+Averaging}, doi = {10.1016/j.str.2015.03.022}, issn = {1878-4186}, year = {2015}, date = {2015-06-01}, journal = {Structure (London, England : 1993)}, volume = {23}, pages = {1123-8}, abstract = {Recent studies have shown that explicit solvent molecular dynamics (MD) simulation followed by structural averaging can consistently improve protein structure models. We find that improvement upon averaging is not limited to explicit water MD simulation, as consistent improvements are also observed for more efficient implicit solvent MD or Monte Carlo minimization simulations. To determine the origin of these improvements, we examine the changes in model accuracy brought about by averaging at the individual residue level. We find that the improvement in model quality from averaging results from the superposition of two effects: a dampening of deviations from the correct structure in the least well modeled regions, and a reinforcement of consistent movements towards the correct structure in better modeled regions. These observations are consistent with an energy landscape model in which the magnitude of the energy gradient toward the native structure decreases with increasing distance from the native state.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Recent studies have shown that explicit solvent molecular dynamics (MD) simulation followed by structural averaging can consistently improve protein structure models. We find that improvement upon averaging is not limited to explicit water MD simulation, as consistent improvements are also observed for more efficient implicit solvent MD or Monte Carlo minimization simulations. To determine the origin of these improvements, we examine the changes in model accuracy brought about by averaging at the individual residue level. We find that the improvement in model quality from averaging results from the superposition of two effects: a dampening of deviations from the correct structure in the least well modeled regions, and a reinforcement of consistent movements towards the correct structure in better modeled regions. These observations are consistent with an energy landscape model in which the magnitude of the energy gradient toward the native structure decreases with increasing distance from the native state. |
Siegel, Justin B; Smith, Amanda Lee; Poust, Sean; Wargacki, Adam J; Bar-Even, Arren; Louw, Catherine; Shen, Betty W; Eiben, Christopher B; Tran, Huu M; Noor, Elad; Gallaher, Jasmine L; Bale, Jacob; Yoshikuni, Yasuo; Gelb, Michael H; Keasling, Jay D; Stoddard, Barry L; Lidstrom, Mary E; Baker, David Computational protein design enables a novel one-carbon assimilation pathway Journal Article Proceedings of the National Academy of Sciences of the United States of America, 2015, ISSN: 1091-6490. @article{565, title = {Computational protein design enables a novel one-carbon assimilation pathway}, author = { Justin B Siegel and Amanda Lee Smith and Sean Poust and Adam J Wargacki and Arren Bar-Even and Catherine Louw and Betty W Shen and Christopher B Eiben and Huu M Tran and Elad Noor and Jasmine L Gallaher and Jacob Bale and Yasuo Yoshikuni and Michael H Gelb and Jay D Keasling and Barry L Stoddard and Mary E Lidstrom and David Baker}, url = {http://www.bakerlab.org/wp-content/uploads/2015/12/siegel15A.pdf}, doi = {10.1073/pnas.1500545112}, issn = {1091-6490}, year = {2015}, date = {2015-03-01}, journal = {Proceedings of the National Academy of Sciences of the United States of America}, abstract = {We describe a computationally designed enzyme, formolase (FLS), which catalyzes the carboligation of three one-carbon formaldehyde molecules into one three-carbon dihydroxyacetone molecule. The existence of FLS enables the design of a new carbon fixation pathway, the formolase pathway, consisting of a small number of thermodynamically favorable chemical transformations that convert formate into a three-carbon sugar in central metabolism. The formolase pathway is predicted to use carbon more efficiently and with less backward flux than any naturally occurring one-carbon assimilation pathway. When supplemented with enzymes carrying out the other steps in the pathway, FLS converts formate into dihydroxyacetone phosphate and other central metabolites in vitro. These results demonstrate how modern protein engineering and design tools can facilitate the construction of a completely new biosynthetic pathway.}, keywords = {}, pubstate = {published}, tppubtype = {article} } We describe a computationally designed enzyme, formolase (FLS), which catalyzes the carboligation of three one-carbon formaldehyde molecules into one three-carbon dihydroxyacetone molecule. The existence of FLS enables the design of a new carbon fixation pathway, the formolase pathway, consisting of a small number of thermodynamically favorable chemical transformations that convert formate into a three-carbon sugar in central metabolism. The formolase pathway is predicted to use carbon more efficiently and with less backward flux than any naturally occurring one-carbon assimilation pathway. When supplemented with enzymes carrying out the other steps in the pathway, FLS converts formate into dihydroxyacetone phosphate and other central metabolites in vitro. These results demonstrate how modern protein engineering and design tools can facilitate the construction of a completely new biosynthetic pathway. |
DiMaio, Frank; Song, Yifan; Li, Xueming; Brunner, Matthias J; Xu, Chunfu; Conticello, Vincent; Egelman, Edward; Marlovits, Thomas C; Cheng, Yifan; Baker, David Atomic-accuracy models from 4.5-r A cryo-electron microscopy data with density-guided iterative local refinement. Journal Article Nature methods, 2015, ISSN: 1548-7105. @article{560, title = {Atomic-accuracy models from 4.5-r A cryo-electron microscopy data with density-guided iterative local refinement.}, author = { Frank DiMaio and Yifan Song and Xueming Li and Matthias J Brunner and Chunfu Xu and Vincent Conticello and Edward Egelman and Thomas C Marlovits and Yifan Cheng and David Baker}, url = {http://www.bakerlab.org/wp-content/uploads/2015/12/DiMaio_NatMethods_2015.pdf}, doi = {10.1038/nmeth.3286}, issn = {1548-7105}, year = {2015}, date = {2015-02-01}, journal = {Nature methods}, abstract = {We describe a general approach for refining protein structure models on the basis of cryo-electron microscopy maps with near-atomic resolution. The method integrates Monte Carlo sampling with local density-guided optimization, Rosetta all-atom refinement and real-space B-factor fitting. In tests on experimental maps of three different systems with 4.5-r A resolution or better, the method consistently produced models with atomic-level accuracy largely independently of starting-model quality, and it outperformed the molecular dynamics-based MDFF method. Cross-validated model quality statistics correlated with model accuracy over the three test systems.}, keywords = {}, pubstate = {published}, tppubtype = {article} } We describe a general approach for refining protein structure models on the basis of cryo-electron microscopy maps with near-atomic resolution. The method integrates Monte Carlo sampling with local density-guided optimization, Rosetta all-atom refinement and real-space B-factor fitting. In tests on experimental maps of three different systems with 4.5-r A resolution or better, the method consistently produced models with atomic-level accuracy largely independently of starting-model quality, and it outperformed the molecular dynamics-based MDFF method. Cross-validated model quality statistics correlated with model accuracy over the three test systems. |
Wang, Ray Yu-Ruei; Kudryashev, Mikhail; Li, Xueming; Egelman, Edward H; Basler, Marek; Cheng, Yifan; Baker, David; DiMaio, Frank De novo protein structure determination from near-atomic-resolution cryo-EM maps. Journal Article Nature methods, 2015, ISSN: 1548-7105. @article{559, title = {De novo protein structure determination from near-atomic-resolution cryo-EM maps.}, author = { Ray Yu-Ruei Wang and Mikhail Kudryashev and Xueming Li and Edward H Egelman and Marek Basler and Yifan Cheng and David Baker and Frank DiMaio}, url = {http://www.bakerlab.org/wp-content/uploads/2015/12/Wang_NatMethods_2015.pdf}, doi = {10.1038/nmeth.3287}, issn = {1548-7105}, year = {2015}, date = {2015-02-01}, journal = {Nature methods}, abstract = {We present a de novo model-building approach that combines predicted backbone conformations with side-chain fit to density to accurately assign sequence into density maps. This method yielded accurate models for six of nine experimental maps at 3.3- to 4.8-r A resolution and produced a nearly complete model for an unsolved map containing a 660-residue heterodimeric protein. This method should enable rapid and reliable protein structure determination from near-atomic-resolution cryo-electron microscopy (cryo-EM) maps.}, keywords = {}, pubstate = {published}, tppubtype = {article} } We present a de novo model-building approach that combines predicted backbone conformations with side-chain fit to density to accurately assign sequence into density maps. This method yielded accurate models for six of nine experimental maps at 3.3- to 4.8-r A resolution and produced a nearly complete model for an unsolved map containing a 660-residue heterodimeric protein. This method should enable rapid and reliable protein structure determination from near-atomic-resolution cryo-electron microscopy (cryo-EM) maps. |
Rossi, Paolo; Shi, Lei; Liu, Gaohua; Barbieri, Christopher M; Lee, Hsiau-Wei; Grant, Thomas D; Luft, Joseph R; Xiao, Rong; Acton, Thomas B; Snell, Edward H; Montelione, Gaetano T; Baker, David; Lange, Oliver F; Sgourakis, Nikolaos G A hybrid NMR/SAXS-based approach for discriminating oligomeric protein interfaces using Rosetta Journal Article Proteins, 83 , pp. 309-17, 2015, ISSN: 1097-0134. @article{611, title = {A hybrid NMR/SAXS-based approach for discriminating oligomeric protein interfaces using Rosetta}, author = { Paolo Rossi and Lei Shi and Gaohua Liu and Christopher M Barbieri and Hsiau-Wei Lee and Thomas D Grant and Joseph R Luft and Rong Xiao and Thomas B Acton and Edward H Snell and Gaetano T Montelione and David Baker and Oliver F Lange and Nikolaos G Sgourakis}, url = {http://www.bakerlab.org/wp-content/uploads/2015/12/ahybridnmrsaxsbased_Baker2015.pdf}, doi = {10.1002/prot.24719}, issn = {1097-0134}, year = {2015}, date = {2015-02-01}, journal = {Proteins}, volume = {83}, pages = {309-17}, abstract = {Oligomeric proteins are important targets for structure determination in solution. While in most cases the fold of individual subunits can be determined experimentally, or predicted by homology-based methods, protein-protein interfaces are challenging to determine de novo using conventional NMR structure determination protocols. Here we focus on a member of the bet-V1 superfamily, Aha1 from Colwellia psychrerythraea. This family displays a broad range of crystallographic interfaces none of which can be reconciled with the NMR and SAXS data collected for Aha1. Unlike conventional methods relying on a dense network of experimental restraints, the sparse data are used to limit conformational search during optimization of a physically realistic energy function. This work highlights a new approach for studying minor conformational changes due to structural plasticity within a single dimeric interface in solution.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Oligomeric proteins are important targets for structure determination in solution. While in most cases the fold of individual subunits can be determined experimentally, or predicted by homology-based methods, protein-protein interfaces are challenging to determine de novo using conventional NMR structure determination protocols. Here we focus on a member of the bet-V1 superfamily, Aha1 from Colwellia psychrerythraea. This family displays a broad range of crystallographic interfaces none of which can be reconciled with the NMR and SAXS data collected for Aha1. Unlike conventional methods relying on a dense network of experimental restraints, the sparse data are used to limit conformational search during optimization of a physically realistic energy function. This work highlights a new approach for studying minor conformational changes due to structural plasticity within a single dimeric interface in solution. |
Pearson, Aaron D; Mills, Jeremy H; Song, Yifan; Nasertorabi, Fariborz; Han, Gye Won; Baker, David; Stevens, Raymond C; Schultz, Peter G Transition states. Trapping a transition state in a computationally designed protein bottle. Journal Article Science (New York, N.Y.), 347 , pp. 863-7, 2015, ISSN: 1095-9203. @article{561, title = {Transition states. Trapping a transition state in a computationally designed protein bottle.}, author = { Aaron D Pearson and Jeremy H Mills and Yifan Song and Fariborz Nasertorabi and Gye Won Han and David Baker and Raymond C Stevens and Peter G Schultz}, url = {http://www.bakerlab.org/wp-content/uploads/2015/12/Mills_Science_2015A.pdf}, doi = {10.1126/science.aaa2424}, issn = {1095-9203}, year = {2015}, date = {2015-02-01}, journal = {Science (New York, N.Y.)}, volume = {347}, pages = {863-7}, abstract = {The fleeting lifetimes of the transition states (TSs) of chemical reactions make determination of their three-dimensional structures by diffraction methods a challenge. Here, we used packing interactions within the core of a protein to stabilize the planar TS conformation for rotation around the central carbon-carbon bond of biphenyl so that it could be directly observed by x-ray crystallography. The computational protein design software Rosetta was used to design a pocket within threonyl-transfer RNA synthetase from the thermophile Pyrococcus abyssi that forms complementary van der Waals interactions with a planar biphenyl. This latter moiety was introduced biosynthetically as the side chain of the noncanonical amino acid p-biphenylalanine. Through iterative rounds of computational design and structural analysis, we identified a protein in which the side chain of p-biphenylalanine is trapped in the energetically disfavored, coplanar conformation of the TS of the bond rotation reaction.}, keywords = {}, pubstate = {published}, tppubtype = {article} } The fleeting lifetimes of the transition states (TSs) of chemical reactions make determination of their three-dimensional structures by diffraction methods a challenge. Here, we used packing interactions within the core of a protein to stabilize the planar TS conformation for rotation around the central carbon-carbon bond of biphenyl so that it could be directly observed by x-ray crystallography. The computational protein design software Rosetta was used to design a pocket within threonyl-transfer RNA synthetase from the thermophile Pyrococcus abyssi that forms complementary van der Waals interactions with a planar biphenyl. This latter moiety was introduced biosynthetically as the side chain of the noncanonical amino acid p-biphenylalanine. Through iterative rounds of computational design and structural analysis, we identified a protein in which the side chain of p-biphenylalanine is trapped in the energetically disfavored, coplanar conformation of the TS of the bond rotation reaction. |
Egelman, E H; Xu, C; DiMaio, F; Magnotti, E; Modlin, C; Yu, X; Wright, E; Baker, D; Conticello, V P Structural plasticity of helical nanotubes based on coiled-coil assemblies. Journal Article Structure (London, England : 1993), 23 , pp. 280-9, 2015, ISSN: 1878-4186. @article{609, title = {Structural plasticity of helical nanotubes based on coiled-coil assemblies.}, author = { E H Egelman and C. Xu and F. DiMaio and E Magnotti and C Modlin and X Yu and E Wright and D Baker and V P Conticello}, url = {http://www.bakerlab.org/wp-content/uploads/2015/12/structuralplasticity_Baker2015.pdf}, doi = {10.1016/j.str.2014.12.008}, issn = {1878-4186}, year = {2015}, date = {2015-02-01}, journal = {Structure (London, England : 1993)}, volume = {23}, pages = {280-9}, abstract = {Numerous instances can be seen in evolution in which protein quaternary structures have diverged while the sequences of the building blocks have remained fairly conserved. However, the path through which such divergence has taken place is usually not known. We have designed two synthetic 29-residue α-helical peptides, based on the coiled-coil structural motif, that spontaneously self-assemble into helical nanotubes in vitro. Using electron cryomicroscopy with a newly available direct electron detection capability, we can achieve near-atomic resolution of these thin structures. We show how conservative changes of only one or two amino acids result in dramatic changes in quaternary structure, in which the assemblies can be switched between two very different forms. This system provides a framework for understanding how small sequence changes in evolution can translate into very large changes in supramolecular structure, a phenomenon that may have significant implications for the de novo design of synthetic peptide assemblies.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Numerous instances can be seen in evolution in which protein quaternary structures have diverged while the sequences of the building blocks have remained fairly conserved. However, the path through which such divergence has taken place is usually not known. We have designed two synthetic 29-residue α-helical peptides, based on the coiled-coil structural motif, that spontaneously self-assemble into helical nanotubes in vitro. Using electron cryomicroscopy with a newly available direct electron detection capability, we can achieve near-atomic resolution of these thin structures. We show how conservative changes of only one or two amino acids result in dramatic changes in quaternary structure, in which the assemblies can be switched between two very different forms. This system provides a framework for understanding how small sequence changes in evolution can translate into very large changes in supramolecular structure, a phenomenon that may have significant implications for the de novo design of synthetic peptide assemblies. |