Nature 450, 259-64.
A longstanding problem in computational biology is the refinement of low resolution protein structure models to more atomic-level accurate structures. A related challenge is refining low-resolution NMR models to the quality of high-resolution structures. NMR is a valuable tool for determining protein structures, particularly because it does not require crystals. But some NMR structures, especially those determined from insufficient restraints or misinterpreted data, can be incorrect. Also, the core of an NMR structure can tend to be under-packed, possibly due to overlapping spectra. To tackle both challenges, comparative model refinement and NMR structure refinement, we have been developing the Rosetta high-resolution refinement protocol. This protocol involves focusing sampling on regions of the structure that are most likely to contain errors while allowing the whole structure to relax in a physically realistic all-atom forcefield.
A stringent test of accuracy of protein structure models is the molecular replacement test. Molecular replacement solves the crystallographic phase problem by estimating the phases based on model. However, to be successful, the model has to be very close to the structure being solved (typically < 1.5 A). Comparative models used successfully for molecular replacement generally come from templates that share a sequence identity of > 50% with the native sequence. We have shown that models made by the Rosetta high resolution refinement protocols starting from comparative models ( < 20% sequence identity ) and NMR structures consistently provide good molecular replacement solutions.