Identifying near-native multi-fragment sequence alignments in protein structure prediction (Fabian Salomon)
until today, commonly used fragment libraries only contain relatively
small, independent fragments. Consequently, these libraries can only
model the (sequentially) local context, but can’t model structurally
conserved regions that are sequentially discontiguous. We therefore
developed a library of so called ”building blocks”. A building
block is a set of structurally contiguous, sequentially discontiguous
fragments found in two or more proteins.
Existing methods of scoring a sequence alignment can only score each fragment independently or as a contiguous sequence (including the in-between parts). They are therefore not leveraging the additional information provided by building blocks optimally.
Description of Work
We explore how the knowledge about the dependency between building block fragments can be exploited for a more specific scoring scheme. In orderto achieve this, we examine a number of different features that allow a coarse distinction between structural matches and false positives. Ultimately, we evaluate the combined discriminative power of these features through the lens of three different machine learning algorithms.
from the performance on CASP9 targets, the proposed setup works well
for template based modeling targets. We were able to cover 61 targets
(15 more than the control) with 100 % near-native building block
matches. Both the proposed setup and the control achieved roughly the
same number of residues that were covered with near-native matches.