Inhalt des Dokuments
Identifying near-native multi-fragment sequence alignments in protein structure prediction (Fabian Salomon)
Up until today, commonly used fragment
libraries only contain relatively small, independent fragments.
Consequently, these libraries can only model the (sequentially) local
context, but can’t model structurally conserved regions that are
sequentially discontiguous. We therefore developed a library of so
called ”building blocks”. A building block is a set of
structurally contiguous, sequentially discontiguous fragments found in
two or more proteins.
Existing methods of scoring a sequence alignment can only score each fragment independently or as a contiguous sequence (including the in-between parts). They are therefore not leveraging the additional information provided by building blocks optimally.
Description of Work
We explore how the knowledge about the dependency between building block fragments can be exploited for a more specific scoring scheme. In orderto achieve this, we examine a number of different features that allow a coarse distinction between structural matches and false positives. Ultimately, we evaluate the combined discriminative power of these features through the lens of three different machine learning algorithms.
from the performance on CASP9 targets, the proposed setup works well
for template based modeling targets. We were able to cover 61 targets
(15 more than the control) with 100 % near-native building block
matches. Both the proposed setup and the control achieved roughly the
same number of residues that were covered with near-native matches.