Inhalt des Dokuments
Proteins are one of the most abundant molecules in living organisms. They are in charge of a variety of crucial functions, such as transporting molecules (e.g. Hemoglobin), catalyzing reactions (e.g. Enzymes), replicating DNA, identifying and neutralizing foreign bacteria and viruses (e.g. Antibodies) and many more. Predicting the protein structures can help us understand how they function, to create new drugs targeting them, to understand how mutation affect them.
We are continuously developing novel structure prediction methods and integrating them into a fully automated protein structure prediction pipeline, RBO Aleph.
A web interface of to RBO Aleph is available here: http://compbio.robotics.tu-berlin.de/rbo_aleph/ 
Predicting protein contacts by combining information from sequence and physicochemistry (EPSILON-CP)
Contact prediction is an intermediate step towards
solving the protein structure prediction problem. Contact prediction
methods identify residue pairs that are close in space in the native
structure. Knowledge of the contact map can then be used to guide ab
initio methods and to reconstruct the 3D structure of a protein. Due
to the size of the search space, contact prediction remains a hard
problem. To make the problem tractable, information are given as
priors to the model to constrain the search space. Currently, many
different information sources are used in contact prediction. We want
to exploit the different profiles to alleviate potential
We developed a novel contact prediction method (EPSILON-CP) that combines evolutionary, sequence-based and physicochemical information. The physicochemical information stem from EPC-map (see compbio.robotics.tu-berlin.de/epc-map/) , a method developed by Michael Schneider as part of his PhD, that ranked 2nd for long+medium range contacts and 5th for long-range contacts in CASP11. EPSILON-CP utilizes a deep neural network to effectively combine the aforementioned information sources. A key contribution is the refined feature set with drastically reduced dimensionality. EPISLON-CP ranked 5th in the final ranking of the CASP12 contact prediction assessment (group name RBO-EPSILON).
The web server can be found here . EPSILON-CP is also part of the structure prediction pipeline RBO Aleph that is constantly evaluated in CAMEO.
Contact: Kolja Stahl 
Topology-Based Search for Protein Structure Prediction
The major challenge of ab initio protein structure predictions is the huge conformational space populated by large proteins which has to be sampled in order to find the native structure. Due to the size of the conformational space, the probability of sampling from the vicinity of the native conformation is low. But is it really necessary to consider all possible conformations while searching?
Despite having diverse shapes and functions, proteins only populate a tiny part of the space of possible conformations. Our goal is to leverage our knowledge about these populated topologies to guide the search. We strongly believe that using this information during sampling will alleviate many of the problems arising from the size of the conformational space. This in turn should allow us to predict many proteins which are traditionally unsolved by ab initio.
Contact: Mahmoud Mabrouk 
Model-Based-Search for Protein Structure Prediction
Model-based search (MBS) is
our basic method for efficient conformational search in structure
prediction. Typically, conformational search in structure prediction
is uninformed and proceeds by executing many Monte Carlo simulations,
pooling the results and selecting the best solutions. In contrast, MBS
tries to gain information about the underlying energy landscape during
search. Each conformation that is sampled by MBS is considered as a
sample on the energy landscape. MBS analyzes the quality and the
distribution of the samples to identify "funnels" in the
energy landscape, regions that are likely to contain the native state.
As MBS progresses, it gradually refines its model of the energy
landscape and allocates computational resources to regions that are
MBS forms the basic algorithm of all our structure prediction efforts. We used a new implementation of the algorithm that was first introduced in CASP8. The algorithm is most suitable for "free modeling", which is the modeling of protein structures that cannot be modeled by exploiting the sequence-structure similarities to other proteins.
In CASP10, our server RBO-MBS ranked 10th out of 68 automatic servers that participated in the free-modeling category.
In CASP11, our server RBO-Aleph ranked 3rd out of 44 automatic servers that participated in the free-modeling category.
Contact: Mahmoud Mabrouk , Kolja Stahl 
Protein Structure Determination using Cross-linking/Mass Spectrometry and Computational Biology
We are developing novel ways of to determine protein structure using a combination of chemistry and computation. In this project, we developed a highly reactive photochemistry that increases the number of cross-links 17x over earlier, low resolution cross-linking approaches. We combine this data with conformational space search algorithms in a "hybrid" approach to determine protein structure.
We demonstrated the potential of this method by determining the structure of human serum albumin domains in the context of human blood serum. This demonstrates the possibility of determining the structure of proteins in the complex biological contexts in which they function and which they may require for correct folding.
Contact: Mahmoud Mabrouk , Kolja Stahl 
- © AvH
Alexander von Humboldt professorship  - awarded
by the Alexander von Humboldt foundation  and funded through the
Ministry of Education and Research, BMBF ,
July 2009 - June 2014
- © NIH
Predicting Protein Structure with Guided Conformation Space Search  - funded by the National Institutes of Health (NIH) , award number 5R01 GM076706,
August 2006 - May 2013