ARCH_Pred at bioinsilico \dot\ org

ARCH_Pred server

Information

ARCH_Pred is a web application that interface a knowledge-based loop structure prediction method described in this publication. Given a query loop of unknown structure, ARCH_Pred identifies the most suitable loops from a library of structures of protein loops. The prediction algorithm shown below include 3 major steps: Selection, Filtering and Ranking.

Selection

Filtering

Ranking

here

Submission form

Prediction parameters

The first section of the submission form allow users to upload the coordinates of the protein structure containing the missing loop(s). The atomic coordinates must be on standard PDB format. Users have to define the starting residue of the missing loop and the chain ID.

The sequence of the missing loop in the form of a string of amino acids in single letter code is also needed, eg. HDAD for a missing loop of length 4 and His-Asp-Ala-Asp sequence. The type of flanking secondary structures, i.e. the type of regular secondary structure that flank the missing loop is also required. The possibilities are:

From left to rigth: alpha-loop-alpha; alpha-loop-beta; beta-loop-alpha; beta-loop-beta (hairpin); beta-loop-beta (link). Note that both hairpins and links are considered beta-loop-beta motifs.

Finally, users select the number of loop structure to be returned by the prediction (10 maximum) and the Z-score cut off.

IMPORTANT The numbering of the uploaded coordinates must be consistent and in agreeement with the missing loop. For instance, if predicting a missing loop of length 14 and the loop stars at residue 17 and chain A, then the PDB file should look like this:

...
ATOM    826  N   PHE A  14      49.206  19.282  50.320  1.00 54.69           N  
ATOM    827  CA  PHE A  14      48.606  19.107  51.603  1.00 55.79           C  
ATOM    828  C   PHE A  14      47.836  17.809  51.723  1.00 56.22           C  
ATOM    829  O   PHE A  14      48.078  17.051  52.638  1.00 55.30           O  
ATOM    830  CB  PHE A  14      47.690  20.257  51.882  1.00 56.82           C 
ATOM    837  N   LEU A  15      46.893  17.551  50.827  1.00 57.29           N  
ATOM    838  CA  LEU A  15      46.182  16.280  50.875  1.00 59.26           C  
ATOM    839  C   LEU A  15      47.047  15.013  50.740  1.00 59.98           C  
ATOM    840  O   LEU A  15      46.727  13.971  51.280  1.00 60.70           O  
ATOM    845  N   GLU A  16      48.121  15.088  49.997  1.00 60.70           N 
ATOM    846  CA  GLU A  16      48.980  13.935  49.837  1.00 61.99           C  
ATOM    847  C   GLU A  16      49.482  13.456  51.186  1.00 62.42           C  
ATOM    848  O   GLU A  16      49.617  12.256  51.424  1.00 62.45           O  
ATOM    946  N   VAL A  31      25.144  19.607  51.305  1.00 51.18           N  
ATOM    947  CA  VAL A  31      25.010  20.708  52.246  1.00 52.08           C  
ATOM    948  C   VAL A  31      23.603  20.835  52.761  1.00 52.94           C  
ATOM    949  O   VAL A  31      22.705  21.156  52.037  1.00 54.54           O  
ATOM    953  N   THR A  32      23.403  20.559  54.023  1.00 54.13           N  
ATOM    954  CA  THR A  32      22.075  20.604  54.610  1.00 55.49           C  
ATOM    955  C   THR A  32      21.767  21.988  55.149  1.00 56.13           C  
ATOM    956  O   THR A  32      20.606  22.329  55.356  1.00 56.55           O  
...

If the numbering is not consistent, coordinates will be deleted when the PDB file is parsed

Post-prediction optimization

ARCH_Pred allow the post-prediction optimization of predicted loops. This includes the grafting of the loop in the the protein structure and the refinement using modeller. In this case, the server returns individual protein structures with grafted loops. If unchecked, the server will return the coordinates of the protein and the candidates loops in a NMR style PDB file where MODEL 1 is the frame protein and the candidate loops structures (main chain traces) MODEL 2 and above.

Results web page

If the prediction is sucessful the web server will return an web page similar to the one shown below. Should an error occur during the prediction, the web page will show an specific error message and a link to the prediction log file available to download for inspection

The top part of the page shows the prediction parameters selected by users, including the Job ID, loop location (start, chain) and sequence. The number of generated loop models and whether the loops was grafted in the protein using modeller

The second part is the list of generated models ranked by Zscore. The template used for loop modeling is also shown in the table as well as the link to the coordinates files.

Finally, there is a section of with links to a number of files including the file uploaded by the user, a NMR style PDB file including all template loops that passed the RMSD filtering step superposed to the stem residues of the query structure and a prediction log file.