Home      Labs      Publications      People      Tools   

From CAGT

ClusPro Help

The ClusPro Web Server software on this system is operational. It is currently being used to participate in the CAPRI competition as the first fully automated Protein-Protein Docking web server.

The ClusPro Algorithm:
  • The user can input the PDB codes of the crystal structures of their choice in the receptor and ligand fields, as well as any chain identifiers that they would like to use.

    ***Note that some crystal structures have more than one crystal in the unit cell. In these cases, it is imperative to specify the chain(s) of only one of the crystals***

    The user also has the option to upload the PDB files from a local machine.

  • Once the PDB files have been uploaded to the server, they are processed into the input files necessary for DOT or ZDOCK, as well as CHARMM minimized for 100 steps with a constrained backbone. The minimized PDBs are then imported to a supercomputer for the running of DOT/ZDOCK, filtering, and clustering.


    DOCKING
    Docking with DOT is currently done using DOT 1.0 with a 1A grid spacing (which is adjusted automatically to 2A for really large proteins).
    During the docking with DOT, only the shape complementarity fuction is used. We do not incorporate any electrostatics into the docking portion. Here, the top 20,000 structures are retained.
    Docking with ZDOCK is done using ZDOCK v.2.3, and the scoring function is Pairwise Shape Complementarity + Desolvation + Electrostatics. The top 2,000 conformations are retained in a ZDOCK run.

    ENERGY + FILTERING
    The complexes generated from the docking step are then rapidly screened, and the Desolvation and Electrostatic energies are calculated. ClusPro retains the top 2,000 structures from the docking stage, regardless of the docking software used. As a default, ClusPro keeps the top 1500 electrostatic conformations, and the top 500 desolvation conformations.

    PAIRWISE BINDING SITE RMSD CALCULATION
    For each of the 2,000 ligand conformations retained in the filtering stage, we calculate the residues of that ligand that have at least one atom within 10 Angstroms of any receptor atom. These residues are now considered the binding site of that ligand. Then, the RMSD of that binding site is calculated between that particular conformation and each of the 2,000 other conformations. This will yield a 2000x2000 Matrix R, and it is important to note that Rij != Rji.

    So, for example, for ligand.1, the binding site is defined by residues 20-50 and 100-135. We then extract only these C-alpha residues from each of the 2000 ligands, and calculate that RMSD. This value would be put into R12 If ligand.2 has the binding site defined by residues 40-95, we will do the same. As you can see, Rij does not equal Rji, as there are different residues being taken into account. This value would be put into R21.

    CLUSTERING
    The Clustering is done using a greedy algorithm. Here, we assume that the native binding site is a broad, deep free energy well, while there are local minima scattered throughout the free energy landscape. The number of structures in each energy well should, therefore, be proportional to the size of the well, validating our use of a greedy clustering algorithm.

    First, in our 2000x2000 RMSD matrix, we find the structure that has the most neighbors under the clustering radius, and call this the first cluster center. We then extract all of those neighbors from the RMSD matrix, and place them in the first cluster. We then go back to the RMSD matrix, and find the structure with the next highest number of neighbors, and call that the second cluster center, and so on.

    The models returned by ClusPro are ranked according to the size of the cluster based on our assumption of how the free energy wells are populated.


    HOMO-MULTIMERIC DOCKING
    The same docking and filtering steps are used as in the standard version of ClusPro. However, with each of the decoys generated, we calculate whether or not it fits the criteria for being symmetric. That is, we calculate the rotation and translation matrix between the receptor and the docked ligand conformation. We then move then move the docked ligand to generate a third structure. This process continues until N+1 structures are generated. A ligand is considered symmetric if the 1st and (N+1)th structures have less than an 8A RMSD and there are no significant overlaps between the 1st and Nth structures.

    The top symmetric structures are then clustered amongst themselves to eliminate any possible redundant structures. The structure with the tightest symmetry (ie, the lowest RMSD between 1 and N+1) is chosen as the cluster center.

    The cluster centers are then clustered against the top 2000 energetically favorable conformations. The number of energetically favorable neighbors is then used as the score for the symmetric cluster center. The cluster centers are then ranked according to these scores.

  • Special Cases: Dimer of Dimers, Dimer of Trimers, and Trimer of Dimers
    ClusPro is able to discriminate between these special cases of tetramers and hexamers and the regular N-mer formations. For a detailed description of the algorithm, please view our paper in the Journal of Structural Biology (currently In Press).

    Methodology developed by CJ Camacho DW Gatchell and S Vajda.
    Server developed by SR Comeau

    For Methodolgy, please cite:

  • CJ Camacho DW Gatchell SR Kimura S Vajda. Scoring docked conformations generated by rigid body docking, Proteins: Structure, Function, and Genetics, 40, 525-537 (2000)

  • CJ Camacho DW Gatchell. Successful Discrimination of protein interactions, Proteins: Structure, Function, and Genetics, 52(1), 92-97 (2003)

  • SR Comeau, DW Gatchell, S Vajda, CJ Camacho. ClusPro: an automated docking and discrimination method for the prediction of protein complexes. Bioinformatics, 20, 45-50 (2004)


    For Software References, please cite:

  • B. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D. J. States, S. Swaminathan, and M. Karplus, CHARMM: A Program for Macromolecular Energy, Minimization, and Dynamics Calculations, J. Comp. Chem. 4, 187-217 (1983)

  • A. D. MacKerell, Jr., B. Brooks, C. L. Brooks, III, L. Nilsson, B. Roux, Y. Won, and M. Karplus, CHARMM: The Energy Function and Its Parameterization with an Overview of the Program, The Encyclopedia of Computational Chemistry, 1, 271-277, P. v. R. Schleyer et al., editors (John Wiley & Sons: Chichester, 1998)

  • Chen R., Li L. and Weng Z., ZDOCK: An Initial-stage Protein Docking Algorithm, Proteins 2003 (in press)

  • Chen R. & Weng Z., Docking Unbound Proteins Using Shape Complementarity, Desolvation, and Electrostatics, Proteins 2002;47:281-294

  • Ten Eyck LF, Mandell J, Roberts VA, Pique ME. Surveying molecular interactions with DOT. In: Proceedings of the 1995 ACM/IEEE Supercomputing Conference. (ed. Hayes, A.& Simmons, M.) (ACM Press, New York, 1995)

  • Vakser, IA. Protein docking for low-resolution structures, Protein Eng. 1995;8:371-377

    Thank you for using ClusPro

    Protein Engineering