About PoSSuM Database


Today, vast amounts of protein-small molecule binding sites can be found in the Protein Data Bank (PDB). Exhaustive comparison of them is computationally demanding, but useful in the prediction of protein functions and drug discovery. We proposed a tremendously fast algorithm called "SketchSort" that enables the enumeration of similar pairs in a huge number of protein-ligand binding sites. We conducted all-pair similarity searches for 3.4 million known and potential binding sites using the proposed method and discovered over 24 million similar pairs of binding sites. We present the results as a relational database Pocket Similarity Search using Multiple-Sketches (PoSSuM), which includes all the discovered pairs with annotations of various types (e.g., CATH, SCOP, EC number, Gene ontology). PoSSuM enables rapid exploration of similar binding sites among structures with different global folds as well as similar ones. Moreover, PoSSuM is useful for predicting the binding ligand for unbound structures. Basically, the users can search similar binding pockets using two search modes:

i) "Search K" is useful for finding similar binding sites for a known ligand-binding site. Post a known ligand-binding site (a pair of "PDB ID" and "HET code") in the PDB, and PoSSuM will search similar sites for the query site.

ii) "Search P" is useful for predicting ligands that potentially bind to a structure of interest. Post a known protein structure (PDB ID) in the PDB, and PoSSuM will search similar known-ligand binding sites for the query structure.



Latest release

28-Mar-2012
The latest version of dataset has been released, which includes up-to-date PDB entries (ver Jan-2012).

22-Dec-2011
The latest version of dataset has been released. Almost all PDB entries (67,212) have been used for generating putative pockets in the current version, whereas the older version included only 29,779 entries (95% sequence similarity cut-off).

1-Nov-2011
Online calculation mode (Beta version) has been released. Users can upload their protein structure of interest, in case the structure has not been deposited to PDB.




The PoSSuM database is freely available for all researchers. Please cite the followings:

  1. Ito J, Tabei Y, Shimizu K, Tsuda K, and Tomii K. PoSSuM: a database of similar protein-ligand binding and putative pockets. Nucleic Acids Res DB issue 2012;40:D541-8.

  2. Ito J, Tabei Y, Shimizu K, Tomii K, and Tsuda K. PDB-Scale analysis of known and putative ligand binding sites with structural sketches. Proteins 2011;80:747-63.

  3. Tabei Y, Uno T, Sugiyama M, Tsuda K. Single Versus Multiple Sorting for All Pairs Similarity Search. The 2nd Asian Conference on Machine Learning (ACML2010) 2010.

Contact: