DomBpred is a sequence-based method for protein domain boundary prediction. In DomBpred, the input sequence is firstly classified as either a single-domain protein or a multi-domain protein through an effective sequence metric based on a single-domain sequence library, which is constructed according to the CATH and SCOPe databases. For the multi-domain protein, a domain-residue clustering algorithm inspired by Ising model is used to cluster the spatially close residues according to the inter-residue distance. The unclassified residues and the residues at the edge of the cluster are then tuned by the secondary structure to form potential cut points. Finally, a domain boundary scoring function is employed to recursively evaluate the potential cut points to determine the domain boundary.

  • Zhongze Yu, Chunxiang Peng, Jun Liu, Biao Zhang, Xiaogen Zhou, and Guijun Zhang*. DomBpred: protein domain boundary predictor using inter-residue distance and domain-residue level clustering. bioRxiv, doi: 10.1101/2021.11.19.469204.