DeepSCFold is designed for high-accuracy protein complex structure modeling using complex paired MSAs constructed by identifying potential interaction relationships between monomeric sequences. In the paired-MSA construction, the key component of DeepSCFold is two sequence-based deep learning models that predicts protein-protein structural similarity (pSS-score) and interaction probability (pIA-score). For the input protein complex sequences, DeepSCFold first generates monomeric multiple sequence alignments (MSAs) from multiple sequence databases (such as UniRef30, UniRef90, UniProt, BFD, MGnify, and the ColabFold DB). Then, the predicted pSS-score between the input sequence and each sequence alignment in the monomer MSAs was used for ranking and selecting the monomer MSAs. Subsequently, the developed deep learning model predicts the pIA-scores for the sequence alignments from different subunit MSAs to construct paired MSAs. Additionally, we use information from multiple sources, such as species annotations, UniProt accession number, and protein complexes from the Protein Data Bank (PDB), to further construct extra paired MSAs. Subsequently, DeepSCFold uses the series of paired MSAs constructed above to perform complex structure predictions through AlphaFold-Multimer. The top-1 model is selected based on our in-house complex model quality assessment method.



News
  • 2024/11/30: Ranking of DeepSCFold (Group name: GuijunLab-Complex & GuijunLab-Assembly) in blind test of CASP16.
  • Protein Domains: GuijunLab-Complex ranked 11th out of 111 groups, based on models with the best scores (Zscore > -2.0).
    Protein Multimers: GuijunLab-Assembly ranked 14th out of 86 groups and 2nd for easy&medium targets, based on models with the best scores (Zscore > -2.0).
  • 2024/06/29: DeepSCFold (Group name: ZJUT-DeepSHFold) was randked 1st for the protein structure prediction in the three months CAMEO blind test.

  • Download
  • DeepSCFold data download:
  • Model training and testing datasets: protein-protein interaction probability prediction and structure similarity prediction.
    Complex structure prediction datasets: CASP15 protein complex dataset and antibody-antigen protein complex dataset.
  • DeepSCFold code download:
  • DeepSCFold (v1.0) standalone package download.

    References
  • Minghua Hou, Yuhao Xia, Pengcheng Wang, Zexin lv, Dongliang Hou, Jianyang Zeng and Guijun Zhang. High-accuracy protein complex structure modeling based on sequence-derived structure complementarity. In press.