SPrCY: Structure Prediction Comparisons for Yeast

This database has been constructed by Nathan Baker and Todd Dolinsky in collaboration with Kevin Karplus and Peter Burgers. Structure Prediction Databases Analyzed:

SAM-T02 A collection of SAM-T02 server predictions for all (translated) S. cerevisiae ORFs
3D-PSSM A collection of 3D-PSSM server predictions for all (translated) S. cerevisiae ORFs with greater than 30 and less than 800 amino acids.
mGenTHREADER A collection of mGenTHREADER server predictions for all (translated) S. cerevisiae ORFs with greater than 30 and less than 800 amino acids.
PDBAA A collection of BLAST searches against the PDBAA database for all (translated) S. cerevisiae ORFs.
Pfam_ls A collection of HMMER searches with standard options (E-value cutoff of 10.0) against the Pfam_ls database for all (translated) S. cerevisiae ORFs.
Structure Prediction Analysis Scripts:

SCOP Analysis Returns SCOP superfamily predictions and statistics for one database versus another.
Gene Ontology Analysis Returns Gene Ontology (GO) predictions and statistics for one database versus another. Also contains SGD GO predictions for reference.

Other Available Scripts:

Get SCOP Tree Creates a SCOP-based hierarchical tree for an ORF by using the structure prediction method's e-values for that ORF.
Browse SCOP Browses the SCOP hierarchy. Obtains ORF predictions at any given branch of the hierarchy. Searchable.
Get ORF Returns all predictions for a particular ORF.
Get PDBs Searches for all ORFs whose predictions include particular PDB files.
Get New Information Searches for all ORFs with significant predictions in sequence-structure methods but low similarity to existing PDB structures (as evaluted by PDBAA BLAST) by thresholding e-values.
Get Significant ORFs Searches for all ORFs with significant predictions by thresholding e-values.
Get All ORFs Returns a list of all processed ORFs in a particular database.
Get ORF Information Returns information about ORFs, including Systematic and Common Names, Aliases, SGD, Genbank, and Entrez IDs, and more. Searchable.

References

SCOP. SCOP hierarchies were generated by using the SCOP parseable files available from the SCOP web page. References include:

  1. Lo Conte L, Brenner SE, Hubbard TJ, Chothia C, Murzin AG. SCOP database in 2002: refinements accommodate stuctural genomics. Nucleic Acids Res. 30 264-267, 2002.
  2. Murzin AG, Brenner SE, Hubbard T, Chothia C. SCOP: a structural classification of proteins database for the investigation of sequences and strucures. J. Mol. Biol. 247 536-540, 1995.
SAM-T02. Calculations were performed by Kevin Karplus and co-workers who generously provided the data for use in this database. References include:
  1. Karchin R, Cline M, Mandel-Gutfreund Y, Karplus K. Hidden Markov models that use predicted local structure for fold recognition: alphabets of backbone geometry. Proteins, in press.
  2. Karplus K, Karchin R, Barrett C, Tu S, Cline M, Diekhans M, Grate L, and Casper J, Hughey R. What is the value added by human intervention in protein structure prediction? Proteins 45 (S5) 86-91, 2001.
3D-PSSM. Calculations were performed by submission to the 3D-PSSM web portal. References include:
  1. Kelley LA, MacCallum RM & Sternberg MJE Enhanced. Genome Annotation using Structural Profiles in the Program 3D-PSSM. J. Mol. Biol. 299 499-520, 2000.
  2. Fischer D, Barret C, Bryson K, Elofsson A, Godzik A, Jones D, Karplus KJ, Kelley LA, Maccallum, RM, Pawowski K, Rost B, Rychlewski L, Sternberg MJ. CAFASP-1: Critical Assessment of Fully Automated Structure Prediction Methods. Proteins Suppl 3 209-217, 1999.
  3. Kelley LA, Maccallum R, Sternberg MJE. Recognition of Remote Protein Homologies Using Three-Dimensional Information to Generate a Position Specific Scoring Matrix in the program 3D-PSSM. RECOMB 99, Proceedings of the Third Annual Conference on Computational Molecular Biology Istrail S, Pevzner P, Waterman M, Eds. The Association for Computing Machinery: New York, 1999. Pages 218-225.
mGenTHREADER. Calculations were performed by submission to the mGenTHREADER web site. References include:
  1. Jones, DT. GenTHREADER: An efficient and reliable protein fold recognition method for genomic sequences. J. Mol. Biol. 287 797-815, 1999.
NCBI BLAST. Searches were performed locally against the PDBAA database. NBCI BLAST references include:
  1. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zheng Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25 3389-3402, 1997.
Pfam_ls. Searches were performed locally. Pfam_ls references include:
  1. Bateman A, Birney E, Cerruti L, Durbin R, Etwiller L, Eddy SR, Griffiths-Jones S, Howe KL, Marshall M, Sonnhammer EL. The Pfam protein families database. Nucleic Acids Res. 30 276-280, 2002.
  2. Eddy SR. Profile hidden Markov models. Bioinformatics 14 276-280, 1998.
  3. Eddy SR. Hidden Markov models. Curr Opin Struct Biol 6, 361-365, 1996.
SGD. References include:
  1. Issel-Tarver L, Christie KR, Dolinski K, Andrada R, Balakrishman R, Ball CA, Binkley G, Dong S, Dwight SS, Fisk DG, Harris M, Schroeder M, Sethuraman A, Tse K, Weng S, Botstein D, Cherry JM. Saccharomyces Genome Database. Methods Enzymol 350 329-346, 2002.
  2. Dwight SS, Harris MA, Dolinski K, Ball CA, Binkley G, Christie KR, Fisk DG, Issel-Tarver L, Schroeder M, Sherlock G, Sethuraman A, Weng S, Botstein D, Cherry JM. Saccharomyces Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO). Nucleic Acids Res. 30 69-72, 2002.