Sensitivity Datasets - Leveraging Implicit Knowledge in Neural Networks for Functional Dissection and Engineering of Proteins
DOI10.5281/zenodo.2577920Zenodo2577920MaRDI QIDQ6718388
Dataset published at Zenodo repository.
Author name not available (Why is that?)
Publication date: 24 August 2018
Copyright license: No records found.
Leveraging Implicit Knowledge in Neural Networks for Functional Dissection and Engineering of Proteins The Sensitivity datasets cover more than 800 proteins and are structured as follows. The sensitivity values are the mean of four DeeProtein replicates. It is uploaded as tar.gz. and contains one directory. File names contain the PDB1 identifier and the respective chain identifier. The sequences and secondary structure information were downloaded from the RCSB Protein Databank and are available here: https://cdn.rcsb.org/etl/kabschSander/ss_dis.txt.gz This URL can be found with some explanation at http://www.rcsb.org/pdb/static.do?p=download/http/index.html The secondary structure annotation relies on the DSSP Algorithm by Kabsch and Sander2. The files are tab-separated and contain the following columns: PosPosition in the sequence, starting from zero AAAmino acid in that position sec Secondary structure as annotated in the RCSB Protein Databank disif a region has not been experimentally observed (sometimes explains mismatches with crystal structures) GO:_______Sensitivity for the GO term References The Protein Data Bank H.M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T.N. Bhat, H. Weissig, I.N. Shindyalov, P.E. Bourne (2000) Nucleic Acids Research, 28: 235-242. doi:10.1093/nar/28.1.235 Kabsch, W. Sander, C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577-2637, doi:10.1002/bip.360221211 (1983).
This page was built for dataset: Sensitivity Datasets - Leveraging Implicit Knowledge in Neural Networks for Functional Dissection and Engineering of Proteins