Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
physiochemical_protein - MaRDI portal

physiochemical_protein

From MaRDI portal
Dataset:6037796



OpenML44963MaRDI QIDQ6037796

OpenML dataset with id 44963

No author found.

Full work available at URL: https://api.openml.org/data/v1/download/22111827/physiochemical_protein.arff

Upload date: 22 December 2022
Copyright license: Creative Commons Attribution 4.0 International



Dataset Characteristics

Number of classes: 0
Number of features: 10 (numeric: 10, symbolic: 0 and in total binary: 0 )
Number of instances: 45,730
Number of instances with missing values: 0
Number of missing values: 0

Data Description

This is a data set of Physicochemical Properties of Protein Tertiary Structure. The data set is taken from CASP 5-9. There are 45730 decoys and size varying from 0 to 21 armstrong.

The goal of the dataset is to predict the size of the residue for a tertiary protein structure (a 3d protein structure). Once linked in the protein chain, an individual amino acid is called a residue. The target feature is root mean square error of the residue.

Attribute Description

1. *RMSD* - size of the residue 2. *F1* - total surface area 3. *F2* - non polar exposed area 4. *F3* - fractional area of exposed non polar residue 5. *F4* - fractional area of exposed non polar part of residue 6. *F5* - molecular mass weighted exposed area 7. *F6* - average deviation from standard exposed area of residue 8. *F7* - Euclidian distance 9. *F8* - secondary structure penalty 10. *F9* - Spacial Distribution constraints (N,K Value)





This page was built for dataset: physiochemical_protein