hiva_agnostic
OpenML dataset with id 1039
Author name not available (Why is that?)
Full work available at URL: https://api.openml.org/data/v1/download/53922/hiva_agnostic.arff
Upload date: 6 October 2014
Dataset Characteristics
Number of classes: 2
Number of features: 1,618 (numeric: 1,617, symbolic: 1 and in total binary: 1 )
Number of instances: 4,229
Number of instances with missing values: 0
Number of missing values: 0
Author: Source: Unknown - Date unknown Please cite:
Datasets from the Agnostic Learning vs. Prior Knowledge Challenge (http://www.agnostic.inf.ethz.ch)
Dataset from: http://www.agnostic.inf.ethz.ch/datasets.php
Modified by TunedIT (converted to ARFF format)
HIVA is the HIV infection database
The task of HIVA is to predict which compounds are active against the AIDS HIV infection. The original data has 3 classes (active, moderately active, and inactive). We brought it back to a two-class classification problem (active vs. inactive). We represented the data as 2000 sparse binary input variables. The variables represent properties of the molecule inferred from its structure. The problem is therefore to relate structure to activity (a QSAR=quantitative structure-activity relationship problem) to screen new compounds before actually testing them (a HTS=high-throughput screening problem.)
Data type: non-sparse Number of features: 1617 Number of examples and check-sum: Pos_ex Neg_ex Tot_ex Check_sum Train 135 3710 3845 564954.00 Valid 14 370 384 56056.00
This dataset contains samples from both training and validation datasets.
This page was built for dataset: hiva_agnostic