ada_prior
OpenML dataset with id 1037
Author name not available (Why is that?)
Full work available at URL: https://api.openml.org/data/v1/download/53920/ada_prior.arff
Upload date: 6 October 2014
Dataset Characteristics
Number of classes: 2
Number of features: 15 (numeric: 6, symbolic: 9 and in total binary: 2 )
Number of instances: 4,562
Number of instances with missing values: 88
Number of missing values: 88
Author: Source: Unknown - Date unknown Please cite:
Datasets from the Agnostic Learning vs. Prior Knowledge Challenge (http://www.agnostic.inf.ethz.ch)
Dataset from: http://www.agnostic.inf.ethz.ch/datasets.php
Modified by TunedIT (converted to ARFF format)
ADA is the marketing database
The task of ADA is to discover high revenue people from census data. This is a two-class classification problem. The raw data from the census bureau is known as the Adult database in the UCI machine-learning repository. The 14 original attributes (features) include age, workclass, education, education, marital status, occupation, native country, etc. It contains continuous, binary and categorical features. This dataset is from "prior knowledge track", i.e. has access to the original features and their identity.
Number of examples:
Pos_ex Neg_ex Tot_ex
Train 1029 3118 4147
Valid 103 312 415
This dataset contains samples from both training and validation datasets.
Attribute information
1. age Instance’s age (numeric)
2. workclass Instance’s work class (nominal)
3. fnlwgt Instance’s sampling weight (numeric)
4. education Instance’s education level (nominal)
5. educationNum Instance’s education level (numeric version)
6. maritalStatus Instance’s marital status (nominal)
7. occupation Instance’s occupation (nominal)
8. relationship Instance’s type of relationship (nominal)
9. race Instance’s race (nominal)
10. sex Instance’s sex (nominal)
11. capitalGain Instance’s capital gain (numeric)
12. capitalLoss Instance’s capital loss (numeric)
13. hoursPerWeek Instance’s number of working hours (numeric)
14. nativeCountry Instance’s native country (numeric)
15. label Class attribute (1: the instance earns more than 50K a year; -1 otherwise)
This page was built for dataset: ada_prior