STL-10
OpenML dataset with id 41103
Adam Coates, Andrew Y. Ng, Honglak Lee
Full work available at URL: https://api.openml.org/data/v1/download/19334421/STL-10.arff
Upload date: 23 June 2018
Copyright license: CC0
Dataset Characteristics
Number of classes: 10
Number of features: 27,649 (numeric: 27,648, symbolic: 1 and in total binary: 0 )
Number of instances: 13,000
Number of instances with missing values: 0
Number of missing values: 0
CIFAR-10 dataset but with some modifications. In particular, each class has fewer labeled training examples than in CIFAR-10, but a very large set of unlabeled examples is provided to learn image models prior to supervised training. The primary challenge is to make use of the unlabeled data (which comes from a similar but different distribution from the labeled data) to build a useful prior. We also expect that the higher resolution of this dataset (96x96) will make it a challenging benchmark for developing more scalable unsupervised learning methods.
Overview
10 classes: airplane, bird, car, cat, deer, dog, horse, monkey, ship, truck.
Images are 96x96 pixels, color.
500 training images (10 pre-defined folds), 800 test images per class.
100000 unlabeled images for unsupervised learning. These examples are extracted from a similar but broader distribution of images. For instance, it contains other types of animals (bears, rabbits, etc.) and vehicles (trains, buses, etc.) in addition to the ones in the labeled set.
Images were acquired from labeled examples on ImageNet.
This page was built for dataset: STL-10