Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
USPS - MaRDI portal

USPS

From MaRDI portal
Dataset:6035423



OpenML41082MaRDI QIDQ6035423

OpenML dataset with id 41082

No author found.

Full work available at URL: https://api.openml.org/data/v1/download/19329737/USPS.arff

Upload date: 14 May 2018
Copyright license: CC0



Dataset Characteristics

Number of classes: 10
Number of features: 257 (numeric: 256, symbolic: 1 and in total binary: 0 )
Number of instances: 9,298
Number of instances with missing values: 0
Number of missing values: 0

The dataset and this description is made available on http://www-stat.stanford.edu/~tibs/ElemStatLearn/data.html.

Normalized handwritten digits, automatically scanned from envelopes by the U.S. Postal Service. The original scanned digits are binary and of different sizes and orientations; the images here have been deslanted and size normalized, resulting in 16 x 16 grayscale images (Le Cun et al., 1990).

The data are in two gzipped files, and each line consists of the digit id (0-9) followed by the 256 grayscale values.


There are 7291 training observations and 2007 test observations, distributed as follows: 0 1 2 3 4 5 6 7 8 9 Total Train 1194 1005 731 658 652 556 664 645 542 644 7291 Test 359 264 198 166 200 160 170 147 166 177 2007

or as proportions: 0 1 2 3 4 5 6 7 8 9 Train 0.16 0.14 0.1 0.09 0.09 0.08 0.09 0.09 0.07 0.09 Test 0.18 0.13 0.1 0.08 0.10 0.08 0.08 0.07 0.08 0.09


Alternatively, the training data are available as separate files per digit (and hence without the digit identifier in each row)

The test set is notoriously "difficult", and a 2.5% error rate is excellent. These data were kindly made available by the neural network group at AT&T research labs (thanks to Yann Le Cunn).

PS: In this dataset, the class is represented as 1-10




This page was built for dataset: USPS