diamonds
OpenML dataset with id 44140
Author name not available (Why is that?)
Full work available at URL: https://api.openml.org/data/v1/download/22103265/diamonds.arff
Upload date: 5 July 2022
Dataset Characteristics
Number of classes: 0
Number of features: 7 (numeric: 7, symbolic: 0 and in total binary: 0 )
Number of instances: 53,940
Number of instances with missing values: 0
Number of missing values: 0
Dataset used in the tabular data benchmark https://github.com/LeoGrin/tabular-benchmark, transformed in the same way. This dataset belongs to the "regression on numerical features" benchmark. Original description:
This classic dataset contains the prices and other attributes of almost 54,000 diamonds. It's a great dataset for beginners learning to work with data analysis and visualization.
Content price price in US dollars (\$326--\$18,823)
carat weight of the diamond (0.2--5.01)
cut quality of the cut (Fair, Good, Very Good, Premium, Ideal)
color diamond colour, from J (worst) to D (best)
clarity a measurement of how clear the diamond is (I1 (worst), SI2, SI1, VS2, VS1, VVS2, VVS1, IF (best))
x length in mm (0--10.74)
y width in mm (0--58.9)
z depth in mm (0--31.8)
depth total depth percentage = z / mean(x, y) = 2 * z / (x + y) (43--79)
table width of top of diamond relative to widest point (43--95)
This page was built for dataset: diamonds