amazon-commerce-reviews
OpenML dataset with id 1457
Author name not available (Why is that?)
Full work available at URL: https://api.openml.org/data/v1/download/1586202/amazon-commerce-reviews.arff
Upload date: 21 May 2015
Dataset Characteristics
Number of classes: 50
Number of features: 10,001 (numeric: 10,000, symbolic: 1 and in total binary: 0 )
Number of instances: 1,500
Number of instances with missing values: 0
Number of missing values: 0
Author: Zhi Liu Source: UCI Please cite:
Dataset creator and donator: Zhi Liu, e-mail: liuzhi8673 '@' gmail.com, institution: National Engineering Research Center for E-Learning, Hubei Wuhan, China
Data Set Information:
dataset are derived from the customers reviews in Amazon Commerce Website for authorship identification. Most previous studies conducted the identification experiments for two to ten authors. But in the online context, reviews to be identified usually have more potential authors, and normally classification algorithms are not adapted to large number of target classes. To examine the robustness of classification algorithms, we identified 50 of the most active users (represented by a unique ID and username) who frequently posted reviews in these newsgroups. The number of reviews we collected for each author is 30.
Attribute Information:
attribution includes authors' linguistic style such as usage of digit, punctuation, words and sentences' length and usage frequency of words and so on
This page was built for dataset: amazon-commerce-reviews