PC-Games-2020
OpenML dataset with id 43689
Author name not available (Why is that?)
Full work available at URL: https://api.openml.org/data/v1/download/22102514/PC-Games-2020.arff
Upload date: 24 March 2022
Dataset Characteristics
Number of features: 27 (numeric: 10, symbolic: 0 and in total binary: 0 )
Number of instances: 30,250
Number of instances with missing values: 30,250
Number of missing values: 188,331
Context
The projects goal is to use this dataset to predict the level of success game developers should expect given their game design details. Features such as 'Indie' (developed by indie studio), 'Soundtrack' (whether or not the game was noted for its soundtrack), and 'Genres', will be able to predict the popularity of the game.
Content
Gathered the data July 2020 by doing one long scrape of the Steam store, from most popular to least popular. You can see signs of this by the correlation between index and the presence value (number of online posts related to the game).
While performing the scrape, each game was supplemented by calling the RAWG API and adding another dozen or so features.
Inspiration
My main inspiration with this dataset was to gain and share the importance of each of the features related to game success on the Steam store. This information could be valuable for game developers, and I would also like to create a game using the insights, to evaluate the accuracy of the models.
This page was built for dataset: PC-Games-2020