The-2020-Pokemon-dataset
OpenML dataset with id 43795
No author found.
Full work available at URL: https://api.openml.org/data/v1/download/22102620/The-2020-Pokemon-dataset.arff
Upload date: 24 March 2022
Dataset Characteristics
Number of features: 40 (numeric: 33, symbolic: 0 and in total binary: 0 )
Number of instances: 1,013
Number of instances with missing values: 0
Number of missing values: 0
Context I am currently learning Data Science concepts so I started my journey by doing some basic data visualisations. While I was looking for datasets online to visualise, I saw a Pokemon dataset. I have been a fan of this franchise since I was 3 and I have played it's every main series games so it was such a delight for me to start with a pokemon dataset as the start of my Data Visualisation work. But when I started the analysis of that data, I found that the data had quite a few missing values at places and it had information of only the first seven generation pokemon. This was kinda expected as the datasets were 3 years old and the latest generation was revealed last year. I thought that now is the need to update this data. I had the knowledge of web scraping so nothing was stopping me from doing that. I scraped the data from pokemondb.net and bulbapedia and it took me two days of creating the logic, debugging and perfecting the code so that I can scrape data. I also included the data of all the mega and all the alternate forms too. This meant iterating through a single page multiple times and accessing the data which is visible only when the button is pressed. Most of my time went in creating this logic. Finally, once the file was generated, I did a manual work of checking and changing some names with unsupported symbols and arranged the columns. Content This dataset contains the names, Pokedex number, their generation, abilities physical stats like height and weight, their typing, their defence multiplier against each type etc. This data not only includes the 890 pokemon but also their mega evolutions, their galarian, alolan as well as their alternate forms. I have also added the columns of islegendary, ismythical and is_mega so that you can remove those pokemon by some queries if needed. Acknowledgements The data was taken from-
https://pokemondb.net/ https://bulbapedia.bulbagarden.net/wiki/Main_Page
This page was built for dataset: The-2020-Pokemon-dataset