Perth-House-Prices
OpenML dataset with id 43822
Author name not available (Why is that?)
Full work available at URL: https://api.openml.org/data/v1/download/22102647/Perth-House-Prices.arff
Upload date: 24 March 2022
Copyright license: No records found.
Dataset Characteristics
Number of features: 18 (numeric: 14, symbolic: 0 and in total binary: 0 )
Number of instances: 33,656
Number of instances with missing values: 14,448
Number of missing values: 16,585
Acknowledgements
This data was scraped from http://house.speakingsame.com/ and includes data from 322 Perth suburbs, resulting in an average of about 100 rows per suburb.
Content
I believe the columns chosen to represent this dataset are the most crucial in predicting house prices. Some preliminary analysis I conducted showed a significant correlation between each of these columns and the response variable (i.e. price).
Data obtained from other than scrape source
Longitude and Latitude data was obtained from data.gov.au.
School ranking data was obtained from bettereducation.
The nearest schools to each address selected in this dataset are schools which are defined to be 'ATAR-applicable'. In the Australian secondary school education system, ATAR is a scoring system used to assess a student's culminative academic results and is used for entry into Australian universities. As such, schools which do not have an ATAR program such as primary schools, vocational schools, special needs schools etc. are not considered in determining the nearest school.
Do also note that under the "NEAREST_SCH_RANK" column, there are some missing rows as some schools are unranked according to this criteria by bettereducation.
This page was built for dataset: Perth-House-Prices