Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
LoanDefaultPrediction - MaRDI portal

LoanDefaultPrediction

From MaRDI portal
Dataset:6035286



OpenML6331MaRDI QIDQ6035286

OpenML dataset with id 6331

No author found.

Full work available at URL: https://api.openml.org/data/v1/download/1854209/LoanDefaultPrediction.arff

Upload date: 30 March 2016



Dataset Characteristics

Number of classes: 0
Number of features: 771 (numeric: 765, symbolic: 6 and in total binary: 2 )
Number of instances: 105,471
Number of instances with missing values: 53,531
Number of missing values: 785,955

Data from training set of the Kaggle Loan Default Prediction - Imperial College London challenge: https://www.kaggle.com/c/loan-default-prediction

This data corresponds to a set of financial transactions associated with individuals. The data has been standardized, de-trended, and anonymized. You are provided with over two hundred thousand observations and nearly 800 features. Each observation is independent from the previous.

For each observation, it was recorded whether a default was triggered. In case of a default, the loss was measured. This quantity lies between 0 and 100. It has been normalised, considering that the notional of each transaction at inception is 100. For example, a loss of 60 means that only 40 is reimbursed. If the loan did not default, the loss was 0. You are asked to predict the losses for each observation in the test set.

Missing feature values have been kept as is, so that the competing teams can really use the maximum data available, implementing a strategy to fill the gaps if desired. Note that some variables may be categorical (e.g. f776 and f777).

The competition sponsor has worked to remove time-dimensionality from the data. However, the observations are still listed in order from old to new.




This page was built for dataset: LoanDefaultPrediction