KDDCup09_churn
OpenML dataset with id 1112
Author name not available (Why is that?)
Full work available at URL: https://api.openml.org/data/v1/download/53995/KDDCup09_churn.arff
Upload date: 7 October 2014
Dataset Characteristics
Number of classes: 2
Number of features: 231 (numeric: 192, symbolic: 39 and in total binary: 5 )
Number of instances: 50,000
Number of instances with missing values: 50,000
Number of missing values: 8,024,152
Author: Orange Telecom Source: ACM KDD Cup - 2009 Please cite:
The KDD Cup 2009 offers the opportunity to work on large marketing databases from the French Telecom company Orange to predict the propensity of customers to switch provider (churn).
Churn (wikipedia definition): Churn rate is also sometimes called attrition rate. It is one of two primary factors that determine the steady-state level of customers a business will support. In its broadest sense, churn rate is a measure of the number of individuals or items moving into or out of a collection over a specific period of time.
The term is used in many contexts, but is most widely applied in business with respect to a contractual customer base. For instance, it is an important factor for any business with a subscriber-based service model, including mobile telephone networks and pay TV operators. The term is also used to refer to participant turnover in peer-to-peer networks.
The training set contains 50,000 examples.
The first predictive 190 variables are numerical and the last 40 predictive variables are categorical.
The last target variable is binary {-1,1}.
This page was built for dataset: KDDCup09_churn