A full presentation is available including pre-processing, modeling details and results, and a discussion of how to reduce the churn rate.
There were 7032 clients each with 32 features after the pre-processing step (20 originally). The relationship among the original features are illustrated here
Models were built to predict the label of churning. To compare and find the best model, Logistic regression, SVM, GBDT, and neural network were used. The data was split with a training-to-testing ratio of 4:1, and features were normalized with respect to the training set. Parameters tuning for each model was done with a stratified, 5-Fold CV Grid-Search. GBDT using LGBoost was found to perform the best with an accuracy of 80.24%
A telecom client dataset was explored and features engineered in notebook 01. Then the feature set was used to fit 7 models, including 4 variations of decision trees, in notebook 02. The best model was selected by comparing various scores' values, and its dependence on the features was plotted in the feature importance chart. Finally, the resulting model was studied to plan for actions to lower the churn rate in notebook 03.