Category:
Data Science
Introduction
PowerCo, a client of Boston Consulting Group (BCG), aims to analyze its customer dataset to identify patterns and predict which customers are likely to churn within the next three months. The company seeks to implement a data-driven churn prediction model to proactively address customer attrition and improve retention strategies.
To achieve this, PowerCo has requested a thorough analysis of historical customer behavior, contract details, energy consumption trends, and pricing factors to build a predictive model that accurately identifies customers at high risk of churning. The model will help PowerCo make targeted interventions before customers leave, minimizing revenue loss and improving customer satisfaction.
Additionally, PowerCo is exploring whether offering a 20% discount to customers predicted to churn would be a profitable retention strategy. The key question is whether the increased retention rate and extended customer lifetime value (CLV) would outweigh the revenue loss caused by the discount.
This project will help PowerCo make data-driven retention decisions, optimize pricing strategies, and ultimately improve profitability while reducing customer churn.
About this file
client_data.csv
id = client company identifier
activity_new = category of the company’s activity
channel_sales = code of the sales channel
cons_12m = electricity consumption of the past 12 months
cons_gas_12m = gas consumption of the past 12 months
cons_last_month = electricity consumption of the last month
date_activ = date of activation of the contract
date_end = registered date of the end of the contract
date_modif_prod = date of the last modification of the product
date_renewal = date of the next contract renewal
forecast_cons_12m = forecasted electricity consumption for next 12 months
forecast_cons_year = forecasted electricity consumption for the next calendar year
forecast_discount_energy = forecasted value of current discount
forecast_meter_rent_12m = forecasted bill of meter rental for the next 2 months
forecast_price_energy_off_peak = forecasted energy price for 1st period (off peak)
forecast_price_energy_peak = forecasted energy price for 2nd period (peak)
forecast_price_pow_off_peak = forecasted power price for 1st period (off peak)
has_gas = indicated if client is also a gas client
imp_cons = current paid consumption
margin_gross_pow_ele = gross margin on power subscription
margin_net_pow_ele = net margin on power subscription
nb_prod_act = number of active products and services
net_margin = total net margin
num_years_antig = antiquity of the client (in number of years)
origin_up = code of the electricity campaign the customer first subscribed to
pow_max = subscribed power
churn = has the client churned over the next 3 months
price_data.csv
id = client company identifier
price_date = reference date
price_off_peak_var = price of energy for the 1st period (off peak)
price_peak_var = price of energy for the 2nd period (peak)
price_mid_peak_var = price of energy for the 3rd period (mid peak)
price_off_peak_fix = price of power for the 1st period (off peak)
price_peak_fix = price of power for the 2nd period (peak)
price_mid_peak_fix = price of power for the 3rd period (mid peak)
Note: some fields are hashed text strings. This preserves the privacy of the original data but the commercial meaning is retained and so they may have predictive power