2nd Place Solution - Predicting customer return behaviour during holiday shopping using strategic feature engineering
The challenge involved predicting customer return behaviour during holiday shopping. Given transaction data, the task was to classify whether customers would return purchased products - a binary classification problem with significant business implications for retail operations.
The winning approach focused on quality over quantity - 17 carefully curated features outperformed baselines with 55+ one-hot encoded features.
The solution employed a straightforward yet effective pipeline:
# Aggregation features - capturing historical patterns
df['customer_return_rate'] = df.groupby('CustomerID')['ReturnFlag'].transform('mean')
df['product_return_rate'] = df.groupby('ProductID')['ReturnFlag'].transform('mean')
df['category_return_rate'] = df.groupby('Category')['ReturnFlag'].transform('mean')
# Temporal features - holiday shopping patterns
df['day_of_week'] = df['PurchaseDate'].dt.dayofweek
df['is_weekend'] = df['day_of_week'].isin([5, 6]).astype(int)
df['is_christmas_week'] = df['PurchaseDate'].dt.isocalendar().week == 52
df['is_post_christmas'] = df['PurchaseDate'].dt.day > 25
# Final pipeline
pipeline = Pipeline([
('scaler', StandardScaler()),
('classifier', LogisticRegression(max_iter=3000, random_state=42))
])
The simplicity of the model was intentional - with well-engineered features, a simple model can outperform complex ensembles built on raw features.