QMUL ML Society Christmas Hackathon

2nd Place Solution - Predicting customer return behaviour during holiday shopping using strategic feature engineering

2nd Place
QMUL Machine Learning Society Christmas Hackathon 2025

Problem Statement

The challenge involved predicting customer return behaviour during holiday shopping. Given transaction data, the task was to classify whether customers would return purchased products - a binary classification problem with significant business implications for retail operations.

8,000
Training Transactions
2,000
Test Transactions
25
Original Features
~50%
Return Rate

Feature Engineering Strategy

The winning approach focused on quality over quantity - 17 carefully curated features outperformed baselines with 55+ one-hot encoded features.

Aggregation Features (3 features)

  • Customer-level return history: Identifying repeat returners through historical patterns
  • Product-level return patterns: Highlighting problematic items with high return rates
  • Category-level return statistics: Capturing industry-specific trends

Temporal Features (6 features)

  • Day-of-week indicators for shopping pattern analysis
  • Monthly and daily components for seasonal trends
  • Weekend flags to capture leisure shopping behaviour
  • IsPostChristmas and IsChristmasWeek flags for holiday-specific patterns

Original Features (8 selected)

  • Age, quantity, and pricing information
  • Customer satisfaction scores
  • Discount percentages and promotion indicators
  • Online purchase and gift-wrap flags

Modelling Approach

The solution employed a straightforward yet effective pipeline:

  • Preprocessing: StandardScaler for feature normalisation
  • Model: Logistic Regression (max_iter=3000, random_state=42)
  • Training: All 8,000 training samples utilised
  • Validation: Cross-validation for robust performance estimation

Code: Feature Engineering Pipeline

# Aggregation features - capturing historical patterns
df['customer_return_rate'] = df.groupby('CustomerID')['ReturnFlag'].transform('mean')
df['product_return_rate'] = df.groupby('ProductID')['ReturnFlag'].transform('mean')
df['category_return_rate'] = df.groupby('Category')['ReturnFlag'].transform('mean')

# Temporal features - holiday shopping patterns
df['day_of_week'] = df['PurchaseDate'].dt.dayofweek
df['is_weekend'] = df['day_of_week'].isin([5, 6]).astype(int)
df['is_christmas_week'] = df['PurchaseDate'].dt.isocalendar().week == 52
df['is_post_christmas'] = df['PurchaseDate'].dt.day > 25

# Final pipeline
pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('classifier', LogisticRegression(max_iter=3000, random_state=42))
])

The simplicity of the model was intentional - with well-engineered features, a simple model can outperform complex ensembles built on raw features.

Key Success Factors

  • Aggregation features provided strong predictive signals by capturing behavioural patterns that individual transactions couldn't reveal
  • Domain knowledge of holiday shopping informed temporal feature engineering - understanding that post-Christmas returns follow different patterns than regular shopping
  • Feature quality over quantity: 17 curated features beat 55+ one-hot encoded features
  • Simple model, complex features: LogisticRegression with good features outperformed complex models with raw features

Tools & Technologies

  • Python: Primary programming language
  • Jupyter Notebook: Interactive development for rapid iteration
  • Pandas & NumPy: Data manipulation and feature engineering
  • Scikit-learn: StandardScaler, LogisticRegression, cross-validation
  • Matplotlib & Seaborn: EDA visualisation

Key Learnings

  • Feature engineering is often more valuable than model complexity - time spent understanding the data and creating meaningful features pays off more than hyperparameter tuning
  • Domain knowledge matters: Understanding holiday shopping patterns directly translated to better features
  • Aggregation features capture patterns individual samples can't: Customer and product history revealed trends invisible at the transaction level
  • Simple models are interpretable: LogisticRegression coefficients helped validate that features made business sense