Skip to content

This repository contains the data and the code used in Polimi Kaggle competition. The application domain is book recommendation. The main goal of the competition is to discover which items a user will interact with

License

Notifications You must be signed in to change notification settings

FrancescoZanella/RecSystems

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

header

Politecnico di Milano

Recommender System Challenge 2023/2024 Polimi

This repository contains the data and the code used in Polimi Kaggle competition. The application domain is book recommendation. The datasets provided contains interactions of users with books, in particular, if the user attributed to the book a rating of at least 4. The main goal of the competition is to discover which items (books) a user will interact with.

Dataset

The datasets includes around 600k interactions, 13k users, 22k items (books). The training-test split is done via random holdout, 80% training, 20% test. The goal is to recommend a list of 10 potentially relevant items for each user. MAP@10 is used for evaluation. You can use any kind of recommender algorithm you wish written in Python.

Results

Deadline 2 (final):

  • Public leaderboard: 2th
  • Private leaderboard: 3th

Deadline 1:

  • Public leaderboard: 2th
  • Private leaderboard: 2th

There were 63 teams in competition.

Recommender

The recommender system that we used to achieve the third position in the challenge was a model composed of a hybrid approach where we combined:

  • Slim Elastic Net
  • Item KNN
  • Rp3Beta We integrated the similarity matrices by assigning different weights to each recommender based on hyperparameter tuning conducted in Optuna.

XGBoost

We used XGB to further improve the performances of our model. We used as features:

  • Top Popular
  • RP3beta 
  • SLIMen
  • SLIMbpr
  • ItemKNN
  • The best hybrid (SLIMen + ItemKNN + RP3b)
  • P3alpha
  • User profile length
  • Item popularity

This is a representation of the architecture:

N.B. the candidate generator we have used is strongly optimized on recall, no more on MAP.

Hyperparameters tuning

The hyperparameters tuning was done using:

  • Kaggle free GPU plan
  • Asus Zenbook

Presentation

For further informations on how the tuning is been done and on how we have structured our work pipeline, you can read the presentation.

Contributors

Francesco Zanella

Federico CIliberto

Credits

This repository is based on Maurizio Ferrari Dacrema Repository.

About

This repository contains the data and the code used in Polimi Kaggle competition. The application domain is book recommendation. The main goal of the competition is to discover which items a user will interact with

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published