This repository is the official implementation of N. Vlassis, A. Chandrashekar, F. Amat and N. Kallus. "Control Variates for Slate Off-Policy Evaluation", in NeurIPS, 2021.
To install requirements:
pip install -r requirements.txt
In order to run these experiments you need to download the Microsoft Learning to Rank Datasets and follow these steps:
- Download the dataset here. It is a ~3.7GB Zip file
- You will need to sign in with a One Drive account to download the file.
- Inside the file there are five folders named
Fold%d
. Just use files in subfolderFold1
. Extract the threetxt
files in subfolderFold1
- Combine the three txt files extracted with the following Unix command
cat test.txt train.txt vali.txt > mslr.txt
- Copy the file
mslr.txt
(~4.1GB) to the relative folder./slateOPE/MSLR_WEB30K/Datasets/mslr/mslr.txt
Note: if you want to check that the code runs, you can probably "downsample" the mslr.txt
file. Once you have verified the code runs in your machine, you can rerun with the entire dataset. You should have at least 64GB to run the code with all the data.
In order to reproduce the data points in Section 6 Figures 1 and 2 of the paper go to MSLR_WEB30K
folder and run the main.py
script.
For example, python3 main.py -m 20 -k 5 -r NDCG -n 5000 -s 300
runs a simulation for all estimators for 5 slots with top-M=20 predicted documents and NDCG metric. Each simulation is run 300 times with 5000 samples each.
In order to reproduce any of the data points in Section 7 Figure 3 of the paper, go to simulator
folder and run the main.py
script.
For example, python3 main.py -k 2 -d 10 -n 600 -s 6000
runs the simulation for all estimators for a slate of 2 slots with 10 actions per slot. Each simulation is run 6000 (=20x300) times with 600 samples each. This is actually the first point (upper left corner) displayed in Figure 2.
Note: This part of the code uses Joblib package to parallelize simulation and reduce computational time.
Our model achieves the following performance on MSLR-WEB30K: