Skip to content
forked from Tharun24/MACH

Extreme Classification in Log Memory via Count-Min Sketch

Notifications You must be signed in to change notification settings

RUSH-LAB/MACH-1

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This repo will be updated soon with streamlined codes (modular structure, TFRecords and faster evaluation)

Download links for datasets

  1. Extreme classification repository - https://manikvarma.org/downloads/XC/XMLRepository.html Please download any of the datasets mentioned the paper, unzip them and and push the files to the the respective data folder (like ./amazon_670k/data/).

  2. Download ODP dataset from https://hunch.net/~vw/odp_train.vw.gz and https://hunch.net/~vw/odp_test.vw.gz . The data format must be changed to match the datasets on Extreme Classification repo.

  3. Download ImageNet dataset from https://hunch.net/~jl/datasets/imagenet/training.txt.gz and https://hunch.net/~jl/datasets/imagenet/testing.txt.gz . Yet again, the data format must be changed to match the datasets on Extreme Classification repo.

Run MACH

Please move in to src folder for respective dataset, like 'amazon_670k/src/'. The steps to build indexes, train and evaualte are mentioned sequentially in run.sh The steps to build indexes, train and evaualte are mentioned sequentially in run.sh

About

Extreme Classification in Log Memory via Count-Min Sketch

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 72.1%
  • Shell 21.8%
  • C++ 5.5%
  • Other 0.6%