Skip to content

vaibhav016/Critical-Data-Size

 
 

Repository files navigation

Critical Data Size of Language Models from a Grokking Perspective

This repo inculdes the offical code in the paper Critical Data Size of Language Models from a Grokking Perspective.

Main_figure

Prerequisites

  • torch >= 2.0
  • transformers

Quick Start

1. Grokking On IMDB

Execute the following command to re-produce our results:

sh run_grokking_on_imdb.sh

2. Grokking on Yelp

sh run_grokking_on_yelp.sh

Citation

@article{zhu2024critical,
  title={Critical data size of language models from a grokking perspective},
  author={Zhu, Xuekai and Fu, Yao and Zhou, Bowen and Lin, Zhouhan},
  journal={arXiv preprint arXiv:2401.10463},
  year={2024}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 93.3%
  • Shell 6.7%