Stars
Scripts supporting the development and serving the Roots Search Tool - https://hf.co/spaces/bigscience-data/roots-search
Pipeline for pulling and processing online language model pretraining data from the web
Web-scale retrieval for knowledge-intensive NLP
Seminar on Large Language Models (COMP790-101 at UNC Chapel Hill, Fall 2022)
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
🤗 Evaluate: A library for easily evaluating machine learning models and datasets.
A library for building and serving multi-node distributed faiss indices.
Library for fast text representation and classification.