- Norway
Block or Report
Block or report Murhaf
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLists (2)
Sort Name ascending (A-Z)
Language
Sort by: Recently starred
Starred repositories
Open-source scientific and technical publishing system built on Pandoc.
Pympress is a simple yet powerful PDF reader designed for dual-screen presentations
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
Utility for behavioral and representational analyses of Language Models
Agentless🐱: an agentless approach to automatically solve software development problems
ReFT: Representation Finetuning for Language Models
The most streamlined road map to learn ML fundamentals for free.
A reactive notebook for Python — run reproducible experiments, execute as a script, deploy as an app, and version with git.
Reconquer the canvas: beautiful Tikz figures without clunky Tikz code
Python module (C extension and plain python) implementing Aho-Corasick algorithm
Fast lexical search library implementing BM25 in Python using Scipy (on average 2x faster than Elasticsearch in single-threaded setting)
Fast & Simple repository for pre-training and fine-tuning T5-style models
⚗️ distilabel is a framework for synthetic data and AI feedback for AI engineers that require high-quality outputs, full data ownership, and overall efficiency.
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Efficient few-shot learning with Sentence Transformers
Experiments for efforts to train a new and improved t5
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
Paper List for Contrastive Learning for Natural Language Processing
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
Octopus is a neural machine generation toolkit for Arabic Natural Lnagauge Generation (NLG)
MTEB: Massive Text Embedding Benchmark
Sparsity-aware deep learning inference runtime for CPUs
Pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE)