Highlights
- Pro
Lists (3)
Sort Name ascending (A-Z)
Starred repositories
Writing AI Conference Papers: A Handbook for Beginners
🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
A Comprehensive Benchmark for Code Information Retrieval.
An executable to convert SOCKS5 proxy into HTTP proxy
WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.
Machine Learning Engineering Open Book
Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
User-friendly WebUI for AI (Formerly Ollama WebUI)
Fast lexical search library implementing BM25 in Python using Numpy and Scipy
⚡FlashRAG: A Python Toolkit for Efficient RAG Research
AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark
An open-source remote desktop application designed for self-hosting, as an alternative to TeamViewer.
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
minimal pytorch implementation of bm25 (with sparse tensors)
SGLang is a fast serving framework for large language models and vision language models.
Code repository for the paper - "Matryoshka Representation Learning"
📰 Must-read papers and blogs on Speculative Decoding ⚡️
ACL 2024 | LooGLE: Long Context Evaluation for Long-Context Language Models
Understanding the correlation between different LLM benchmarks
[ICLR 2024] SWE-Bench: Can Language Models Resolve Real-world Github Issues?
xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval