Stars
The official repository of the Omni-MATH benchmark.
Implemented PDF Query Chatbot using only Cloud based tools - LLama-Index , Gemini Embeddings , Groq LLM , Pinecone
Code for the EMNLP 2024 paper "Mathador-LM: A Dynamic Benchmark for Mathematical Reasoning on LLMs".
Discovering Data-driven Hypotheses in the Wild
This is the official repository of the paper "OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI"
👻 Code and benchmark for our EMNLP 2023 paper - "FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions"
Dagaz Project (Board games and Puzzles)
Benchmark LLM reasoning capability by solving chess puzzles.
A project structure aware autonomous software engineer aiming for autonomous program improvement. Resolved 30.67% tasks (pass@1) in SWE-bench lite and 38.40% tasks (pass@1) in SWE-bench verified wi…
Minimalistic chess variant GUI for Fairy-Stockfish, superseded by fairyground
EPD opening book generation and filtering for chess and chess variants
Variant NNUE training data generator for Fairy-Stockfish
Stable Diffusion web UI
GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
chess variant NNUE training code (for Fairy-Stockfish)
chess variant NNUE training code for Fairy-Stockfish
A free, open-source and modern Chess Variant Analysis GUI for the 21st century
chess variant engine supporting Xiangqi, Shogi, Janggi, Makruk, S-Chess, Crazyhouse, Bughouse, and many more