Stars
QGEval: A Benchmark for Question Generation Evaluation
OS-ATLAS: A Foundation Action Model For Generalist GUI Agents
Repo for Anonymous purpose, pls don't distribute
A Self-Training Framework for Vision-Language Reasoning
This is a collection of resources for computer-use agents, including videos, blogs, papers, and projects.
A curated collection of LLM reasoning and planning resources, including key papers, limitations, benchmarks, and additional learning materials.
SEA is an automated paper review framework capable of generating comprehensive and high-quality review feedback with high consistency for papers, thereby assisting researchers in improving the qual…
Official implementation for "MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?"
The project page for "LOGIC-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning"
This is the repo for our paper "Mr-Ben: A Comprehensive Meta-Reasoning Benchmark for Large Language Models"
Image Textualization: An Automatic Framework for Generating Rich and Detailed Image Descriptions (NeurIPS 2024)
xufangzhi / SeeClick
Forked from njucckevin/SeeClickThe model, data and code for the visual GUI Agent SeeClick
[🏆Outstanding Paper Award at ACL 2024] MMToM-QA: Multimodal Theory of Mind Question Answering
The repository of the project "Fine-tuning Large Language Models with Sequential Instructions", code base comes from open-instruct and LAVIS
AcadHomepage: A Modern and Responsive Academic Personal Homepage
An self-improving embodied conversational agent seamlessly integrated into the operating system to automate our daily tasks.
[PVLDB 2024 Best Paper Nomination] TFB: Towards Comprehensive and Fair Benchmarking of Time Series Forecasting Methods
Neural Code Intelligence Survey 2024
LongHeads: Multi-Head Attention is Secretly a Long Context Processor