Skip to content
View caixinyu2020's full-sized avatar
  • Shanghai Artificial Intelligence Laboratory
  • Shanghai

Block or report caixinyu2020

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Python 774 53 Updated Sep 13, 2024

[NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of existing MLLMs to comprehend long multimodal documents.

Python 80 5 Updated Oct 9, 2024

Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model

Python 326 20 Updated Jun 23, 2024

Agentic components of the Llama Stack APIs

Python 3,706 552 Updated Oct 10, 2024

[ CVPR 2023 Award Candidate ] OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation

Python 453 12 Updated Sep 2, 2024

An open source implementation of CLIP.

Python 9,974 961 Updated Oct 9, 2024

A Comprehensive Toolkit for High-Quality PDF Content Extraction

Python 5,025 337 Updated Oct 10, 2024

主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题

HTML 3,022 357 Updated Aug 19, 2024

A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。

Python 12,719 947 Updated Oct 10, 2024

AirLLM 70B inference with single 4GB GPU

Jupyter Notebook 4,536 361 Updated Sep 25, 2024

LLM based autonomous agent that does online comprehensive research on any given topic

Python 14,354 1,879 Updated Oct 10, 2024

Rigourous evaluation of LLM-synthesized code - NeurIPS 2023

Python 1,181 102 Updated Oct 8, 2024

Cayman is a Jekyll theme for GitHub Pages

SCSS 1,206 3,161 Updated Aug 2, 2024

The Open-Source Data Annotation Platform

TypeScript 520 42 Updated Aug 12, 2024

Making large AI models cheaper, faster and more accessible

Python 38,715 4,339 Updated Oct 10, 2024

💎 数学公式识别 Math Formula OCR

Jupyter Notebook 489 98 Updated Mar 24, 2023

CodeXGLUE

C# 1,528 364 Updated Apr 23, 2024
Python 1,415 108 Updated May 12, 2023

🤖 GPT Code Review for Gitlab (针对于 Gitlab 的 LLM 辅助 Code Review 工具)项目详细文档 👇🏻

Python 114 18 Updated Aug 12, 2024

Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 40+ benchmarks

Python 1,142 160 Updated Oct 10, 2024
Python 57 Updated Aug 8, 2024

Measuring Massive Multitask Language Understanding | ICLR 2021

Python 1,172 90 Updated May 28, 2023

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and…

Python 43,122 7,725 Updated Oct 10, 2024

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

Python 5,753 467 Updated Jul 11, 2024

GAOKAO-Bench is an evaluation framework that utilizes GAOKAO questions as a dataset to evaluate large language models.

Python 525 37 Updated Mar 30, 2024

A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.

C++ 1,382 167 Updated Sep 30, 2024

[Neurips'24 Spotlight] Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning

Python 103 6 Updated Oct 9, 2024

🚀 基于大语言模型和 RAG 的知识库问答系统。开箱即用、模型中立、灵活编排,支持快速嵌入到第三方业务系统。

Python 10,660 1,402 Updated Oct 10, 2024

All things prompt engineering

Python 5,375 298 Updated Jun 4, 2024

[ECCV 2024] Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?

Python 141 11 Updated Sep 24, 2024
Next