Skip to content
View ttengwang's full-sized avatar

Highlights

  • Pro
Block or Report

Block or report ttengwang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results

A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。

Python 3,696 276 Updated Jul 30, 2024

[ECCV2024] Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models

Python 9 Updated Jul 17, 2024

SEED-Story: Multimodal Long Story Generation with Large Language Model

Python 606 46 Updated Jul 29, 2024

This is a repo to track the latest autoregressive visual generation papers.

12 Updated Jul 23, 2024

mllm-npu: training multimodal large language models on Ascend NPUs

Python 66 1 Updated Jul 29, 2024

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

3,259 224 Updated Jul 30, 2024

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

Python 1,087 39 Updated Jul 14, 2024

A generative speech model for daily dialogue.

Python 28,383 3,089 Updated Jul 29, 2024

Official PyTorch implementation code for realizing the technical part of Mamba-based traversal of rationale (Meteor) to improve performance of numerous vision language performances for diverse capa…

Python 93 4 Updated May 30, 2024

Create LLM agents with long-term memory and custom tools 📚🦙

Python 10,964 1,185 Updated Jul 30, 2024

Fast Multimodal LLM on Mobile Devices

C++ 300 34 Updated Jul 22, 2024

🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)

Python 768 53 Updated Jul 10, 2024

Convert PDF to markdown quickly with high accuracy

Python 14,939 791 Updated Jul 22, 2024

Unified Audio-Visual Perception for Multi-Task Video Localization

Python 12 Updated Apr 19, 2024

LLM inference in C/C++

C++ 62,717 8,990 Updated Jul 30, 2024

This repository collects papers for "A Survey on Knowledge Distillation of Large Language Models". We break down KD into Knowledge Elicitation and Distillation Algorithms, and explore the Skill & V…

429 27 Updated Jul 3, 2024

A new multi-shot video understanding benchmark Shot2Story with comprehensive video summaries and detailed shot-level captions.

Python 77 4 Updated Jul 28, 2024

Multimodal Models in Real World

Jupyter Notebook 345 17 Updated Jul 12, 2024

Sample LaTex file for HKU PhD thesis.

TeX 17 4 Updated Mar 16, 2022

Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"

Python 675 50 Updated Jul 9, 2024

AnyDoor: Test-Time Backdoor Attacks on Multimodal Large Language Models

Python 34 Updated Apr 8, 2024

MiniCPM-2B: An end-side LLM outperforming Llama2-13B.

Python 4,476 321 Updated Jul 29, 2024

PhotoMaker [CVPR 2024]

Jupyter Notebook 8,989 709 Updated Jul 24, 2024

Awesome Papers related to Mamba.

1,006 51 Updated Jul 19, 2024

[ACL 2024] Progressive LLaMA with Block Expansion.

Python 450 34 Updated May 20, 2024

A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.

2,917 178 Updated Jul 25, 2024

[NAACL 2024] MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuning

Python 67 3 Updated Jul 28, 2024

Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.

482 28 Updated Jul 28, 2024
Next