ttengwang

Follow

Teng Wang ttengwang

Follow

Ph.D. student in computer science. My research interests lie in deep learning and computer vision, focusing on vision-language multimodal learning.

191 followers · 63 following

The University of Hong Kong
Hong Kong
ttengwang.com

Achievements

Achievements

Highlights

Pro

Block or Report

Block or report ttengwang

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Lists (1)

Sort

vision-language pretraining

vision-language pretraining

Beta Lists are currently in beta. Share feedback and report bugs.

Stars

opendatalab / MinerU

A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具，支持PDF/网页/多格式电子书提取。

Python 3,696 276 Updated Jul 30, 2024

zjr2000 / REVERIE

[ECCV2024] Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models

Python 9 Updated Jul 17, 2024

foundation-multimodal-models / CAPTURE

Python 18 Updated Jul 27, 2024

TencentARC / SEED-Story

SEED-Story: Multimodal Long Story Generation with Large Language Model

Python 606 46 Updated Jul 29, 2024

lxa9867 / Awesome-Autoregressive-Visual-Generation

This is a repo to track the latest autoregressive visual generation papers.

12 Updated Jul 23, 2024

TencentARC / mllm-npu

mllm-npu: training multimodal large language models on Ascend NPUs

Python 66 1 Updated Jul 29, 2024

esbatmop / MNBVC

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化，也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

3,259 224 Updated Jul 30, 2024

FoundationVision / LlamaGen

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

Python 1,087 39 Updated Jul 14, 2024

2noise / ChatTTS

A generative speech model for daily dialogue.

Python 28,383 3,089 Updated Jul 29, 2024

ByungKwanLee / Meteor

Official PyTorch implementation code for realizing the technical part of Mamba-based traversal of rationale (Meteor) to improve performance of numerous vision language performances for diverse capa…

Python 93 4 Updated May 30, 2024

cpacker / MemGPT

Create LLM agents with long-term memory and custom tools 📚🦙

Python 10,964 1,185 Updated Jul 30, 2024

UbiquitousLearning / mllm

Fast Multimodal LLM on Mobile Devices

C++ 300 34 Updated Jul 22, 2024

mbzuai-oryx / LLaVA-pp

🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)

Python 768 53 Updated Jul 10, 2024

VikParuchuri / marker

Convert PDF to markdown quickly with high accuracy

Python 14,939 791 Updated Jul 22, 2024

ttgeng233 / UniAV

Unified Audio-Visual Perception for Multi-Task Video Localization

Python 12 Updated Apr 19, 2024

ggerganov / llama.cpp

LLM inference in C/C++

C++ 62,717 8,990 Updated Jul 30, 2024

Tebmer / Awesome-Knowledge-Distillation-of-LLMs

This repository collects papers for "A Survey on Knowledge Distillation of Large Language Models". We break down KD into Knowledge Elicitation and Distillation Algorithms, and explore the Skill & V…

429 27 Updated Jul 3, 2024

bytedance / Shot2Story

A new multi-shot video understanding benchmark Shot2Story with comprehensive video summaries and detailed shot-level captions.

Python 77 4 Updated Jul 28, 2024

AILab-CVC / SEED-X

Multimodal Models in Real World

Jupyter Notebook 345 17 Updated Jul 12, 2024

guanyingc / HKU-PhD-Thesis-LaTex

Sample LaTex file for HKU PhD thesis.

TeX 17 4 Updated Mar 16, 2022

Tongji-KGLLM / RAG-Survey

1,601 116 Updated May 8, 2024

OpenMOSS / AnyGPT

Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"

Python 675 50 Updated Jul 9, 2024

sail-sg / AnyDoor

AnyDoor: Test-Time Backdoor Attacks on Multimodal Large Language Models

Python 34 Updated Apr 8, 2024

OpenBMB / MiniCPM

MiniCPM-2B: An end-side LLM outperforming Llama2-13B.

Python 4,476 321 Updated Jul 29, 2024

TencentARC / PhotoMaker

PhotoMaker [CVPR 2024]

Jupyter Notebook 8,989 709 Updated Jul 24, 2024

yyyujintang / Awesome-Mamba-Papers

Awesome Papers related to Mamba.

1,006 51 Updated Jul 19, 2024

TencentARC / LLaMA-Pro

[ACL 2024] Progressive LLaMA with Block Expansion.

Python 450 34 Updated May 20, 2024

showlab / Awesome-Video-Diffusion

A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.

2,917 178 Updated Jul 25, 2024

FuxiaoLiu / MMC

[NAACL 2024] MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuning

Python 67 3 Updated Jul 28, 2024

Yangyi-Chen / Multimodal-AND-Large-Language-Models

Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.

482 28 Updated Jul 28, 2024