princewang1994

Follow

🤔

Prince Wang princewang1994

🤔

Follow

I'm a CS graduate student from Zhejiang University

77 followers · 43 following

http:https://princewang1994.github.io

Achievements

BetaSend feedback

Achievements

BetaSend feedback

Block or Report

Block or report princewang1994

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Lists (1)

Sort

✨ Inspiration

Beta Lists are currently in beta. Share feedback and report bugs.

Starred repositories

mbzuai-oryx / VideoGPT-plus

Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding

Python 119 5 Updated Jun 20, 2024

Vision-CAIR / MiniGPT4-video

Official code for MiniGPT4-video

Python 439 45 Updated Jun 22, 2024

1Panel-dev / 1Panel

🔥🔥🔥 Web-based linux server management control panel. / 现代化、开源的 Linux 服务器运维管理面板。

Go 20,113 1,805 Updated Jun 28, 2024

clovaai / donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

Python 5,493 439 Updated Jun 10, 2024

ollama / ollama

Get up and running with Llama 3, Mistral, Gemma 2, and other large language models.

Go 75,905 5,704 Updated Jun 28, 2024

kyegomez / NaViT

My implementation of "Patch n’ Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution"

Python 142 7 Updated Jun 17, 2024

eric-mitchell / direct-preference-optimization

Reference implementation for DPO (Direct Preference Optimization)

Python 1,808 138 Updated May 23, 2024

harrytea / Awesome-Document-Understanding

Document Artifical Intelligence

91 1 Updated Jun 28, 2024

Hon-Wong / Elysium

Elysium: Exploring Object-level Perception in Videos via MLLM

21 1 Updated Mar 29, 2024

devmaxxing / videocr-PaddleOCR

Forked from apm1467/videocr

Extract hardcoded subtitles from videos using machine learning

Jupyter Notebook 123 16 Updated Feb 17, 2024

alibaba / rtp-llm

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

C++ 429 35 Updated Jun 24, 2024

google-research / big_vision

Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.

Jupyter Notebook 1,955 133 Updated Jun 21, 2024

h-zhao1997 / cobra

Cobra: Extending Mamba to Multi-modal Large Language Model for Efficient Inference

Python 211 7 Updated Jun 3, 2024

Ucas-HaoranWei / Vary

Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.

Python 1,637 149 Updated May 27, 2024

YaoFANGUK / video-subtitle-extractor

视频硬字幕提取，生成srt文件。无需申请第三方API，本地实现文本识别。基于深度学习的视频字幕提取框架，包含字幕区域检测、字幕内容提取。A GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.

Python 5,204 581 Updated Feb 21, 2024

pytorch / torchtune

A Native-PyTorch Library for LLM Fine-tuning

Python 3,515 282 Updated Jun 27, 2024

FoundationVision / VAR

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly …

Python 3,753 283 Updated Apr 30, 2024

aixcoder-plugin / aiXcoder-7B

official repository of aiXcoder-7B Code Large Language Model

Python 2,148 342 Updated Apr 22, 2024

bytedance / VTVQA

Towards Video Text Visual Question Answering: Benchmark and Baseline

Python 34 Updated Feb 26, 2024

callsys / TextVR

A large Cross-Modal Video Retrieval Dataset with Reading Comprehension

Python 18 Updated Dec 28, 2023

xiaobai1217 / Awesome-Video-Datasets

Video datasets

1,013 87 Updated Mar 8, 2023

InternLM / InternLM-XComposer

InternLM-XComposer2 is a groundbreaking vision-language large model (VLLM) excelling in free-form text-image composition and comprehension.

Python 1,905 120 Updated Jun 14, 2024

InternLM / xtuner

An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)

Python 3,145 253 Updated Jun 27, 2024

karpathy / llama2.c

Inference Llama 2 in one file of pure C

C 16,697 1,940 Updated Jun 26, 2024

karpathy / minbpe

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Python 8,640 782 Updated Jun 17, 2024

apple / ml-ferret

Python 8,179 475 Updated Jan 27, 2024

OpenBMB / MiniCPM

MiniCPM-2B: An end-side LLM outperforming Llama2-13B.

Jupyter Notebook 4,372 313 Updated Jun 28, 2024

dvlab-research / Prompt-Highlighter

[CVPR 2024] Prompt Highlighter: Interactive Control for Multi-Modal LLMs

Python 107 2 Updated Jan 25, 2024

dvlab-research / LLaMA-VID

Official Implementation for LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models

Python 617 39 Updated Jan 10, 2024

linexjlin / GPTs

leaked prompts of GPTs

27,410 3,697 Updated Jun 18, 2024

Starred topics

vehicle-detection