princewang1994

Follow

🤔

Prince Wang princewang1994

🤔

Follow

I'm a CS graduate student from Zhejiang University

79 followers · 43 following

https://princewang1994.github.io

Achievements

Achievements

Lists (1)

Sort

✨ Inspiration

Beta Lists are currently in beta. Share feedback and report bugs.

Starred repositories

484 results for source starred repositories

EdoardoBotta / RQ-VAE-Recommender

[Pytorch] Generative retrieval model based on RQ-VAE from "Recommender Systems with Generative Retrieval"

Python 45 2 Updated Oct 17, 2024

NVlabs / SegFormer

Official PyTorch implementation of SegFormer

Python 2,535 352 Updated Aug 2, 2024

yaoxieyoulei / mytv-android

使用Android原生开发的电视直播软件

Kotlin 6,065 641 Updated Oct 13, 2024

Ucas-HaoranWei / GOT-OCR2.0

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 5,495 459 Updated Oct 18, 2024

hiyouga / LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Python 32,643 4,001 Updated Oct 17, 2024

QwenLM / Qwen2-VL

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 2,660 153 Updated Oct 4, 2024

comfyanonymous / ComfyUI

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

Python 53,508 5,666 Updated Oct 18, 2024

yunlong10 / Awesome-LLMs-for-Video-Understanding

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

1,414 73 Updated Oct 9, 2024

Breakthrough / PySceneDetect

🎥 Python and OpenCV-based scene cut/transition detection program & library.

Python 3,202 392 Updated Oct 6, 2024

jianchang512 / pyvideotrans

Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言，并支持api调用

Python 10,397 1,156 Updated Oct 18, 2024

CorentinTh / it-tools

Collection of handy online tools for developers, with great UX.

Vue 22,133 2,686 Updated Oct 7, 2024

NExT-ChatV / NExT-Chat

The code of the paper "NExT-Chat: An LMM for Chat, Detection and Segmentation".

Python 210 8 Updated Feb 5, 2024

DAMO-NLP-SG / VideoLLaMA2

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Python 795 52 Updated Oct 17, 2024

fanmingming / live

✯ 可直连访问的电视/广播图标库与相关工具项目 ✯ 🔕 永久免费直连访问完整开源不断完善的台标支持IPv4/IPv6双栈访问 🔕

JavaScript 22,239 3,332 Updated Oct 18, 2024

mbzuai-oryx / VideoGPT-plus

Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding

Python 204 15 Updated Aug 11, 2024

Vision-CAIR / MiniGPT4-video

Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding

Python 543 59 Updated Oct 4, 2024

1Panel-dev / 1Panel

🔥🔥🔥 Web-based linux server management control panel. / 现代化、开源的 Linux 服务器运维管理面板。

Go 22,346 2,026 Updated Oct 18, 2024

clovaai / donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

Python 5,778 469 Updated Jul 11, 2024

ollama / ollama

Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.

Go 93,761 7,410 Updated Oct 18, 2024

kyegomez / NaViT

My implementation of "Patch n’ Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution"

Python 177 9 Updated Oct 7, 2024

eric-mitchell / direct-preference-optimization

Reference implementation for DPO (Direct Preference Optimization)

Python 2,100 172 Updated Aug 11, 2024

harrytea / Awesome-Document-Understanding

Document Artifical Intelligence

122 5 Updated Oct 12, 2024

Hon-Wong / Elysium

[ECCV 2024] Elysium: Exploring Object-level Perception in Videos via MLLM

Python 51 2 Updated Jul 17, 2024

alibaba / rtp-llm

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

C++ 528 49 Updated Oct 14, 2024

google-research / big_vision

Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.

Jupyter Notebook 2,275 151 Updated Aug 23, 2024

h-zhao1997 / cobra

Cobra: Extending Mamba to Multi-modal Large Language Model for Efficient Inference

Python 252 8 Updated Aug 19, 2024

Ucas-HaoranWei / Vary

[ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.

Python 1,787 158 Updated Sep 28, 2024

YaoFANGUK / video-subtitle-extractor

视频硬字幕提取，生成srt文件。无需申请第三方API，本地实现文本识别。基于深度学习的视频字幕提取框架，包含字幕区域检测、字幕内容提取。A GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.

Python 5,922 653 Updated Oct 9, 2024

pytorch / torchtune

PyTorch native finetuning library

Python 4,163 400 Updated Oct 18, 2024

FoundationVision / VAR

[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-sim…

Python 4,117 307 Updated Oct 6, 2024

Starred topics

vehicle-detection