Skip to content
View princewang1994's full-sized avatar
🤔
🤔

Block or report princewang1994

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.

Starred repositories

484 results for source starred repositories
Clear filter

[Pytorch] Generative retrieval model based on RQ-VAE from "Recommender Systems with Generative Retrieval"

Python 45 2 Updated Oct 17, 2024

Official PyTorch implementation of SegFormer

Python 2,535 352 Updated Aug 2, 2024

使用Android原生开发的电视直播软件

Kotlin 6,065 641 Updated Oct 13, 2024

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 5,495 459 Updated Oct 18, 2024

Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Python 32,643 4,001 Updated Oct 17, 2024

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 2,660 153 Updated Oct 4, 2024

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

Python 53,508 5,666 Updated Oct 18, 2024

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

1,414 73 Updated Oct 9, 2024

🎥 Python and OpenCV-based scene cut/transition detection program & library.

Python 3,202 392 Updated Oct 6, 2024

Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,并支持api调用

Python 10,397 1,156 Updated Oct 18, 2024

Collection of handy online tools for developers, with great UX.

Vue 22,133 2,686 Updated Oct 7, 2024

The code of the paper "NExT-Chat: An LMM for Chat, Detection and Segmentation".

Python 210 8 Updated Feb 5, 2024

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Python 795 52 Updated Oct 17, 2024

✯ 可直连访问的电视/广播图标库与相关工具项目 ✯ 🔕 永久免费 直连访问 完整开源 不断完善的台标 支持IPv4/IPv6双栈访问 🔕

JavaScript 22,239 3,332 Updated Oct 18, 2024

Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding

Python 204 15 Updated Aug 11, 2024

Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding

Python 543 59 Updated Oct 4, 2024

🔥🔥🔥 Web-based linux server management control panel. / 现代化、开源的 Linux 服务器运维管理面板。

Go 22,346 2,026 Updated Oct 18, 2024

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

Python 5,778 469 Updated Jul 11, 2024

Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.

Go 93,761 7,410 Updated Oct 18, 2024

My implementation of "Patch n’ Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution"

Python 177 9 Updated Oct 7, 2024

Reference implementation for DPO (Direct Preference Optimization)

Python 2,100 172 Updated Aug 11, 2024

Document Artifical Intelligence

122 5 Updated Oct 12, 2024

[ECCV 2024] Elysium: Exploring Object-level Perception in Videos via MLLM

Python 51 2 Updated Jul 17, 2024

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

C++ 528 49 Updated Oct 14, 2024

Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.

Jupyter Notebook 2,275 151 Updated Aug 23, 2024

Cobra: Extending Mamba to Multi-modal Large Language Model for Efficient Inference

Python 252 8 Updated Aug 19, 2024

[ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.

Python 1,787 158 Updated Sep 28, 2024

视频硬字幕提取,生成srt文件。无需申请第三方API,本地实现文本识别。基于深度学习的视频字幕提取框架,包含字幕区域检测、字幕内容提取。A GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.

Python 5,922 653 Updated Oct 9, 2024

PyTorch native finetuning library

Python 4,163 400 Updated Oct 18, 2024

[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-sim…

Python 4,117 307 Updated Oct 6, 2024
Next