Skip to content
View zyl9737's full-sized avatar
💭
I may be slow to respond.
💭
I may be slow to respond.

Highlights

  • Pro
Block or Report

Block or report zyl9737

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results

利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.

Python 14,787 2,270 Updated Jul 9, 2024

All the resources you need to get to Senior Engineer and beyond

12,273 1,139 Updated Jul 3, 2024

🎨 ML Visuals contains figures and templates which you can reuse and customize to improve your scientific writing.

12,775 1,344 Updated Feb 13, 2023

llama3 implementation one matrix multiplication at a time

Jupyter Notebook 11,121 837 Updated May 23, 2024

本项目旨在分享大模型相关技术原理以及实战经验。

HTML 7,696 752 Updated Jul 5, 2024

对llama3进行全参微调、lora微调以及qlora微调。

Python 59 4 Updated May 16, 2024

Practice to LLM.

Python 66 15 Updated Jul 7, 2024

personal chatgpt

Jupyter Notebook 264 54 Updated Jul 9, 2024

An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"

Python 332 53 Updated Nov 28, 2022

Research Code for Multimodal-Cognition Team in Ant Group

Python 61 2 Updated Jul 10, 2024

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

Python 5,447 925 Updated May 25, 2024

PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Models

Python 229 11 Updated Jan 2, 2024

This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.

OpenEdge ABL 684 143 Updated Mar 15, 2023

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Python 1,115 75 Updated Jul 10, 2024

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

946 53 Updated Jun 24, 2024

[ICCV2023] Official code for "VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control"

Python 51 1 Updated Sep 21, 2023

Code and pre-trained models for our paper "CLIPping the Deception: Adapting Vision-Language Models for Universal Deepfake Detection".

Python 28 3 Updated Jul 8, 2024

Rust Deep Neural Network

Rust 7 Updated Jun 1, 2024

Official implementation of "Harnessing Large Language Models for Training-free Video Anomaly Detection", CVPR 2024

Python 32 Updated May 28, 2024

PyTorch implementation of Masked AutoEncoder

Python 7 2 Updated Apr 2, 2024

视频硬字幕提取,生成srt文件。无需申请第三方API,本地实现文本识别。基于深度学习的视频字幕提取框架,包含字幕区域检测、字幕内容提取。A GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.

Python 5,274 592 Updated Feb 21, 2024

[AAAI2023] TopicFM: Robust, Efficient, and Interpretable Topic-Assisted Feature Matching

Python 101 4 Updated May 7, 2024

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

Python 2,871 234 Updated Jul 5, 2024

View-decoupled Transformer for Person Re-identification under Aerial-ground Camera Network (CVPR'24)

Python 26 1 Updated Mar 26, 2024

X站爬虫 Xvideos

Python 2 Updated May 26, 2024

[CVPR2023] Global and Local Mixture Consistency Cumulative Learning for Long-tailed Visual Recognitions

Python 67 12 Updated Sep 1, 2023

Code and trained models for our paper: K. Triaridis, V. Mezaris, "Exploring Multi-Modal Fusion for Image Manipulation Detection and Localization", Proc. 30th Int. Conf. on MultiMedia Modeling (MMM …

Python 46 3 Updated Apr 1, 2024

The code for "A Single Simple Patch is All You Need for AI-generated Image Detection"

Python 41 1 Updated May 19, 2024

EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything

Jupyter Notebook 1,978 143 Updated Jun 6, 2024

In this repository I demonstrate how you can perform multimodal(image+text) search to find similar images+texts given a test image+text from a multimodal (texts+images) database . I use the Kaggle …

Jupyter Notebook 11 2 Updated May 22, 2021
Next