Skip to content
View zhd36's full-sized avatar

Highlights

  • Pro

Block or report zhd36

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Free and source-available fair-code licensed workflow automation tool. Easily automate tasks across different services.

TypeScript 44,220 5,890 Updated Aug 24, 2024

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 5,621 610 Updated Aug 22, 2024

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 31,531 3,623 Updated Aug 23, 2024

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 4,607 346 Updated Aug 7, 2024

AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.

Python 4,808 506 Updated Aug 8, 2024

🚀 RVC + UVR = A perfect set of tools for voice cloning, easily and free!

Python 108 30 Updated Aug 21, 2024

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Python 11,054 784 Updated Aug 25, 2024

程序员在家做饭方法指南。Programmer's guide about how to cook at home (Simplified Chinese only).

Dockerfile 66,107 8,670 Updated Aug 13, 2024

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 5,196 406 Updated Aug 20, 2024

An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) fo…

Jupyter Notebook 2,744 332 Updated Apr 25, 2024

A lightweight utility that makes the Windows taskbar translucent/transparent.

C++ 15,339 1,111 Updated Aug 20, 2024

🤱🏻 Turn any webpage into a desktop app with Rust. 🤱🏻 利用 Rust 轻松构建轻量级多端桌面应用

Rust 25,284 4,548 Updated Aug 21, 2024

State-of-the-art 2D and 3D Face Analysis Project

Python 22,572 5,316 Updated Aug 19, 2024

The model, data and code for the visual GUI Agent SeeClick

HTML 171 9 Updated Jul 15, 2024

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!

Python 31,730 2,365 Updated Aug 23, 2024

real time face swap and one-click video deepfake with only a single image

Python 28,810 3,982 Updated Aug 24, 2024

GUI-focused roop

Python 4,195 638 Updated May 28, 2024

Bring portraits to life!

Python 10,499 1,037 Updated Aug 19, 2024

VITS-based Voice Conversion focused on simplicity, quality and performance.

Python 1,444 242 Updated Aug 24, 2024

Easily train a good VC model with voice data <= 10 mins!

Python 22,363 3,380 Updated Aug 17, 2024

so-vits-svc fork with realtime support, improved interface and more features.

Python 8,642 1,149 Updated Aug 21, 2024

SoftVC VITS Singing Voice Conversion

Python 25,174 4,732 Updated Nov 11, 2023

Next generation face swapper and enhancer

Python 17,453 2,593 Updated Aug 24, 2024

The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.

Jupyter Notebook 4,870 316 Updated Jun 28, 2024

Evolved Fork of roop with Web Server and lots of additions

Python 1,909 449 Updated Jul 29, 2024
Python 6 5 Updated Aug 10, 2023

This is an implementation of iperov's DeepFaceLab and DeepFaceLive in Stable Diffusion Web UI 1111 by AUTOMATIC1111.

Python 91 16 Updated May 24, 2024

one-click face swap

Python 26,720 6,526 Updated Aug 19, 2024

GUI Odyssey is a comprehensive dataset for training and evaluating cross-app navigation agents. GUI Odyssey consists of 7,735 episodes from 6 mobile devices, spanning 6 types of cross-app tasks, 20…

Python 52 2 Updated Jul 10, 2024

PPTC Benchmark: Evaluating Large Language Models for PowerPoint Task Completion

Python 45 5 Updated Feb 29, 2024
Next