Lists (1)
Sort Name ascending (A-Z)
Stars
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
Per-language text caret and mouse cursor styling, aka language indicator
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Official inference repo for FLUX.1 models
[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization
DeepSeek-VL: Towards Real-World Vision-Language Understanding
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Lightweight GPT-4 Vision processing over the Webcam
I play with my best friend GPT
基于达摩院视频切割技术的视频转换为短音频的vits数据集生成工具 A VITS Dataset Generation Tool for Converting Video to Short Audio Based on Damo Academy Video Cutting Technology
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
Blind&Invisible Watermark ,图片盲水印,提取水印无须原图!
JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf
🏆 📚 A list of awesome MkDocs projects and plugins.
Pre-trained NFNets with 99% of the accuracy of the official paper "High-Performance Large-Scale Image Recognition Without Normalization".
可自动生成一些能够用于 Clash for Windows、Clash for Android 等应用的配置文件的 Python 脚本。Python script can be used to generate some profiles for Clash for Windows、Clash for Android and so on.
High-Resolution Image Synthesis with Latent Diffusion Models
Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.
🌏🌍🌎Translators🌎🌍🌏 is a library that aims to bring free, multiple, enjoyable translations to individuals and students in Python. Translators是一个旨在用Python为个人和学生带来免费、多样、愉快翻译的库。
A multi-voice TTS system trained with an emphasis on quality
The code for our newly accepted paper in Pattern Recognition 2020: "U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection."
Rembg is a tool to remove images background
A cross-platform GUI automation Python module for human beings. Used to programmatically control the mouse & keyboard.
直播源相关资源汇总 📺 💯 IPTV、M3U —— 勤洗手、戴口罩,祝愿所有人百毒不侵
Addon scripts, plugins, and skins for XBMC Media Center. Special for chinese laguage.
Unofficial implementation of Palette: Image-to-Image Diffusion Models by Pytorch
Dataset of prompts, synthetic AI generated images, and aesthetic ratings.
A latent text-to-image diffusion model