ywdong

Follow

Vincent ywdong

Follow

11 followers · 16 following

Shenzhen,China

Block or Report

Block or report ywdong

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Starred repositories

RapidAI / RapidOCR

Awesome OCR multiple programing languages toolkits based on ONNXRuntime, OpenVION and PaddlePaddle. （将PaddleOCR模型做了转换，采用ONNXRuntime推理，速度很快）

Python 2,476 323 Updated Jul 15, 2024

neuml / txtai

💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows

Python 7,968 544 Updated Jul 22, 2024

outlines-dev / outlines

Structured Text Generation

Python 7,283 375 Updated Jul 23, 2024

gabrielmittag / NISQA

NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment

Python 627 114 Updated Mar 8, 2024

Takaaki-Saeki / DiscreteSpeechMetrics

Reference-aware automatic speech evaluation toolkit

Python 80 5 Updated Feb 22, 2024

kijai / ComfyUI-LivePortraitKJ

ComfyUI nodes for LivePortrait

Python 808 57 Updated Jul 22, 2024

taleinat / fuzzysearch

Find parts of long text or data, allowing for some changes/typos.

Python 292 24 Updated Jun 29, 2024

GAIR-NLP / anole

Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation

Python 541 28 Updated Jul 15, 2024

ToonCrafter / ToonCrafter

a research paper for generative cartoon interpolation

Python 4,866 400 Updated Jun 1, 2024

PeterH0323 / Streamer-Sales

Streamer-Sales 销冠 —— 卖货主播 LLM 大模型🛒🎁，一个能够根据给定的商品特点从激发用户购买意愿角度出发进行商品解说的卖货主播大模型。🚀⭐内含详细的数据生成流程❗ 📦另外还集成了 LMDeploy 加速推理🚀、RAG检索增强生成 📚、TTS文字转语音🔊、数字人生成 🦸、 Agent 使用网络查询实时信息🌐、ASR 语音转文字🎙️

Python 1,935 273 Updated Jul 22, 2024

z-x-yang / Segment-and-Track-Anything

An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) fo…

Jupyter Notebook 2,665 323 Updated Apr 25, 2024

mit-han-lab / efficientvit

EfficientViT is a new family of vision models for efficient high-resolution vision.

Python 1,653 147 Updated Jul 11, 2024

cambrian-mllm / cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,582 96 Updated Jul 6, 2024

facebookresearch / MetaCLIP

ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Experts via Clustering

Python 1,124 48 Updated Jul 9, 2024

google-deepmind / magiclens

[ICML'24 Oral] "MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions"

Python 74 8 Updated Jun 23, 2024

learn2phoenix / CSD

Python 86 3 Updated Jul 17, 2024

CyberAgentAILab / RALF

[CVPR24 Oral] Official repository for RALF: Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation

Python 78 1 Updated Jul 6, 2024

ypw0102 / BatchEval

code for ACL2024-main: BatchEval: Towards Human-like Text Evaluation

Python 13 1 Updated May 20, 2024

karpathy / LLM101n

LLM101n: Let's build a Storyteller

24,957 1,314 Updated Jul 21, 2024

ExponentialML / ComfyUI_Native_DynamiCrafter

DynamiCrafter that works natively with ComfyUI's nodes, optimizations, and more.

Python 104 10 Updated Jun 8, 2024

THUDM / GLM-4

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型

Python 3,860 283 Updated Jul 22, 2024

THUDM / CogVLM2

GPT4V-level open-source multi-modal model based on Llama3-8B

Python 1,629 87 Updated Jul 16, 2024

chflame163 / ComfyUI_IPAdapter_plus_V2

Forked from cubiq/ComfyUI_IPAdapter_plus

A copy of ComfyUI_IPAdapter_plus, Only changed node name to coexist with ComfyUI_IPAdapter_plus v1 version.

Python 26 2 Updated Jul 17, 2024

AIGText / Glyph-ByT5

[ECCV2024] This is an official inference code of the paper "Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering" and "Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Mu…

Jupyter Notebook 413 14 Updated Jul 13, 2024

2noise / ChatTTS

A generative speech model for daily dialogue.

Python 28,097 3,050 Updated Jul 23, 2024

adam-maj / deep-learning

A deep-dive on the entire history of deep-learning

Jupyter Notebook 918 76 Updated Jul 16, 2024

langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…

TypeScript 38,534 5,262 Updated Jul 23, 2024

crewAIInc / crewAI

Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.

Python 17,548 2,387 Updated Jul 23, 2024

MC-E / ReVideo

Python 273 7 Updated Jun 27, 2024

naklecha / llama3-from-scratch

llama3 implementation one matrix multiplication at a time

Jupyter Notebook 11,364 860 Updated May 23, 2024

Starred topics

stable-diffusion-webui-plugin