trangtv57

trangtv57

23 followers · 112 following

Zalo AI
Hà Nội

Achievements

Stars

Fictionarry / TalkingGaussian

[ECCV'24] TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting

Python 258 33 Updated Jul 30, 2024

fudan-generative-vision / hallo2

Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation

Python 4,144 587 Updated Nov 6, 2024

open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Jupyter Notebook 7,633 571 Updated Nov 17, 2024

Aubrey-ao / HumanBehaviorAnimation

Python 182 15 Updated May 21, 2024

mohamedhassanmus / prox

Resolving 3D Human Pose Ambiguities with 3D Scene Constraints https://prox.is.tue.mpg.de

Python 220 19 Updated Jul 13, 2021

facebookresearch / sam2

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 12,390 1,140 Updated Oct 14, 2024

facebookresearch / segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Jupyter Notebook 47,649 5,633 Updated Sep 18, 2024

Belval / TextRecognitionDataGenerator

A synthetic data generator for text recognition

Python 3,287 977 Updated Jul 18, 2024

TAG-Research / TAG-Bench

TAG-Bench: A benchmark for table-augmented generation (TAG)

Python 595 62 Updated Aug 28, 2024

pipecat-ai / pipecat

Open Source framework for voice and multimodal conversational AI

Python 3,381 325 Updated Nov 16, 2024

lipku / LiveTalking

Real time interactive streaming digital human

Python 3,919 561 Updated Nov 16, 2024

RickyL-2000 / ROSVOT

Robust Singing Voice Transcription and MIDI Extraction

Python 55 2 Updated Jul 29, 2024

baaivision / Emu3

Next-Token Prediction is All You Need

Python 1,815 71 Updated Oct 24, 2024

AILab-CVC / SEED-X

Multimodal Models in Real World

Jupyter Notebook 403 16 Updated Oct 28, 2024

stanford-futuredata / ColBERT

ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)

Python 3,067 388 Updated Nov 14, 2024

promptfoo / promptfoo

Test your prompts, agents, and RAGs. Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with comma…

TypeScript 4,752 376 Updated Nov 17, 2024

kyutai-labs / moshi

Python 6,748 526 Updated Oct 31, 2024

WangFei-2019 / Image-text-Retrieval

Python 43 6 Updated Sep 15, 2022

TencentARC / SmartEdit

Official code of SmartEdit [CVPR-2024 Highlight]

Python 252 8 Updated Jun 21, 2024

THUDM / CogVideo

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 9,154 859 Updated Nov 17, 2024

Doubiiu / DynamiCrafter

[ECCV 2024, Oral] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors

Python 2,583 207 Updated Sep 8, 2024

black-forest-labs / flux

Official inference repo for FLUX.1 models

Python 15,930 1,157 Updated Nov 14, 2024

OpenMOSS / AnyGPT

Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"

Python 779 61 Updated Aug 27, 2024

pipecat-ai / rtvi-web-demo

Example UI implementing the RTVI web client

TypeScript 471 69 Updated Oct 10, 2024

jdepoix / youtube-transcript-api

This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headles…

Python 3,024 340 Updated Nov 11, 2024

yangdongchao / LLM-Codec

The open source code for LLM-Codec

Python 114 5 Updated Aug 18, 2024

Nexdata-AI / 4995-Vietnamese-OCR-Images-Data-Images-with-Annotation-and-Transcription

Vietnamese OCR Images Dataset

4 Updated Aug 8, 2024

clovaai / donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

Python 5,834 476 Updated Jul 11, 2024

google-research-datasets / ToTTo

ToTTo is an open-domain English table-to-text dataset with over 120,000 training examples that proposes a controlled generation task: given a Wikipedia table and a set of highlighted table cells, p…

436 37 Updated Sep 11, 2024

dvlab-research / Step-DPO

Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"

Python 284 9 Updated Jul 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

trangtv57

Achievements

Achievements

Block or report trangtv57

Stars

Fictionarry / TalkingGaussian

fudan-generative-vision / hallo2

open-mmlab / Amphion

Aubrey-ao / HumanBehaviorAnimation

mohamedhassanmus / prox

facebookresearch / sam2

facebookresearch / segment-anything

Belval / TextRecognitionDataGenerator

TAG-Research / TAG-Bench

pipecat-ai / pipecat

lipku / LiveTalking

RickyL-2000 / ROSVOT

baaivision / Emu3

AILab-CVC / SEED-X

stanford-futuredata / ColBERT

promptfoo / promptfoo

kyutai-labs / moshi

WangFei-2019 / Image-text-Retrieval

TencentARC / SmartEdit

THUDM / CogVideo

Doubiiu / DynamiCrafter

black-forest-labs / flux

OpenMOSS / AnyGPT

pipecat-ai / rtvi-web-demo

jdepoix / youtube-transcript-api

yangdongchao / LLM-Codec

Nexdata-AI / 4995-Vietnamese-OCR-Images-Data-Images-with-Annotation-and-Transcription

clovaai / donut

google-research-datasets / ToTTo

dvlab-research / Step-DPO