ChuniHiro

🎯

Focusing

Hongyu Fu ChuniHiro

🎯

Focusing

1 follower · 9 following

Achievements

Highlights

Lists (1)

Sort

Jobs

3 repositories

Stars

richards199999 / Thinking-Claude

Let your Claude able to think

JavaScript 4,101 480 Updated Nov 16, 2024

Q-Future / Q-Bench

①[ICLR2024 Spotlight] (GPT-4V/Gemini-Pro/Qwen-VL-Plus+16 OS MLLMs) A benchmark for multi-modality LLMs (MLLMs) on low-level vision and visual quality assessment.

Jupyter Notebook 247 12 Updated Aug 12, 2024

chaofengc / Awesome-Image-Quality-Assessment

A comprehensive collection of IQA papers

TeX 1,004 67 Updated Nov 5, 2024

beyondExp / B-Llama3-o

B-Llama3o a llama3 with Vision Audio and Audio understanding as well as text and Audio and Animation Data output.

Python 26 4 Updated Jun 3, 2024

QwenLM / Qwen2-Audio

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 1,221 81 Updated Aug 13, 2024

modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 6,960 739 Updated Nov 15, 2024

QwenLM / Qwen-Audio

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 1,482 107 Updated Jul 5, 2024

LAION-AI / CLAP

Contrastive Language-Audio Pretraining

Python 1,415 137 Updated Jul 9, 2024

NVIDIA / audio-flamingo

PyTorch implementation of Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities.

Python 192 14 Updated Oct 2, 2024

frankenliu / LOAE

Python 10 1 Updated Sep 25, 2024

QwenLM / Qwen2-VL

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 3,095 188 Updated Oct 4, 2024

feizc / FluxMusic

Text-to-Music Generation with Rectified Flow Transformers

Python 1,596 122 Updated Sep 6, 2024

LLaVA-VL / LLaVA-NeXT

Python 2,872 239 Updated Oct 16, 2024

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 20,238 2,236 Updated Aug 12, 2024

smahesh29 / Gender-and-Age-Detection

A Python project which can detect gender and age using OpenCV of the person (face) in a picture or through webcam.

Python 489 204 Updated May 16, 2024

diovisgood / agender

Real-time estimation of gender and age

Python 156 39 Updated May 10, 2019

OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 6,014 465 Updated Nov 15, 2024

luosiallen / Diff-Foley

Diff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models

Python 161 19 Updated May 29, 2024

HUIZ-A / SVA

Python 19 1 Updated Apr 26, 2024

lmnt-com / diffwave

DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.

Python 775 113 Updated Mar 26, 2024

rany2 / edge-tts

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

Python 6,279 619 Updated Nov 15, 2024

GitHubDaily / GitHubDaily

坚持分享 GitHub 上高质量、有趣实用的开源技术教程、开发者工具、编程网站、技术资讯。A list cool, interesting projects of GitHub.

32,496 3,562 Updated May 29, 2024

BytedanceSpeech / seed-tts-eval

Python 1,043 104 Updated Jun 14, 2024

Stability-AI / stable-audio-tools

Generative models for conditional audio generation

Python 2,715 258 Updated Nov 5, 2024

guyyariv / TempoTokens

This repo contains the official PyTorch implementation of: Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation

Python 107 11 Updated Apr 23, 2024

RoySheffer / im2wav

Official implementation of the pipeline presented in I hear your true colors: Image Guided Audio Generation

Python 104 9 Updated Jan 18, 2023

huggingface / peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python 16,441 1,620 Updated Nov 12, 2024

OpenBMB / MiniCPM-V

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Python 12,595 886 Updated Oct 22, 2024

stanford-oval / storm

An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.

Python 13,364 1,218 Updated Oct 30, 2024

lich99 / ChatGLM-finetune-LoRA

Code for fintune ChatGLM-6b using low-rank adaptation (LoRA)

Jupyter Notebook 724 64 Updated Jul 18, 2023