[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

Python 502 38 Updated Jun 24, 2024

skypilot-org / skypilot

SkyPilot: Run LLMs, AI, and Batch jobs on any cloud. Get maximum savings, highest GPU availability, and managed execution—all with a simple interface.

Python 6,220 427 Updated Jul 10, 2024

instill-ai / instill-core

🔮 Instill Core is a full-stack AI infrastructure tool for data, model and pipeline orchestration, designed to streamline every aspect of building versatile AI-first applications

Makefile 2,000 87 Updated Jul 10, 2024

binhnguyennus / awesome-scalability

The Patterns of Scalable, Reliable, and Performant Large-Scale Systems

56,833 5,866 Updated Jun 17, 2024

mingdianliu / astra-toolbox-for-cone-beam

This is a collection of Python scripts for implementing ASTRA Toolbox for cone-beam X-ray CT reconstruction.

Python 15 1 Updated Nov 8, 2023

OpenBMB / MiniCPM-V

MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone

Python 7,912 551 Updated Jul 3, 2024

nathanbabcock / hypetrigger-configs

Companion repo to Hypetrigger, providing extensibility and modding support.

TypeScript 7 Updated Jan 19, 2023

hongbo-miao / hongbomiao.com

🦋 A personal research and development (R&D) lab that facilitates the sharing of knowledge.

HCL 213 35 Updated Jul 10, 2024

modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 4,647 515 Updated Jul 10, 2024

2noise / ChatTTS

A generative speech model for daily dialogue.

Python 27,445 2,989 Updated Jul 10, 2024

wangshusen / RecommenderSystem

1,973 301 Updated Feb 7, 2024

Docta-ai / docta

A Doctor for your data

Python 3,068 189 Updated Jan 12, 2024

aehyok / video2blog

视频转图文 AI跨平台客户端（win mac linux）

Vue 153 15 Updated Jun 25, 2024

ApdowJN / Stereo-NEC

The official repository of our ICRA 2024 paper "Stereo-NEC: Enhancing Stereo Visual-Inertial SLAM Initialization with Normal Epipolar Constraints".

C++ 107 9 Updated May 9, 2024

jitsi / jiwer

Evaluate your speech-to-text system with similarity measures such as word error rate (WER)

Python 570 92 Updated May 6, 2024

BasedHardware / OpenGlass

Turn any glasses into AI-powered smart glasses

C 2,878 358 Updated Jul 9, 2024

deepseek-ai / DeepSeek-V2

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

2,927 108 Updated Jun 26, 2024

sato-team / Stable-Text-to-Motion-Framework

SATO: Stable Text-to-Motion Framework

Jupyter Notebook 93 5 Updated May 10, 2024

YangLing0818 / RPG-DiffusionMaster

[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG)

Jupyter Notebook 1,594 91 Updated Jun 6, 2024

facebookresearch / generative-recommenders

Repository hosting code used to reproduce results in "Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations" (https://arxiv.org/abs/2402.17152, I…

Python 514 83 Updated Jul 3, 2024

NVlabs / RADIO

Official repository for "AM-RADIO: Reduce All Domains Into One"

Python 514 17 Updated Jul 9, 2024

amazon-science / QA-ViT

Python 31 8 Updated May 8, 2024

UMass-Foundation-Model / COMBO

Source codes for the paper "COMBO: Compositional World Models for Embodied Multi-Agent Cooperation"

Python 22 3 Updated Apr 17, 2024

agiresearch / AIOS

AIOS: LLM Agent Operating System

Python 2,979 352 Updated Jul 10, 2024

spinsphotonics / fdtdz

Fast, scalable, accessible photonic simulation

C++ 106 13 Updated Oct 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mingdian Liu mingdianliu

Block or report mingdianliu

Lists (1)

🔮 Future ideas

Stars

ShiArthur03 / ShiArthur03

vllm-project / vllm

karpathy / LLM101n

facebookresearch / chameleon

openai / simple-evals

ddlBoJack / emotion2vec