Skip to content
View jeradf's full-sized avatar

Block or report jeradf

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Please see the readme file as well as our 2019 EMNLP paper linked here -->

193 58 Updated Apr 24, 2024
Python 16 Updated Feb 19, 2024
Python 18 3 Updated Jan 17, 2022

A dataset consisting of 502 English dialogs with 12,000 annotated utterances between a user and an assistant discussing movie preferences in natural language. It was collected using a Wizard-of-Oz …

25 4 Updated Jan 20, 2021

Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".

Python 363 32 Updated Apr 24, 2024

A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.

Python 2,209 175 Updated Sep 18, 2024

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 1,393 102 Updated Jul 5, 2024

Agent framework and applications built upon Qwen2, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.

Python 3,101 304 Updated Sep 4, 2024
Jupyter Notebook 34 8 Updated Jul 21, 2024

A Gradio web UI for Large Language Models.

Python 39,581 5,204 Updated Sep 16, 2024

SGLang is a fast serving framework for large language models and vision language models.

Python 5,142 362 Updated Sep 18, 2024

This repo includes ChatGPT prompt curation to use ChatGPT better.

HTML 110,702 15,068 Updated Sep 17, 2024

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 1,933 110 Updated Sep 18, 2024

Speech, Language, Audio, Music Processing with Large Language Model

Python 497 40 Updated Aug 20, 2024

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

Python 873 66 Updated Aug 24, 2024

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 935 63 Updated Sep 12, 2024
Jupyter Notebook 7,345 523 Updated Jun 16, 2024

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Python 30,173 6,372 Updated Sep 9, 2024

A Python wrapper for Kaldi

Python 991 248 Updated Aug 15, 2024

Repository for SLURP paper

Python 96 19 Updated Apr 20, 2022

This library provides common speech features for ASR including MFCCs and filterbank energies.

Python 2,364 618 Updated Oct 20, 2021

🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.

C++ 2,237 271 Updated Mar 11, 2024

Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"

Python 744 57 Updated Aug 27, 2024

A guidance language for controlling large language models.

Jupyter Notebook 18,706 1,031 Updated Sep 16, 2024

Open models for Coqui STT

119 35 Updated May 9, 2023

kaldi-asr/kaldi is the official location of the Kaldi project.

Shell 14,128 5,314 Updated Sep 16, 2024

Tools for handling speech data in machine learning projects.

Python 932 213 Updated Sep 17, 2024

Speech-to-text server framework with next-gen Kaldi

C++ 524 105 Updated Sep 18, 2024

Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, L…

C++ 980 151 Updated Aug 24, 2024
Next