jeradf

jerad fields jeradf

28 followers · 201 following

medium

Achievements

Starred repositories

google-research-datasets / Taskmaster

Please see the readme file as well as our 2019 EMNLP paper linked here -->

193 58 Updated Apr 24, 2024

zihaohe123 / speak-turn-emb-dialog-act-clf

Python 24 8 Updated Apr 18, 2022

Mashiro009 / slidespeech_dl

Python 16 Updated Feb 19, 2024

roholazandie / ryan-tts

Python 18 3 Updated Jan 17, 2022

google-research-datasets / ccpe

A dataset consisting of 502 English dialogs with 12,000 annotated utterances between a user and an assistant discussing movie preferences in natural language. It was collected using a Wizard-of-Oz …

25 4 Updated Jan 20, 2021

YuanGongND / ltu

Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".

Python 363 32 Updated Apr 24, 2024

webdataset / webdataset

A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.

Python 2,209 175 Updated Sep 18, 2024

QwenLM / Qwen-Audio

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 1,393 102 Updated Jul 5, 2024

QwenLM / Qwen-Agent

Agent framework and applications built upon Qwen2, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.

Python 3,101 304 Updated Sep 4, 2024

asappresearch / webagents-step

Jupyter Notebook 34 8 Updated Jul 21, 2024

oobabooga / text-generation-webui

A Gradio web UI for Large Language Models.

Python 39,581 5,204 Updated Sep 16, 2024

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 5,142 362 Updated Sep 18, 2024

f / awesome-chatgpt-prompts

This repo includes ChatGPT prompt curation to use ChatGPT better.

HTML 110,702 15,068 Updated Sep 17, 2024

QwenLM / Qwen2-VL

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 1,933 110 Updated Sep 18, 2024

X-LANCE / SLAM-LLM

Speech, Language, Audio, Music Processing with Large Language Model

Python 497 40 Updated Aug 20, 2024

ictnlp / StreamSpeech

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

Python 873 66 Updated Aug 24, 2024

ictnlp / LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 935 63 Updated Sep 12, 2024

Vaibhavs10 / insanely-fast-whisper

Jupyter Notebook 7,345 523 Updated Jun 16, 2024

facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Python 30,173 6,372 Updated Sep 9, 2024

pykaldi / pykaldi

A Python wrapper for Kaldi

Python 991 248 Updated Aug 15, 2024

pswietojanski / slurp

Repository for SLURP paper

Python 96 19 Updated Apr 20, 2022

jameslyons / python_speech_features

This library provides common speech features for ASR including MFCCs and filterbank energies.

Python 2,364 618 Updated Oct 20, 2021

coqui-ai / STT

🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.

C++ 2,237 271 Updated Mar 11, 2024

OpenMOSS / AnyGPT

Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"

Python 744 57 Updated Aug 27, 2024

guidance-ai / guidance

A guidance language for controlling large language models.

Jupyter Notebook 18,706 1,031 Updated Sep 16, 2024

coqui-ai / STT-models

Open models for Coqui STT

119 35 Updated May 9, 2023

kaldi-asr / kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

Shell 14,128 5,314 Updated Sep 16, 2024

lhotse-speech / lhotse

Tools for handling speech data in machine learning projects.

Python 932 213 Updated Sep 17, 2024

k2-fsa / sherpa

Speech-to-text server framework with next-gen Kaldi

C++ 524 105 Updated Sep 18, 2024

k2-fsa / sherpa-ncnn

Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, L…

C++ 980 151 Updated Aug 24, 2024

jerad fields jeradf

Starred repositories

voice-assistant

voice-activity-detection

time-series

data-quality

model-serving

Medium

recsys

ml-infrastructure

Machine learning

Deep learning