Stars
Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".
This repository collects an extensive list of awesome papers about Story Generation / Storytelling, primarily focusing on the era of Large Language Models (LLMs).
Generate interleaved text and image content in a structured format you can directly pass to downstream APIs.
Groq-powered chat assistant for generating contextual responses to user queries.
Scripts for computing the Intelligibility and CLVP scores for evaluating TTS models
The Programmable Cypher-based Neuro-Symbolic AGI that lets you program its behavior using Graph-based Prompt Programming: for people who want AI to behave as expected
Infinite Alchemy is an AI-powered game where you mix and match elements to create basically anything
Comparison of Language Model Inference Engines
EMNLP 23 - Integrating Whisper Encoder to LLaMA Decoder for Generative ASR Error Correction
Awesome speech/audio LLMs, representation learning, and codec models
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
draw.io is a JavaScript, client-side editor for general diagramming.
A python package to analyze and compare voices with deep learning
A repository for publicly/freely available Natural Language Processing (NLP) datasets for African languages.
A multi-voice TTS system trained with an emphasis on quality
A generative language model which seeks to maximize rhyming syllables. Based on OpenAI's GPT-2.
VTuber application which only requires your voice and microphone, no need for a webcam or other tracking nonsense.
jearaneda / react-new-orgchart
Forked from dabeng/react-orgchartOrgChart with a twist! Simpler, faster navigation and exporting capabilities for those power users who need to manage bulky organizations.
A browser extension that removes YouTube suggestions, comments, shorts, and more
Self-Supervised Speech Pre-training and Representation Learning Toolkit
Phoneme Recognition using pre-trained models Wav2vec2, HuBERT and WavLM. Throughout this project, we compared specifically three different self-supervised models, Wav2vec (2019, 2020), HuBERT (2021…
Repository for fine-tuning Transformers 🤗 based seq2seq speech models in JAX/Flax.
PyTorch code for “TVLT: Textless Vision-Language Transformer” (NeurIPS 2022 Oral)
AfriBERTa: Exploring the Viability of Pretrained Multilingual Language Models for Low-resourced Languages
Library for Textless Spoken Language Processing
THIS REPO IS NOT MAINTAINED ANYMORE. Please see https://codeberg.org/tenacityteam/tenacity for Tenacity, which is maintained.