Block or Report
Block or report shibuiwilliam
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
Verification of the effect of speculative decoding in Japanese.
Multiple NVIDIA GPUs or Apple Silicon for Large Language Model Inference?
Integrate GraphQL with your Pydantic models
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
A library for squeakily cleaning and filtering language datasets.
Checkpointable dataset utilities for foundation model training
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
Few Shot Text Classification with Large Language Models
What's In My Big Data (WIMBD) - a toolkit for analyzing large text datasets
This repository is build in association with our position paper on "Multimodality for NLP-Centered Applications: Resources, Advances and Frontiers". As a part of this release we share the informati…
Support Continual pre-training & Instruction Tuning forked from llama-recipes
DagStream is the Python package in order to manage relationship between functions, especially for data-preprocessing functions for machine learning applications.
Library for Textless Spoken Language Processing
The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface.
We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts…
Libraries for efficient and scalable group-structured dataset pipelines.
A collection of useful audio datasets and transforms for PyTorch.
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Official github repo for SafetyBench, a comprehensive benchmark to evaluate LLMs' safety.
A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!
A lightweight, dependency-free Python library (and command-line utility) for downloading YouTube Videos.
Easily create large video dataset from video urls
A deep learning library for video understanding research.