-
Human Language Processing Laboratory (HLP Lab)
- Wuhan, China
Block or Report
Block or report hlp-ai
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Simple text to phones converter for multiple languages
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
The hub for EleutherAI's work on interpretability and learning dynamics
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
A library for preparing data for machine translation research (monolingual preprocessing, bitext mining, etc.) built by the FAIR NLLB team.
A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supportin…
Llama2-SFT, Llama-2-7B微调(transformers)/LORA(peft)/推理
Boosting your Web Services of Deep Learning Applications.
A c/c++ implementation of micrograd: a tiny autograd engine with neural net on top.
A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
A self-hosted, offline, ChatGPT-like chatbot. Powered by Llama 2. 100% private, with no data leaving your device. New: Code Llama support!
Scripts to preprocess training and test data and to run fast_align and giza
Best practice TTS based on BERT and VITS with some Natural Speech Features Of Microsoft; Support ONNX streaming out!
Finetune VITS and MMS using HuggingFace's tools
Meta's "No Language Left Behind" models served as web app and REST API