Skip to content
View hlp-ai's full-sized avatar
  • Human Language Processing Laboratory (HLP Lab)
  • Wuhan, China
Block or Report

Block or report hlp-ai

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Open language modeling toolkit based on PyTorch

Python 21 6 Updated Jul 16, 2024

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

Python 4,641 459 Updated Jul 18, 2024

《PDF 解析》

907 101 Updated Jul 6, 2024

The full minitorch student suite.

Python 1,461 281 Updated Mar 1, 2024

Simple text to phones converter for multiple languages

Python 1,151 164 Updated Jul 2, 2024

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 29,366 3,393 Updated Jul 19, 2024

[WIP] Scripts for fine-tuning Whisper

Python 201 27 Updated May 29, 2023

The hub for EleutherAI's work on interpretability and learning dynamics

Jupyter Notebook 2,150 157 Updated Jul 12, 2024

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

3,235 223 Updated Jul 8, 2024

A library for preparing data for machine translation research (monolingual preprocessing, bitext mining, etc.) built by the FAIR NLLB team.

Python 242 37 Updated Dec 15, 2023

A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Python 26,822 3,316 Updated Jul 18, 2024

Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supportin…

Jupyter Notebook 10,520 1,499 Updated Jul 18, 2024

State-of-the-art LLM-based translation models.

Ruby 362 26 Updated Jun 20, 2024

The official Meta Llama 3 GitHub site

Python 23,355 2,505 Updated Jul 17, 2024

LLM inference in C/C++

C++ 61,866 8,868 Updated Jul 19, 2024

Llama2-SFT, Llama-2-7B微调(transformers)/LORA(peft)/推理

Python 20 Updated Jul 26, 2023

Yi机器翻译系统

Python 4 1 Updated Jul 17, 2024

Modern HTTP benchmarking tool

C 37,245 2,913 Updated Dec 30, 2023

Boosting your Web Services of Deep Learning Applications.

Python 1,221 187 Updated May 13, 2021

A c/c++ implementation of micrograd: a tiny autograd engine with neural net on top.

C 50 6 Updated Sep 21, 2023

Inference Llama 2 in one file of pure C

C 16,855 1,970 Updated Jul 13, 2024

A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API

Jupyter Notebook 9,549 1,331 Updated Jun 21, 2024

A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training

Python 19,481 2,412 Updated Apr 28, 2024

A self-hosted, offline, ChatGPT-like chatbot. Powered by Llama 2. 100% private, with no data leaving your device. New: Code Llama support!

TypeScript 10,574 673 Updated Apr 23, 2024

Scripts to preprocess training and test data and to run fast_align and giza

Python 108 22 Updated Nov 2, 2021

Best practice TTS based on BERT and VITS with some Natural Speech Features Of Microsoft; Support ONNX streaming out!

Python 1,126 165 Updated Feb 5, 2024

Finetune VITS and MMS using HuggingFace's tools

Python 100 21 Updated Mar 31, 2024

Meta's "No Language Left Behind" models served as web app and REST API

Python 154 21 Updated Mar 20, 2024
Python 15 3 Updated Oct 28, 2022

Effort to open-source NLLB checkpoints.

Python 403 36 Updated May 29, 2024
Next