Language Models Implementation Repository

Project Overview

This repository is mainly created for educational purposes (mainly my own), with an emphasis on the practical implementation of state-of-the-art (SOTA) language model papers, using the PyTorch library. The whole project is inspired by the NanoGPT project of Andrej Karpathy (https://github.com/karpathy/nanoGPT).

The main goal of this project is to provide a comprehensive and detailed implementation of the most recent and popular language models, such as GPT-2, Llama 2, Mistral, and others, as well as to provide a detailed explanation of the underlying concepts and mechanisms of these models.

Experiments:

The results of the experiments can be found in the TESTS.md file.

Future Work and TODO's

The following are among the planned future works and 'To Do' items for this project:

Model/Architecture improvements:

Fine-tuning improvements:

Load pre-trained models
Implement LoRA
Implement QLoRA

Training improvements:

Data:

Train with the Fineweb dataset (edu) 10B
Train using an instruction dataset
Read Efficient Training of Language Models to Fill in the Middle https://arxiv.org/pdf/2207.14255
Read WizardLM (https://arxiv.org/pdf/2304.12244)

Observability improvements:

Implement Tensorboard
Add tracking of different test-training metrics (params, testing name, time).
Augment the logging of the training metrics with wandb (instead of tensorboard)

Name		Name	Last commit message	Last commit date
Latest commit History 262 Commits
.vscode		.vscode
data		data
loaders		loaders
.gitignore		.gitignore
.prettierrc		.prettierrc
LICENSE		LICENSE
README.md		README.md
TESTS.md		TESTS.md
hellaswag.py		hellaswag.py
huggingface.py		huggingface.py
info.py		info.py
model.py		model.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Language Models Implementation Repository

Project Overview

Experiments:

Future Work and TODO's

Model/Architecture improvements:

Fine-tuning improvements:

Training improvements:

Data:

Observability improvements:

About

Releases

Packages

Languages

License

PolRF/LM

Folders and files

Latest commit

History

Repository files navigation

Language Models Implementation Repository

Project Overview

Experiments:

Future Work and TODO's

Model/Architecture improvements:

Fine-tuning improvements:

Training improvements:

Data:

Observability improvements:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages