GorkaUrbizu

Gorka Urbizu Garmendia GorkaUrbizu

Researcher at Orai NLP Technologies | PhD student

5 followers · 14 following

Orai NLP Technologies
Basque Country
@GorkaUrbizu

Achievements

Stars

facebookresearch / lingua

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,157 210 Updated Nov 6, 2024

young-geng / tpu_pod_commander

TPU pod commander is a package for managing and launching jobs on Google Cloud TPU pods.

Python 14 Updated Jun 24, 2024

google / paxml

Pax is a Jax-based machine learning framework for training large scale models. Pax allows for advanced and fully configurable experimentation and parallelization, and has demonstrated industry lead…

Python 456 68 Updated Oct 28, 2024

rspeer / python-ftfy

Fixes mojibake and other glitches in Unicode text, after the fact.

Python 3,810 121 Updated Oct 30, 2024

NVIDIA / NeMo-Curator

Scalable data pre processing and curation toolkit for LLMs

Jupyter Notebook 585 78 Updated Nov 8, 2024

HMUNACHI / nanodl

A Jax-based library for designing and training transformer models from scratch.

Python 275 11 Updated Aug 28, 2024

stanford-crfm / levanter

Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax

Python 516 81 Updated Nov 8, 2024

AI-Hypercomputer / maxtext

A simple, performant and scalable Jax LLM!

Python 1,524 292 Updated Nov 9, 2024

erfanzar / EasyDeL

Accelerate, Optimize performance with streamlined training and serving options with JAX.

Python 202 25 Updated Nov 9, 2024

simonucl / EasyLM

Forked from hamishivi/EasyLM

Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.

Python 2 Updated Feb 5, 2024

lucidrains / self-rewarding-lm-pytorch

Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI

Python 1,333 73 Updated Apr 11, 2024

VikParuchuri / marker

Convert PDF to markdown quickly with high accuracy

Python 17,595 1,008 Updated Nov 7, 2024

jzhang38 / TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 7,840 461 Updated May 3, 2024

kanishkamisra / minicons

Utility for behavioral and representational analyses of Language Models

Python 122 29 Updated Aug 30, 2024

ggerganov / llama.cpp

LLM inference in C/C++

C++ 67,500 9,691 Updated Nov 9, 2024

facebookresearch / belebele

Repo for the Belebele dataset, a massively multilingual reading comprehension dataset.

Python 314 21 Updated Aug 12, 2024

MonsoonNLP / byt5-dv

ByT5 model scripts

Jupyter Notebook 2 Updated Jul 12, 2021

togethercomputer / RedPajama-Data

The RedPajama-Data repository contains code for preparing large datasets for training large language models.

Python 4,569 350 Updated Oct 17, 2024

oltoporkov / morphological-information-datasets

The collection of files that contain datasets for 6 languages (Russian, Basque, Turkish, Spanish, Czech and English) with labels of different morphological complexity

Python 1 Updated Feb 29, 2024

EdinburghNLP / awesome-hallucination-detection

List of papers on hallucination detection in LLMs.

669 54 Updated Nov 1, 2024

DFKI-NLP / gevalm

Code and data for the paper "Evaluating German Transformer Language Models with Syntactic Agreement Tests" (Zaczynska et al., 2020)

Python 7 2 Updated Jun 12, 2023

young-geng / EasyLM

Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.

Python 2,402 255 Updated Aug 13, 2024

bazingagin / npc_gzip

Code for Paper: “Low-Resource” Text Classification: A Parameter-Free Classification Method with Compressors

Python 1,769 155 Updated Aug 7, 2023

karpathy / nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 37,211 5,922 Updated Aug 19, 2024

linhduongtuan / BLOOM-LORA

Due to restriction of LLaMA, we try to reimplement BLOOM-LoRA (much less restricted BLOOM license here https://huggingface.co/spaces/bigscience/license) using Alpaca-LoRA and Alpaca_data_cleaned.json

Jupyter Notebook 183 39 Updated Jun 18, 2023

allenai / natural-instructions

Expanding natural instructions

Python 956 189 Updated Dec 11, 2023

ltgoslo / ltg-bert

LTG-Bert

Python 29 4 Updated Jan 8, 2024

IST-DASLab / gptq

Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".

Python 1,923 153 Updated Mar 27, 2024

IamAdiSri / hf-trim

Reduce the size of pretrained Hugging Face models via vocabulary trimming.

Python 43 5 Updated Dec 28, 2022

xcfcode / Summarization-Papers

Summarization Papers

TeX 985 143 Updated Jul 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gorka Urbizu Garmendia GorkaUrbizu

Achievements

Achievements

Block or report GorkaUrbizu

Stars

facebookresearch / lingua

young-geng / tpu_pod_commander

google / paxml

rspeer / python-ftfy

NVIDIA / NeMo-Curator

HMUNACHI / nanodl

stanford-crfm / levanter

AI-Hypercomputer / maxtext

erfanzar / EasyDeL

simonucl / EasyLM

lucidrains / self-rewarding-lm-pytorch

VikParuchuri / marker

jzhang38 / TinyLlama

kanishkamisra / minicons

ggerganov / llama.cpp

facebookresearch / belebele

MonsoonNLP / byt5-dv

togethercomputer / RedPajama-Data

oltoporkov / morphological-information-datasets

EdinburghNLP / awesome-hallucination-detection

DFKI-NLP / gevalm

young-geng / EasyLM

bazingagin / npc_gzip

karpathy / nanoGPT

linhduongtuan / BLOOM-LORA

allenai / natural-instructions

ltgoslo / ltg-bert

IST-DASLab / gptq

IamAdiSri / hf-trim

xcfcode / Summarization-Papers