gentaiscool

Writing interesting code...

Genta Indra Winata gentaiscool

Writing interesting code...

Researcher @ Capital One AI Foundations. Natural Language Processing, Speech, Multilingual, Code-switching, Dialogue

231 followers · 120 following

Capital One AI Foundations
New York
https://gentawinata.com
@gentaiscool

Achievements

x3 x2

Achievements

x3 x2

Highlights

Organizations

Block or Report

Block or report gentaiscool

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Stars

termcolor / termcolor

ANSI color formatting for output in terminal

Python 201 25 Updated Jul 1, 2024

afaji / summerschool-KD-PEFT

Mexican NLP 2024 Summerschool Tutorial on Knowledge Distillation and Parameter Efficient Finetuning

6 Updated Jun 17, 2024

bayesian-optimization / BayesianOptimization

A Python implementation of global optimization with gaussian processes.

Python 7,621 1,515 Updated Jul 4, 2024

gentaiscool / distfuse

A library to calculate similarity scores between two collections of text sequences encoded using transformer models for bitext mining, dense retrieval, retrieval-based classification, and retrieval…

Python 4 2 Updated Jun 22, 2024

ZurichNLP / nmtscore

A library of translation-based text similarity measures

Python 23 5 Updated Dec 11, 2023

Tiiiger / bert_score

BERT score for text generation

Jupyter Notebook 1,491 206 Updated Jun 14, 2024

embeddings-benchmark / mteb

MTEB: Massive Text Embedding Benchmark

Python 1,623 212 Updated Jul 5, 2024

davidanugraha / proxylm

Implementation of ProxyLM, a scalable and efficient LM performance prediction framework on NLP task using proxy models

Python 3 Updated Jun 15, 2024

BatsResearch / LexC-Gen

Generate synthetic labeled data for extremely low-resource languages using bilingual lexicons.

Python 9 2 Updated Jul 1, 2024

gentaiscool / miners

MINERS ⛏️: The semantic retrieval benchmark for evaluating multilingual language models.

Python 7 1 Updated Jun 17, 2024

SEACrowd / seacrowd-datahub

A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.

Python 55 54 Updated Jun 24, 2024

SamuelCahyawijaya / in-context-alignment

Jupyter Notebook 3 Updated Jun 24, 2024

dehanalkautsar / IndoToD

IndoToD: A Multi-Domain Indonesian Benchmark For End-to-End Task-Oriented Dialogue Systems

1 Updated Jun 10, 2024

IndoNLP / cendol

Indonesian T0 | Instruction-tuning for low-resource and extremely low-resource Austronesian languages

Jupyter Notebook 9 1 Updated Jun 24, 2024

l3cube-pune / code-mixed-nlp

This repository is dedicated to development of code-mixed language resources.

22 1 Updated Jul 22, 2023

IyanuSh / NollySenti

Nollywood Movie Reviews in 5 Nigerian Languages

Shell 4 Updated May 18, 2024

nlp-uoregon / Okapi

Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback

Python 83 2 Updated Aug 18, 2023

LAION-AI / Open-Instruction-Generalist

Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks

Python 203 19 Updated Jan 13, 2024

Genius1237 / numpy-gpt2

Python 1 Updated Aug 7, 2023

srush / do-we-need-attention

TeX 158 7 Updated Jul 5, 2023

Nyandwi / machine_learning_complete

A comprehensive machine learning repository containing 30+ notebooks on different concepts, algorithms and techniques.

Jupyter Notebook 4,545 737 Updated Sep 22, 2023

IndoNLP / nusa-writes

NusaWrites is an in-depth analysis of corpora collection strategy and a comprehensive language modeling benchmark for underrepresented and extremely low-resource Indonesian local languages.

Jupyter Notebook 26 1 Updated Feb 26, 2024

kongaskristjan / rubik

Solve a Rubik's Cube with neural networks

Python 5 Updated Aug 4, 2021

forestagostinelli / DeepCubeA

Code for DeepCubeA, a Deep Reinforcement Learning algorithm that can learn to solve the Rubik's cube.

Python 145 49 Updated Aug 31, 2023

Southeast-Asia-NLP / LLM-Code-Mixing

Can LLMs generate code-mixed sentences through zero-shot prompting?

11 Updated Apr 18, 2023

meta-llama / llama

Inference code for Llama models

Python 54,142 9,316 Updated May 15, 2024

bloomberg / minilmv2.bb

Our open source implementation of MiniLMv2 (https://aclanthology.org/2021.findings-acl.188)

Python 59 6 Updated Jun 12, 2023

neulab / globalbench

GlobalBench: A Benchmark for Global Progress in Language Technology

Python 6 Updated Dec 7, 2023

ExpressAI / DataLab

The unified platform for data-related resources.

Python 128 28 Updated Mar 6, 2023

aparnadutta / code-mixed-lid

Word-level language identification for Bangla-English code-mixed social media data, using a BiLSTM with subword embeddings.

Python 8 1 Updated Aug 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Genta Indra Winata gentaiscool

Achievements

Achievements

Highlights

Organizations

Block or report gentaiscool

Stars

termcolor / termcolor

afaji / summerschool-KD-PEFT

bayesian-optimization / BayesianOptimization

gentaiscool / distfuse

ZurichNLP / nmtscore

Tiiiger / bert_score

embeddings-benchmark / mteb

davidanugraha / proxylm

BatsResearch / LexC-Gen

gentaiscool / miners

SEACrowd / seacrowd-datahub

SamuelCahyawijaya / in-context-alignment

dehanalkautsar / IndoToD

IndoNLP / cendol

l3cube-pune / code-mixed-nlp

IyanuSh / NollySenti

nlp-uoregon / Okapi

LAION-AI / Open-Instruction-Generalist

Genius1237 / numpy-gpt2

srush / do-we-need-attention

Nyandwi / machine_learning_complete

IndoNLP / nusa-writes

kongaskristjan / rubik

forestagostinelli / DeepCubeA

Southeast-Asia-NLP / LLM-Code-Mixing

meta-llama / llama

bloomberg / minilmv2.bb

neulab / globalbench

ExpressAI / DataLab

aparnadutta / code-mixed-lid