Skip to content
View fajri91's full-sized avatar
🏠
Working from home
🏠
Working from home
Block or Report

Block or report fajri91

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results
Python 19 4 Updated Jul 12, 2024

Evaluation and analysis code for LLM360

Python 75 7 Updated Jun 3, 2024

A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.

Python 55 54 Updated Jul 8, 2024

NusaWrites is an in-depth analysis of corpora collection strategy and a comprehensive language modeling benchmark for underrepresented and extremely low-resource Indonesian local languages.

Jupyter Notebook 25 1 Updated Feb 26, 2024
Python 17 8 Updated Oct 10, 2023

Multicultural Proverbs and Sayings

Python 9 Updated May 8, 2024
Python 2 Updated Dec 6, 2022

CMMLU: Measuring massive multitask language understanding in Chinese

Python 625 40 Updated Jul 8, 2024

A Multilingual Replicable Instruction-Following Model

Python 92 3 Updated Jun 11, 2023
Python 1 Updated Sep 16, 2022

Discourse Probing of Pretrained Language Models. In Proceedings of NAACL 2021.

Jupyter Notebook 9 1 Updated Jun 27, 2022

A framework for assessing and improving classification fairness.

Jupyter Notebook 31 8 Updated Jun 12, 2023

High-quality parallel resource on sentiment analysis for 10 low-resource Indonesian languages, English, and Indonesian (Outstanding Paper at EACL 2023)

Jupyter Notebook 84 8 Updated May 8, 2023

Minangkabau NLP corpus. PACLIC 2020

Python 10 2 Updated Jun 7, 2021

Evaluating the Efficacy of Summarization Evaluation across Languages. In Findings of ACL 2021.

Jupyter Notebook 2 1 Updated Jul 26, 2021

Indonesia Sentiment Lexicon

78 22 Updated Aug 5, 2019

IndoNLI

Python 18 3 Updated Dec 4, 2021

KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense Generation

Python 31 7 Updated Aug 31, 2021

EACL 2021

Python 10 4 Updated May 4, 2021

IndoBERTweet is the first large-scale pretrained model for Indonesian Twitter. Published at EMNLP 2021 (main conference)

Python 55 5 Updated Sep 13, 2021

Complete Web Scraping of TED.com for Metadata, Transcript, Audio, Video, Images using Parallel Programming

Jupyter Notebook 11 4 Updated Jun 25, 2020

Classification of twitter user's personality based on their tweets. Big Five Model used to classify the personality.

Python 14 5 Updated Aug 30, 2020

The Dataset for Hate Speech Detection in Indonesian (Bahasa Indonesia)

25 16 Updated Jul 6, 2022

Summarization Papers

TeX 976 142 Updated Jul 15, 2023

BERTweet: A pre-trained language model for English Tweets (EMNLP-2020)

Python 568 52 Updated Dec 15, 2023
Python 11 5 Updated Dec 8, 2022

The first large-scale summarization corpus for the Indonesian language. AACL 2020.

Python 35 8 Updated Mar 4, 2021

IndoLEM is a comprehensive Indonesian NLU benchmark, comprising three pillars NLP task: morpho-syntax, semantic, and discourse. Presented in COLING 2020.

Python 90 27 Updated Dec 14, 2020
Next