Block or Report
Block or report kl3259
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
Summarize existing representative LLMs text datasets.
[NeurIPS 2023 D&B Track] Code and data for paper "Revisiting Out-of-distribution Robustness in NLP: Benchmarks, Analysis, and LLMs Evaluations".
Logiqa2.0 dataset - logical reasoning in MRC and NLI tasks
Synthetic question-answering dataset to formally analyze the chain-of-thought output of large language models on a reasoning task.
mlr3 extension for Fairness in Machine Learning
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
source code for ICLR'22 paper "VOS: Learning What You Don’t Know by Virtual Outlier Synthesis"
Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image …
This repository contains the data and code introduced in the paper "CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models" (EMNLP 2020).
Data for evaluating gender bias in coreference resolution systems.
The machine learning toolkit for time series analysis in Python
Open-source simulator for autonomous driving research.
The repository for paper <Evaluating Open-QA Evaluation>
[ICML'2024] Can AI Assistants Know What They Don't Know?
TISSUE (Transcript Imputation with Spatial Single-cell Uncertainty Estimation) provides tools for estimating well-calibrated uncertainty measures for gene expression predictions in single-cell spat…
Modeling, training, eval, and inference code for OLMo
Code for the paper "Calibrating Deep Neural Networks using Focal Loss"
Extending Conformal Prediction to LLMs
FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age
Code for paper: DivideMix: Learning with Noisy Labels as Semi-supervised Learning
Awesome-LLM-Robustness: a curated list of Uncertainty, Reliability and Robustness in Large Language Models
A curated (most recent) list of resources for Learning with Noisy Labels