![react logo](https://raw.githubusercontent.com/github/explore/80688e429a7d4ef2fca1e82350fe8e3517d3494d/topics/react/react.png)
- Copenhagen, Denmark
- https://a-part.ai
- @esbenkc
Block or Report
Block or report esbenkc
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLanguage: Jupyter Notebook
Sort by: Most stars
Starred repositories
An adversarial example library for constructing attacks, building defenses, and benchmarking both
Book about interpretable machine learning
The hub for EleutherAI's work on interpretability and learning dynamics
Discovering Interpretable GAN Controls [NeurIPS 2020]
Takagi and Nishimoto, CVPR 2023
TruthfulQA: Measuring How Models Imitate Human Falsehoods
Emergent world representations: Exploring a sequence model trained on a synthetic task
Training pipeline for end-to-end self-driving with Comma AI's Openpilot. WIP
tomsup 👍 Theory of Mind Simulation using Python. A package that allows for easy agent-based modelling of recursive Theory of Mind
WMDP is a LLM proxy benchmark for hazardous knowledge in bio, cyber, and chemical security. We also release code for RMU, an unlearning method which reduces LLM performance on WMDP while retaining …
EffiSciencesResearch / ML4G
Forked from crsegerie/mlabMachine Learning for Alignment Bootcamp
Tools for exploring Transformer neuron behaviour, including input pruning and diversification.
✱ Understanding the underlying learning dynamics of simple tasks in Transformer networks
Mechanistic Interpretability Tutorials, Results and research log as I learn from publicly available research, and experimentation.
🔥 A repository for collecting cyberdefense thoughts, books, and documents about AI cyberdefense
Code templates to get started as an AI psychologist
This repository contains code for the Democracy x AI Hackathon by Apart Research
A tool for exploring EA Forum and LessWrong
Alignment Jam - Interpretability Hackaton
An easy straightforward tutorial for finetuning a Danish BERT using simpletransformers
🧠 EEG classification in a P300 typing task from the BR41N.IO hackathon 2021
apartresearch / othelloscope
Forked from likenneth/othello_worldInterpretability Hackathon 2.0 entry
apartresearch / n2g
Forked from apartresearch/Neuron2GraphTools for exploring Transformer neuron behaviour, including input pruning and diversification.