Skip to content
View esbenkc's full-sized avatar

Organizations

@apartresearch
Block or Report

Block or report esbenkc

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

34 stars written in Jupyter Notebook
Clear filter

An adversarial example library for constructing attacks, building defenses, and benchmarking both

Jupyter Notebook 6,106 1,386 Updated Apr 10, 2024

Book about interpretable machine learning

Jupyter Notebook 4,718 1,042 Updated May 26, 2024

The hub for EleutherAI's work on interpretability and learning dynamics

Jupyter Notebook 2,120 155 Updated Jun 18, 2024

Discovering Interpretable GAN Controls [NeurIPS 2020]

Jupyter Notebook 1,774 265 Updated Jan 20, 2023

Takagi and Nishimoto, CVPR 2023

Jupyter Notebook 1,073 59 Updated Aug 17, 2023

TruthfulQA: Measuring How Models Imitate Human Falsehoods

Jupyter Notebook 528 59 Updated Nov 6, 2023

Emergent world representations: Exploring a sequence model trained on a synthetic task

Jupyter Notebook 157 39 Updated Jul 12, 2023

Training pipeline for end-to-end self-driving with Comma AI's Openpilot. WIP

Jupyter Notebook 92 35 Updated Apr 20, 2022

tomsup 👍 Theory of Mind Simulation using Python. A package that allows for easy agent-based modelling of recursive Theory of Mind

Jupyter Notebook 62 6 Updated Aug 1, 2023

WMDP is a LLM proxy benchmark for hazardous knowledge in bio, cyber, and chemical security. We also release code for RMU, an unlearning method which reduces LLM performance on WMDP while retaining …

Jupyter Notebook 52 11 Updated Apr 27, 2024

Machine Learning for Alignment Bootcamp

Jupyter Notebook 23 12 Updated Mar 7, 2024

Tools for exploring Transformer neuron behaviour, including input pruning and diversification.

Jupyter Notebook 16 5 Updated Sep 28, 2023
Jupyter Notebook 13 2 Updated Mar 31, 2024

✱ Understanding the underlying learning dynamics of simple tasks in Transformer networks

Jupyter Notebook 10 Updated Apr 23, 2024

Mechanistic Interpretability Tutorials, Results and research log as I learn from publicly available research, and experimentation.

Jupyter Notebook 8 2 Updated Sep 13, 2023

🔥 A repository for collecting cyberdefense thoughts, books, and documents about AI cyberdefense

Jupyter Notebook 5 1 Updated Jul 2, 2023

Code templates to get started as an AI psychologist

Jupyter Notebook 4 Updated Oct 31, 2022

This repository contains code for the Democracy x AI Hackathon by Apart Research

Jupyter Notebook 4 2 Updated May 9, 2024

A tool for exploring EA Forum and LessWrong

Jupyter Notebook 3 Updated Jan 3, 2023

Alignment Jam - Interpretability Hackaton

Jupyter Notebook 3 1 Updated Nov 13, 2022

An easy straightforward tutorial for finetuning a Danish BERT using simpletransformers

Jupyter Notebook 3 Updated Jan 24, 2022
Jupyter Notebook 2 Updated Feb 13, 2023
Jupyter Notebook 2 Updated Jan 21, 2023

🧠 EEG classification in a P300 typing task from the BR41N.IO hackathon 2021

Jupyter Notebook 2 Updated Jul 13, 2021
Jupyter Notebook 2 1 Updated Nov 11, 2022
Jupyter Notebook 2 Updated May 27, 2024

Interpretability Hackathon 2.0 entry

Jupyter Notebook 2 Updated Apr 28, 2023
Jupyter Notebook 1 Updated Nov 13, 2022

Tools for exploring Transformer neuron behaviour, including input pruning and diversification.

Jupyter Notebook 1 Updated Aug 9, 2023
Next