jinzhuoran

🎯

Focusing

Zhuoran Jin jinzhuoran

🎯

Focusing

NLPer

76 followers · 312 following

NEU & CASIA
Beijing
22:36 (UTC +08:00)

Achievements

Organizations

Lists (1)

Sort

🔮 Future ideas

Beta Lists are currently in beta. Share feedback and report bugs.

Stars

WindyLee0822 / Process_Q_Model

official implementation of paper "Process Reward Model with Q-value Rankings"

Python 10 Updated Oct 26, 2024

IAAR-Shanghai / ICSFSurvey

Explore concepts like Self-Correct, Self-Refine, Self-Improve, Self-Contradict, Self-Play, and Self-Knowledge, alongside o1-like reasoning elevation🍓 and hallucination alleviation🍄.

Jupyter Notebook 158 4 Updated Oct 26, 2024

git-disl / awesome_LLM-harmful-fine-tuning-papers

A survey on harmful fine-tuning attack for large language model

55 1 Updated Oct 29, 2024

LLM360 / Analysis360

Open Implementations of LLM Analyses

Jupyter Notebook 94 7 Updated Oct 8, 2024

wasiahmad / Awesome-LLM-Synthetic-Data

A reading list on LLM based Synthetic Data Generation 🔥

687 41 Updated Oct 21, 2024

wchrepo / mulfe

MULFE: Multi-Level Benchmark for Free Text Model Editing

Python 4 Updated Aug 11, 2024

ai-safety-foundation / sparse_autoencoder

Sparse Autoencoder for Mechanistic Interpretability

Python 185 39 Updated Jul 20, 2024

JShollaj / awesome-llm-interpretability

A curated list of Large Language Model (LLM) Interpretability resources.

1,127 90 Updated Jul 31, 2024

rishub-tamirisa / tamper-resistance

Official Repository for "Tamper-Resistant Safeguards for Open-Weight LLMs"

Python 38 5 Updated Oct 14, 2024

VITA-MLLM / VITA

✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM

Python 933 54 Updated Oct 24, 2024

cooperleong00 / Awesome-LLM-Interpretability

A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..

161 6 Updated Oct 17, 2024

IAAR-Shanghai / Awesome-Attention-Heads

An awesome repository & A comprehensive survey on interpretability of LLM attention heads.

TeX 250 6 Updated Nov 1, 2024

XueFuzhao / awesome-mixture-of-experts

A collection of AWESOME things about mixture-of-experts

959 72 Updated Jul 31, 2024

zhaochen0110 / Cotempqa

Code and data for "Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning?" (ACL 2024)

Python 28 1 Updated Jul 3, 2024

TIGER-AI-Lab / MAmmoTH

Code and data for "MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning" (ICLR 2024)

Jupyter Notebook 327 47 Updated Aug 25, 2024

HowieHwong / TrustLLM

[ICML 2024] TrustLLM: Trustworthiness in Large Language Models

Python 460 43 Updated Sep 29, 2024

microsoft / graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system

Python 18,634 1,816 Updated Nov 1, 2024

ruizheliUOA / Awesome-Interpretability-in-Large-Language-Models

This repository collects all relevant resources about interpretability in LLMs

279 16 Updated Nov 1, 2024

jinzhuoran / RWKU

RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models. NeurIPS 2024

Python 56 4 Updated Sep 30, 2024

chrisliu298 / llm-unlearn-eco

[NeurIPS 2024] Large Language Model Unlearning via Embedding-Corrupted Prompts

Python 10 Updated Sep 26, 2024

maleficent1997 / PQTD_MCTS

C++ 2 Updated Jun 25, 2024

facebookresearch / advprompter

Official implementation of AdvPrompter https//arxiv.org/abs/2404.16873

Python 120 12 Updated May 6, 2024

wangyu-ustc / MemoryLLM

The official implementation of the ICML 2024 paper "MemoryLLM: Towards Self-Updatable Large Language Models"

Python 80 4 Updated Oct 22, 2024

OPTML-Group / Unlearn-Saliency

[ICLR24 (Spotlight)] "SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation" by Chongyu Fan*, Jiancheng Liu*, Yihua Zhang, Eric Wong, D…

Python 97 13 Updated Aug 9, 2024

centerforaisafety / HarmBench

HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal

Jupyter Notebook 315 53 Updated Aug 16, 2024

SheltonLiu-N / AutoDAN

The official implementation of our ICLR2024 paper "AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models".

Python 239 40 Updated Oct 23, 2024

gomate-community / awesome-papers-for-rag

A curated list of resources dedicated to retrieval-augmented generation (RAG).

Python 67 6 Updated Oct 28, 2024

ryoungj / ToolEmu

[ICLR'24 Spotlight] A language model (LM)-based emulation framework for identifying the risks of LM agents with tool use

Python 112 13 Updated Mar 22, 2024

MadryLab / context-cite

Attribute (or cite) statements generated by LLMs back to in-context information.

Jupyter Notebook 138 14 Updated Oct 8, 2024

nrimsky / CAA

Steering Llama 2 with Contrastive Activation Addition

Jupyter Notebook 94 30 Updated May 23, 2024