Skip to content
@PKU-Alignment

PKU-Alignment

Loves Sharing and Open-Source, Making AI Safer.

PKU-Alignment

Large language models (LLM) have immense potential in the field of general intelligence but come with significant risks. As a research team at Peking University, we actively focus on alignment techniques for large language models, such as safety alignment, to enhance the model's safety and reduce toxicity.

Welcome to follow our AI Safety project:

Pinned

  1. omnisafe omnisafe Public

    OmniSafe is an infrastructural framework for accelerating SafeRL research.

    Python 868 126

  2. safety-gymnasium safety-gymnasium Public

    NeurIPS 2023: Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark

    Python 338 49

  3. safe-rlhf safe-rlhf Public

    Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

    Python 1.2k 107

  4. Safe-Policy-Optimization Safe-Policy-Optimization Public

    NeurIPS 2023: Safe Policy Optimization: A benchmark repository for safe reinforcement learning algorithms

    Python 302 41

Repositories

Showing 10 of 12 repositories
  • safe-sora Public

    SafeSora is a human preference dataset designed to support safety alignment research in the text-to-video generation field, aiming to enhance the helpfulness and harmlessness of Large Vision Models (LVMs).

    PKU-Alignment/safe-sora’s past year of commit activity
    Python 14 3 0 0 Updated Jun 22, 2024
  • omnisafe Public

    OmniSafe is an infrastructural framework for accelerating SafeRL research.

    PKU-Alignment/omnisafe’s past year of commit activity
    Python 868 Apache-2.0 126 12 6 Updated Jun 19, 2024
  • safe-rlhf Public

    Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

    PKU-Alignment/safe-rlhf’s past year of commit activity
    Python 1,210 Apache-2.0 107 14 0 Updated Jun 13, 2024
  • .github Public
    PKU-Alignment/.github’s past year of commit activity
    0 0 0 0 Updated Jun 12, 2024
  • llms-resist-alignment Public

    Repo for paper "Language Models Resist Alignment"

    PKU-Alignment/llms-resist-alignment’s past year of commit activity
    Python 2 0 0 0 Updated Jun 9, 2024
  • safety-gymnasium Public

    NeurIPS 2023: Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark

    PKU-Alignment/safety-gymnasium’s past year of commit activity
    Python 338 Apache-2.0 49 1 0 Updated May 14, 2024
  • ProAgent Public

    ProAgent: Building Proactive Cooperative Agents with Large Language Models

    PKU-Alignment/ProAgent’s past year of commit activity
    JavaScript 39 MIT 3 1 0 Updated Apr 8, 2024
  • SafeDreamer Public

    ICLR 2024: SafeDreamer: Safe Reinforcement Learning with World Models

    PKU-Alignment/SafeDreamer’s past year of commit activity
    Python 31 Apache-2.0 2 0 0 Updated Apr 8, 2024
  • Safe-Policy-Optimization Public

    NeurIPS 2023: Safe Policy Optimization: A benchmark repository for safe reinforcement learning algorithms

    PKU-Alignment/Safe-Policy-Optimization’s past year of commit activity
    Python 302 Apache-2.0 41 0 0 Updated Mar 20, 2024
  • AlignmentSurvey Public

    AI Alignment: A Comprehensive Survey

    PKU-Alignment/AlignmentSurvey’s past year of commit activity
    118 0 0 0 Updated Nov 2, 2023

Top languages

Loading…

Most used topics

Loading…