Skip to content

Navigation Menu

Explore
For
- Enterprise
- Teams
- Startups
- Education
By Solution
Resources
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

PKU-Alignment

Loves Sharing and Open-Source, Making AI Safer.

278 followers
China
yaodong.yang@outlook.com

Overview
Repositories
Projects
Packages
People

More

Overview
Repositories
Projects
Packages
People

PKU-Alignment

Large language models (LLM) have immense potential in the field of general intelligence but come with significant risks. As a research team at Peking University, we actively focus on alignment techniques for large language models, such as safety alignment, to enhance the model's safety and reduce toxicity.

Welcome to follow our AI Safety project:

safe-rlhf
omnisafe
safepo
safety-gymnasium

Pinned

omnisafe omnisafe Public

OmniSafe is an infrastructural framework for accelerating SafeRL research.

Python 868 126
safety-gymnasium safety-gymnasium Public

NeurIPS 2023: Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark

Python 338 49
safe-rlhf safe-rlhf Public

Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Python 1.2k 107
Safe-Policy-Optimization Safe-Policy-Optimization Public

NeurIPS 2023: Safe Policy Optimization: A benchmark repository for safe reinforcement learning algorithms

Python 302 41

Repositories

Type

Select type

All Public Sources Forks Archived Mirrors Templates

Language

Select language

All JavaScript Makefile Python

Sort

Select order

Last updated Name Stars

Showing 10 of 12 repositories

safe-sora Public
SafeSora is a human preference dataset designed to support safety alignment research in the text-to-video generation field, aiming to enhance the helpfulness and harmlessness of Large Vision Models (LVMs).

PKU-Alignment/safe-sora’s past year of commit activity

Python 14 3 0 0 Updated Jun 22, 2024
omnisafe Public
OmniSafe is an infrastructural framework for accelerating SafeRL research.

PKU-Alignment/omnisafe’s past year of commit activity

Python 868 Apache-2.0 126 12 6 Updated Jun 19, 2024
safe-rlhf Public
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

PKU-Alignment/safe-rlhf’s past year of commit activity

Python 1,210 Apache-2.0 107 14 0 Updated Jun 13, 2024
.github Public

PKU-Alignment/.github’s past year of commit activity

0 0 0 0 Updated Jun 12, 2024
llms-resist-alignment Public
Repo for paper "Language Models Resist Alignment"

PKU-Alignment/llms-resist-alignment’s past year of commit activity

Python 2 0 0 0 Updated Jun 9, 2024
safety-gymnasium Public
NeurIPS 2023: Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark

PKU-Alignment/safety-gymnasium’s past year of commit activity

Python 338 Apache-2.0 49 1 0 Updated May 14, 2024
ProAgent Public
ProAgent: Building Proactive Cooperative Agents with Large Language Models

PKU-Alignment/ProAgent’s past year of commit activity

JavaScript 39 MIT 3 1 0 Updated Apr 8, 2024
SafeDreamer Public
ICLR 2024: SafeDreamer: Safe Reinforcement Learning with World Models

PKU-Alignment/SafeDreamer’s past year of commit activity

Python 31 Apache-2.0 2 0 0 Updated Apr 8, 2024
Safe-Policy-Optimization Public
NeurIPS 2023: Safe Policy Optimization: A benchmark repository for safe reinforcement learning algorithms

PKU-Alignment/Safe-Policy-Optimization’s past year of commit activity

Python 302 Apache-2.0 41 0 0 Updated Mar 20, 2024
AlignmentSurvey Public
AI Alignment: A Comprehensive Survey

PKU-Alignment/AlignmentSurvey’s past year of commit activity

118 0 0 0 Updated Nov 2, 2023

View all repositories

People

Top languages

Loading…

Most used topics

Loading…

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.