Skip to content
View WayXG's full-sized avatar

Block or report WayXG

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Popular repositories Loading

  1. Online-RLHF Online-RLHF Public

    Forked from RLHFlow/Online-RLHF

    A recipe for online RLHF.

    Python

  2. RLHF-Reward-Modeling RLHF-Reward-Modeling Public

    Forked from RLHFlow/RLHF-Reward-Modeling

    Recipes to train reward model for RLHF.

    Python

  3. ToRA ToRA Public

    Forked from WeiXiongUST/ToRA

    ToRA is a series of Tool-integrated Reasoning LLM Agents designed to solve challenging mathematical reasoning problems by interacting with tools [ICLR'24].

    Python

  4. RLHF4MATH_Dev RLHF4MATH_Dev Public

    Python

  5. preference-construction preference-construction Public

    Python

  6. RAFT RAFT Public

    Forked from RLHFlow/RAFT

    This is an official implementation of the Reward rAnked Fine-Tuning Algorithm (RAFT), also known as iterative best-of-n fine-tuning or rejection sampling fine-tuning.

    Python