Skip to content
View FYVictor93's full-sized avatar

Block or report FYVictor93

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Curation of resources for LLM mathematical reasoning, most of which are screened by @tongyx361 to ensure high quality and accompanied with elaborately-written concise descriptions to help readers g…

55 2 Updated Jul 12, 2024

A python module to repair invalid JSON, commonly used to parse the output of LLMs

Python 646 37 Updated Aug 28, 2024

A reading list on LLM based Synthetic Data Generation 🔥

73 3 Updated Aug 18, 2024

This repository contains the joint use of CPO and SimPO method for better reference-free preference learning methods.

Python 25 3 Updated Aug 13, 2024

[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning

Jupyter Notebook 319 26 Updated Jun 29, 2024

A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. 🧮✨

Python 66 7 Updated Apr 26, 2024

Llama3、Llama3.1 中文仓库(随书籍撰写中... 各种网友及厂商微调、魔改版本有趣权重 & 训练、推理、评测、部署教程视频 & 文档)

Python 3,859 315 Updated Aug 16, 2024

A framework for few-shot evaluation of language models.

Python 6,246 1,649 Updated Aug 28, 2024
Jupyter Notebook 246 14 Updated Jul 22, 2024

Official implementation of the paper "From Complex to Simple: Enhancing Multi-Constraint Complex Instruction Following Ability of Large Language Models"

Python 29 2 Updated Jun 24, 2024

该仓库主要记录 NLP 算法工程师相关的顶会论文研读笔记

C++ 3,840 660 Updated Aug 18, 2023

Advanced Retrieval-Augmented Generation (RAG) through practical notebooks, using the power of the Langchain, OpenAI GPTs ,META LLAMA3 ,Agents.

Jupyter Notebook 148 27 Updated Apr 26, 2024

​ 李白 👤 作为唐代杰出诗人,其诗歌作品在中国文学史上具有重要地位。近年来,随着数字技术和人工智能的快速发展,传统文化普及推广的形式也面临着创新与变革。国内外对于李白诗歌的研究虽已相当深入,但在数字化、智能化普及方面仍存在不足。因此,本项目旨在通过构建李白知识图谱,结合大模型训练出专业的AI智能体,以生成式对话应用的形式,推动李白文化的普及与推广。

Python 1,095 124 Updated Jul 12, 2024

大模型基础: 一文了解大模型基础知识

2,414 218 Updated Aug 13, 2024

In this blog, we will build a small scale text-to-video model from scratch. We will input a text prompt, and our trained model will generate a video based on that prompt.

Jupyter Notebook 110 20 Updated Jun 23, 2024

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

1,173 66 Updated Aug 21, 2024

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Python 11,205 789 Updated Aug 25, 2024

DataComp for Language Models

HTML 1,080 96 Updated Aug 19, 2024
Python 104 11 Updated Apr 16, 2024

interest repositories

182 42 Updated Feb 6, 2024

Extensible, parallel implementations of t-SNE

Python 1,439 158 Updated Aug 13, 2024

Official repository for "Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing". Your efficient and high-quality synthetic data generation pipeline!

Python 357 35 Updated Aug 28, 2024

A SOTA lightweight multilingual LLM

Python 798 42 Updated Jul 8, 2024

A list of AI autonomous agents

9,428 672 Updated Jul 30, 2024

Minimal keyword extraction with BERT

Python 3,407 342 Updated Jul 16, 2024

拥有基于上下文语境的人工智能翻译引擎,为网站提供更加友好的翻译,让所有人都能够拥有基于母语般的阅读体验。

JavaScript 1,296 59 Updated Jun 10, 2024
Python 407 39 Updated Jul 17, 2024

Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.

Python 1,888 128 Updated Aug 28, 2024

Easily embed, cluster and semantically label text datasets

Python 421 32 Updated Mar 28, 2024

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

3,332 231 Updated Aug 19, 2024
Next