Fuzzy-JSON is a compact Python package with no dependencies, designed to address the pesky JSONDecodeError that sometimes occurs when utilizing OpenAI's powerful call function.

Python 31 5 Updated Nov 1, 2024

noamgat / lm-format-enforcer

Enforce the output format (JSON Schema, Regex etc) of a language model

Python 1,514 67 Updated Oct 16, 2024

huggingface / datatrove

Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.

Python 2,023 144 Updated Oct 31, 2024

RUC-NLPIR / FlashRAG

⚡FlashRAG: A Python Toolkit for Efficient RAG Research

Python 1,287 105 Updated Nov 7, 2024

microsoft / BlingFire

A lightning fast Finite State machine and REgular expression manipulation library.

C++ 1,831 128 Updated Oct 24, 2023

stanford-oval / wikidata-emnlp23

WikiSP, a semantic parser for Wikidata. WikiWebQuestions, a SPARQL-annotated dataset on Wikidata

Python 83 8 Updated Oct 21, 2024

vegetablejuiceftw / wiki-search

Wikipedia / Wikidata search project for knowledge base RAG systems.

Python 3 Updated Jun 21, 2024

NielsRogge / Transformers-Tutorials

This repository contains demos I made with the Transformers library by HuggingFace.

Jupyter Notebook 9,425 1,443 Updated Oct 21, 2024

SeanLee97 / AnglE

Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard

Python 483 33 Updated Nov 2, 2024

wangyuxinwhy / uniem

unified embedding model

Python 828 64 Updated Sep 1, 2023

naklecha / llama3-from-scratch

llama3 implementation one matrix multiplication at a time

Jupyter Notebook 13,676 1,093 Updated May 23, 2024

LlamaFamily / Llama-Chinese

Llama中文社区，Llama3在线体验和微调模型已开放，实时汇总最新Llama3学习资料，已将所有代码更新适配Llama3，构建最好的中文Llama大模型，完全开源可商用

Python 13,954 1,253 Updated Sep 5, 2024

texttron / tevatron

Tevatron - A flexible toolkit for neural retrieval research and development.

Python 516 100 Updated Oct 20, 2024

rohan-paul / LLM-FineTuning-Large-Language-Models

LLM (Large Language Model) FineTuning

Jupyter Notebook 464 110 Updated May 19, 2024

facebookresearch / tart

Code and model release for the paper "Task-aware Retrieval with Instructions" by Asai et al.

Python 159 11 Updated Oct 4, 2023

jakespringer / echo-embeddings

Python 122 7 Updated Apr 17, 2024

hiyouga / LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Python 33,806 4,160 Updated Nov 4, 2024

ContextualAI / gritlm

Generative Representational Instruction Tuning

Jupyter Notebook 561 40 Updated Nov 6, 2024

McGill-NLP / llm2vec

Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'

Python 1,255 93 Updated Oct 8, 2024

castorini / rank_llm

RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.

Python 335 42 Updated Nov 3, 2024

sunnweiwei / RankGPT

Is ChatGPT Good at Search? LLMs as Re-Ranking Agent [EMNLP 2023 Outstanding Paper Award]

Python 524 48 Updated Mar 10, 2024

quqxui / Awesome-LLM4IE-Papers

Awesome papers about generative Information Extraction (IE) using Large Language Models (LLMs)

738 42 Updated Oct 31, 2024