Skip to content
View newbietuan's full-sized avatar
Block or Report

Block or report newbietuan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The hub for EleutherAI's work on interpretability and learning dynamics

Jupyter Notebook 2,155 156 Updated Jul 12, 2024
Jupyter Notebook 54 8 Updated Jul 15, 2024

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 7,376 433 Updated May 3, 2024

用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.

Python 2,351 289 Updated May 21, 2024

The memory layer for Personalized AI

Python 15,638 1,555 Updated Jul 23, 2024

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

3,105 116 Updated Jun 26, 2024

Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.

Python 1,796 112 Updated Jul 22, 2024

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…

TypeScript 38,525 5,262 Updated Jul 23, 2024

Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training

Python 127 4 Updated Mar 8, 2023

Chat with any PDF. Easily upload the PDF documents you'd like to chat with. Instant answers. Ask questions, extract information, and summarize documents with AI. Sources included.

Jupyter Notebook 1,337 208 Updated Jun 12, 2024

Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 23,231 2,395 Updated Jul 23, 2024

DataComp for Language Models

HTML 639 49 Updated Jul 22, 2024

MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW

Python 2,458 290 Updated Jun 4, 2024

本项目旨在分享大模型相关技术原理以及实战经验。

HTML 8,039 783 Updated Jul 17, 2024

Fast and memory-efficient exact attention

Python 12,552 1,118 Updated Jul 23, 2024

llama3 implementation one matrix multiplication at a time

Jupyter Notebook 11,363 860 Updated May 23, 2024

大模型基础: 一文了解大模型基础知识

2,067 185 Updated Jul 11, 2024

Building a quick conversation-based search demo with Lepton AI.

TypeScript 7,561 949 Updated Jul 10, 2024

Language Technology Platform

Python 4,886 1,036 Updated Jul 1, 2024

SimBERT升级版(SimBERTv2)!

Python 433 72 Updated Mar 21, 2022

text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。

Python 4,290 387 Updated Feb 21, 2024

A work in progress. Trying to write about all interesting or necessary pieces in the current development of LLMs and generative AI. Gradually adding more topics.

Jupyter Notebook 179 9 Updated Sep 14, 2023

Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo

Python 2,942 526 Updated May 9, 2024

Foundational Models for State-of-the-Art Speech and Text Translation

Jupyter Notebook 10,578 1,018 Updated Jun 26, 2024

Source code for NAACL 2022 paper Weakly Supervised Text Classification using Supervision Signals from a Language Mode

Python 10 Updated Jun 13, 2022

The code for the ACL 2023 paper "Linear Classifier: An Often-Forgotten Baseline for Text Classification".

Python 16 1 Updated Jun 29, 2024

A library for multi-class and multi-label classification

Python 147 29 Updated Jul 22, 2024

Official resources of "Hierarchical Verbalizer for Few-Shot Hierarchical Text Classification" (ACL 2023 long).

Python 25 1 Updated Jul 30, 2023

Official implementation of "Neuralangelo: High-Fidelity Neural Surface Reconstruction" (CVPR 2023)

Python 4,279 383 Updated Apr 14, 2024
Next