Block or Report
Block or report Hubotcoder
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)
《开源大模型食用指南》基于Linux环境快速部署开源大模型,更适合中国宝宝的部署教程
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
中文对话0.2B小模型(ChatLM-Chinese-0.2B),开源所有数据集来源、数据清洗、tokenizer训练、模型预训练、SFT指令微调、RLHF优化等流程的全部代码。支持下游任务sft微调,给出三元组信息抽取微调示例。
MiniLLM is a minimal system for running modern LLMs on consumer-grade GPUs
用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.
This is a repository used by individuals to experiment and reproduce the pre-training process of LLM.
Phi2-Chinese-0.2B 从0开始训练自己的Phi2中文小模型,支持接入langchain加载本地知识库做检索增强生成RAG。Training your own Phi2 small chat model from scratch.
Sample of using proxies to crawl baidu search results.
BaiduSpider,一个爬取百度搜索结果的爬虫,目前支持百度网页搜索,百度图片搜索,百度知道搜索,百度视频搜索,百度资讯搜索,百度文库搜索,百度经验搜索和百度百科搜索。
自己手写的百度搜索接口的封装,pip安装,支持命令行执行。Baidu Search unofficial API for Python with no external dependencies
Official release of InternLM2.5 7B base and chat models. 1M context support
The official PyTorch implementation of Google's Gemma models
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Modeling, training, eval, and inference code for OLMo
Code for the book Deep Learning with PyTorch by Eli Stevens, Luca Antiga, and Thomas Viehmann.
Pytorch implementation of JointBERT: "BERT for Joint Intent Classification and Slot Filling"
Metric depth estimation from a single image
Official implementation of "Neuralangelo: High-Fidelity Neural Surface Reconstruction" (CVPR 2023)