Stars
Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.
中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理
Official implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding & Generation, Multimodal Understanding & Generation, and beyond.
A game theoretic approach to explain the output of any machine learning model.
SemEval2024-task 11: Bridging the Gap in Text-Based Emotion Detection
搜集、整理、发布 中文 自然语言处理 语料/数据集,与 有志之士 共同 促进 中文 自然语言处理 的 发展。
Weibo-COV: A Large-Scale COVID-19 Social Media Dataset from Weibo
Which Encoding is the Best for Text Classification in Chinese, English, Japanese and Korean?
BLEnD: A Benchmark for LLMs on Everyday Knowledge in Diverse Cultures and Languages
An open-source library designed for the evaluation of Spiking Neural Networks (SNNs).
Code for paper "Measuring Social Biases in Grounded Vision and Language Embeddings"
Balancing the Picture: Debiasing Vision-Language Datasets with Synthetic Contrast Sets
A Prompt Array Keeps the Bias Away: Debiasing Vision-Language Models with Adversarial Learning [AACL 2022]
A large-scale face dataset for face parsing, recognition, generation and editing.
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
A guidance language for controlling large language models.
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
Llama中文社区,Llama3在线体验和微调模型已开放,实时汇总最新Llama3学习资料,已将所有代码更新适配Llama3,构建最好的中文Llama大模型,完全开源可商用
This is a lstm implementation with pytorch. Trained for language modeling.
Repository for research in the field of Responsible NLP at Meta.