Highlights
- Pro
Block or Report
Block or report huu4ontocord
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuse-
-
aurora-m Public
Forked from SkunkworksAI/BakLLaVAAdapting Starcoderplus for Multimodal Experts
-
aurora Public
Multilingual, Multimodal, Multidomain model based on Starcoderplus and Bakllava
-
-
M3rlin Public
Multilingual, Multimodal, Multidomain (M3) Model
-
-
M3rlin-fmengine Public
Forked from eth-easl/fmengineM3 Training Using FMengine
-
-
rio Public
Text pre-processing for NLP datasets
-
sungai Public
Sample multilingual data and tools for creating the data - used for NLP multilingual NLP research
-
tevatron Public
Forked from texttron/tevatronTevatron - A flexible toolkit for dense retrieval research and development.
Python Apache License 2.0 UpdatedNov 24, 2022 -
muliwai Public
Forked from piisa/muliwaiexperimental PII framework
-
-
data_tooling Public
Forked from bigscience-workshop/data_toolingHow should we store and serve the dataset?
HTML Apache License 2.0 UpdatedMar 4, 2022 -
-
Megatron-DeepSpeed Public
Forked from bigscience-workshop/Megatron-DeepSpeedOngoing research training transformer language models at scale, including: BERT & GPT-2
Python Other UpdatedOct 17, 2021 -
pii_processing Public
Forked from bigscience-workshop/pii_processingPII Processing code to clean up BigScience datasets. Reference implementation for the PII Hackathon
Python Other UpdatedOct 12, 2021 -
summarize Public
Forked from fastforwardlabs/summarize.Summarize. is a Streamlit application that performs automatic text summarization using both extractive and abstractive models.
Python Apache License 2.0 UpdatedSep 22, 2021 -
KeyedVectorsANN Public
Genism word2vec + Pysparnn ANN + Trimmed GoogleNewsVec = Fast and lightweight NLP tool
-
hpj.py Public
Simple Python to Javascript translator with an emphasis on readability of generated code.
Python MIT License UpdatedMay 20, 2015