ndamulelonemakh

Follow

🍸

Solution explorer

Ndamulelo Nemakhavhani ndamulelonemakh

🍸

Solution explorer

Follow

Data technologist in ML & NLP | Azure Cloud Engineer 🏆 | Indigenous language advocate | Knows a thing or two about LLMs🚀

33 followers · 235 following

Mungana AI
Pretoria
20:40 (UTC -12:00)
https://www.linkedin.com/in/ndamulelonemakhavhani/
@NdamuleloNemakh
@[email protected]
https://credly.com/users/ndamulelo-nemakhavhani

Achievements

BetaSend feedback

Achievements

BetaSend feedback

Block or Report

Block or report ndamulelonemakh

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

NLP Toolbox

Collection of NLP resources, projects and state-of-the-art tools

48 repositories

laugustyniak / awesome-sentiment-analysis

Repository with all what is necessary for sentiment analysis and related areas

526 108 Updated Nov 13, 2023

google-research / bert

TensorFlow code and pre-trained models for BERT

Python 37,345 9,530 Updated May 2, 2024

huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 127,883 25,364 Updated Jun 21, 2024

marytts / marytts

MARY TTS -- an open-source, multilingual text-to-speech synthesis system written in pure java

Java 2,276 729 Updated Apr 14, 2023

artetxem / vecmap

A framework to learn cross-lingual word embedding mappings

Python 642 130 Updated Apr 22, 2023

keon / awesome-nlp

📖 A curated list of resources dedicated to Natural Language Processing (NLP)

16,200 2,568 Updated Nov 13, 2023

RandolphVI / Multi-Label-Text-Classification

About Muti-Label Text Classification Based on Neural Network.

Python 546 147 Updated Nov 18, 2020

facebookresearch / XLM

PyTorch original implementation of Cross-lingual Language Model Pretraining.

Python 2,866 478 Updated Feb 14, 2023

Leonard-Xu / CWE

C 299 111 Updated Aug 24, 2020

huggingface / swift-coreml-transformers

Swift Core ML 3 implementations of GPT-2, DistilGPT-2, BERT, and DistilBERT for Question answering. Other Transformers coming soon!

Swift 1,591 172 Updated Nov 24, 2023

clld / wals3

The World Atlas Of Language Structures Online

CSS 120 18 Updated Jun 3, 2023

stitchfix / hamilton

A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton

Python 865 38 Updated Jul 3, 2023

amalaj7 / Top2Vec-Topic-Modelling-and-Semantic-Search

Algorithm for Topic Modelling and Semantic Search

Jupyter Notebook 2 Updated Jun 16, 2022

code-kern-ai / refinery

The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.

Python 1,376 65 Updated Jun 13, 2024

argilla-io / argilla

Argilla is a collaboration platform for AI engineers and domain experts that require high-quality outputs, full data ownership, and overall efficiency.

Python 3,446 328 Updated Jun 21, 2024

QData / TextAttack

TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/

Python 2,816 378 Updated Mar 31, 2024

diffgram / diffgram

The AI Datastore for Schemas, BLOBs, and Predictions. Use with your apps or integrate built-in Human Supervision, Data Workflow, and UI Catalog to get the most value out of your AI Data.

Python 1,812 119 Updated Jun 11, 2024

facebookresearch / fastText

Library for fast text representation and classification.

HTML 25,688 4,695 Updated Mar 22, 2024

stanfordnlp / GloVe

Software in C and data files for the popular GloVe model for distributed word representations, a.k.a. word vectors or embeddings

C 6,765 1,493 Updated Sep 19, 2023

attardi / wikiextractor

A tool for extracting plain text from Wikipedia dumps

Python 3,671 955 Updated May 23, 2024

ddangelov / Top2Vec

Top2Vec learns jointly embedded topic, document and word vectors.

Python 2,872 367 Updated May 12, 2024

openai / whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Python 63,423 7,359 Updated Jun 16, 2024

doccano / doccano

Open source annotation tool for machine learning practitioners.

Python 9,156 1,682 Updated Mar 6, 2024

ChrizH / pdfstructure

`pdfstructure` detects, splits and organizes the documents text content into its natural structure as envisioned by the author.

Python 92 19 Updated Apr 1, 2024

pdfminer / pdfminer.six

Community maintained fork of pdfminer - we fathom PDF

Python 5,583 901 Updated Jun 17, 2024

fmalina / unilex-transcript

Get semantic HTML from PDFs, recover lost text, tables, data... in bulk.

HTML 28 7 Updated Dec 14, 2023

dsfsi / textaugment

TextAugment: Text Augmentation Library

Python 385 59 Updated Feb 20, 2024

clab / fast_align

Simple, fast unsupervised word aligner

C++ 727 160 Updated Jul 19, 2022

filyp / autocorrect

Spelling corrector in python

Python 438 78 Updated Dec 4, 2023

snorkel-team / snorkel

A system for quickly generating training data with weak supervision

Python 5,738 860 Updated May 2, 2024