Skip to content
View naoa's full-sized avatar

Organizations

@groonga @mroonga @ipnexus @cleanhearing @patentfield

Block or report naoa

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Training LLMs with QLoRA + FSDP

Jupyter Notebook 1,406 187 Updated Sep 23, 2024

Build LLM-powered applications in Ruby

Ruby 1,330 186 Updated Oct 12, 2024

Language-Agnostic SEntence Representations

Jupyter Notebook 3,589 462 Updated May 2, 2024

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 19,809 2,522 Updated Oct 10, 2024

Incremental Skip-gram Model with Negative Sampling

Shell 69 8 Updated Jun 30, 2019

Word2Vec naïve version from scratch vs Word2Vec parallelized version.

Jupyter Notebook 1 Updated Aug 4, 2022

Package for evaluating word embeddings

Python 435 110 Updated Jan 4, 2021

RiverText is a framework that standardizes the Incremental Word Embeddings proposed in the state-of-art. Please feel welcome to open an issue in case you have any questions or a pull request if you…

Python 18 Updated Dec 28, 2023

Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

C++ 13,163 1,163 Updated Jul 29, 2024

🍇 GRAPE is a Rust/Python Graph Representation Learning library for Predictions and Evaluations

Jupyter Notebook 527 38 Updated Feb 24, 2024

A collection of ORM-style clients to public patent data

Python 88 34 Updated Sep 24, 2024

Painterro - JavaScript painting plugin

JavaScript 645 86 Updated Sep 18, 2024

🔥 Use pre-trained models in PyTorch to extract vector embeddings for any image

Python 585 93 Updated Dec 23, 2023

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…

Python 31,812 4,721 Updated Oct 11, 2024

Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.

Python 3,189 246 Updated Oct 11, 2024

Header-only C++/python library for fast approximate nearest neighbors

C++ 4,323 637 Updated Aug 11, 2024

Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.

C++ 3,392 450 Updated Sep 21, 2024

FAst Lookups of Cosine and Other Nearest Neighbors (based on fast locality-sensitive hashing)

C 1,133 193 Updated Jun 1, 2024

Hash function quality and speed tests

C++ 1,831 177 Updated Sep 28, 2024

SIMD (SSE) population count --- https://0x80.pl/articles/sse-popcount.html

C++ 324 47 Updated Apr 1, 2024

Javascript Canvas Library, SVG-to-Canvas (& canvas-to-SVG) Parser

JavaScript 28,899 3,497 Updated Oct 12, 2024

Zest is a compression-based text classifier using Meta's Zstandard compression algorithm. Zest is language-agnostic and this approach simplifies configuration, avoids careful feature extraction and…

Python 5 Updated Jan 15, 2022

Datasets, SOTA results of every fields of Chinese NLP

HTML 1,787 273 Updated Apr 7, 2022

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821

Python 3,382 512 Updated Jul 2, 2024

Pytorch version of BERT-whitening

Python 310 46 Updated Oct 9, 2021

PISA: Performant Indexes and Search for Academia

C++ 925 64 Updated Oct 13, 2024

BERT models for Japanese text.

Python 513 55 Updated Mar 23, 2024

PyTorch code for SpERT: Span-based Entity and Relation Transformer

Python 685 146 Updated Feb 1, 2024
Next