Skip to content
View habanoz's full-sized avatar
💭
Busy...
💭
Busy...

Block or report habanoz

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A toolkit to create optimal Production-ready RAG setup for your data

Python 945 79 Updated Oct 8, 2024

Automatic data change tracking for PostgreSQL

TypeScript 285 7 Updated Oct 1, 2024

A memory-efficient implementation of DenseNets

Python 1,517 327 Updated Jun 1, 2023

Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 28,776 3,289 Updated Oct 8, 2024

Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.

296 11 Updated Apr 18, 2024

Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.

Java 10,513 2,504 Updated Oct 8, 2024

A framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings.

Python 64 9 Updated Oct 1, 2024

An enterprise friendly way of detecting and preventing secrets in code.

Python 3,769 467 Updated Oct 7, 2024

The Security Toolkit for LLM Interactions

Python 1,175 148 Updated Oct 7, 2024

A Python library to perform NER on structured data and generate PII with Faker

Jupyter Notebook 27 Updated May 31, 2024

🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning

Python 6,729 598 Updated Oct 8, 2024

A version 1.1 of the Alexander Koch low cost robot arm with some small changes.

360 34 Updated Sep 17, 2024

The Universe of Data. All about data, data science, and data engineering

Python 507 52 Updated Jul 18, 2024

Locality Sensitive Hashing using MinHash in Python/Cython to detect near duplicate text documents

Python 279 79 Updated Jun 11, 2023

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python 542 44 Updated Oct 8, 2024

Sample Python code for comparing documents using MinHash

Python 5 Updated Feb 2, 2019

MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW

Python 2,530 295 Updated Jun 4, 2024

MinHash implementation in Python

Jupyter Notebook 10 5 Updated Aug 24, 2024

A vector search SQLite extension that runs anywhere!

C 3,916 133 Updated Oct 2, 2024

Python bindings for llama.cpp

Python 7,864 940 Updated Oct 3, 2024

SGLang is a fast serving framework for large language models and vision language models.

Python 5,463 407 Updated Oct 8, 2024

🕷️ The pipeline for the OSCAR corpus

Rust 162 14 Updated Dec 18, 2023

Retrieve author and publication information from Google Scholar in a friendly, Pythonic way without having to worry about CAPTCHAs!

Python 1,366 298 Updated Jul 3, 2024

Utilities intended for use with Llama models.

Python 4,360 770 Updated Oct 8, 2024
Jupyter Notebook 3 Updated Sep 8, 2023

DataComp for Language Models

HTML 1,126 100 Updated Oct 7, 2024

newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:

Python 14,098 2,112 Updated Jul 23, 2024

A command-line tool for using CommonCrawl Index API at https://index.commoncrawl.org/

Python 179 48 Updated Oct 7, 2018

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

2,035 101 Updated Sep 24, 2024
Next