Starred repositories
Simple, convenient and cross-platform file date changing library. 📅
Evidently is an open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. From tabular data to Gen AI. 100+ metrics.
A small set of Python functions to draw pretty maps from OpenStreetMap data. Based on osmnx, matplotlib and shapely libraries.
pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (…
Awesome list of uBlacklist subscriptions to block search results from google, bing, duckduckgo.
Blocks specific sites from appearing in Google search results
A simple tool for visually comparing two PDF files
A little word cloud generator in Python
A tokenizer and sentence splitter for German and English web and social media texts.
Deploy a ML inference service on a budget in less than 10 lines of code.
🌈Rainbow CSV - Sublime Text Package: Highlight columns in CSV and TSV files and run queeries in SQL-like language
🦜RBQL - Rainbow Query Language: SQL-like query engine for (not only) CSV file processing. Supports SQL queries with Python and JavaScript expressions.
Command line interface for testing internet bandwidth using speedtest.net
Looking for a guide? You came to the right place. Here you can find documentation for a variety of topics I research to make complex computing easier. For comments go to the IRC channel #nfo at the…
Load data from redshift into a pandas DataFrame and vice versa.
Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable,…
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
🎨 Diagram as Code for prototyping cloud system architectures
Sacred is a tool to help you configure, organize, log and reproduce experiments developed at IDSIA.
Learning embeddings for classification, retrieval and ranking.
Open source annotation tool for machine learning practitioners.
Label Studio is a multi-type data labeling and annotation tool with standardized output format
Collaborate & label any type of data, images, text, or documents, in an easy web interface or desktop app.
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Context aware, pluggable and customizable data protection and de-identification SDK for text and images
QtPass is a multi-platform GUI for pass, the standard unix password manager.
The lazier way to manage everything docker