Lists (4)
Sort Name ascending (A-Z)
Stars
A Spark plugin for reading and writing Excel files
A native Delta implementation for integration with any query engine
Fabric Python Notebooks examples
Scan documents to PDF and more, as simply as possible.
LakeSail's computation framework with a mission to unify stream processing, batch processing, and compute-intensive (AI) workloads.
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, Du…
Open, Multi-modal Catalog for Data & AI
GUI Tool To Removes Ads From Various Places Around Windows 11
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
Qubole Sparklens tool for performance tuning Apache Spark
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
An Open Standard for lineage metadata collection
A fast static code analyzer & language server for Python
Distributed DataFrame for Python designed for the cloud, powered by Rust
Watch a file or folder and automatically commit changes to a git repo easily.
Git Extensions is a standalone UI tool for managing git repositories. It also integrates with Windows Explorer and Microsoft Visual Studio (2015/2017/2019).
A stand-alone test framework that allows to write unit tests for Data Factory pipelines on Microsoft Fabric, Azure Data Factory and Azure Synapse Analytics.
DacFx, SqlPackage, and other SQL development libraries enable declarative database development and database portability across SQL versions and environments. Share feedback here on dacpacs, bacpacs…
fsspec-compatible Azure Datake and Azure Blob Storage access
ripgrep recursively searches directories for a regex pattern while respecting your gitignore