DuckDB-powered analytics for Postgres
-
Updated
Nov 19, 2024 - Rust
DuckDB-powered analytics for Postgres
Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines
Smart Automation Tool for building modern Data Lakes and Data Pipelines
a lightweight, comprehensive solution for managing delta tables built on polars and deltalake
Real-time Data Warehouse with Apache Flink & Apache Kafka & Apache Hudi
Streaming application development and management system, based on Linkis and DSS, planning to provide the workflow-like graphical drag-and-drop development capability.
This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which we need in our real life experience as a data engineer. We will be using pyspark & sparksql for the development. At the end of the course we also cover few case studies.
This repository exemplifies a simple ELT process using delta to perform upsert and remove data files that aren't in the latest state of the transaction log for the table.
Databricks Platform - Architecture, Security, Automation and much more!!
Threat Detection and Visualization
Open source stack lakehouse
Don't Panic. This guide will help you when it feels like the end of the world.
A platform and cloud-based service for data sharing based on the Delta Sharing protocol.
db2ixf is a python package with a CLI that simplifies the parsing and processing of IBM Integration eXchange Format (IXF) files.
Add a description, image, and links to the deltalake topic page so that developers can more easily learn about it.
To associate your repository with the deltalake topic, visit your repo's landing page and select "manage topics."