Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
-
Updated
Nov 1, 2024 - Python
Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
A Scala kernel for Jupyter
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
Qubole Sparklens tool for performance tuning Apache Spark
The Internals of Spark SQL
🐍 Quick reference guide to common patterns & functions in PySpark.
Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.
Use SQL to build ELT pipelines on a data lakehouse.
Apache Spark™ and Scala Workshops
Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!
A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
A prototype project of big data platform, the source codes of the book Big Data Platform Architecture and Prototype
Spark Structured Streaming / Kafka / Cassandra / Elastic
Add a description, image, and links to the spark-sql topic page so that developers can more easily learn about it.
To associate your repository with the spark-sql topic, visit your repo's landing page and select "manage topics."