Skip to content
View ctolon's full-sized avatar
⚛️
Working.
⚛️
Working.

Highlights

  • Pro
Block or Report

Block or report ctolon

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results

High-performance FastAPI backend boilerplate for real-world production, with mongo and pytest. Suitable for microservices.

Python 61 17 Updated May 25, 2024

Spark-Dashboard is a solution for monitoring Apache Spark jobs. This repository provides the tooling and configuration for deploying an Apache Spark Performance Dashboard using containers technology.

Dockerfile 103 21 Updated Jul 16, 2024

A PHP serializer implementation for Python

Python 11 12 Updated Nov 4, 2018

HOCON parser for Python

Python 495 117 Updated May 30, 2024

Example using the MongoDB Go Driver

Go 126 38 Updated Oct 1, 2020

Dataproc templates and pipelines for solving simple in-cloud data tasks

Python 117 88 Updated Jul 16, 2024

Qubole Sparklens tool for performance tuning Apache Spark

Scala 558 136 Updated Jun 26, 2024

This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spark jobs. It focuses on easing the collection and examination…

Scala 680 143 Updated Jul 16, 2024

Basic skeleton for Spring Boot Microservices. It includes spring spring security for basic Auth. Spring Cloud Gateway is also implemented as an API gateway. Lots of the spring cloud component integ…

Java 191 73 Updated Mar 21, 2024

A Procedure To Create A Yarn Cluster Based on Docker, Run Spark, And Do TPC-DS Performance Test.

C 16 2 Updated Jan 3, 2024

使用容器搭建大数据架构微服务

Shell 12 7 Updated Nov 28, 2017

Scripts to setup a Spark cluster using Docker Swarm.

Python 3 Updated May 10, 2017

Tutorial for setting up a Spark cluster running inside of Docker containers located on different machines

Jupyter Notebook 118 57 Updated Nov 4, 2022

Hadoop Cluster Configurations

Shell 32 56 Updated Aug 5, 2021

This repository

Java 3 3 Updated May 30, 2024

This project provides Apache Spark SQL, RDD, DataFrame and Dataset examples in Scala language

Scala 546 546 Updated Mar 20, 2024

Spark Examples

Scala 125 127 Updated Feb 1, 2022

Apache Spark - A unified analytics engine for large-scale data processing

Scala 39,019 28,123 Updated Jul 30, 2024

Apache Spark examples exclusively in Java

Java 96 44 Updated Apr 21, 2023

Bigquery ETL

Python 246 98 Updated Jul 30, 2024

Elastic Stack 8.x Cookbook published by Packt Publishing

JavaScript 14 3 Updated Jul 8, 2024

A list of useful resources to learn Data Engineering from scratch

3,387 487 Updated Jun 19, 2024

Example project and best practices for Python-based Spark ETL jobs and applications.

Python 4 1 Updated Oct 2, 2018

pyspark framework

Python 25 13 Updated Feb 22, 2022

Boilerplate template for machine learning projects in PySpark.

Python 5 3 Updated Feb 16, 2021

A Python PySpark Projet with Poetry

Jupyter Notebook 17 2 Updated Jun 18, 2023

Pyspark RDD, DataFrame and Dataset Examples in Python language

Python 1,129 862 Updated Mar 28, 2024

The Patterns of Scalable, Reliable, and Performant Large-Scale Systems

57,165 5,907 Updated Jul 14, 2024
Next