bluishglc

Follow

Laurence Geng bluishglc

Follow

Architect, author of the book Big Data Platform Architecture and Prototype Implementation，sales page: https://item.jd.com/12677623.html

44 followers · 0 following

Shanghai, China
https://laurence.blog.csdn.net/

Achievements

Achievements

Stars

oryanmoshe / debezium-timestamp-converter

Java 40 40 Updated Sep 3, 2024

dajudge / kafkaproxy

kafkaproxy is a reverse proxy for the wire protocol of Apache Kafka.

Java 71 10 Updated Jun 13, 2023

sjwiesman / flink-scala-3

Scala 35 3 Updated Aug 24, 2022

yangyichao-mango / flink-study

Java 444 163 Updated Sep 17, 2022

morsapaes / flink-sql-CDC

Self-contained demo using Flink SQL and Debezium to build a CDC-based analytics pipeline. All you need is Docker! 🐳

Dockerfile 24 32 Updated May 11, 2021

brianfrankcooper / YCSB

Yahoo! Cloud Serving Benchmark

Java 4,955 2,252 Updated Nov 14, 2024

tmcgrath / kafka-connect-examples

Kafka Connect Examples

Shell 42 19 Updated Sep 27, 2022

mli / paper-reading

深度学习经典、新论文逐段精读

27,118 2,449 Updated Aug 8, 2024

aws-samples / emr-spark-benchmark

Shell 20 3 Updated Mar 12, 2024

cartershanklin / hive-testbench

Testbench for experimenting with Apache Hive at any data scale.

Java 65 195 Updated Jul 10, 2017

databricks / tpcds-kit

Forked from gregrahn/tpcds-kit

TPC-DS benchmark kit with some modifications/fixes

C 88 65 Updated Aug 13, 2024

databricks / spark-sql-perf

Scala 586 407 Updated Feb 26, 2022

hortonworks / hive-testbench

Java 376 283 Updated Jan 25, 2024

awesomedata / awesome-public-datasets

A topic-centric list of HQ open datasets.

61,001 9,935 Updated Nov 13, 2024

bluishglc / apache-hudi-core-conceptions

A set of notebooks to explore and explain core conceptions of Apache Hudi, such as file layouts, file sizing, compaction, clustering and so on.

Jupyter Notebook 10 1 Updated Aug 22, 2023

bluishglc / ranger-emr-cli-installer

This is a powerful cli tool for Apache Ranger and AWS EMR automated installation & integration with OpenLDAP & Windows AD. It supports Open-Source Ranger and EMR-Native Ranger both, supports OpenLD…

Shell 8 15 Updated Jan 30, 2023

ageron / handson-ml3

A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.

Jupyter Notebook 7,844 3,136 Updated Oct 8, 2024

bluishglc / serverless-datalake-example

A serverless datalake project and framework based on AWS S3，Glue，Athena，MWAA and QuickSight. With a series of best practices, it guides you how to build a serverless datalake.

Shell 16 5 Updated Nov 22, 2022

DataTalksClub / nyc-tlc-data

Backup for NYC TLC data for the DE Zoomcamp course

150 45 Updated Jul 19, 2022

datahub-project / datahub

The Metadata Platform for your Data and AI Stack

Java 9,911 2,940 Updated Nov 16, 2024

bluishglc / aws-cli-plus

This command line tool is a useful complement to aws-cli. It offers a suite of utilities that manages and operates ec2, emr and other aws services.

Shell 1 Updated Jul 4, 2023

Kyligence / ssb-kylin

Star Schema Benchmark Tool for Apache Kylin

C 96 47 Updated Aug 26, 2021

electrum / ssb-dbgen

Star Schema Benchmark dbgen

C 120 82 Updated Mar 11, 2024

bluishglc / bdp

A prototype project of big data platform, the source codes of the book Big Data Platform Architecture and Prototype

Java 196 144 Updated Aug 12, 2020

big-data-europe / docker-hadoop

Apache Hadoop docker image

Shell 2,210 1,304 Updated Feb 1, 2024

libaoquan95 / aasPractice

《spark高级数据分析》练习

Scala 22 42 Updated Jun 9, 2018

renesemela / lastfm-dataset-2020

New Last.fm Dataset 2020 for music auto-tagging purposes.

Python 28 Updated Jul 6, 2023

bambrow / docker-hadoop-workbench

A Hadoop cluster based on Docker, including Hive and Spark.

Shell 77 29 Updated Nov 13, 2022

Marcel-Jan / docker-hadoop-spark

Forked from big-data-europe/docker-hadoop

Multi-container environment with Hadoop, Spark and Hive

Shell 203 148 Updated Jan 6, 2024

gettyimages / docker-spark

Docker build for Apache Spark

Dockerfile 675 370 Updated Dec 30, 2021