aws-glue-data-catalog

Here are 10 public repositories matching this topic...

aws-samples / automated-datastore-discovery-with-aws-glue

Automation framework to catalog AWS data sources using Glue

aws typescript aws-s3 dynamodb glue python3 data-catalog rds gdpr pii data-governance aws-cdk aws-glue-workflow aws-glue-crawler aws-glue-data-catalog

Updated May 24, 2024
Python

shiv-rna / Youtube-Data-Engineering-Pipeline

Star

This project repo 📺 offers a robust solution meticulously crafted to efficiently manage, process, and analyze YouTube video data leveraging the power of AWS services. Whether you're diving into structured statistics or exploring the nuances of trending key metrics, this pipeline is engineered to handle it all with finesse.

aws youtube aws-lambda aws-s3 aws-cli data-engineering aws-iam aws-athena aws-glue data-engineering-pipeline aws-quicksight aws-glue-data-catalog

Updated Mar 20, 2024
Python

SadafAsad / LinkedIn-Jobs-Analysis

Star

Unveiling job market trends with Scrapy and AWS

python aws-s3 scrapy aws-ec2 aws-athena aws-quicksight aws-glue-crawler aws-glue-data-catalog

Updated Apr 5, 2024
Python

subhamay-cloudworks / 0090-deutzia-cft

Sponsor

Star

Creating an audit table for a DynamoDB table using CloudTrail, Kinesis Data Stream, Lambda, S3, Glue and Athena and CloudFormation

aws-python-lambda aws-iam aws-cloudformation aws-cloudtrail aws-cloudwatch aws-athena aws-cloudwatch-logs aws-kinesis-stream aws-glue-crawler aws-iam-roles aws-iam-policies aws-s3-bucket aws-glue-data-catalog

Updated Jul 6, 2023
Python

subhamay-cloudworks / 0052-agapanthus-cft

Sponsor

Star

Working with Glue Data Catalog and Running the Glue Crawler On Demand

aws-cloudformation aws-glue aws-glue-crawler aws-iam-roles aws-iam-policies aws-glue-data-catalog

Updated May 11, 2023

jibbs1703 / Tickit-Data-Pipeline

Star

This repository contains a data pipeline that extracts, transforms and loads data from an AWS S3 bucket into an AWS Redshift table using AWS Glue. The raw data is made available in AWS S3 in its raw form and then the pipeline enables AWS Glue extract the raw data from S3 bucket.

data-validation aws-s3 aws-redshift etl-pipeline aws-glue pydantic aws-glue-crawler aws-glue-data-catalog

Updated Sep 24, 2024
Python

ev2900 / Iceberg_Glue_register_table

Star

Example using the Iceberg register_table command with AWS Glue and Glue Data Catalog

aws glue iceberg aws-glue apache-iceberg aws-glue-data-catalog

Updated Oct 17, 2024
Python

ShreyasLengade / serverless_etl_pipeline

Star

Developed an ETL pipeline for real-time ingestion of stock market data from the stock-market-data-manage.onrender.com API. Engineered the system to store data in Parquet format for optimized query processing and incorporated data quality checks to ensure accuracy prior to visualization.

aws-lambda aws-s3 data-engineering aws-kinesis aws-glue data-engineering-pipeline aws-glue-crawler aws-grafana aws-glue-data-catalog