Skip to content
/ abs Public

Student WIL work on the ABS data modeled into the Azure Synapse.

Notifications You must be signed in to change notification settings

mstoiana/abs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

70 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

abs

Steps to Add to GitHub Issues

  1. Implement Data Encryption and Decryption

    • Task: Develop and implement encryption (Md5, Sha256) for data at the Bronze stage and decryption for the Gold stage.
    • User Story: As a data engineer, I want to ensure that data is encrypted at the Bronze stage and decrypted at the Gold stage to maintain data security and integrity.
  2. Set Up Extract and Load Pipeline

    • Task: Create an extract and load pipeline to move data from the source (databases and files) to the Bronze stage.
    • User Story: As a data engineer, I want to set up a robust extract and load pipeline to efficiently move data from the source to the Bronze stage for further processing.
  3. Develop Transformation Pipeline

    • Task: Implement a transformation pipeline to process data from Bronze to Silver stage.
    • User Story: As a data engineer, I need to transform raw data in the Bronze stage into a more structured format in the Silver stage to facilitate better analysis.
  4. Modeling Pipeline Implementation

    • Task: Create a modeling pipeline to convert data from the Silver stage to the Gold stage, including dimensions, facts, and views.
    • User Story: As a data engineer, I want to design a modeling pipeline to refine Silver data into the Gold stage, creating dimensions, facts, and views for comprehensive analysis.
  5. Set Up Development Environment

    • Task: Establish a development environment with limited copies of data for development purposes.
    • User Story: As a developer, I want a dedicated development environment with limited data copies to safely develop and test data processes.
  6. Automated Testing and CI Pipeline

    • Task: Implement automated testing and continuous integration (CI) pipelines triggered by code changes.
    • User Story: As a developer, I need automated testing and CI pipelines to ensure that code changes are thoroughly tested and integrated without manual intervention.
  7. Role-Based Access Control (RBAC)

    • Task: Configure role-based access control for different roles (Loader, Developer, Analyst, Automation).
    • User Story: As a security officer, I want to establish role-based access control to ensure that users have appropriate access levels based on their roles.

User Stories for a Student

  1. Encrypt Data at Bronze Stage

    • Story: As a student, I want to implement encryption for data at the Bronze stage using Md5 and Sha256 to learn about data security practices.
  2. Set Up Extract and Load Pipeline

    • Story: As a student, I want to create an extract and load pipeline to move data from the source to the Bronze stage to understand the data ingestion process.
  3. Transform Data from Bronze to Silver

    • Story: As a student, I need to develop a transformation pipeline to process data from the Bronze to the Silver stage to gain experience in data transformation techniques.
  4. Model Data from Silver to Gold

    • Story: As a student, I want to implement a modeling pipeline to refine data from the Silver to the Gold stage, learning how to create dimensions, facts, and views.
  5. Establish Development Environment

    • Story: As a student, I want to set up a development environment with limited data copies to safely develop and test new data processes.
  6. Implement Automated Testing and CI

    • Story: As a student, I need to create automated testing and CI pipelines to ensure my code changes are tested and integrated automatically.
  7. Configure RBAC for Different Roles

    • Story: As a student, I want to set up role-based access control for various roles (Loader, Developer, Analyst, Automation) to learn about managing access and security in a data environment.

Automated vs Human Status

  1. Automated Tasks

    • Data Encryption and Decryption: Implement encryption/decryption algorithms.
    • Extract and Load Pipeline: Automate data extraction and loading processes.
    • Transformation Pipeline: Automate data transformation steps.
    • Modeling Pipeline: Automate data modeling steps.
    • Automated Testing and CI: Implement automated testing and CI pipelines.
  2. Human-Managed Tasks

    • Development Environment Setup: Configure and maintain development environments.
    • Role-Based Access Control: Set up and manage RBAC settings.
    • Analyst Activities: Perform data analysis and visualization tasks.
    • Monitor Automated Pipelines: Monitor and troubleshoot automated processes as needed.

These steps and user stories should help organize and clarify the tasks necessary for the project, ensuring clear responsibilities and progress tracking.

About

Student WIL work on the ABS data modeled into the Azure Synapse.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages