Skip to content

Spark application system for the data platform - Take Off Project

License

Notifications You must be signed in to change notification settings

marouenes/take-off-data-platform

Repository files navigation

Open Data Platform - The TakeOff Project

ci pipeline License Code of Conduct

This is a simple straigtforward spark applcation and an Airflow control layer monorepo.

It is intended to be run locally, and is not designed to be run in a production environment.

Getting Started

Prerequisites

Installing and Running

  • Clone the repo
  • Install the requirements
  • Run the bootstrap installation script for airflow
  • Launch the airflow webserver
  • Launch the airflow scheduler
  • Schedule the spark jobs on Airflow

Running the tests

  • Run the tests locally:
python -m pytest

Test are run using pytest and are located in the tests directory.

Test coverage is provided by pytest-cov and can be run using:

python -m pytest --cov=.
  • TODO: Add end to end tests, and integration tests, run the whole application in ci and deploy to a staging environment.

Deployment

  • TODO: Add additional notes about how to deploy this on a live system

Collaborators

Authors

About

Spark application system for the data platform - Take Off Project

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published