This is an example on how to structure a Dagster project in order to organize the assets, jobs, repositories, schedules, and ops. The example also contains examples on unit-tests and a docker-compose deployment file that utilizes a Postgresql database for the run, event_log and schedule storage.
This example should in no way be considered suitable for production and is marely my own example of a possible file structure. I personally felt that it was difficult to put the Dagster concepts to use since the projects own examples had widely different structure and was difficult to overview as a beginner.
The example is based on the official tutorial.
To run the example simply do
docker-compose up -d
This will build the Docker image and pull Postgresql dependency. The dagster dashboard is then available on https://localhost:3000
There is an example on how to run a single pipeline in src/main.py
. First
install the dependencies in an isolated Python environment.
pip install -r requirements
Then run the dagster_example
Python module from the project root folder.
python -m dagster_example
Note that you can run the main file directly as well but then you need to add the project root to the PYTHONPATH environment variable manually.
PYTHONPATH="${PWD}" python dagster_example/__main__.py
- pybokeh/dagster-sklearn
- Gave me the inspiration for the primary folder structure. Although that example is more advanced and utilizes sklearn.
- dagster-io/dagster examples
- Dagster's own examples.
- xyzy-web/dagster-exchangerates
- An example that includes Kubernetes Deployment.
- sephib/dagster-graph-project
- sspaeti-com/practical-data-engineering