Name		Name	Last commit message	Last commit date
parent directory ..
.scripts		.scripts
ci		ci
data		data
macros		macros
models		models
.gitignore		.gitignore
README.md		README.md
dbt_project.yml		dbt_project.yml
packages.yml		packages.yml

README.md

snowplow-unified-integration-tests

Integration test suite for the snowplow-unified dbt package.

The ./scripts directory contains the following:

integration_tests.sh: This tests the standard modules of the snowplow-unified package:
App error test: runs the app errors optional module without test data to see if it compiles correctly.
Late enabled contexts test: runs the package first without contexts enabled (excluding cwv due to the separate dummy dataset) then runs them with enabled. It checkes if a new incremental table that is based on events_this_run runs smoothly or not indicating datatype errors.
All contexts except for cwv test: runs the Snowplow unified package 4 times to replicate incremental loading of events, then performs an equality test between the actual vs expected output.
Web Vital test: Runs the cwv optional module once and checks the outputs.
Web (all web contexts except for cwv) test: runs only the web related contexts and checks if it fails by doing so.
Mobile (all mobile contexts) test: runs only the web related contexts and checks if it fails by doing so.

Run the scripts using:

bash integration_tests.sh -d {warehouse}

Supported warehouses (should be the same as your target in your profile.yml):

redshift
bigquery
snowflake
databricks
postgres
all (iterates through all supported warehouses)

Good-to-knows:

There are certain exceptions to how different warehouses process data and in places we had to adjust the integration test to work around that. Here's a list of things to keep in mind:

the non-deterministic nature of row_number() function for Redshift/Postgres/Databricks means that we had to hard-code actuals and expected models for cases where we are testing duplicate rows with exact same results / window
postgres / redshift needing the array format of : (within sessions_expected)
bigquery handling of snowplow_utils.timestamp_diff() - absolute_time_in_s changes
rotating domain_userid per session is hard-coded in the integration test expectations, when run in one batch the user_identifier differs: 2e340eb6e94820ea8369c0174c612260d1cfe9d41f0fe46268994e28d9c0bbf17 0e9ab97b5d9d9a174112df13fe9c44788af3ac9088a8b41e0998d92a8b4b5a4fc
same with the number of quarantined sessions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

integration_tests

integration_tests

README.md

snowplow-unified-integration-tests

Files

integration_tests

Directory actions

More options

Directory actions

More options

Latest commit

History

integration_tests

Folders and files

parent directory

README.md

snowplow-unified-integration-tests