Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration test improvements #2858

Merged
merged 11 commits into from
Mar 7, 2024
Next Next commit
integration test improvements
  • Loading branch information
hubertdeng123 committed Mar 6, 2024
commit 02ccedd29b92612fce612c523395ec17d93ab7f2
8 changes: 7 additions & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ jobs:
compose_path: "/usr/local/lib/docker/cli-plugins"
- compose_version: "v2.7.0"
compose_path: "/usr/local/lib/docker/cli-plugins"
test-group: ["initial-install", "customizations"]
env:
COMPOSE_PROJECT_NAME: self-hosted-${{ strategy.job-index }}
steps:
Expand All @@ -83,7 +84,12 @@ jobs:
sudo chmod +x "${{ matrix.compose_path }}/docker-compose"

- name: Integration Test
run: ./integration-test.sh
uses: nick-fields/retry@v3
with:
max_attempts: 3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just so I understand: what was the behavior before you added these settings? Unlimited retries? No timeout so it hung until the action crashed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests would fail from flakes way too often, adding this in drastically increases the chance the tests pass when they should

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, in effect, the previous (implicit) setting was max_attempts: 1? Or timeout_minutes: Infinity? Or some combination of the two? I get that the problem we are trying to solve is flakiness, I'm just not clear how changing (raising? lowering?) max_attempts and timeout_minutes helps that, since it's not obvious what the current state of affairs is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that's correct. The max_attempts really is just the first step into adding flaky test detection. If a job fails, but then is retried and succeed, it can be marked as flaky. The timeout_minutes is a required parameter here. I can remove this and readd in a follow-up PR if that is more clear

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that's fine. I just wanted to understand the change. LGTM!

timeout_minutes: 25
retry_on: error
command: ./integration-test.sh --${{ matrix.test-group }}

- name: Inspect failure
if: failure()
Expand Down
10 changes: 7 additions & 3 deletions _integration-test/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,11 @@ teardown() {
DID_TEAR_DOWN=1

if [ "$1" != "EXIT" ]; then
echo "An error occurred, caught SIG$1 on line $2"
error_msg="An error occurred, caught SIG$1 on line $2"
echo "$error_msg"
dsn="https://[email protected]/6627632"
sentry_cli="docker run --rm -v /tmp:/work -e SENTRY_DSN=$dsn getsentry/sentry-cli"
$sentry_cli send-event -m "$error_msg" --logfile "$log_file"
fi

echo "Tearing down ..."
Expand All @@ -41,10 +45,10 @@ echo "${_endgroup}"
echo "${_group}Starting Sentry for tests ..."
# Disable beacon for e2e tests
echo 'SENTRY_BEACON=False' >>$SENTRY_CONFIG_PY
echo y | $dcr web createuser --force-update --superuser --email $TEST_USER --password $TEST_PASS
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docker run is slower than just an exec into a running container

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is but it is also intentional to keep the run separate from the main web process.

That said for a one-off thing like this, I think using exec is a good compromise if it saves a notable amount of time. I'd just want it documented with a brief comment.

$dc up -d
printf "Waiting for Sentry to be up"
timeout 90 bash -c 'until $(curl -Isf -o /dev/null $SENTRY_TEST_HOST); do printf '.'; sleep 0.5; done'
echo y | $dc exec web sentry createuser --force-update --superuser --email $TEST_USER --password $TEST_PASS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if we exec into the web container but it isn't ready yet? Does docker wait until it is ready, or does this fail? If it's the former, we should echo "Waiting for Sentry..." before this runs, otherwise the user may be waiting a while. If it fails, we should add some sort of sync point to wait for the container to be up before trying this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docker compose up will only succeed if the container healthcheck for web passes, so I don't think this will be a problem. That is performed on a previous line, so when the tests get to the createuser logic the web container will always be ready

printf "Waiting for Sentry to be up"
echo ""
echo "${_endgroup}"

Expand Down
36 changes: 22 additions & 14 deletions integration-test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,24 +6,32 @@ rm -f sentry/enhance-image.sh
rm -f sentry/requirements.txt
export REPORT_SELF_HOSTED_ISSUES=0

echo "Testing initial install"
./install.sh
_integration-test/run.sh
_integration-test/ensure-customizations-not-present.sh
_integration-test/ensure-backup-restore-works.sh
test_option="$1"

echo "Make customizations"
cat <<EOT >sentry/enhance-image.sh
if [[ "$test_option" == "--initial-install" ]]; then
echo "Testing initial install"
./install.sh
_integration-test/run.sh
_integration-test/ensure-customizations-not-present.sh
_integration-test/ensure-backup-restore-works.sh
elif [[ "$test_option" == "--customizations" ]]; then
echo "Testing customizations"
./install.sh
source install/dc-detect-version.sh
$dc up -d
echo "Making customizations"
cat <<EOT >sentry/enhance-image.sh
#!/bin/bash
touch /created-by-enhance-image
apt-get update
apt-get install -y gcc libsasl2-dev python-dev libldap2-dev libssl-dev
EOT
chmod +x sentry/enhance-image.sh
printf "python-ldap" >sentry/requirements.txt
chmod +x sentry/enhance-image.sh
printf "python-ldap" >sentry/requirements.txt

echo "Testing in-place upgrade and customizations"
./install.sh --minimize-downtime
_integration-test/run.sh
_integration-test/ensure-customizations-work.sh
_integration-test/ensure-backup-restore-works.sh
echo "Testing in-place upgrade and customizations"
./install.sh --minimize-downtime
_integration-test/run.sh
_integration-test/ensure-customizations-work.sh
_integration-test/ensure-backup-restore-works.sh
fi
Loading