Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration test improvements #2858

Merged
merged 11 commits into from
Mar 7, 2024
Prev Previous commit
Next Next commit
fix test type
  • Loading branch information
hubertdeng123 committed Mar 7, 2024
commit b22f1eb3ccd54d215d1e8b7ce2691be72b040fcb
2 changes: 1 addition & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ jobs:
max_attempts: 3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just so I understand: what was the behavior before you added these settings? Unlimited retries? No timeout so it hung until the action crashed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests would fail from flakes way too often, adding this in drastically increases the chance the tests pass when they should

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, in effect, the previous (implicit) setting was max_attempts: 1? Or timeout_minutes: Infinity? Or some combination of the two? I get that the problem we are trying to solve is flakiness, I'm just not clear how changing (raising? lowering?) max_attempts and timeout_minutes helps that, since it's not obvious what the current state of affairs is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that's correct. The max_attempts really is just the first step into adding flaky test detection. If a job fails, but then is retried and succeed, it can be marked as flaky. The timeout_minutes is a required parameter here. I can remove this and readd in a follow-up PR if that is more clear

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that's fine. I just wanted to understand the change. LGTM!

timeout_minutes: 25
retry_on: error
command: ./integration-test.sh --${{ matrix.test-group }}
command: ./integration-test.sh --${{ matrix.test_type }}

- name: Inspect failure
if: failure()
Expand Down
Loading