Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-30165][runtime][JUnit5 Migration] Migrate unaligned checkpoint related tests under flink-runtime module to junit5 #21368

Merged
merged 2 commits into from
Dec 9, 2022

Conversation

1996fanrui
Copy link
Member

What is the purpose of the change

Migrate unaligned checkpoint related tests under flink-runtime module to junit5

Brief change log

Migrate unaligned checkpoint related tests under flink-runtime module to junit5

Verifying this change

This change is a trivial rework / code cleanup without any test coverage.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no
  • If yes, how is the feature documented? not documented

@flinkbot
Copy link
Collaborator

flinkbot commented Nov 23, 2022

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@1996fanrui
Copy link
Member Author

Hi @XComp , please help take a look in your free time, thanks!

@XComp
Copy link
Contributor

XComp commented Nov 24, 2022

@snuyanzin may you have time to take a look at it?

@1996fanrui
Copy link
Member Author

Hi @RocMarshal , please help take a look in your free time, thanks~

@snuyanzin
Copy link
Contributor

In general looks good, i left some comments.
May be a dummy question to clarify: is there a way to determine which tests are related to unaligned checkpoint and which are not based e.g. on a package name?

@1996fanrui
Copy link
Member Author

Hi @snuyanzin , Thanks for your review. I have addressed all comments.

Is there a way to determine which tests are related to unaligned checkpoint and which are not based e.g. on a package name?

Yes, I based package name. In fact, it is non-strict. The flink-runtime module is too large to be migrated in one go. So split some small packages to migrate.

Copy link
Contributor

@RocMarshal RocMarshal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool~
Thanks for @1996fanrui contribution and review @snuyanzin .
I left a few comments. PTAL. thx.

Copy link
Member

@reswqa reswqa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@1996fanrui Thanks for doing this migration, It looks very good overall. I have only a few suggestions.

@1996fanrui 1996fanrui force-pushed the 30165/junit5-runtime-uc branch 2 times, most recently from 46f1706 to 2ce45c0 Compare November 24, 2022 14:11
Copy link
Contributor

@XComp XComp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @1996fanrui for this PR and @snuyanzin , @RocMarshal and @reswqa for reviewing this big change thoroughly. I did a pass over it and had some final comments.

A final thought on collecting expected exceptions. We should really be specific in the callback of an assertThatThrownBy call to be sure that the exception is actually thrown in the expected place. I understand that this wasn't the case in the JUnit4 version either. But I think that it's worth it to add this improvement as part of the JUnit5 migration.

@1996fanrui
Copy link
Member Author

Hi @XComp , thanks for your hard review. I have addressed all comments.

And thanks @XComp @snuyanzin @reswqa and @RocMarshal again, I learned a lot in this PR. It will be useful for my future contributions.

Copy link
Contributor

@XComp XComp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my comments, @1996fanrui . I went over the changes once more. Please find more comments below.

runWithSyncWorker(
(writer, worker) -> {
callStart(writer);
callAddInputData(writer, dataBuf);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
callAddInputData(writer, dataBuf);
callAddInputData(writer, eventBuf, dataBuf);

Are you sure that we're still testing the same thing? The original code didn't call worker.processAllRequests() twice, as far as I can see. 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes , I'm sure.

Channel state just allows the data buffer, and doesn't allow the event buffer. This unit-test checks whether it throw IllegalArgumentException when adding an event buffer.

So I call processAllRequests twice, the first one, it's a start request and add data buffer request, the second one, it's a add event buffer request.

The first one is normal, and the second one should throw exception. It's more specific.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess, you're right. But the test is slightly changed by you adding dataBuf before adding eventBuf because we're checking whether dataBuf is recycled afterwards. The semantics of this check are changed due to your change: Previously, the test checked that a buffer was recycled which should have been processed but wasn't processed because of the Exception the eventBuf caused. Now, we check that the dataBuf is recycled after being processed properly. I'm curious about your opinion here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

he test checked that a buffer was recycled which should have been processed but wasn't processed because of the Exception the eventBuf caused.

Good catch, I didn't realize it before. I just thought that test wanted to verify whether the data buffer is normal.

Now I understand it, thanks for your reminder~

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick response, @1996fanrui. Looking at this specific test, we will see now that we run into the same issue which I explained in my other comment. The expected IllegalArgumentException is caught and handled by assertThatThrownBy. As a consequence, the call chain continues within runWithSyncWorker making processAllRequests being called once more which might affect dataBuf in this specific case. I'm gonna propose a way to deal with this in the other thread to keep that discussion there.

assertTrue(dataBuf.isRecycled());
}

runWithSyncWorker(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also noticed the following: Refactoring this code in a way that we're using assertThatThrownBy is used changes the behavior of runWithSyncWorker slightly because of worker.processAllRequests() in ChannelStateWriterImplTest:313. The old implementation didn't call processAllRequests() if an exception was thrown. The refactored version would actually cause this method to be called which might change the behavior. You might want to double-check whether that's something we want.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked all caller. All the unit tests of @Test(expected = Exception.class) throw an exception in the last line of runWithSyncWorker or it just has one line, so I think the old and new codes are the same.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess, you misunderstood me here: worker.processAllRequests() in ChannelStateWriterImplTest:319 is called after the callback function is executed. In the old code, we threw an Exception within the callback function which resulted in the command in line 319 not being called. In the new version of the code, we rely on assertThatThrownBy which doesn't throw an Exception and, therefore, line 319 is executed, i.e. worker.processAllRequests() is called in the expected case. This feels like an unwanted change? 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the callback doesn't throw exception. Old and new code is same.

Some old code tests rely on runWithSyncWorker -> worker.processAllRequests() to throw exceptions for tests that expect Exception.

In the new code, I have additionally called worker.processAllRequests() in the callback, so runWithSyncWorker -> worker.processAllRequests() will not process any requests.

Please correct me, if anything is wrong.

Or if we worry about which tests are wrong, we can refactor them into non-specific checks, what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be on the safe side, we should remove worker.processAllRequests() in ChannelStateWriterImplTest:316 and move it into the tests where this call is still required. This is test-specific code (for tests where no exception is expected) rather than generic code. WDYT?

@1996fanrui 1996fanrui force-pushed the 30165/junit5-runtime-uc branch 2 times, most recently from 1cebb8c to 5596204 Compare December 2, 2022 15:00
@XComp
Copy link
Contributor

XComp commented Dec 5, 2022

I replied to your comments, @1996fanrui. I'm looking forward to your response.

@1996fanrui
Copy link
Member Author

1996fanrui commented Dec 5, 2022

I replied to your comments, @1996fanrui. I'm looking forward to your response.

@XComp , thanks for your hard review, updated~

… related tests under flink-runtime module to junit5
Copy link
Contributor

@XComp XComp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my comments. The PR looks good. 👍

@1996fanrui just as hint for future PRs. It helps to add new commits while being in the review process and only do a squash after you're sure that the review is done (usually that can be even done by the committer when merging the PR).
Github provides functionality to check the diff after a forced-push. But having separate commits also makes the incremental changes available locally when reviewing the PR using an IDE.

@XComp
Copy link
Contributor

XComp commented Dec 5, 2022

@rkhachatryan do you want to have a brief look over the changes to make sure that we didn't accidentally change some test logic? Otherwise, I would go ahead and merge it tomorrow after CI is green

@1996fanrui
Copy link
Member Author

Thanks for addressing my comments. The PR looks good. 👍

@1996fanrui just as hint for future PRs. It helps to add new commits while being in the review process and only do a squash after you're sure that the review is done (usually that can be even done by the committer when merging the PR). Github provides functionality to check the diff after a forced-push. But having separate commits also makes the incremental changes available locally when reviewing the PR using an IDE.

Thanks for your reminder.

@1996fanrui
Copy link
Member Author

@flinkbot run azure

@1996fanrui 1996fanrui requested a review from XComp December 8, 2022 17:40
@XComp
Copy link
Contributor

XComp commented Dec 8, 2022

Why another review? I was just waiting with merging because I wanted to give @rkhachatryan the opportunity to verify that we didn't miss something. Merging the PR is on my list of things to do for Monday. :-)

Update: Ah I see, in my last comment, I said "tomorrow". 😇 Sorry for the confusion. :-D will do it on Monday.

Copy link
Contributor

@rkhachatryan rkhachatryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR and for pulling me in.
I've left only some minor comments, PTAL.

@1996fanrui
Copy link
Member Author

Thanks for the PR and for pulling me in. I've left only some minor comments, PTAL.

Hi @rkhachatryan , thanks for your review, I have updated.

Copy link
Contributor

@XComp XComp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for giving it another pass @rkhachatryan . I'm gonna go ahead and merge it as soon as CI is green.

Copy link
Contributor

@rkhachatryan rkhachatryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating the PR, LGTM

@1996fanrui
Copy link
Member Author

@flinkbot run azure

@XComp
Copy link
Contributor

XComp commented Dec 9, 2022

CI is green. About the rerun: It would be also nice to investigate build failures. The build failures we've seen in this PR's CI can be explained with FLINK-29427 and FLINK-30355. The latter one seems to be an issue that we haven't observed so far (based on a Jira search). It would have been a pity if we would have missed that one.

Such an analysis is also good to determine whether the build failures are related to the PR's changes or not. In this case, it's obvious that they are not because we only refactored test classes. Therefore, we're good to go. I will merge the PR. Thanks for your contribution @1996fanrui and everyone else for their reviews.

@XComp XComp merged commit 5f924bc into apache:master Dec 9, 2022
@1996fanrui 1996fanrui deleted the 30165/junit5-runtime-uc branch December 9, 2022 16:10
@1996fanrui
Copy link
Member Author

Thanks everyone who reviewed 👍🏻

chucheng92 pushed a commit to chucheng92/flink that referenced this pull request Feb 3, 2023
… related tests under flink-runtime module to junit5 (apache#21368)
akkinenivijay pushed a commit to krisnaru/flink that referenced this pull request Feb 11, 2023
… related tests under flink-runtime module to junit5 (apache#21368)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants