Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

o/snapstate: make a managed refresh schedule not require any additional checks #14107

Conversation

bboozzoo
Copy link
Collaborator

@bboozzoo bboozzoo commented Jun 20, 2024

Drop the additional check to CanManageRefreshes() when the refresh schedule is
already set to 'managed'. This was originally a way to ensure that there is at
least one snap entitled to directly manage the refreshes or fall back to the
default auto-refresh schedule. However, the conditions in which the fallback
would be applied are incorrect and could lead to a situation when snapd would
trigger an auto-refresh even while a snap which is entitled to using a managed
refresh schedule is being refreshed (due to the snapd-control being temporarily
disconnected). On top of this, since the device was once switched to managed, it
clearly means that it was entitled to do so and it was intentional, hence we
should not accidentally break the expectations.

An attempt to fix this was introduced in 01ef5a9 which landed in 2.58, but when refreshing snapd from a version prior to 2.58, the pending security bits would not be set in the state and thus the new code would not be triggered as expected.

@github-actions github-actions bot added the Needs Documentation -auto- Label automatically added which indicates the change needs documentation label Jun 20, 2024
MiguelPires
MiguelPires previously approved these changes Jun 20, 2024
Copy link
Contributor

@MiguelPires MiguelPires left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me, thank you

Copy link
Collaborator

@pedronis pedronis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, couple small comments

@@ -109,6 +109,8 @@ func validateRefreshSchedule(tr RunTransaction) error {
st.Lock()
defer st.Unlock()

// ensure there is a snap entitled to managing the refreshes
// in an autonomic manner
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the last part of the comment inside reportOrIgnoreInvalidManageRefreshes needs also changes I think

s.state.Lock()
defer s.state.Unlock()

// enable gate-auto-refresh-hook feature
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the comment seems wrong

@pedronis pedronis dismissed MiguelPires’s stale review June 20, 2024 12:29

there's a test as well now

@pedronis pedronis self-requested a review June 20, 2024 12:29
Copy link
Collaborator

@pedronis pedronis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bboozzoo @Meulengracht the test looks useful but doesn't seem that it would test fix itself?

@bboozzoo bboozzoo force-pushed the bboozzoo/refresh-schedule-managed-stays-on branch from 7c24ca1 to fc54f59 Compare June 21, 2024 13:01
@bboozzoo bboozzoo requested a review from pedronis June 21, 2024 13:02
@bboozzoo
Copy link
Collaborator Author

@pedronis have a look at the test changes I pushed. I tried to make it more realistic and match more closely the scenario we're addressing (unfortunately it's at the cost of the test becoming less easier to follow).

Copy link
Contributor

@MiguelPires MiguelPires left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Comment on lines 14428 to 14440
// This test exists to verify that any distruptions of a refresh
// will maintain any pre-existing connection that was held before
// the snap is marked inactive. The goal of this test is to run all
// tasks that involve something making the snap unavailable/not active
// and then simulating a reboot inside the interface manager. We then
// re-verify the connection and see that functionality used return
// the expected values. The interface used for testing is snapd-control
// with the managed refresh schedule attribute.
//
// This was selected based on a customer case. Prior to version 2.58 this
// would cause the connection to be dropped, if a well-timed restart of snapd
// would occur during a refresh. In 2.58 a fix for this was merged, and this
// test shall be passing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// This test exists to verify that any distruptions of a refresh
// will maintain any pre-existing connection that was held before
// the snap is marked inactive. The goal of this test is to run all
// tasks that involve something making the snap unavailable/not active
// and then simulating a reboot inside the interface manager. We then
// re-verify the connection and see that functionality used return
// the expected values. The interface used for testing is snapd-control
// with the managed refresh schedule attribute.
//
// This was selected based on a customer case. Prior to version 2.58 this
// would cause the connection to be dropped, if a well-timed restart of snapd
// would occur during a refresh. In 2.58 a fix for this was merged, and this
// test shall be passing.
// This test exists to verify that any disruptions of a refresh
// will maintain any pre-existing connection that was held before
// the snap is marked inactive. The goal of this test is to run all
// tasks that involve something making the snap unavailable/not active
// and then simulating a reboot inside the interface manager. We then
// re-verify the connection and see that functionality used return
// the expected values. The interface used for testing is snapd-control
// with the managed refresh schedule attribute.
//
// This was selected based on a customer case. Prior to version 2.58 this
// would cause the connection to be dropped, if a well-timed restart of snapd
// would occur during a refresh. In 2.58 a fix for this was merged, and this
// test should be passing.

Copy link
Collaborator

@pedronis pedronis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

questions

overlord/managers_test.go Show resolved Hide resolved
// reflected by the repo
c.Assert(err, ErrorMatches, `snap "snap-with-snapd-control" has no .* "snapd-control"`)
c.Assert(ref, HasLen, 0)
// but returns false without the fix, yet we still made no
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the yet and no right here? above we have c.Assert(len(reqs) > 0, Equals, true) ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, yes, that code was slightly later (after the checks that auto refresh wasn't done), but then I moved it around without updating the comment

@bboozzoo bboozzoo requested a review from pedronis June 24, 2024 06:06
Copy link
Member

@Meulengracht Meulengracht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the improvements to the test - and lgtm.

bboozzoo and others added 8 commits June 27, 2024 08:38
…al checks

Drop the additional check to CanManageRefreshes() when the refresh schedule is
already set to 'managed'. This was originally a way to ensure that there is at
least one snap entitled to directly manage the refreshes or fall back to the
default auto-refresh schedule. However, the conditions in which the fallback
would be applied are incorrect and could lead to a situation when snapd would
trigger an auto-refresh even while a snap which is entitled to using a managed
refresh schedule is being refreshed (due to the snapd-control being temporarily
disconnected). On top of this, since the device was once switched to managed, it
clearly means that it was entitled to do so and it was intentional, hence we
should not accidentally break the expectations.

Signed-off-by: Maciej Borzecki <[email protected]>
Signed-off-by: Maciej Borzecki <[email protected]>
@bboozzoo bboozzoo force-pushed the bboozzoo/refresh-schedule-managed-stays-on branch from 04067c6 to eb43042 Compare June 27, 2024 06:39
@Meulengracht Meulengracht merged commit a2bc59c into snapcore:master Jun 27, 2024
47 of 51 checks passed
@bboozzoo bboozzoo deleted the bboozzoo/refresh-schedule-managed-stays-on branch June 28, 2024 09:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Documentation -auto- Label automatically added which indicates the change needs documentation
Projects
None yet
4 participants