refactor(engine): do not run redundant cleanup on asyncio.CancelledError #8993

mcous · 2021-12-06T21:21:28Z

Overview

During a halt, an asyncio.CancelledError was propagating upwards in an unexpected (to me) way. This PR ensures that the engine cleanup logic does not run in the case of an asyncio.CancelledError, and simplifies the EngineStatus enum to ensure better status reporting in real-world, PAPIv2 usage.

Closes #8980

Changelog

do not run redundant cleanup on asyncio.CancelledError
rename ProtocolEngine.stop to ProtocolEngine.finish to better communicate that it should be called to tell the engine it is done working
- This method is called by the ProtocolRunner when the protocol is done executing
rename ProtocolEngine.halt to ProtocolEngine.stop to better communicate that it stops the engine from doing whatever it wants to be doing
- This lines up better with the HTTP API client-facing actionType: stop, too
Remove PAUSE_REQUESTED status enum value for the time being
- We will re-introduce it for PAPIv3 / JSON schema v6
- For now, though, it just confuses PAPIv2 execution reporting, because the PAUSED status is unreachable given current implementation of the legacy support plugin

Review requests

Other than checking out the code and test changes, stress test playing, pausing, running to completion, and stopping protocol /runs to see if you can get any unexpected statuses back.

Risk assessment

Low.

Closes #8980

codecov · 2021-12-06T21:24:12Z

Codecov Report

Merging #8993 (24629cd) into edge (be485fe) will decrease coverage by 0.00%.
The diff coverage is 98.11%.

@@            Coverage Diff             @@
##             edge    #8993      +/-   ##
==========================================
- Coverage   74.72%   74.72%   -0.01%     
==========================================
  Files        1860     1860              
  Lines       49407    49421      +14     
  Branches     4857     4857              
==========================================
+ Hits        36918    36928      +10     
- Misses      11644    11648       +4     
  Partials      845      845

Flag	Coverage Δ
api	`84.68% <98.11%> (-0.01%)`	⬇️
robot-server	`91.99% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
.../src/opentrons/protocol_engine/actions/__init__.py	`100.00% <ø> (ø)`
api/src/opentrons/protocol_engine/types.py	`100.00% <ø> (ø)`
api/src/opentrons/protocol_runner/task_queue.py	`95.65% <85.71%> (+0.30%)`	⬆️
...i/src/opentrons/protocol_engine/actions/actions.py	`100.00% <100.00%> (ø)`
...i/src/opentrons/protocol_engine/protocol_engine.py	`100.00% <100.00%> (ø)`
...pi/src/opentrons/protocol_engine/state/commands.py	`99.21% <100.00%> (+0.05%)`	⬆️
...i/src/opentrons/protocol_runner/protocol_runner.py	`100.00% <100.00%> (ø)`
...hardware_control/emulation/module_server/server.py	`75.55% <0.00%> (-4.45%)`	⬇️
...i/src/opentrons/hardware_control/thread_manager.py	`88.80% <0.00%> (-1.61%)`	⬇️

api/src/opentrons/protocol_engine/types.py

api/tests/opentrons/protocol_runner/test_task_queue.py

SyntaxColoring

Nice, I think this is a big improvement in clarity.

SyntaxColoring · 2021-12-07T16:17:39Z

api/src/opentrons/protocol_engine/state/commands.py

@@ -30,6 +32,8 @@ class CommandState:
 """State of all protocol engine command resources."""

 is_running: bool
+ is_gracefully_finished: bool


Would it be clearer if is_gracefully_finished were instead named something like is_done_adding_commands, or is_expecting_more_commands?

If I'm reading this correctly, that's the minimal, unique piece of additional state that this boolean adds atop is_running, is_hardware_stopped, and stop_requested. The term "finished" has a lot of overlap with them.

I'm not sure is_expecting_more_commands is quite right, because in the case of a hard stop / cancel, the engine certainly will not be expecting more commands. I think the graceful is important; it's what tells the engine that we'd like to report a success or failure result rather than "stopped"

Hm, yeah. command_stream_closed_gracefully maybe?

I guess I'm trying to pull the name of this boolean towards being about just the input stream of commands, and trying to avoid making it sound like it describes the higher-level state of the whole engine, which is a more complicated notion that needs to be synthesized by get_status().

api/src/opentrons/protocol_engine/state/commands.py

SyntaxColoring · 2021-12-07T17:33:05Z

api/src/opentrons/protocol_runner/task_queue.py

+ except asyncio.CancelledError:
+ log.debug("Run task was cancelled")
+ raise


I would put this check in join() instead of _run(), to make things easier to match up. We call .cancel() on self._run_task, so it's symmetrical to catch CancelledError from await self._run_task.

We need this check here to exclude asyncio.CancelledError from the run task error capture. We need to capture run task errors here to feed them into the cleanup func

Ah drat, yeah. :(

api/src/opentrons/protocol_engine/state/commands.py

SyntaxColoring · 2021-12-07T18:27:40Z

api/src/opentrons/protocol_engine/state/commands.py

@@ -263,16 +288,17 @@ def get_status(self) -> EngineStatus:
 all_errors = self._state.errors_by_id.values()


(Commenting on validate_action_allowed(), which GitHub won't let me start a thread on):

Should we forbid adding new commands when self._state.is_gracefully_finished or self._state.is_stop_requested is True?

Or would that mess something up about how live LPC commands and protocol commands coexist within the same run?

I'll add a TODO

SyntaxColoring · 2021-12-07T18:54:42Z

api/src/opentrons/protocol_runner/task_queue.py

+
+ if self._cleanup_func is not None:
+ await self._cleanup_func(error=error)
+ elif error:
+ raise error


As a caller, I'd be surprised by cleanup only running conditionally on the run_func succeeding.

But also, as a caller, I never really got the motivation behind this class in the first place, so I might be missing something. I had been thinking of this class as basically an indirection around a try/finally block, where run_func is the try body and cleanup_func is the finally body. So now I guess it's an indirection around just a try block, without the finally? But why would the caller not just write the try itself?

Feel free to say this isn't a question for this PR.

I think you've misread this code. This if re-raises the error only if self._cleanup_func is not defined. This is never true in actual usage, so the cleanup func always runs regardless of the success or failure of the run func.

I agree that this dependency is awkward and needs to go away, but it's not for this PR.

I think you've misread this code. This if re-raises the error only if self._cleanup_func is not defined. This is never true in actual usage, so the cleanup func always runs regardless of the success or failure of the run func.

Right, but the whole block has been de-indented outside of the finally, right? So if the run_func code raises anything—which we still expect to be possible—then this code won't run?

async def _run(self) -> None: error = None try: if self._run_func is not None: await self._run_func() except asyncio.CancelledError: log.debug("Run task was cancelled") raise except Exception as e: # <-- exception in run_func caught here log.debug( "Exception raised during protocol run", exc_info=error, ) error = e # <-- exception captured, not reraised if self._cleanup_func is not None: await self._cleanup_func(error=error) # <-- cleanup runs elif error: raise error

shlokamin

tested on robot and seems to work, thanks @mcous!

mcous added 2 commits December 6, 2021 15:18

refactor(engine): do not run redundant cleanup on asyncio.CancelledError

b590281

Closes #8980

fixup: disambiguate between stopped and gracefully finished

3bfb4e4

mcous added the robot-svcs Falls under the purview of the Robot Services squad (formerly CPX, Core Platform Experience). label Dec 6, 2021

mcous requested a review from a team as a code owner December 6, 2021 21:21

fixup: revert removal of EngineStatus.STOP_REQUESTED

f712cbe

mcous commented Dec 6, 2021

View reviewed changes

api/src/opentrons/protocol_engine/types.py Outdated Show resolved Hide resolved

fixup: revert unintentional enum value change

4e46574

mcous commented Dec 6, 2021

View reviewed changes

api/tests/opentrons/protocol_runner/test_task_queue.py Outdated Show resolved Hide resolved

fixup: docstring typo

81da353

SyntaxColoring reviewed Dec 7, 2021

View reviewed changes

shlokamin approved these changes Dec 7, 2021

View reviewed changes

SyntaxColoring approved these changes Dec 7, 2021

View reviewed changes

fixup: docstrings and variable renames

24629cd

mcous merged commit fec85cd into edge Dec 7, 2021

mcous deleted the engine_hw-cancelled-error branch December 7, 2021 21:56

SyntaxColoring mentioned this pull request Jun 20, 2024

refactor(app,api-client): Delete bindings to nonexistent pause-requested run status #15476

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(engine): do not run redundant cleanup on asyncio.CancelledError #8993

refactor(engine): do not run redundant cleanup on asyncio.CancelledError #8993

mcous commented Dec 6, 2021 •

edited

Loading

codecov bot commented Dec 6, 2021 •

edited

Loading

SyntaxColoring left a comment •

edited

Loading

SyntaxColoring Dec 7, 2021 •

edited

Loading

mcous Dec 7, 2021

SyntaxColoring Dec 7, 2021

SyntaxColoring Dec 7, 2021

mcous Dec 7, 2021

SyntaxColoring Dec 7, 2021

SyntaxColoring Dec 7, 2021

mcous Dec 7, 2021

SyntaxColoring Dec 7, 2021

mcous Dec 7, 2021

SyntaxColoring Dec 7, 2021

mcous Dec 7, 2021

shlokamin left a comment

		@@ -263,16 +288,17 @@ def get_status(self) -> EngineStatus:
		all_errors = self._state.errors_by_id.values()

refactor(engine): do not run redundant cleanup on asyncio.CancelledError #8993

refactor(engine): do not run redundant cleanup on asyncio.CancelledError #8993

Conversation

mcous commented Dec 6, 2021 • edited Loading

Overview

Changelog

Review requests

Risk assessment

codecov bot commented Dec 6, 2021 • edited Loading

Codecov Report

SyntaxColoring left a comment • edited Loading

Choose a reason for hiding this comment

SyntaxColoring Dec 7, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shlokamin left a comment

Choose a reason for hiding this comment

mcous commented Dec 6, 2021 •

edited

Loading

codecov bot commented Dec 6, 2021 •

edited

Loading

SyntaxColoring left a comment •

edited

Loading

SyntaxColoring Dec 7, 2021 •

edited

Loading