Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(robot-server): start sessions router for HTTP-based runs #7833

Merged
merged 4 commits into from
May 28, 2021

Conversation

mcous
Copy link
Contributor

@mcous mcous commented May 24, 2021

Overview

This PR builds of the #7804 research spike to add a new /sessions router that will execute protocols runs using a ProtocolEngine, AbstractFileRunner, and other associated interfaces.

Specifically, this starts building out stubs for the following robot-server interfaces and value objects:

  • SessionStore, where session resource data can be kept in memory
  • SessionRunner, where ProtocolEngine and FileRunner instances can be managed
  • Session models, to be replaced / merged with pre-existing session models
  • SessionInput models, to provide an interface for controlling the lifecycle of a session (aka play/pause)

This takes out a chunk of #7808 by providing the beginnings of HTTP routes to:

  • Create a session from a protocol resource
  • Tell the session to start executing

Changelog

  • refactor(robot-server): start sessions router for HTTP-based runs

Review requests

Two main requests on this one:

  • I'm interested in feedback from a standpoint of data encapsulation and interface / responsibility design
    • e.g. "SessionInput is a weird name, what about XYZ" is helpful, because the name of the value object affects perceived responsibilities
    • e.g. "We need other session types" is not helpful, because having more subclasses of Session in this PR wouldn't change how data flows through the routes and dependencies
  • Tests in this PR are for interface design of the router, so please approach them from that standpoint
    • Do they communicate the intent of the test subject's dependency orchestration well?
    • Is the test subject doing a good job of not mixing logic with delegation?

This new sessions router is placed behind the enableProtocolEngine feature flag, but all routes will raise a NotImplementedError right now. If you'd like to see for yourself, go ahead and run the dev server with the feature flag enabled:

make dev OT_API_FF_enableProtocolEngine=true

Risk assessment

No risk for production code. The risk here is mostly in strategy and approach. Eventually, this sessions router and models will need to account for existing flows, like deck calibration and tip length check. The trick will be to pay down tech debt while keeping that stuff operational.

@mcous mcous added the robot-svcs Falls under the purview of the Robot Services squad (formerly CPX, Core Platform Experience). label May 24, 2021
@mcous mcous requested a review from a team as a code owner May 24, 2021 22:46
@mcous mcous requested review from a team, amitlissack and celsasser and removed request for a team and amitlissack May 25, 2021 00:21
@SyntaxColoring SyntaxColoring self-requested a review May 25, 2021 15:46
Copy link
Contributor

@celsasser celsasser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it. Most of my feedback is questions, curiosity and some suggestions. Only two typo change requests.

robot-server/robot_server/router.py Outdated Show resolved Hide resolved


def get_current_time() -> datetime:
"""Get the current system time."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember seeing an argument (think it was in the pragmatic programmer) that I agree with which had to do with redundant comments. They said it much more eloquently, but essentially is there value in a comment that restates a well named function. And, well done in choosing a well named function :).

But can see reasons for including it so that we don't neglect it as the function develops?

Copy link
Contributor Author

@mcous mcous May 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping @SyntaxColoring to make sure this discussion makes it into the style guide

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha yeah, this is a big pet peeve of mine. 😁

I had an essay written up, but I don't think this PR is the place for it. Good callout for the style guide. Are you thinking this is an HTTP API style guide thing, or a Python-in-general style guide thing?

In this specific case, I agree with @celsasser:

  • We should consider renaming get_current_time() to get_current_system_time().
  • We should remove the docstring. If the linter complains about that, the linter is wrong, and we should either disable that overzealous check, or work around it with a #noqa.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would fleshing out the docstring also be an option?

def get_current_time() -> datetime:
    """Get the current system time.
    
    This dependency can be used to generate timestamps for resource
    creation and updates.

    Returns:
        The current time in UTC
    """"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that works too.

I see redundant comments as sitting in an "uncanny valley." The code can be improved, more-or-less equally, by taking the comment in either of two opposite directions: either deleting it or making it really pull its weight.

We shouldn't feel pressured to choose the "making it really pull its weight" direction. Undocumented internal functions are legitimate. Sometimes we deliberately want to leave something vaguer and open to interpretation, or lean on intuition.

robot-server/robot_server/sessions/router.py Show resolved Hide resolved
robot-server/tests/helpers.py Show resolved Hide resolved
robot-server/tests/sessions/test_sessions_router.py Outdated Show resolved Hide resolved
robot-server/tests/sessions/test_sessions_router.py Outdated Show resolved Hide resolved
robot-server/tests/sessions/test_sessions_router.py Outdated Show resolved Hide resolved
robot-server/tests/sessions/test_sessions_router.py Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented May 25, 2021

Codecov Report

Merging #7833 (4e0718b) into edge (e21b870) will increase coverage by 10.03%.
The diff coverage is 92.30%.

Impacted file tree graph

@@             Coverage Diff             @@
##             edge    #7833       +/-   ##
===========================================
+ Coverage   83.48%   93.51%   +10.03%     
===========================================
  Files         336      124      -212     
  Lines       21350     4921    -16429     
===========================================
- Hits        17824     4602    -13222     
+ Misses       3526      319     -3207     
Impacted Files Coverage Δ
robot-server/robot_server/sessions/dependencies.py 64.28% <64.28%> (ø)
...bot-server/robot_server/sessions/session_runner.py 75.00% <75.00%> (ø)
...obot-server/robot_server/sessions/session_store.py 79.16% <79.16%> (ø)
...ot-server/robot_server/sessions/session_builder.py 93.75% <93.75%> (ø)
robot-server/robot_server/router.py 100.00% <100.00%> (ø)
...t-server/robot_server/service/json_api/__init__.py 100.00% <100.00%> (ø)
...t-server/robot_server/service/json_api/response.py 100.00% <100.00%> (ø)
robot-server/robot_server/sessions/__init__.py 100.00% <100.00%> (ø)
robot-server/robot_server/sessions/router.py 100.00% <100.00%> (ø)
...bot-server/robot_server/sessions/session_inputs.py 100.00% <100.00%> (ø)
... and 229 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e21b870...4e0718b. Read the comment docs.

@mcous mcous requested a review from celsasser May 25, 2021 18:48
Copy link
Contributor

@celsasser celsasser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a couple of comments that are optional. Looks good.

Copy link
Contributor

@SyntaxColoring SyntaxColoring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GitHub threw away some of my work, ugh, so this is a partial review while I find the rest of my comments.

[Edit: Completed review in a new review, below.]

robot-server/robot_server/router.py Outdated Show resolved Hide resolved
robot-server/robot_server/router.py Show resolved Hide resolved
robot-server/tests/sessions/test_sessions_router.py Outdated Show resolved Hide resolved
@sessions_router.get(
path="/sessions",
summary="Get all sessions",
description="Get a list of all active and inactive sessions.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. How can there be more than one active session at once? Aren't sessions mutually exclusive?
  2. What is an inactive session?

Copy link
Contributor Author

@mcous mcous May 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. How can there be more than one active session at once? Aren't sessions mutually exclusive?

I don't think there could be, but I also does this description imply that there could be?

  1. What is an inactive session?

A session whose state has been preserved for re-activation at a later time, but is not currently in use. In my imagination, I think of a user pausing a protocol to re-do a calibration before returning to the protocol run where they left off

(There is a whole other question, talked about in another comment, about where session switching is even a thing we want to do. This description would change if we decided "no")

Comment on lines +21 to +29
class SessionInput(ResourceModel):
"""Session input model.

A SessionInput resource represents a client-provided input into the
session in order to control the execution of the session itself.

This is different than a protocol command, which represents an individual
action to execute on the robot as part of a protocol.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should try to generalize an input mechanism across all session types. Instead, I think each session type should have its own specific resources and actions.

Examples:

# Exact endpoint hierarchies and spellings TBD.

# See if the protocol is paused because the user hit the pause button.
GET /sessions/protocol_run/current/user_pause_state
200 { "userPaused": true }

# User clicks resume button.
PUT /sessions/protocol_run/current/user_pause_state { "userPaused": false }
200  { "userPaused": false }

# Or maybe:
POST /sessions/protocol_run/current/user_pause_state/actions/resume
200  { "userPaused": false }

# See if the current step needs user input.
GET /sessions/protocol_run/current/user_pause_state/steps/current
{
    "id": "abc123"
    "status": "awaitingUserInput",
    "userInput": {
        "spec": {
            "sample_count", { "type": "int", "minimum": 1, "maximum": 96 }
        }
        "currentValue": null
    }
}

# User provides input to the step.
PATCH /steps/abc123/user_input { "sample_count": 48 }

Ideologically:

A generalized input mechanism feels to me like RPC-over-JSON-over-HTTP. I'm not saying this would be wrong—it would be a perfectly functional way to control the robot—but I do think it would be contrary to our prior-stated ambitions of making a RESTful interface.

Instead of trying to anticipate commonalities between sessions, I think we should start with the most dumb, direct, literal, concrete interfaces possible. And then move things under generalized umbrellas only as patterns and needs emerge.

Practically:

Indirection and generality make documentation difficult. OpenAPI doesn't model them well.

If you have a protocol run session, for example, as a 3rd-party developer, you'd want to see all the things that you can do on it. But our documentation wouldn't naturally provide you with that list. It would only provide you with the list of things that anyone can do on any session type. We'd have to take pains to explain that only some of them are valid for different session types. And it would be up to you to manually adhere to those requirements. (OpenAPI and static typing couldn't help you.)

And even if the docs weren't a problem, I think the indirection and generality are undue conceptual weight. A 3rd-party dev won't want to "issue a start command to a protocol run session"; they'll want to "start a protocol run."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then move things under generalized umbrellas only as patterns and needs emerge.

To clarify what I mean here by an "emerged need..."

Client code like this would be a red flag that we need to generalize something.

def delete_current_session():
    if current_session_type() == "calibration":
        delete_current_calibration_session()
    elif current_session_type() == "liveProtocol":
        delete_current_live_protocol_session()
    # ...ad nauseam.

On the other hand, if two resources merely look like they have similar data, or similar actions, I wouldn't consider that alone to be a good reason to bring them under a generalized umbrella. Generalizations ought not form taxonomies for taxonomies' sake. "Duplication is far cheaper than the wrong abstraction." (Again, though, not that I'm calling the existing API "wrong.")

Copy link
Contributor Author

@mcous mcous May 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On RPC and REST (ideology)

A generalized input mechanism feels to me like RPC-over-JSON-over-HTTP. I'm not saying this would be wrong—it would be a perfectly functional way to control the robot—but I do think it would be contrary to our prior-stated ambitions of making a RESTful interface.

I disagree. This solution models user-provided control inputs as a specific, timestamped, identifiable resources on the server, which can be queried and referenced. The creation of that resource triggers some side-effects, which is not my favorite, but useful in this case. IMO the resource creation makes it REST'y.

From your examples:

PUT /sessions/protocol_run/current/user_pause_state { "userPaused": false }
200 { "userPaused": false }

This is also pretty REST'y, going with resource mutation rather than resource creation. It's similar to how we do things over in the factory tools. I'm having some trouble figuring out what the resource(s) look(s) like here, though. Is it the session? Is it some child of the session?

type ProtocolSession = {
  id: string,
  user_pause_state: {
    userPaused: boolean,
  },
  // ...
}

I prefer POST /session/:sid/inputs for a few reasons:

  • I think it's important that there be a record of pauses, resumes, etc. This necessitates the creation of some resource to represent those events, so why not just deal with those resources directly?
  • Where possible, I prefer immutable resource creation to resource mutation, especially with the end goal of triggering a side effect

POST /sessions/protocol_run/current/user_pause_state/actions/resume
200 { "userPaused": false }

This is 100% just a remote procedure call ("do a resume") in a URL, so I think we can both agree that this is not the way to go.

Instead of trying to anticipate commonalities between sessions, I think we should start with the most dumb, direct, literal, concrete interfaces possible. And then move things under generalized umbrellas only as patterns and needs emerge.

I don't think saying "sessions require user input, and therefore we need a way to model user input events" is an unreasonable commonality to conclude.

On the OpenAPI spec (practicality)

We're in total agreement that good OpenAPI Spec modeling of behaviors is table stakes for this project.

Indirection and generality make documentation difficult. OpenAPI doesn't model them well.

I think there's some subtlety here. I think OpenAPI is perfectly capable or modelling them well (enough), but we need to do a good job of structuring our models to make that happen, which we haven't done well in the previous /sessions iteration.

If you have a protocol run session, for example, as a 3rd-party developer, you'd want to see all the things that you can do on it. But our documentation wouldn't naturally provide you with that list. It would only provide you with the list of things that anyone can do on any session type.

This is 100% 50% possible, we just have to model our different session types properly:

from .input_models import AbstractInput, StartInput, PauseInput, StopInput

class AbstractSession:
    inputs: Optional[List[AbstractInput]]

class ProtocolSession:
    inputs: List[Union[StartInput, PauseInput, StopInput]]

class UnstoppableSession:
    inputs: Literal[None]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is 100% possible, we just have to model our different session types properly:

Hm upon second thought, this is easy at the Session schema level. I think you're right that it becomes less fun at the URL schema level

Copy link
Contributor

@SyntaxColoring SyntaxColoring May 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm upon second thought, this is easy at the Session schema level. I think you're right that it becomes less fun at the URL schema level

Yup, exactly.

I think there's two questions here that I kind of conflated:

  1. Is there a single input endpoint, overloaded to support all session types? Or does each session type have its own dedicated input endpoint?
  2. Regardless of we decide for (1), is an input endpoint then further split up based on the "kind" of input? If so, where do we stop splitting it up?

I'm more enthusiastic about the (1) kind of splitting up than the (2) kind of splitting up, though I have arguments for (2) also.

Comment on lines +1 to +11
"""Session creation and management.

A "session" is a logical container for a user's interaction with a robot,
usually (but not always) with a well-defined start point and end point.
Examples of "sessions" include:

- A session to run a specific protocol
- A session to complete a specific calibration procedure
- A long running, "default" session to perform one-off actions, like toggling
the frame lights on
"""
Copy link
Contributor

@SyntaxColoring SyntaxColoring May 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this description of what sessions are.

It would also help me if we wrote down what making something a session "does," and why we do it. (More specifically than "to manage them.")

In my mind, it's:

  1. Mutual exclusion. I shouldn't be able to calibrate while I'm running a protocol.
  2. A machine-readable entry point into inspecting the robot's current state. If I connect my Opentrons App out of the blue to your robot, my app needs a way of finding what the robot is "currently doing" and presenting that to me. That's the "current session."

Is that it, or am I missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add "a historical record of events that occurred during a given calibration / protocol run / etc." I also might take issue with:

Mutual exclusion. I shouldn't be able to calibrate while I'm running a protocol.

Counterpoint: why not? I can image this flow: you're running a protocol, something is off, you issue a manual series of commands / resource updates to correct the issue (and those events are recorded in session state), and you proceed with the run

Co-authored-by: Max Marrone <[email protected]>
Copy link
Member

@sanni-t sanni-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comments are mainly naming related.
I am not too sure of SessionInput either but I can warm up to it. It just doesn't reflect what it's talking about. Maybe something like SessionControlInput if that's not too wordy for everyone..

created_at=created_at,
)

data = session_builder.build(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, if I'm understanding this correctly, adding an entry to the sessions store using create() actually creates the session, while session builder creates a representation of the session in terms of tangible session properties? If so, I think the name session_builder and the build method are confusing. It reads as the thing that actually builds/creates a session.
I don't have great alternatives but thinking of something like a SessionRepr..

@SyntaxColoring SyntaxColoring self-requested a review May 26, 2021 20:51
Copy link
Contributor

@SyntaxColoring SyntaxColoring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a few threads here where I'm not really convinced, but I don't think merging this PR should wait for me to form a coherent argument.

I'll gather my thoughts and we can continue discussing later (here on in Slack), if you want to merge now.

@mcous
Copy link
Contributor Author

mcous commented May 28, 2021

Merging this one because I'm already too far into the next PR to really wrangle things, so here are unresolved items that require followup (that I know about):

  • SessionInput is too broad a name, let's switch to SessionControl or SessionControlInput to reflect that we're collecting requests to control the run
  • SessionBuilder doesn't accurately reflect what that utility is doing, switch it to ResponseBuilder to reflect that it's purely mapping existing session data to a Session model for HTTP response purposes
  • Need to add various TODO comments

@mcous mcous merged commit 5f9c15d into edge May 28, 2021
@mcous mcous deleted the robot-server_engine-sessions branch June 1, 2021 19:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
robot-svcs Falls under the purview of the Robot Services squad (formerly CPX, Core Platform Experience).
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants