Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mockup of idea for CLI #8

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

eslavich
Copy link
Contributor

Here's an outline of a stpipe CLI tool:

The base stpipe script asks the user to choose whether to work with pipeline or step commands:

$ stpipe -h
usage: stpipe [-h] [-v] {pipeline,step} ...

stpipe CLI

optional arguments:
  -h, --help       show this help message and exit
  -v, --version    print version information and exit

commands:
  {pipeline,step}
    pipeline       operate on pipelines
    step           operate on steps

Pipeline commands might include list available pipelines, describe a pipeline (various detailed info including a list of steps), and run a pipeline:

$ stpipe pipeline -h
usage: stpipe pipeline [-h] {describe,list,run} ...

operate on pipelines

optional arguments:
  -h, --help           show this help message and exit

command:
  {describe,list,run}
    describe           describe a pipeline
    list               list available pipelines
    run                run a pipeline

The list command outputs each available pipeline class (registered via entry point):

$ stpipe pipeline list -h
usage: stpipe pipeline list [-h] [pattern]

list available pipelines

positional arguments:
  pattern     (optional) restrict pipelines to glob pattern

optional arguments:
  -h, --help  show this help message and exit

The run command is similar to existing strun:

$ stpipe pipeline run -h
usage: stpipe pipeline run [-h] pipeline-class

run a pipeline

positional arguments:
  pipeline-class  pipeline class name

optional arguments:
  -h, --help      show this help message and exit

Similar commands would be implemented within step.

@eslavich eslavich marked this pull request as draft January 11, 2021 22:35
@nden
Copy link
Collaborator

nden commented Jan 12, 2021

I like this interface. I suspect we'll have to alias strun to stpipe pipeline run and deprecate it for a while.

@ddavis-stsci
Copy link
Collaborator

Would it be useful to have a mission parameter at this level? I'm worried that the Roman and JWST pipelines may have similar names and confuse the end user. Otherwise I think this is a great improvement.

@eslavich
Copy link
Contributor Author

Would it be useful to have a mission parameter at this level? I'm worried that the Roman and JWST pipelines may have similar names and confuse the end user.

I was thinking of referring to the pipeline and step classes by their fully-qualified names:

jwst.pipeline.Ami3Pipeline
jwst.pipeline.Coron3Pipeline
jwst.pipeline.DarkPipeline
...
romancal.pipeline.SomeRomanPipeline
romancal.pipeline.SomeOtherRomanPipeline

And if a user wanted to list only romancal pipelines they would pass a pattern to the "list" command:

$ stpipe pipeline list romancal.*

Would it be better to refer to them by class name (without module path)? In which case we would need to break the CLI commands down by mission, or add a --mission flag to each command. One thing I like about the fully-qualified names is that they're compatible with steps that are shared among the missions.

@stscieisenhamer
Copy link
Collaborator

Two questions:

Is there a reason to distinguish between pipeline and step? For JWST, there is no functional difference and are generically all referred to as step. One should simply be able to do stpipe <any-Step-derived-class>.

On running, is there an option to use a local configuration file?

@stscieisenhamer
Copy link
Collaborator

There was mention of deprecating strun: There are a number of other options that are provided. A discussion should be had of which ones to keep/deprecate/rename, etc.

@eslavich
Copy link
Contributor Author

eslavich commented Jan 12, 2021

Is there a reason to distinguish between pipeline and step? For JWST, there is no functional difference and are generically all referred to as step. One should simply be able to do stpipe .

@nden @jdavies-st and everyone else, how do you feel about the current inheritance relationship between Step and Pipeline? I personally feel that it's awkward, but I have limited experience using them. If we think we'll eventually want to break that relationship then maybe we should keep them separate in the CLI, but if we're happy with treating Pipeline as a special case of Step then perhaps it does make sense to merge them together. Although keeping them separate does provide a natural way to list them separately, and in the future there might be other operations that need to behave differently between steps and pipelines.

On running, is there an option to use a local configuration file?

Rumor has it that config files are being retired? Or is that only for the default configs? It strikes me that config files, pars files, and CLI arguments may be too many ways to accomplish the same thing. Would users be content with only the CLI arguments? They could create shell scripts for commonly used configurations. Or alternatively, would it be enough to support only config files from the CLI and drop the single parameter arguments?

@stscieisenhamer
Copy link
Collaborator

Concerning Step vs. Pipeline: Currently pipeline is a step that just runs multiple steps. To break that relationship would be a significant re-architecture. If anything, one may want to consider simply incorporating the multi-step code into step itself, or making the current step a hidden, internal implementation, class and just make everything pipeline. Regarding just the new stpipe command: strun is pipeline vs. step agnostic and, unless there are other reasons, stpipe technically should also be agnostic. However, see more after the config file paragraph.

Regarding the config file: The inclusion of the CRDS component is to provide parameters that are instrument/mode-dependent and in no way replaces or deprecates any other method of parameter setting. About whether there are too many ways of setting parameters, that is definitely something to discuss.

About scope of this issue: What is the goal? As presented, it looks like stpipe is something similar to the crds command: stpipe is a meta-command that has a number of sub-commands to group similar functionality. If so, then discussions of functionality such as pipeline vs. step and should we have config files are separate issues. Or, is the scope of this issue to define, from the ground-up, a new API. If so, then there are a whole slew of discussion that needs to take place. See github issue #2159. Sorry, thought there was a JP issue pointing to this, but cannot find it.

@nden
Copy link
Collaborator

nden commented Jan 12, 2021

The goal of this is to define a CLI interface to stpipe which is more standard than what we have now and is closer to *nix CLIs.
It should also enable us to use stpipe with other packages, romancal as a start. This is not about the Python interface.
Thanks for pointing out #2159. Those should be considered as well and I agree that this all needs a broader discussion.
This is a draft PR to present a visual of the proposed CLI. We will discuss how to proceed in person.

As for step vs pipeline, I had the same initial reaction as @stscieisenhamer. But I wonder if the difference has some significance to end users. For example, pipeline list gives a short and easily parsable list of pipelines while step list is a long list which may not always make sense immediately. In other words, developers know a pipeline is a step but is it necessary for end users to know that or is it more friendly to present them separately.

We need to support the par files in CRDS as a possible input.

@stscirij
Copy link

For step vs pipeline and whether to treat them differently - from an OOP perspective, a pipeline isn't really a kind of step, it's more like a container or steps (has-a rather than is-a relationship) - sort of like HDULists versus HDUs. While they are indeed quite well wedded architecturally, maybe some of that wedding was a bit like forcing square pegs into round holes.

@stscirij
Copy link

When stpipe was developed many years ago, I think much of the impetus came from the belief that users wouldn't want to use Python and would much prefer using command line tools. Fast forward to now, and we have data processing using stpipe while many INS testers use the Python interface to steps and pipelines. Stpipe is indeed convenient to use for doing one-off runs of steps or pipelines. But however we go ahead with a CLI interface, we should make sure that each option has a clear equivalent in Python that will always give the same behaviour.
The CLI presented looks nice and clean, although I have to say that as a user I've never been a fan of having to use a fully qualified class name, especially with inconsistent use of CamelCasing and - vs _.

@eslavich
Copy link
Contributor Author

The CLI presented looks nice and clean, although I have to say that as a user I've never been a fan of having to use a fully qualified class name, especially with inconsistent use of CamelCasing and - vs _.

One option is to list fully-qualified names but allow use of simple class names if there is no ambiguity.

@stscieisenhamer
Copy link
Collaborator

Concerning user-presentation: My limited experience is with INS and since they are often very concerned about individual steps, there is no need for the separation. However, with the general community, there is probably good reason to make a distinction.

@nden
Copy link
Collaborator

nden commented Jan 19, 2021

For reference current strun options are documented in

https://jwst-pipeline.readthedocs.io/en/latest/jwst/stpipe/user_step.html

and include --verbose, --debug, --save-parameters, --disable-crds-steppars.

Also undocumented (at least I didn't find it) but still working is adding the override_<ref_file_type> to the steppar file.

@eslavich
Copy link
Contributor Author

Subcommand options (discussed in 2021-01-25 meeting):

  1. All nested option:
stpipe pipeline run SomePipeline
stpipe pipeline list jwst.*
stpipe pipeline describe SomePipeline
  1. No nested option:
stpipe run SomePipeline
stpipe list --pipelines-only jwst.*
stpipe describe SomePipeline
  1. Hybrid option:
stpipe run SomePipeline
stpipe pipeline list jwst.*
stpipe pipeline describe SomePipeline

@stscieisenhamer
Copy link
Collaborator

In the poll, I voted for option 2. However, to be clear about the choice, this is a vote more to be against the formalization of pipeline vs. step as in command keywords such as pipeline or step and a vote for the switch selection of the meta-classes. This still leaves the option that, when absent of a switch, a default of pipeline-subclasses is listed. Such that the command:

stpipe list jwst.*

operates only on pipeline, but switches such as --all and --step would modify that default.

And note the phrasing meta-class: Such switches, etc., could be directly tied to the actual meta classes, allowing/preferring things such as jwstpipeline or romanpipeline, or whatever meta-classes are exist.

And an unrelated request: For the good of humanity, consider making any class specification case-insensitive.

@eslavich
Copy link
Contributor Author

The Slack poll shows 2. No nested option to be the unanimous favorite, so I pushed an update to the argument parser. Here's the help output now:

Base stpipe command:

$ ./stpipe -h
usage: stpipe [-h] [-V] {describe,list,print-config,run} ...

stpipe CLI

optional arguments:
  -h, --help            show this help message and exit
  -V, --version         print version information and exit

commands:
  {describe,list,print-config,run}
    describe            describe a step or pipeline
    list                list available classes
    print-config        print step or pipeline config to stdout
    run                 run a step or pipeline

List registered steps/pipelines:

$ ./stpipe list -h
usage: stpipe list [-h] [--all] [--steps-only] [<pattern>]

list available classes

positional arguments:
  <pattern>     restrict classes to glob pattern (case-insensitive)

optional arguments:
  -h, --help    show this help message and exit
  --all         list both step and pipeline classes
  --steps-only  list only step classes

examples:
  list available pipeline classes:

    stpipe list

  list pipelines and steps:

    stpipe list --all

  list steps in the jwst package:

    stpipe list --steps-only jwst.*

Describe a single step or pipeline (including parameters):

$ ./stpipe describe -h
usage: stpipe describe [-h] [-v] <class>

describe a step or pipeline

positional arguments:
  <class>        step or pipeline class name (case-insensitive, module path may be omitted for unique class names)

optional arguments:
  -h, --help     show this help message and exit
  -v, --verbose  verbose output

examples:
  print step description and parameters:

    stpipe describe jwst.step.RampFitStep

  increase verbosity to include parameters for nested steps:

    stpipe describe --verbose jwst.pipeline.Detector1Pipeline

Print a config to stdout:

$ ./stpipe print-config -h
usage: stpipe print-config [-h] <class> <input-file>

print step or pipeline config to stdout

positional arguments:
  <class>       step or pipeline class name (case-insensitive, module path may be omitted for unique class names)
  <input-file>  input dataset or association (used to fetch parameters from CRDS)

optional arguments:
  -h, --help    show this help message and exit

examples:
  save a pipeline config to a local file:

    stpipe print-config jwst.pipeline.Detector1Pipeline dataset.fits > config.asdf

Run a step or pipeline:

$ ./stpipe run -h
usage: stpipe run [-h] [--config <path>] [-d] [-p <name>=<value>] [-v] <class> <input-file>

run a step or pipeline

positional arguments:
  <class>            step or pipeline class name (case-insensitive, module path may be omitted for unique class names)
  <input-file>       input dataset or association

optional arguments:
  -h, --help         show this help message and exit
  --config <path>    config file (use 'stpipe print-config' to save and edit the default config)
  -d, --debug        debug logging (DEBUG level)
  -p <name>=<value>  override an individual step or pipeline parameter (use 'stpipe describe' to list available parameters)
  -v, --verbose      verbose logging (INFO level)

examples:
  run a pipeline with default parameters recommended by CRDS:

    stpipe run jwst.pipeline.Detector1Pipeline dataset.fits

  run a pipeline with parameters specified in a local config:

    stpipe run --config config.asdf jwst.pipeline.Detector1Pipeline dataset.fits

  override an individual pipeline parameter:

    stpipe run -p save_calibrated_ramp=true jwst.pipeline.Detector1Pipeline dataset.fits

  override an individual step parameter:

    stpipe run -p jump.rejection_threshold=3.0 jwst.pipeline.Detector1Pipeline dataset.fits

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants