Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter out log messages about non-existing pars- reffiles #135

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

jdavies-st
Copy link
Contributor

@jdavies-st jdavies-st commented Jan 22, 2024

This fixes the issue where custom steps added through entry_points check for non-existing CRDS pars- reffiles and log an error message. This PR filters out those log messages. In fact, it filters out all these missing pars- reffile log messages.

Before:

$ pip install snowblind
$ strun snowblind jw01335008001_02101_00001_nrs1_jump.fits
2024-01-22 13:58:54,829 - CRDS - ERROR -  Error determining best reference for 'pars-snowblindstep'  =   Unknown reference type 'pars-snowblindstep'
2024-01-22 13:58:54,829 - stpipe.SnowblindStep - INFO - SnowblindStep instance created.
...

With this PR:

$ strun snowblind jw01335008001_02101_00001_nrs1_jump.fits
2024-01-22 14:00:08,499 - stpipe.SnowblindStep - INFO - SnowblindStep instance created.
...

I've added a test, but because CRDS does non-standard logging via stderr, pytest's known bug in capturing these log messages make the test not work unless you run pytest -s.

Resolves spacetelescope/jwst#5251

And perhaps this is a better solution (and easier to maintain) than spacetelescope/crds#774

@jdavies-st jdavies-st requested a review from a team as a code owner January 22, 2024 13:50
@jdavies-st jdavies-st changed the title Remove pars-reffile CRDS log messages Filter out log messages about non-existing pars- reffiles Jan 22, 2024
Copy link

codecov bot commented Jan 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (aea1617) 59.15% compared to head (f495c20) 59.67%.
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #135      +/-   ##
==========================================
+ Coverage   59.15%   59.67%   +0.52%     
==========================================
  Files          24       24              
  Lines        1611     1617       +6     
==========================================
+ Hits          953      965      +12     
+ Misses        658      652       -6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jdavies-st jdavies-st force-pushed the filter-pars-log-errors branch 2 times, most recently from 9e00d36 to bf0dbc4 Compare January 23, 2024 11:44
@nden
Copy link
Collaborator

nden commented Feb 27, 2024

Is this going to filter out also expected errors, i.e. pars- files that the pipeline expects and should be available but were not delivered yet, e.g. new steps?
Is it better to document how to do this in code for those who add external steps?
@stscieisenhamer Thoughts?

@jdavies-st
Copy link
Contributor Author

jdavies-st commented Feb 27, 2024

Yes, this PR silences those log messages. stpipe already handles hard exception that CRDS throws here

stpipe/src/stpipe/step.py

Lines 865 to 875 in bcb98ec

try:
ref_file = crds_client.get_reference_file(
crds_parameters,
reftype,
crds_observatory,
)
except (AttributeError, crds_client.CrdsError):
logger.debug("%s: No parameters found", reftype.upper())
return config_parser.ConfigObj()
if ref_file != "N/A":
logger.info("%s parameters found: %s", reftype.upper(), ref_file)

by issuing its own DEBUG log message, but CRDS itself also throws the ERROR level log message which goes unfiltered. This PR filters it.

The problem is that the pipeline always expect CRDS to have pars- reffiles, even when a step doesn't or can't in the case of steps available outside of ST. crds has to do some special handling not to issue this error (spacetelescope/crds#774), which seems like a big lift when it could just not log that error and allow the library code that uses it to handle the exception how it sees fit.

From a user standpoint, it is exceedingly useless (and misleading) to get an ERROR log message that I can do nothing about. From the developer side, it's also fairly useless. Either there are pars- reffiles for the Step or not. Currently the logging at the INFO and DEBUG level issued by stpipe in the above block tells you everything you need to know about what was or was not found.

From a user standpoint, the difference between a pars- reffile

  1. not able to exist (Step subclass is outside of jwst package)
  2. not existing because CRDS version has not been updated to handle that pars- reffile
  3. not existing because the instrument team has not delivered one

is not very important. The ERROR message routinely confuses users because it does not distinguish between the above 3 cases. Currently, only the last case is handled by the linked PR above in crds.

Unlike standard reference files listed via ref_types class variable, pars- reffiles are just assumed to be valid and exist for every conceivable step, even if there are not parameters unique-to-the-step to set, and there's no way to know which of the 3 cases above applies to a particular step. The ERROR log message is a hammer. Best to filter it, or not issue it at all.

@jdavies-st
Copy link
Contributor Author

Another way to fix this would be to set a private _has_pars class variable which could be set to False for steps that don't or can't have pars- reffiles. And it would short-circuit even checking crds for a pars- reffile.

@stscieisenhamer
Copy link
Collaborator

Hmmm, maybe late-to-party? Sorry....

Starting from the ground-up....

Basic: Requesting a reftype that is non-existent is a base CRDS error. Basically, why the current situation exists.

Next: This has been left this way, mainly as Nadia states. But to further explain: If anything is running in the ground system pipeline that is requesting reftypes, those reftypes should absolutely exist. So having the apparently useless-but-annoying error is useful, because it does need to be noted and taken care of.

However: The point-of-view of an external developer creating new steps and the new point-of-view that there are steps that do not have parameters, or do not otherwise need the CRDS connection, does imply that something more is needed.

Of the options given, there is still hesitancy in hiding the error; though I believe the code is bullet-proof enough to minimize the issue, especially if DEBUG level messages are available.

However, I like the idea of making the "don't look for CRDS pars files" a more explicit condition that is under control of the step. This option I would give the thumbs-up to immediately, modulo there are not other issues with it/does not resolve the current issue.

@jdavies-st
Copy link
Contributor Author

jdavies-st commented Feb 28, 2024

I like the idea of making the "don't look for CRDS pars files" a more explicit condition that is under control of the step. This option I would give the thumbs-up to immediately, modulo there are not other issues with it/does not resolve the current issue.

Thanks for your input @stscieisenhamer. I think that solution is the better one too. I will open a PR that does that, and then close this one.

There's already a method to generally not load CRDS pars- reffiles globally, via an env var or command line option. I think just making that work on a per-Step basis would work.

Thanks!

@hbushouse
Copy link
Collaborator

This one is a tough call. Parameter ref files are a totally different breed of ref file compared to calibration reference files. It's absolutely the case that when step code says I need calibration ref file "x" and a call to find it fails, we definitely need to raise that as an error. But param ref files are completely optional. They aren't needed - just nice to have in some situations. So while it's perhaps worth noting, as information only, that no param ref file is found for a given step or pipeline, it's not an error condition. Only when a user is intentionally trying to use a param ref file and it can't be found or opened, is it considered a real error and the user needs to know that their custom params they were hoping to use were not actually used.

So how do we make code smart enough to figure out the difference? The idea of a global "don't use any param ref files" switch is not very useful - you rarely want to use or not use all param ref files for a single processing run. At the same time I don't think the approach in this PR of filtering out all such messages regarding all param ref files is necessarily a good idea. Again, if the user was assuming or hoping that a particular param ref file would get used and it did not get used, they need to know about that.

So perhaps the alternate approach of somehow trying to change the CRDS code itself to only raise a warning, instead of an error, when param ref files (as opposed to calibration ref files) can't be found is the better way to go?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failures to get pars reference files in prefetch cause log messages
4 participants