-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Postprocessor plugin APIs #1512
Conversation
@alfoa @mandd @PaulTalbot-INL I have changed the PostProcessor Plugin base class inheritance. Please take a look, any comments are appreciated. |
I think this is a good approach. Note we don't have an automatic system for loading the plugins from outside RAVEN into the factories in RAVEN; I started on it but have not been able to finish that yet. |
@ In, paramInput, InputData.ParameterInput, the already parsed input. | ||
@ Out, None | ||
""" | ||
super()._handleInput(paramInput) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only thought from the plugin side is the somewhat confusing usage of the term "input".
In handleInput
it means "handle input XML from the RAVEN input file" whereas in createNewInput
it means "prepare the input for a single postprocessor run". I wonder how (or even if) we could approach that to make it more clear for plugin developers.
framework/Models/PostProcessor.py
Outdated
@@ -175,6 +175,8 @@ def createNewInput(self,myInput,samplerType,**kwargs): | |||
a mandatory key is the sampledVars'that contains a dictionary {'name variable':value} | |||
@ Out, myInput, list, the inputs (list) to start from to generate the new one | |||
""" | |||
if 'createNewInput' in dir(self._pp): | |||
myInput = self._pp.createNewInput(myInput,samplerType,**kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we remove the "samplerType" (I know it was present already)? Postprocesessors do not use sampleType or we are thinking here to extend the possibility to have PP behave differently depending on the Sampler that generated the data? (e.g. Sortino -> Sobol, OtherSamplers -> least square of HDMR coefficients, etc,)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering also Paul's comment, I suggest to change the function name "createNewInput(myInput,samplerType,**kwargs)" to something like "processInput(myInput, **kwargs)". The "samplerType" will be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could clarify which input, like "createPostprocessorInput" or "createModelInput"; I feel like the term "input" is the confusing one; or maybe even "prepareModelInput"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to use "createPostprocessorInput" and "samplerType" is removed.
self.raiseAnError(IOError, "Only PointSet is allowed as classifier, but HistorySet", inputObject.name, "is provided!") | ||
requiredKeys = list(self.mapping.keys()) + [self.label] | ||
for inputDict in currentInput: | ||
print(inputDict['type']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
understandable by this pp. | ||
@ In, currentInput, list, a list of DataObjects | ||
@ Out, newInput, list, list of converted data | ||
Method to identify the inputs for classifier and target, respectively |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you expand the description here? What "identify" means?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
@ In, currentInput, list, a list of DataObjects | ||
@ Out, newInput, list, list of converted data | ||
Method to identify the inputs for classifier and target, respectively | ||
@ In, currentInput, list, a list of dictionaries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
list of dictionaries...what is in these? This defines the API (e.g. {'data':array, 'dims':list(dimensions), etc}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this comment, there are few things as confusing as trying to interrogate a dictionary for what should be there. I wonder if we could explicitly list the contents as args and kwargs instead of as a summary input object for the interface objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
return newInput | ||
|
||
def run(self, inputIn): | ||
""" | ||
This method executes the postprocessor action. | ||
@ In, inputIn, list, list of DataObjects | ||
@ In, inputIn, list, list of input dictionaries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please define what those dictionaries look like (what do we have inside?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.g. dims
, type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
@@ -39,6 +39,7 @@ | |||
from .ParetoFrontierPostProcessor import ParetoFrontier | |||
from .MCSimporter import MCSImporter | |||
from .EconomicRatio import EconomicRatio | |||
from .RiskMeasuresDiscrete import RiskMeasuresDiscrete |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused about this "double" import (both in the __init__
and Factory.py
).
See my PR about the validation gate, the double import is not needed as long as the imports are in the Factory.py
. This is valid for all the Factories that have been "Restructured" with PR #1480
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's good to know, we can do a quick clean up in another PR, since I do not see the double imports get removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes but we can remove it from here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the __init__
got cleaned up in previous PR.
# Set default to 'dict', this is consistent with current post-processors | ||
outType = kwargs.get('outType', 'dict') | ||
inputDs.append(inp.asDataset(outType=outType)) | ||
elif isinstance(inp, Database): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all these checks should be moved at the validation stage in the Models.py (I thought we already have it. Doesn't it work?). Databases are not possible for PP and they should be checked for all of them at once
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed.
dataDict['outVars'] = self.getVars('output') | ||
dataDict['numberRealization'] = self.size | ||
dataDict['name'] = self.name | ||
dataDict['metaKeys'] = self.getVars('meta') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am conflicted here (I mean for the usage of this for PP). I do not think we should use dictionaries anymore in PP). The data should be provided as Dataset only. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can provide sufficiently clear examples for developers so that Datasets are easy for them to work with for the things they need to do; otherwise, I worry this could be a big barrier to external development, since the xarray datasets can be difficult until you've played with them a bit, at least for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I leave the option to use either Dict or Dataset. Currently, we only have BasicStatistics to use Dataset. With this option, we can identify the PP that we need to convert to use Dataset, and convert them one by one. Otherwise, we can not merge the PP changes without converting all PPs to use Dataset. In addition, I'm not sure it is worth to convert our PPs to use Dataset if there is no need from our customer.
<Models> | ||
<ExternalModel name='PythonModuleReduced' subType='' ModuleToLoad='modelTHreduced'> | ||
<variables>pump1Time,pump2Time,valveTime,Tmax,outcome,pump1State,pump2State,valveState,failureTime</variables> | ||
</ExternalModel> | ||
<PostProcessor name="riskMeasuresDiscrete" subType="InterfacedPostProcessor"> | ||
<PostProcessor name="riskMeasuresDiscrete" subType="RiskMeasuresDiscrete"> | ||
<method>riskMeasuresDiscrete</method> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this method line still needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not needed anymore. removed.
@alfoa This is ready for you to review. Several things to mention:
|
@alfoa Any comments for this PR? |
Job Test qsubs on b7266a0 : invalidated by @alfoa |
Approved. Yes the RavenOut should be removed. |
Job Test OpenSUSE Leap 15 on b7266a0 : invalidated by @wangcj05 |
Checklist passed... PR can be merged. |
Pull Request Description
What issue does this change request address? (Use "#" before the issue to link it, i.e., #42.)
see #1486
Closes #1451
What are the significant changes in functionality due to this change request?
Standardize the APIs for Postprocessor Plugin.
For Change Control Board: Change Request Review
The following review must be completed by an authorized member of the Change Control Board.
<internalParallel>
to True.raven/tests/framework/user_guide
andraven/docs/workshop
) have been changed, the associated documentation must be reviewed and assured the text matches the example.