-
Notifications
You must be signed in to change notification settings - Fork 231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How best to handle model failure with certain parameter sets? #134
Comments
Hi @LindsayLBE. This is indeed a problem which I encountered working with the Method of Morris. Check Section 3.4.4.1 (p. 64) of my thesis for a brief discussion of this and links to a few sources. |
Thanks Will. @LindsayLBE yes it will interfere with the Sobol indices calculations -- it will probably make them return NaN, if I had to guess. My suggested fixes are just hacks, Will might have better options:
Hope that helps, |
@jdherman @willu47: I am currently facing a similar problem. I have a (groundwater) model that may or may not converge depending on parameter combinations that are not entirely trivial i.e., this is not straightforward to fix by simply adjusting the max/min of the prior distributions, there are interaction effects (so hack #1 will not work). I do have control of what the model will output in case of infeasible parameter sets—this can be either NaNs or an unrealistic simulated value that will yield a low likelihood (e.g., sim_value=-9999). Although the sim_value=-9999/NaN approach can work in the case of MCMC sampling (parameter estimation), I think it can/will corrupt the whole SA. From your previous comments, it seems that for FAST (and Morris, and Sobol) this may be a show-stopper as it relies on a very specific sampling strategy. Can anyone shed some light on whether the “missing” infeasible samples (parameter sets) will corrupt/invalidate the FAST analysis? Roughly, 15% of my samples are infeasible, so for 10,000 samples I get around 8,500 valid samples. FAST will still produce sensitivity indices from the 8500 samples, but are they reliable? @jdherman, For hack#2, what do you mean exactly by “replacing the missing values with the mean of the other values”? Could you please provide a minimal example of how to do this? Thanks! |
I believe FAST would also have a problem with missing data (like Sobol or Morris), because the samples are generated in a specific order, and the final step of the analysis expects that order to be preserved. Hack 2 would be something like: # model output vector Y
meanY = Y[Y != -9999].mean()
Y[Y==-9999] = meanY This will probably lead to an underestimate of the total variance, but at least would preserve the mean. And it would allow the rest of the analysis to continue. This is an important question and I don't know of any official way around it. So if anyone reading this has an idea, please let us know! |
Here is another idea suggested by @dmey in #206 . This is probably better than filling missing values with the mean: """ A different approach may be to fill missing values due to model failures by interpolating the values from successes originating from samples close to those that lead the model to fail. In other words, given a model g(u,w) where u and w are model parameters, and Y is the model-output vector, we could say that at the failure instance g', corresponding to model-output value Y', value Y' can be generated by interpolating the results from those parameters that are in closest proximity to those used when the model failed. In effect, you generate missing values by weighting the results from parameters closest in space with those leading the model to fail. |
Reopening this as an issue to be resolved (or at least better dealt with) in v2.0 |
hello |
It seems your question is unrelated to this issue. Could you open a new one and I will try to assist you there. |
thanks , I want to learn how can I use the SAlib |
I'm wondering how best to handle cases where some of the saltelli-generated input parameter sets cause model failure, resulting in no output. How will this interfere with the calculating of sensitivity indices using the Sobol method?
The text was updated successfully, but these errors were encountered: