Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Method to locate parameter combinations/ranges that cause model failure with Morris? #262

Closed
TobiasKAndersen opened this issue Sep 5, 2019 · 7 comments

Comments

@TobiasKAndersen
Copy link
Contributor

On a related question regarding model failure when running Method of Morris:
I am running a process-based, biogeochemical model with 200+ parameters and get around 5% model failures. I was thinking of trying to locate the parameters and/or parameter combinations as a way to handle/avoid model failure and thereby being able to go forward with Method of Morris analysis. Does anyone has a good method to locate parameters causing model failure?

(I hope this question is not out of the scope for this board) :-)

@jdherman
Copy link
Member

jdherman commented Sep 5, 2019

Hi @TobiasKAndersen , interesting question. There isn't a way to this within SALib, but you could consider running a classification method like CART in scikit-learn to find the parameter combinations that cause model failure. First you'd just have to make a 0/1 output column indicating failures, then train CART on that. Hopefully the resulting decision tree would give some insight into what parameter combinations are causing problems.

In case that doesn't work, or doesn't provide useful insight ... the other option you could consider is to use the Delta-MIM method (in SALib) which does not require a specific sampling scheme, and therefore can be used with a set of results where the failures have been removed.

Finally, in this issue thread a while ago we talked about replacing missing values with a placeholder so that the sensitivity indices could still be calculated.
#134

@jdherman jdherman closed this as completed Sep 5, 2019
@TobiasKAndersen
Copy link
Contributor Author

Thanks for all the great advice!
I'll begin immediately looking into CART ;-)

@TobiasKAndersen
Copy link
Contributor Author

@jdherman , I have now implemented the CART algorithm and it works quite well as an exploratory model crash tool. So thanks for the advice!
When working with the Delta-MIM method, I am removing approx. 2% of samples (as they results in crashes). Do you know of any test to check if this could potentially influence SA results?

@jdherman
Copy link
Member

jdherman commented Oct 7, 2019

Cool. One experiment you could try would be to treat your dataset as the "true" one, after removing the crashed samples -- then randomly remove 2% of the samples and see how much the indices change.

Do this a couple times to estimate the average effect of removing 2% of samples. But you can't do it with the crashed samples because you need the index values with and without removal.

@spizwhiz
Copy link

spizwhiz commented Oct 7, 2019

Perhaps there is something in your model output files you could use to flag failed runs? That is what I have done for this issue previously. Once you have identified the failed runs, you can boolean index your results dataframe or matrix (which in my case I made to also contain the parameters for that run) to identify what combinations of parameters caused a failure. Subsequently plotting the failed combinations in a scatter matrix was helpful in my case.

@TobiasKAndersen
Copy link
Contributor Author

TobiasKAndersen commented Oct 8, 2019

Thanks for the comments, @jdherman and @spizwhiz.
I will try to look at the effects of randomly removing 2% repeatedly of the model simulations and continue to try to identify causes for model crashes.

Though, I am not sure I understand "But you can't do it with the crashed samples because you need the index values with and without removal.". I first remove model crashes, and then continue to remove 2% and then again 2%... and analyze results. Of course I would need to match the removals in both X and Y dataframes, but do I need the model crash index for anything else?

@jdherman
Copy link
Member

jdherman commented Oct 8, 2019

Sorry, that wasn't clear. I meant that the "2% removal effect" experiment can't actually use the crashed samples at all, because you don't have the objective function values for those.

What you said is what I was thinking. You don't need the model crash index. Also just to be sure, the 2% resampling would be with replacement every time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants