Method to locate parameter combinations/ranges that cause model failure with Morris? #262

TobiasKAndersen · 2019-09-05T09:19:42Z

On a related question regarding model failure when running Method of Morris:
I am running a process-based, biogeochemical model with 200+ parameters and get around 5% model failures. I was thinking of trying to locate the parameters and/or parameter combinations as a way to handle/avoid model failure and thereby being able to go forward with Method of Morris analysis. Does anyone has a good method to locate parameters causing model failure?

(I hope this question is not out of the scope for this board) :-)

jdherman · 2019-09-05T22:44:54Z

Hi @TobiasKAndersen , interesting question. There isn't a way to this within SALib, but you could consider running a classification method like CART in scikit-learn to find the parameter combinations that cause model failure. First you'd just have to make a 0/1 output column indicating failures, then train CART on that. Hopefully the resulting decision tree would give some insight into what parameter combinations are causing problems.

In case that doesn't work, or doesn't provide useful insight ... the other option you could consider is to use the Delta-MIM method (in SALib) which does not require a specific sampling scheme, and therefore can be used with a set of results where the failures have been removed.

Finally, in this issue thread a while ago we talked about replacing missing values with a placeholder so that the sensitivity indices could still be calculated.
#134

TobiasKAndersen · 2019-09-12T12:56:20Z

Thanks for all the great advice!
I'll begin immediately looking into CART ;-)

TobiasKAndersen · 2019-10-07T09:22:17Z

@jdherman , I have now implemented the CART algorithm and it works quite well as an exploratory model crash tool. So thanks for the advice!
When working with the Delta-MIM method, I am removing approx. 2% of samples (as they results in crashes). Do you know of any test to check if this could potentially influence SA results?

jdherman · 2019-10-07T23:14:39Z

Cool. One experiment you could try would be to treat your dataset as the "true" one, after removing the crashed samples -- then randomly remove 2% of the samples and see how much the indices change.

Do this a couple times to estimate the average effect of removing 2% of samples. But you can't do it with the crashed samples because you need the index values with and without removal.

spizwhiz · 2019-10-07T23:20:49Z

Perhaps there is something in your model output files you could use to flag failed runs? That is what I have done for this issue previously. Once you have identified the failed runs, you can boolean index your results dataframe or matrix (which in my case I made to also contain the parameters for that run) to identify what combinations of parameters caused a failure. Subsequently plotting the failed combinations in a scatter matrix was helpful in my case.

TobiasKAndersen · 2019-10-08T06:11:51Z

Thanks for the comments, @jdherman and @spizwhiz.
I will try to look at the effects of randomly removing 2% repeatedly of the model simulations and continue to try to identify causes for model crashes.

Though, I am not sure I understand "But you can't do it with the crashed samples because you need the index values with and without removal.". I first remove model crashes, and then continue to remove 2% and then again 2%... and analyze results. Of course I would need to match the removals in both X and Y dataframes, but do I need the model crash index for anything else?

jdherman · 2019-10-08T14:10:58Z

Sorry, that wasn't clear. I meant that the "2% removal effect" experiment can't actually use the crashed samples at all, because you don't have the objective function values for those.

What you said is what I was thinking. You don't need the model crash index. Also just to be sure, the 2% resampling would be with replacement every time.

jdherman closed this as completed Sep 5, 2019

jdherman mentioned this issue Nov 7, 2019

Handling NaN/inf model output #273

Open

mschrader15 mentioned this issue Jul 12, 2022

Quantifying the impact of "outliers" on SA results #516

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Method to locate parameter combinations/ranges that cause model failure with Morris? #262

Method to locate parameter combinations/ranges that cause model failure with Morris? #262

TobiasKAndersen commented Sep 5, 2019

jdherman commented Sep 5, 2019

TobiasKAndersen commented Sep 12, 2019

TobiasKAndersen commented Oct 7, 2019

jdherman commented Oct 7, 2019

spizwhiz commented Oct 7, 2019

TobiasKAndersen commented Oct 8, 2019 •

edited

Loading

jdherman commented Oct 8, 2019

Method to locate parameter combinations/ranges that cause model failure with Morris? #262

Method to locate parameter combinations/ranges that cause model failure with Morris? #262

Comments

TobiasKAndersen commented Sep 5, 2019

jdherman commented Sep 5, 2019

TobiasKAndersen commented Sep 12, 2019

TobiasKAndersen commented Oct 7, 2019

jdherman commented Oct 7, 2019

spizwhiz commented Oct 7, 2019

TobiasKAndersen commented Oct 8, 2019 • edited Loading

jdherman commented Oct 8, 2019

TobiasKAndersen commented Oct 8, 2019 •

edited

Loading