Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantifying the impact of "outliers" on SA results #516

Open
mschrader15 opened this issue Jul 12, 2022 · 4 comments
Open

Quantifying the impact of "outliers" on SA results #516

mschrader15 opened this issue Jul 12, 2022 · 4 comments

Comments

@mschrader15
Copy link

mschrader15 commented Jul 12, 2022

Hi All,

I am using this issue like a forum question:

Does anyone know of published research that deals with selecting the bounds for the sensitivity analysis, and especially how model evaluation near the bounds can lead to erroneous results?

My application is traffic simulation. When I generate Sobol sequences for input parameters using the ranges given by prior papers, I find that certain combinations of parameters near the bounds (one at highest value, one at lowest) can cause the traffic flow to break down, making it much different than 99% of the other simulations and failing very basic calibration.

The impact is that these 1% simulations cause a long right tail in the overall distribution of my model outputs and cause the Sobol indices to "over-weight" the impact of the parameters that caused the breakdown. If I re-run the simulation with the sequence bounds set to more realistic values, the CI converges faster and SI's are different.

@mschrader15
Copy link
Author

Looks like #262 was a similar question

@willu47
Copy link
Member

willu47 commented Jul 12, 2022

Hi @mschrader15 - thanks for the question. I am currently looking into a related topic regarding "bounds for the sensitivity analysis" and have found a few papers which show the importance of "properly" assessing the range over which the input parameters can vary - e.g. your uncertainty over the input - rather than merely selecting a ±10% range around a central value.

So this is like the opposite issue to the one you raise, but it is related - in both cases, conducting an exercise to try to quantify uncertainty around the inputs should lead you to a good solution. In your case, if there is a possibility that these parameter combinations which cause erroneous results can occur, then you need to take a closer look at your model to better simulate this eventuality. If the likelihood of the combination is vanishingly small, then you can exclude that from your analysis.

@mschrader15
Copy link
Author

Thanks for the synopsis @willu47! Would you mind sharing links to those papers?

@mschrader15
Copy link
Author

mschrader15 commented Jul 28, 2022

Referencing #315 here as well. Same issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants