Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate values in sample.saltelli #622

Closed
Rishitha-28 opened this issue Jun 12, 2024 · 5 comments
Closed

Duplicate values in sample.saltelli #622

Rishitha-28 opened this issue Jun 12, 2024 · 5 comments

Comments

@Rishitha-28
Copy link

Hello,
I am conducting a sensitivity analysis on watershed parameters using Sobol's method in hydrological modeling. For this purpose, I have been utilizing the 'SALib.analyze.sobol()' function along with the sample generation functions 'SALib.sample.saltelli()' and 'SALib.sample.sobol()'.

While generating parameter values with these functions, I noticed a significant number of duplicate entries in param_values—approximately one-third of the total Y.size. I attempted to adjust the skip_values parameter, but the issue persists.

Could you please provide insights into why these duplicate values might be occurring? Additionally, how might these duplicates impact the validity and accuracy of the sensitivity analysis?

Thank you in advance!

@tupui
Copy link
Member

tupui commented Jun 12, 2024

Hi 👋 thanks for reporting.

Do you have an example to show maybe?

Duplicates on some axis are normal because the Saltelli sampling is constructed like that, you have two independent matrices A and B, then combine these to have AB and BA. The combination is just a permutation of columns so looking at a single axis of the whole sample would have a lot of duplicates. But each sample should be unique.

@Rishitha-28
Copy link
Author

Hello,

Thank you for the prompt response.
I can show you a basic example of 2 parameters:

image
Output:

length of param_values = 1536
Total number of duplicates = 512

I tried it with a smaller sample size to see repeated values.
Smaller Sample Size:
image
Highlighting a few for your reference :)
image

Thank You!

@tupui
Copy link
Member

tupui commented Jun 12, 2024

I see. It's because you have just 2 parameters and are asking for second order indices. We don't prevent people from asking that, but it does not make much sense to try compute that. Like x1|(x1,x2) would not really make sense. It just happens to work as a corner case.

@ConnectedSystems unless I missed something, maybe we should add a check?

@Rishitha-28
Copy link
Author

I hadn't realized that the sampling would change when requesting second-order indices.
I've now set duplicates to false, so the issue is resolved.
Thank you very much!

@ConnectedSystems
Copy link
Member

A check would be good. I'll earmark some time to implement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants