Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skewed Lognormal Distributions #584

Open
kka1996 opened this issue Aug 10, 2023 · 3 comments
Open

Skewed Lognormal Distributions #584

kka1996 opened this issue Aug 10, 2023 · 3 comments

Comments

@kka1996
Copy link

kka1996 commented Aug 10, 2023

Hi everyone,

I am trying to run a sobol analysis on a dataset which is distributed lognormally but with a skewness in it. Consequently, I am not able to represent the data accurately with the ln-spaced mean and standard deviation.
This is how the data looks like in the original feature-space:
grafik

This is how the data looks like once I apply ln to the x-axis:
grafik

One can clearly see that the data has some skewness in it. If I simply take the mean and standard deviation of the log-transformed data and try to sample from the distribution I don't get any good samples. I also tried calculating the mean and standard deviation using the location, scale and shape parameter which I got from from scipy.stats.lognorm.fit(). Am I missing something here, would be great if someone is able to help.

Best

@ConnectedSystems
Copy link
Member

Just to check, are you specifying the relevant factors as having a log-normal distribution?

See: https://salib.readthedocs.io/en/latest/user_guide/advanced.html#generating-alternate-distributions

Otherwise, I suggest trying to transform the data to a gaussian distribution.

@kka1996
Copy link
Author

kka1996 commented Aug 11, 2023

@ConnectedSystems yes, I went through the referred page and for the given data I calculated the ln-spaced mean and standard deviation. However, for skewed lognormal distributions these two parameters are not sufficient to describe the distribution.

Regarding the transformation: I am not quite sure how this will solve the problem. I tried to fit a normal distribution to the data in the original space as well, however, this (similar to fitting a lognormal distribution) did not give a good fit. Thus, I believe that there needs to be a way to include the skewness of the distribution as well in the input parameters. In scipy this is done by giving the three parameters location, scale and shape (https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.lognorm.html). However, I do not see any equivalent in the docs for SA-lib to include that.

@ConnectedSystems
Copy link
Member

ConnectedSystems commented Aug 11, 2023

Sorry, just to check again - when you say you are trying to conduct a Sobol' sensitivity analysis, is this data set you're referring to the output from the Sobol' sampling method as provided in SALib? Just want to confirm this, as what you've written makes it sound like you already have some data on hand. For clarity, the Sobol' method requires a specific sampling scheme, although SALib does offer given-data approaches such as Delta, PAWN, and HDMR. The new discrepancy analysis method is said to be comparable to Sobol' total order indices as well.

I tried to fit a normal distribution to the data in the original space as well, however, this (similar to fitting a lognormal distribution) did not give a good fit

I was suggesting that you modify the data set so that it follows a normal distribution.
What comes immediately to mind is to apply a Box-Cox transform to the log-normal data:

https://scikit-learn.org/stable/auto_examples/preprocessing/plot_map_data_to_normal.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants