Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Si.to_df() does not work with parameter groups #387

Open
jdherman opened this issue Nov 18, 2020 · 5 comments
Open

Si.to_df() does not work with parameter groups #387

jdherman opened this issue Nov 18, 2020 · 5 comments

Comments

@jdherman
Copy link
Member

bug and solution from Mickaël Trochet. Thank you!

The function Si.to_df() currently only works for the names key, but not when groups is defined.

Proposed fixes below, sent by Mickaël. This looks good and is ready for a PR. It could also be nice to add a unit test about this issue.

def Si_to_pandas_dict(S_dict):
    """Convert Si information into Pandas DataFrame compatible dict.

    Parameters
    ----------
    S_dict : ResultDict
        Sobol sensitivity indices

    See Also
    ----------
    Si_list_to_dict

    Returns
    ----------
    tuple : of total, first, and second order sensitivities.
            Total and first order are dicts.
            Second order sensitivities contain a tuple of parameter name
            combinations for use as the DataFrame index and second order
            sensitivities.
            If no second order indices found, then returns tuple of (None, None)

    Examples
    --------
    >>> X = saltelli.sample(problem, 1000)
    >>> Y = Ishigami.evaluate(X)
    >>> Si = sobol.analyze(problem, Y, print_to_console=True)
    >>> T_Si, first_Si, (idx, second_Si) = sobol.Si_to_pandas_dict(Si, problem)
    """
    problem = S_dict.problem
    total_order = {
        'ST': S_dict['ST'],
        'ST_conf': S_dict['ST_conf']
    }
    first_order = {
        'S1': S_dict['S1'],
        'S1_conf': S_dict['S1_conf']
    }

    idx = None
    second_order = None
    if 'S2' in S_dict:
        if problem['groups'] is not None:
            groups = problem['groups']
            groups_uniq = pd.Series(groups).drop_duplicates().tolist()
            idx = list(combinations(groups_uniq, 2))
            second_order = {
                'S2': [S_dict['S2'][groups_uniq.index(i[0]), groups_uniq.index(i[1])]
                       for i in idx],
                'S2_conf': [S_dict['S2_conf'][groups_uniq.index(i[0]), groups_uniq.index(i[1])]
                            for i in idx]
            }
        else:
            names = problem['names']
            idx = list(combinations(names, 2))
            second_order = {
                'S2': [S_dict['S2'][names.index(i[0]), names.index(i[1])]
                       for i in idx],
                'S2_conf': [S_dict['S2_conf'][names.index(i[0]), names.index(i[1])]
                        for i in idx]
            }

    return total_order, first_order, (idx, second_order)

def to_df(self):
    '''Conversion method to Pandas DataFrame. To be attached to ResultDict.

    Returns
    ========
    List : of Pandas DataFrames in order of Total, First, Second
    '''
    total, first, (idx, second) = Si_to_pandas_dict(self)
    if self.problem['groups'] is not None:
        groups = self.problem['groups']
        groups_uniq = pd.Series(groups).drop_duplicates().tolist()
        ret = [pd.DataFrame(total, index=groups_uniq),
               pd.DataFrame(first, index=groups_uniq)]

        if second:
            ret += [pd.DataFrame(second, index=idx)]

        return ret

    else:
        names = self.problem['names']
        ret = [pd.DataFrame(total, index=names),
               pd.DataFrame(first, index=names)]

        if second:
            ret += [pd.DataFrame(second, index=idx)]

        return ret
@Mickael01
Copy link

Thanks for the feedback and credit on your GitHub!

You wanted a example here it is.

I add a correction to my previous solution because it was working only when the problem was declared via the read_param_file method.

Indeed, when the problem dict was declared in the python script it wasn’t working. ( it is good to create a small test case!)

The new correction uses "if self.problem.get('groups'):" instead of "if self.problem['groups'] is not None:" which is more standard toward the entire package and works properly. In fact, when groups does not exists in a problem declare in the script then self.problem['groups'] does not exist and then it fails.

I join in the message the following files:

  • The new sobol.py (in txt format)
  • Two input_param.txt
  • A mwe.py (in txt format)

Basically, there is four different problems declared and a function called on each problems that calls saltelli.sample , a dummy evaluation function , analyse.sobol and Si.to_df().

Hope it done now!
I don't know if I am supposed to put that here. Sorry if it is the case I'm new to github!

input_param_withgroup.txt
input_param_withoutgroup.txt
mwe.txt
sobol.txt

Sincerely Mickaël.

@ConnectedSystems
Copy link
Member

Hi @Mickael01

First and foremost, thank you for this.

This issue was already addressed in the recent development version (as yet not officially released) but only for the Sobol analysis method (see here)

I will extend your proposed tests to cover the other methods.

Could you let me know what version of SALib you're using? If it's a much older version we could release a patch for you.

@Mickael01
Copy link

Hi @ConnectedSystems,

Yes, i am using conda and I install the SAlib version 1.3.11 via conda-forge (my pandas version is 1.1.3).
I installed it via conda prompt and typed

conda install -c conda-forge salib

By the way when I follow your link for the recent development version, I navigate throught utils_func here and found the 'check_group' function but not the 'extract_group_names' function just so you know.

More I read python projects like ours, more I realize that I need to declare more functions but smaller than mine !

I have a python script with two none standalone functions to plot S1, ST and S2.
I am sharing it, it might be useful and you might have some insight for me!

mwe2.txt

Sincelery Mickaël.

@ConnectedSystems
Copy link
Member

Hi Mickaël

Yes, extract_group_names() can be found in the init file

Python import rules are strange 😉

Making these things more obvious is a long-term goal of mine.

@ConnectedSystems
Copy link
Member

Tests added for Sobol' and Morris methods (in #392). Other methods will have to wait 'til later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

3 participants