Si.to_df() does not work with parameter groups #387

jdherman · 2020-11-18T17:44:34Z

bug and solution from Mickaël Trochet. Thank you!

The function Si.to_df() currently only works for the names key, but not when groups is defined.

Proposed fixes below, sent by Mickaël. This looks good and is ready for a PR. It could also be nice to add a unit test about this issue.

def Si_to_pandas_dict(S_dict):
    """Convert Si information into Pandas DataFrame compatible dict.

    Parameters
    ----------
    S_dict : ResultDict
        Sobol sensitivity indices

    See Also
    ----------
    Si_list_to_dict

    Returns
    ----------
    tuple : of total, first, and second order sensitivities.
            Total and first order are dicts.
            Second order sensitivities contain a tuple of parameter name
            combinations for use as the DataFrame index and second order
            sensitivities.
            If no second order indices found, then returns tuple of (None, None)

    Examples
    --------
    >>> X = saltelli.sample(problem, 1000)
    >>> Y = Ishigami.evaluate(X)
    >>> Si = sobol.analyze(problem, Y, print_to_console=True)
    >>> T_Si, first_Si, (idx, second_Si) = sobol.Si_to_pandas_dict(Si, problem)
    """
    problem = S_dict.problem
    total_order = {
        'ST': S_dict['ST'],
        'ST_conf': S_dict['ST_conf']
    }
    first_order = {
        'S1': S_dict['S1'],
        'S1_conf': S_dict['S1_conf']
    }

    idx = None
    second_order = None
    if 'S2' in S_dict:
        if problem['groups'] is not None:
            groups = problem['groups']
            groups_uniq = pd.Series(groups).drop_duplicates().tolist()
            idx = list(combinations(groups_uniq, 2))
            second_order = {
                'S2': [S_dict['S2'][groups_uniq.index(i[0]), groups_uniq.index(i[1])]
                       for i in idx],
                'S2_conf': [S_dict['S2_conf'][groups_uniq.index(i[0]), groups_uniq.index(i[1])]
                            for i in idx]
            }
        else:
            names = problem['names']
            idx = list(combinations(names, 2))
            second_order = {
                'S2': [S_dict['S2'][names.index(i[0]), names.index(i[1])]
                       for i in idx],
                'S2_conf': [S_dict['S2_conf'][names.index(i[0]), names.index(i[1])]
                        for i in idx]
            }

    return total_order, first_order, (idx, second_order)

def to_df(self):
    '''Conversion method to Pandas DataFrame. To be attached to ResultDict.

    Returns
    ========
    List : of Pandas DataFrames in order of Total, First, Second
    '''
    total, first, (idx, second) = Si_to_pandas_dict(self)
    if self.problem['groups'] is not None:
        groups = self.problem['groups']
        groups_uniq = pd.Series(groups).drop_duplicates().tolist()
        ret = [pd.DataFrame(total, index=groups_uniq),
               pd.DataFrame(first, index=groups_uniq)]

        if second:
            ret += [pd.DataFrame(second, index=idx)]

        return ret

    else:
        names = self.problem['names']
        ret = [pd.DataFrame(total, index=names),
               pd.DataFrame(first, index=names)]

        if second:
            ret += [pd.DataFrame(second, index=idx)]

        return ret

The text was updated successfully, but these errors were encountered:

Mickael01 · 2020-11-20T15:06:38Z

Thanks for the feedback and credit on your GitHub!

You wanted a example here it is.

I add a correction to my previous solution because it was working only when the problem was declared via the read_param_file method.

Indeed, when the problem dict was declared in the python script it wasn’t working. ( it is good to create a small test case!)

The new correction uses "if self.problem.get('groups'):" instead of "if self.problem['groups'] is not None:" which is more standard toward the entire package and works properly. In fact, when groups does not exists in a problem declare in the script then self.problem['groups'] does not exist and then it fails.

I join in the message the following files:

The new sobol.py (in txt format)
Two input_param.txt
A mwe.py (in txt format)

Basically, there is four different problems declared and a function called on each problems that calls saltelli.sample , a dummy evaluation function , analyse.sobol and Si.to_df().

Hope it done now!
I don't know if I am supposed to put that here. Sorry if it is the case I'm new to github!

input_param_withgroup.txt
input_param_withoutgroup.txt
mwe.txt
sobol.txt

Sincerely Mickaël.

ConnectedSystems · 2020-11-22T01:23:39Z

Hi @Mickael01

First and foremost, thank you for this.

This issue was already addressed in the recent development version (as yet not officially released) but only for the Sobol analysis method (see here)

I will extend your proposed tests to cover the other methods.

Could you let me know what version of SALib you're using? If it's a much older version we could release a patch for you.

Mickael01 · 2020-11-23T08:48:48Z

Hi @ConnectedSystems,

Yes, i am using conda and I install the SAlib version 1.3.11 via conda-forge (my pandas version is 1.1.3).
I installed it via conda prompt and typed

conda install -c conda-forge salib

By the way when I follow your link for the recent development version, I navigate throught utils_func here and found the 'check_group' function but not the 'extract_group_names' function just so you know.

More I read python projects like ours, more I realize that I need to declare more functions but smaller than mine !

I have a python script with two none standalone functions to plot S1, ST and S2.
I am sharing it, it might be useful and you might have some insight for me!

mwe2.txt

Sincelery Mickaël.

ConnectedSystems · 2020-11-23T08:56:37Z

Hi Mickaël

Yes, extract_group_names() can be found in the init file

Python import rules are strange 😉

Making these things more obvious is a long-term goal of mine.

ConnectedSystems · 2021-02-21T13:08:01Z

Tests added for Sobol' and Morris methods (in #392). Other methods will have to wait 'til later.

jdherman added bug good_for_first_issue labels Nov 18, 2020

ConnectedSystems self-assigned this Nov 19, 2020

ConnectedSystems added this to 1.4.x series in SALib Development Roadmap Sep 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Si.to_df() does not work with parameter groups #387

Si.to_df() does not work with parameter groups #387

jdherman commented Nov 18, 2020

Mickael01 commented Nov 20, 2020

ConnectedSystems commented Nov 22, 2020

Mickael01 commented Nov 23, 2020

ConnectedSystems commented Nov 23, 2020

ConnectedSystems commented Feb 21, 2021

Si.to_df() does not work with parameter groups #387

Si.to_df() does not work with parameter groups #387

Comments

jdherman commented Nov 18, 2020

Mickael01 commented Nov 20, 2020

ConnectedSystems commented Nov 22, 2020

Mickael01 commented Nov 23, 2020

ConnectedSystems commented Nov 23, 2020

ConnectedSystems commented Feb 21, 2021