Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ProblemSpec.sample() causes permanent message #555

Closed
morrowcj opened this issue Mar 17, 2023 · 14 comments · Fixed by #556
Closed

ProblemSpec.sample() causes permanent message #555

morrowcj opened this issue Mar 17, 2023 · 14 comments · Fixed by #556

Comments

@morrowcj
Copy link

morrowcj commented Mar 17, 2023

I'm just starting to get into SALib and I'm drawn to the ProblemSpec class, which provides easy access to all the main features of the library. However, the sample() method causes a message to appear forever.

I'm using the stable version (via pip install SALib), so this may not be an issue with the github version.

Here's some reproducible code (interactive session), with output in comment blocks:

import numpy as np
from SALib import ProblemSpec
from SALib.sample import sobol

prob = ProblemSpec({
        "names": ["x1", "x2", "x3"],
        "groups": None,
        "bounds": [[-np.pi, np.pi]] * 3,
        "outputs": ["Y"],
})

prob.sample(func=sobol.sample, N=1024)
# {'names': ['x1', 'x2', 'x3'], 'groups': None, 'bounds': [[-3.141592653589793, 3.141592653589793], [-3.141592653589793, 3.141592653589793], [-3.141592653589793, 3.141592653589793]], 'outputs': ['response'], 'num_vars': 3, 'sample_scaled': True}
# Samples:
# -- 3 parameters: ['x1', 'x2', 'x3']
# -- 8192 evaluations 

prob.samples
# array([[ 0.27913238, -2.16263456, -1.65803756],
#        [-1.82017621, -2.16263456, -1.65803756],
#        [ 0.27913238,  3.13334766, -1.65803756],
#        ...,
#        [-3.0968241 ,  2.8720456 ,  1.50480474],
#        [-3.0968241 ,  1.73726994,  1.42794031],
#        [-3.0968241 ,  1.73726994,  1.50480474]])
# Samples:
# -- 3 parameters: ['x1', 'x2', 'x3']
# -- 8192 evaluations 

print(7)  # completely unrelated
# 7
# Samples:
# -- 3 parameters: ['x1', 'x2', 'x3']
# -- 8192 evaluations 

Is this normal? I don't get the same issue if I run sobol.sample(prob, 1024) instead.


Looking through the source code, it appears that this caused by the __str__ method, which prints instead of returning a string.

@ConnectedSystems
Copy link
Member

Hi, thanks for reporting this, it is certainly strange!

Could I ask:

Which python distribution (Anaconda?) and version?

Is this in a Jupyter Notebook, the terminal/REPL or something else?

Which IDE/editor?

OS (Windows?)

Thank you

@ConnectedSystems
Copy link
Member

I wonder also, does the issue disappear if you use the below instead?

prob.sample_sobol(1024)

@morrowcj
Copy link
Author

morrowcj commented Mar 17, 2023

  1. I am not using the anaconda version of python, no. Here's my version info
$ python --version
Python 3.10.5
  1. It is not a Jupyter notebook either. I am using the Pycharm IDE with a Python interpreter in a virtual environment. It was run in the Python Console there. I just ran this same code from python in a gitbash terminal and did not experience the same error - interesting.

  2. yes, my OS is Windows 10.

  3. yes using prob.sample_sobol(1024) (or any of the methods added with _add_samplers) does still cause that error in the original environment.

@ConnectedSystems
Copy link
Member

Okay, I think that gives me enough leads to look into, thanks again for reporting!

@morrowcj
Copy link
Author

No problem - can I ask as well: what is the status of evaluate_parallel and analyze_parallel? I'm getting warnings that it is experimental and didn't run in multiple processes. Do I need to do something to set it up properly?

@ConnectedSystems
Copy link
Member

ConnectedSystems commented Mar 17, 2023

Hmm, evaluate_parallel is fairly stable in my experience.

Did you install the additional required packages for parallel evaluation to work? (I guess so because an error message should display if you didn't, but want to confirm)

Are you specifying the number of cores to use with the nprocs argument?

Also note that you'll typically only see an improvement with parallel analysis in cases where you have multiple outputs.

@morrowcj
Copy link
Author

morrowcj commented Mar 17, 2023

Yeah, it isn't working on my work machine. Are there additional requirements to numpy, scipy, pandas, and matplotlib? I'll try it out on my personal computer that I know better and report back. Should I submit a separate issue if I can't get it working?

is evaluate with the nprocs argument simply an alias for evaluate_parallel?

Yes, I plan to use this on a large model with multiple outputs. I expect to need parallelization/distribution. So, if I can get the native version working, it'll save me from having to write my own.

@ConnectedSystems
Copy link
Member

ConnectedSystems commented Mar 17, 2023

Yes, please do open a separate issue. Not at my computer at the moment but there should be instructions on how to install the additional set of packages somewhere.

And yes, it is an alias to evaluate_parallel.

Thanks again for reporting these, I'll look into it as soon as I can

@ConnectedSystems
Copy link
Member

Hi @morrowcj

Could you try the version with potential fix installable with the command below?

pip install git+https://github.com/SALib/SALib.git@problem_spec-output-fix-555

I updated __str__ to return the intended output rather than displaying it as you said.
I suspect the PyCharm console is rendering both string output and the return, and caching the output (for whatever reason) which is how you're getting repeated stale outputs.

Regarding the parallel evaluation, are you trying it out using the example you gave above? I can confirm it's working for me.
I'm also guessing you're using the Task Manager to see if it is working?

If so, it's likely that it's finishing so fast that you never see the second Python process spin up in the task manager.
Try a much larger sample (e.g., 2**13 or 2**14).

I would also advise trying Resource Monitor instead of the Task Manager to track CPU/Memory usage for performance investigations (it has a higher polling rate than task manager if I remember correctly and more information).

@ConnectedSystems
Copy link
Member

FYI @morrowcj I just added some further documentation on writing wrappers for use with SALib which might be helpful getting things to work with the parallel evaluator. Appreciate any feedback if you have any.

https://salib.readthedocs.io/en/latest/user_guide/wrappers.html

@morrowcj
Copy link
Author

pip install git+https://github.com/SALib/SALib.git@problem_spec-output-fix-555

This worked to remove the messages (I had to manually uninstall the old version, though). I also confirmed that calling print(prob) now gives the string as expected.

@morrowcj
Copy link
Author

morrowcj commented Mar 20, 2023

@ConnectedSystems, resource monitor confirmed that the parallel evaluator works as expected when run in a terminal. However, I have been unsuccessful getting it to run in pycharm. I'll start a new issue for that.


Edit: never mind, this is a Pycharm issue. I can't run any code using the multiproccessing package.

@morrowcj
Copy link
Author

https://salib.readthedocs.io/en/latest/user_guide/wrappers.html

@ConnectedSystems, this is great, and exactly what I was planning for my work. Thank you for writing this up - it is very helpful and practical.

@ConnectedSystems
Copy link
Member

Thanks again for reporting @morrowcj . I'll create a separate issue re PyCharm and multiprocessing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants