Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run_multinode_tests_in_docker.sh :FileNotFoundError: [Errno 2] No such file or directory: 'python3.6' #1627

Closed
fco-dv opened this issue Feb 9, 2021 · 3 comments · Fixed by #1631
Assignees

Comments

@fco-dv
Copy link
Contributor

fco-dv commented Feb 9, 2021

🐛 Bug description

When running bash ./test/run_multinode_tests_in_docker.sh

FileNotFoundError: [Errno 2] No such file or directory: 'python3.6' 

This script is not in the CI but is useful to run the distributed tests on a local machine.
Idea would be to make it compatible with pytorch/pytorch:latest, eventually make an inline build of an image with pre-installed tests execution environment, add nnodes, nproc_per_node, gpu as script arguments.

Environment

  • PyTorch Version (e.g., 1.4):
  • Ignite Version (e.g., 0.3.0):
  • OS (e.g., Linux):
  • How you installed Ignite (conda, pip, source):
  • Python version:
  • Any other relevant information:
@fco-dv fco-dv self-assigned this Feb 9, 2021
@sparkingdark
Copy link
Contributor

@fco-dv is this issue relevant to anaconda, I got the same error while installing using docker

╭─   debo@pop-os    ~/ignite     master  
╰─ sudo bash ./tests/run_multinode_tests_in_docker.sh
23885d6b8bdb25571fdf500ab156816d242e5307a1e6eb0cd784d4710a08e4a3
Start Node 0
Unable to find image 'pytorch/pytorch:latest' locally
latest: Pulling from pytorch/pytorch
f22ccc0b8772: Pull complete 
3cf8fb62ba5f: Pull complete 
e80c964ece6a: Pull complete 
158621e7e7a7: Pull complete 
15235cfd2a87: Pull complete 
50e879fda18c: Pull complete 
Digest: sha256:836b33ede9d7a0ee40e9a6bfe441d6b89d2fa5b0fdcd8f3aee7e70d91fca6e70
Status: Downloaded newer image for pytorch/pytorch:latest
f419d60c26ff8b818a7f6281a3974cfe00669a90475e30fb7050ad8191adde81
Start Node 1
Collecting mock
  Downloading mock-4.0.3-py3-none-any.whl (28 kB)
Collecting pytest
  Downloading pytest-6.2.2-py3-none-any.whl (280 kB)
Collecting attrs>=19.2.0
  Downloading attrs-20.3.0-py2.py3-none-any.whl (49 kB)
Collecting importlib-metadata>=0.12
  Downloading importlib_metadata-3.4.0-py3-none-any.whl (10 kB)
Requirement already satisfied: typing-extensions>=3.6.4 in /opt/conda/lib/python3.7/site-packages (from importlib-metadata>=0.12->pytest) (3.7.4.3)
Collecting pluggy<1.0.0a1,>=0.12
  Downloading pluggy-0.13.1-py2.py3-none-any.whl (18 kB)
Collecting py>=1.8.2
  Downloading py-1.10.0-py2.py3-none-any.whl (97 kB)
Collecting zipp>=0.5
  Downloading zipp-3.4.0-py3-none-any.whl (5.2 kB)
Collecting pytest-xdist
  Downloading pytest_xdist-2.2.1-py3-none-any.whl (37 kB)
Collecting execnet>=1.1
  Downloading execnet-1.8.0-py2.py3-none-any.whl (39 kB)
Collecting apipkg>=1.4
  Downloading apipkg-1.5-py2.py3-none-any.whl (4.9 kB)
Collecting scikit-learn
  Downloading scikit_learn-0.24.1-cp37-cp37m-manylinux2010_x86_64.whl (22.3 MB)
Requirement already satisfied: numpy>=1.13.3 in /opt/conda/lib/python3.7/site-packages (from scikit-learn) (1.19.2)
Collecting joblib>=0.11
  Downloading joblib-1.0.1-py3-none-any.whl (303 kB)
Collecting scipy>=0.19.1
  Downloading scipy-1.6.0-cp37-cp37m-manylinux1_x86_64.whl (27.4 MB)
Collecting threadpoolctl>=2.0.0
  Downloading threadpoolctl-2.1.0-py3-none-any.whl (12 kB)
Collecting iniconfig
  Downloading iniconfig-1.1.1-py2.py3-none-any.whl (5.0 kB)
Collecting packaging
  Downloading packaging-20.9-py2.py3-none-any.whl (40 kB)
Collecting pyparsing>=2.0.2
  Downloading pyparsing-2.4.7-py2.py3-none-any.whl (67 kB)
Collecting pytest-forked
  Downloading pytest_forked-1.3.0-py2.py3-none-any.whl (4.7 kB)
Collecting toml
  Downloading toml-0.10.2-py2.py3-none-any.whl (16 kB)
Installing collected packages: zipp, pyparsing, importlib-metadata, toml, py, pluggy, packaging, iniconfig, attrs, pytest, apipkg, threadpoolctl, scipy, pytest-forked, joblib, execnet, scikit-learn, pytest-xdist, mock
Successfully installed apipkg-1.5 attrs-20.3.0 execnet-1.8.0 importlib-metadata-3.4.0 iniconfig-1.1.1 joblib-1.0.1 mock-4.0.3 packaging-20.9 pluggy-0.13.1 py-1.10.0 pyparsing-2.4.7 pytest-6.2.2 pytest-forked-1.3.0 pytest-xdist-2.2.1 scikit-learn-0.24.1 scipy-1.6.0 threadpoolctl-2.1.0 toml-0.10.2 zipp-3.4.0
============================= test session starts ==============================
platform linux -- Python 3.7.9, pytest-6.2.2, py-1.10.0, pluggy-0.13.1 -- /opt/conda/bin/python
cachedir: .pytest_cache
rootdir: /workspace, configfile: setup.cfg
plugins: xdist-2.2.1, forked-1.3.0
gw0 I / gw1 I / gw2 I / gw3 I
INTERNALERROR> Traceback (most recent call last):
INTERNALERROR>   File "/opt/conda/lib/python3.7/site-packages/_pytest/main.py", line 267, in wrap_session
INTERNALERROR>     config.hook.pytest_sessionstart(session=session)
INTERNALERROR>   File "/opt/conda/lib/python3.7/site-packages/pluggy/hooks.py", line 286, in __call__
INTERNALERROR>     return self._hookexec(self, self.get_hookimpls(), kwargs)
INTERNALERROR>   File "/opt/conda/lib/python3.7/site-packages/pluggy/manager.py", line 93, in _hookexec
INTERNALERROR>     return self._inner_hookexec(hook, methods, kwargs)
INTERNALERROR>   File "/opt/conda/lib/python3.7/site-packages/pluggy/manager.py", line 87, in <lambda>
INTERNALERROR>     firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
INTERNALERROR>   File "/opt/conda/lib/python3.7/site-packages/pluggy/callers.py", line 208, in _multicall
INTERNALERROR>     return outcome.get_result()
INTERNALERROR>   File "/opt/conda/lib/python3.7/site-packages/pluggy/callers.py", line 80, in get_result
INTERNALERROR>     raise ex[1].with_traceback(ex[2])
INTERNALERROR>   File "/opt/conda/lib/python3.7/site-packages/pluggy/callers.py", line 187, in _multicall
INTERNALERROR>     res = hook_impl.function(*args)
INTERNALERROR>   File "/opt/conda/lib/python3.7/site-packages/xdist/dsession.py", line 78, in pytest_sessionstart
INTERNALERROR>     nodes = self.nodemanager.setup_nodes(putevent=self.queue.put)
INTERNALERROR>   File "/opt/conda/lib/python3.7/site-packages/xdist/workermanage.py", line 65, in setup_nodes
INTERNALERROR>     return [self.setup_node(spec, putevent) for spec in self.specs]
INTERNALERROR>   File "/opt/conda/lib/python3.7/site-packages/xdist/workermanage.py", line 65, in <listcomp>
INTERNALERROR>     return [self.setup_node(spec, putevent) for spec in self.specs]
INTERNALERROR>   File "/opt/conda/lib/python3.7/site-packages/xdist/workermanage.py", line 68, in setup_node
INTERNALERROR>     gw = self.group.makegateway(spec)
INTERNALERROR>   File "/opt/conda/lib/python3.7/site-packages/execnet/multi.py", line 133, in makegateway
INTERNALERROR>     io = gateway_io.create_io(spec, execmodel=self.execmodel)
INTERNALERROR>   File "/opt/conda/lib/python3.7/site-packages/execnet/gateway_io.py", line 118, in create_io
INTERNALERROR>     return Popen2IOMaster(args, execmodel)
INTERNALERROR>   File "/opt/conda/lib/python3.7/site-packages/execnet/gateway_io.py", line 21, in __init__
INTERNALERROR>     self.popen = p = execmodel.PopenPiped(args)
INTERNALERROR>   File "/opt/conda/lib/python3.7/site-packages/execnet/gateway_base.py", line 184, in PopenPiped
INTERNALERROR>     return self.subprocess.Popen(args, stdout=PIPE, stdin=PIPE)
INTERNALERROR>   File "/opt/conda/lib/python3.7/subprocess.py", line 800, in __init__
INTERNALERROR>     restore_signals, start_new_session)
INTERNALERROR>   File "/opt/conda/lib/python3.7/subprocess.py", line 1551, in _execute_child
INTERNALERROR>     raise child_exception_type(errno_num, err_msg, err_filename)
INTERNALERROR> FileNotFoundError: [Errno 2] No such file or directory: 'python3.6': 'python3.6'
tempnet

platform: popos 20.04

probably a kernel error jupyter/jupyter#147

@fco-dv
Copy link
Contributor Author

fco-dv commented Feb 10, 2021

@sparkingdark I think this error is coming from the hardcoded python3.6 version in this command (l.16) which is different from that of your `'pytorch/pytorch:latest'`` python version , I manage to make it work , will make a PR

cmd="pytest --dist=each --tx $nproc_per_node*popen//python=python3.6 tests -m multinode_distributed -vvv $@"

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Feb 10, 2021

@fco-dv actually popen//python=python3.6 can be popen//python=python as here :

CUDA_VISIBLE_DEVICES="" pytest --tx 4*popen//python=python --cov ignite --cov-report term-missing --cov-report xml -vvv tests "${skip_distrib_opt[@]}"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants