Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[V1.8.x] Attemp to fix cd for v1.8.x #19947

Merged
merged 13 commits into from
Mar 2, 2021
Merged

Conversation

Zha0q1
Copy link
Contributor

@Zha0q1 Zha0q1 commented Feb 24, 2021

Based off #19946

josephevans and others added 9 commits February 24, 2021 01:22
…repo (apache#19788)

* Install python3.6 from deadsnakes repo, since 3.5 is EOL'd and get-pip.py no longer works with 3.5.

* Set symlink for python3 to point to newly installed 3.6 version.

* Setting symlink or using update-alternatives causes add-apt-repository to fail, so instead just set alias in environment to call the correct python version.

* Setup symlinks in /usr/local/bin, since it comes first in the path.

* Don't use absolute path for python3 executable, just use python3 from path.

Co-authored-by: Joe Evans <[email protected]>
…h cuda 11.0 in windows pipelines. (apache#19828)

Co-authored-by: Joe Evans <[email protected]>
…able (apache#19882)

* Set default for cache_intermediate.

* Make sure we sanitize region extracted from registry, since we pass it to os.system.

Co-authored-by: Joe Evans <[email protected]>
* Add random sleep only, since retry attempts are already implemented.

* Reduce random sleep to 2-10 sec.

Co-authored-by: Joe Evans <[email protected]>
* Test moving pipelines from p3 to g4.

* Remove fallback codecov command - the existing (first) command works and the second always fails a few times before finally succeeding (and also doesn't support the -P parameter, which causes an error.)

* Stop using docker python client, since it still doesn't support latest nvidia 'gpus' attribute. Switch to using subprocess calls using list parameter (to avoid shell injections).

See docker/docker-py#2395

* Remove old files.

* Fix comment

* Set default environment variables

* Fix GPU syntax.

* Use subprocess.run and redirect output to stdout, don't run docker in interactive mode.

* Check if codecov works without providing parameters now.

* Send docker stderr to sys.stderr

* Support both nvidia-docker configurations, first try '--gpus all', and if that fails, then try '--runtime nvidia'.

Co-authored-by: Joe Evans <[email protected]>
@mxnet-bot
Copy link

Hey @Zha0q1 , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

  • To trigger all jobs: @mxnet-bot run ci [all]
  • To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [windows-cpu, unix-gpu, windows-gpu, sanity, edge, website, clang, centos-cpu, miscellaneous, centos-gpu, unix-cpu]


Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

@lanking520 lanking520 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-awaiting-testing PR is reviewed and waiting CI build and test labels Feb 24, 2021
@lanking520 lanking520 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Feb 24, 2021
@lanking520 lanking520 added pr-awaiting-review PR is waiting for code review and removed pr-awaiting-testing PR is reviewed and waiting CI build and test labels Feb 24, 2021
apt-get install -y wget ${PYTHON_CMD}-dev gcc && \
wget https://bootstrap.pypa.io/get-pip.py && \
${PYTHON_CMD} get-pip.py
RUN apt-get update || true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you combine all these RUN commands into one? Each RUN command causes a docker layer to be saved, so it's more efficient to do

RUN apt-get update && \
    apt-get install -y software-properties-common && \
    add-apt-repository -y ppa:deadsnakes/ppa && \
    apt-get update && \
    apt-get install -y python3.7-dev python3.7-distutils virtualenv wget && \
    ln -sf /usr/bin/python3.7 /usr/local/bin/python3 && \
    wget -nv https://bootstrap.pypa.io/get-pip.py && \
    python3 get-pip.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

…pache#19959)

* update cude compt for cd

* Update Dockerfile.build.ubuntu_gpu_cu102

* Update Dockerfile.build.ubuntu_gpu_cu102

* Update Dockerfile.build.ubuntu_gpu_cu110

* Update runtime_functions.sh

* Update Dockerfile.build.ubuntu_gpu_cu110

* Update Dockerfile.build.ubuntu_gpu_cu102
@lanking520 lanking520 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-awaiting-review PR is waiting for code review and removed pr-awaiting-review PR is waiting for code review pr-awaiting-testing PR is reviewed and waiting CI build and test labels Feb 26, 2021
@Zha0q1 Zha0q1 changed the title [wip][V1.8.x] Attemp to fix cd for v1.8.x [V1.8.x] Attemp to fix cd for v1.8.x Mar 1, 2021
@lanking520 lanking520 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-awaiting-review PR is waiting for code review and removed pr-awaiting-review PR is waiting for code review pr-awaiting-testing PR is reviewed and waiting CI build and test labels Mar 1, 2021
Copy link
Contributor

@josephevans josephevans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@Zha0q1 Zha0q1 merged commit a03c509 into apache:v1.8.x Mar 2, 2021
mseth10 pushed a commit to mseth10/incubator-mxnet that referenced this pull request Mar 15, 2021
* [v1.x] Migrate to use ECR as docker cache instead of dockerhub (apache#19654)

* [v1.x] Update CI build scripts to install python 3.6 from deadsnakes repo (apache#19788)

* Install python3.6 from deadsnakes repo, since 3.5 is EOL'd and get-pip.py no longer works with 3.5.

* Set symlink for python3 to point to newly installed 3.6 version.

* Setting symlink or using update-alternatives causes add-apt-repository to fail, so instead just set alias in environment to call the correct python version.

* Setup symlinks in /usr/local/bin, since it comes first in the path.

* Don't use absolute path for python3 executable, just use python3 from path.

Co-authored-by: Joe Evans <[email protected]>

* Disable unix-gpu-cu110 pipeline for v1.x build since we now build with cuda 11.0 in windows pipelines. (apache#19828)

Co-authored-by: Joe Evans <[email protected]>

* [v1.x] For ECR, ensure we sanitize region input from environment variable (apache#19882)

* Set default for cache_intermediate.

* Make sure we sanitize region extracted from registry, since we pass it to os.system.

Co-authored-by: Joe Evans <[email protected]>

* [v1.x] Address CI failures with docker timeouts (v2) (apache#19890)

* Add random sleep only, since retry attempts are already implemented.

* Reduce random sleep to 2-10 sec.

Co-authored-by: Joe Evans <[email protected]>

* [v1.x] CI fixes to make more stable and upgradable (apache#19895)

* Test moving pipelines from p3 to g4.

* Remove fallback codecov command - the existing (first) command works and the second always fails a few times before finally succeeding (and also doesn't support the -P parameter, which causes an error.)

* Stop using docker python client, since it still doesn't support latest nvidia 'gpus' attribute. Switch to using subprocess calls using list parameter (to avoid shell injections).

See docker/docker-py#2395

* Remove old files.

* Fix comment

* Set default environment variables

* Fix GPU syntax.

* Use subprocess.run and redirect output to stdout, don't run docker in interactive mode.

* Check if codecov works without providing parameters now.

* Send docker stderr to sys.stderr

* Support both nvidia-docker configurations, first try '--gpus all', and if that fails, then try '--runtime nvidia'.

Co-authored-by: Joe Evans <[email protected]>

* fix cd

* fix cudnn version for cu10.2 buiuld

* WAR the dataloader issue with forked processes holding stale references (apache#19924)

* skip some tests

* fix ski[

* [v.1x] Attempt to fix v1.x cd by installing new cuda compt package (apache#19959)

* update cude compt for cd

* Update Dockerfile.build.ubuntu_gpu_cu102

* Update Dockerfile.build.ubuntu_gpu_cu102

* Update Dockerfile.build.ubuntu_gpu_cu110

* Update runtime_functions.sh

* Update Dockerfile.build.ubuntu_gpu_cu110

* Update Dockerfile.build.ubuntu_gpu_cu102

* update command

Co-authored-by: Joe Evans <[email protected]>
Co-authored-by: Joe Evans <[email protected]>
Co-authored-by: Joe Evans <[email protected]>
Co-authored-by: Przemyslaw Tredak <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-review PR is waiting for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants