Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[CI][URGENT] Fix permissions of ci/docker/install/ubuntu_publish.sh #13840

Merged
merged 1 commit into from
Jan 15, 2019

Conversation

larroy
Copy link
Contributor

@larroy larroy commented Jan 11, 2019

Description

see title

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http:https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

@anirudhacharya
Copy link
Member

@mxnet-label-bot add [pr-awaiting-review]

@marcoabreu marcoabreu added the pr-awaiting-review PR is waiting for code review label Jan 11, 2019
@larroy larroy changed the title Fix permissions of ci/docker/install/ubuntu_publish.sh [CI] Fix permissions of ci/docker/install/ubuntu_publish.sh Jan 15, 2019
@larroy larroy changed the title [CI] Fix permissions of ci/docker/install/ubuntu_publish.sh [CI][URGENT] Fix permissions of ci/docker/install/ubuntu_publish.sh Jan 15, 2019
@KellenSunderland
Copy link
Contributor

Can you give some more detail about what this fixes?

@marcoabreu
Copy link
Contributor

Waiting with merge until Kellen gives his green light, considering his open question.

Copy link
Contributor

@lebeg lebeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the problem was with access to this file by the 1000 user, right?

@larroy
Copy link
Contributor Author

larroy commented Jan 15, 2019

ubuntu 14 docker image is failing across the board due to the permissions in this script.

http:https://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/restricted-docker-cache-refresh/detail/master/1192/pipeline

@larroy
Copy link
Contributor Author

larroy commented Jan 15, 2019

Check any recent build from the docker cache, or right now from master.

@KellenSunderland
Copy link
Contributor

Don't consider my comments blocking, I just didn't have enough context to merge. Merge away if it looks good to you @marcoabreu.

@KellenSunderland
Copy link
Contributor

KellenSunderland commented Jan 15, 2019

@larroy Right now on master I see a problem saying:

http:https://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fsanity/detail/master/204/pipeline

+ apt-get update
Ign:1 http:https://cran.rstudio.com/bin/linux/ubuntu trusty/ InRelease
Hit:2 http:https://cran.rstudio.com/bin/linux/ubuntu trusty/ Release
Get:3 http:https://apt.llvm.org/xenial llvm-toolchain-xenial-3.9 InRelease [4207 B]
Get:4 http:https://apt.llvm.org/xenial llvm-toolchain-xenial-6.0 InRelease [4236 B]
Get:6 http:https://apt.llvm.org/xenial llvm-toolchain-xenial-3.9/main amd64 Packages [6693 B]
Hit:7 https://cran.rstudio.com/bin/linux/ubuntu xenial/ InRelease
Get:8 http:https://apt.llvm.org/xenial llvm-toolchain-xenial-6.0/main amd64 Packages [6570 B]
Err:8 http:https://apt.llvm.org/xenial llvm-toolchain-xenial-6.0/main amd64 Packages
  Hash Sum mismatch

Hit:9 http:https://archive.ubuntu.com/ubuntu xenial InRelease
Hit:10 http:https://security.ubuntu.com/ubuntu xenial-security InRelease
Hit:11 http:https://archive.ubuntu.com/ubuntu xenial-updates InRelease
Hit:12 http:https://archive.ubuntu.com/ubuntu xenial-backports InRelease

Fetched 21.7 kB in 0s (34.2 kB/s)

Reading package lists...
E: Failed to fetch http:https://apt.llvm.org/xenial/dists/llvm-toolchain-xenial-6.0/main/binary-amd64/Packages.gz  Hash Sum mismatch
E: Some index files failed to download. They have been ignored, or old ones used instead.
+ wget https://raw.githubusercontent.com/llvm-mirror/clang-tools-extra/7654135f0cbd155c285fd2a37d87e27e4fff3071/clang-tidy/tool/run-clang-tidy.py -O /usr/lib/llvm-6.0/share/clang/run-clang-tidy.py
/usr/lib/llvm-6.0/share/clang/run-clang-tidy.py: No such file or directory
The command '/bin/sh -c /work/ubuntu_clang.sh' returned a non-zero code: 1

Traceback (most recent call last):
  File "ci/build.py", line 582, in <module>
    sys.exit(main())
  File "ci/build.py", line 495, in main
    num_retries=args.docker_build_retries, no_cache=args.no_cache)
  File "ci/build.py", line 162, in build_docker
    run_cmd()
  File "/home/jenkins_slave/workspace/sanity-lint/ci/util.py", line 77, in f_retry
    return f(*args, **kwargs)
  File "ci/build.py", line 160, in run_cmd
    check_call(cmd)
  File "/usr/lib/python3.5/subprocess.py", line 581, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['docker', 'build', '-f', 'docker/Dockerfile.build.ubuntu_cpu', '--build-arg', 'USER_ID=1001', '--build-arg', 'GROUP_ID=1001', '--cache-from', 'mxnetci/build.ubuntu_cpu', '-t', 'mxnetci/build.ubuntu_cpu', 'docker']' returned non-zero exit status 1

Which does not seem related to me. I might be missing something though.

@larroy
Copy link
Contributor Author

larroy commented Jan 15, 2019

Yes this is another issue, with failing download of clang tidy script. Both are valid.

@larroy
Copy link
Contributor Author

larroy commented Jan 15, 2019

Since docker cache is not working well (This fixes it) we have docker image being rebuilt in PRs and master CI checks, can we please merge this then fix other things step by step? thanks.

@marcoabreu marcoabreu merged commit c4b4246 into apache:master Jan 15, 2019
@marcoabreu
Copy link
Contributor

Thanks a lot for the fix, Pedro!

@KellenSunderland
Copy link
Contributor

KellenSunderland commented Jan 16, 2019

Can you guys verify this fixed what you wanted it to? I see two builds in master broken for task restricted-docker-cache-refresh with the error:

@larroy @marcoabreu

docker_cache.py: 2019-01-16 03:05:12,704 Failed to generate publish.ubuntu1604_cpu

Separate error?

@larroy
Copy link
Contributor Author

larroy commented Jan 20, 2019

@KellenSunderland works for me: http:https://jenkins.mxnet-ci.amazon-ml.com/job/restricted-docker-cache-refresh/

stephenrawls pushed a commit to stephenrawls/incubator-mxnet that referenced this pull request Feb 16, 2019
haohuanw pushed a commit to haohuanw/incubator-mxnet that referenced this pull request Jun 23, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-review PR is waiting for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants