Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any way to specify additional ECR registries to log in to? #13

Open
athewsey opened this issue Apr 20, 2021 · 6 comments
Open

Any way to specify additional ECR registries to log in to? #13

athewsey opened this issue Apr 20, 2021 · 6 comments
Labels
enhancement New feature or request

Comments

@athewsey
Copy link

I'm trying to sm-docker build a container derived from SageMaker Scikit-Learn framework container in ap-southeast-1, something like the following:

base_docker_uri = sagemaker.image_uris.retrieve(
    sagemaker.sklearn.defaults.SKLEARN_NAME,
    smsess.boto_region_name,
    version="0.23-1",
    instance_type="ml.m5.xlarge",
)
# 121021644041.dkr.ecr.ap-southeast-1.amazonaws.com/sagemaker-scikit-learn:0.23-1-cpu-py3

...so Dockerfile is FROM 121021644041.dkr....etc

Seems like the CLI tool spins up successfully and logs in to a load of other ECR registries, but not 121021644041: Then fails on step 1 with:

[Container] 2021/04/20 02:54:22 Running command docker build -t $IMAGE_REPO_NAME:$IMAGE_TAG .
Sending build context to Docker daemon   7.68kB
Step 1/2 : FROM 121021644041.dkr.ecr.ap-southeast-1.amazonaws.com/sagemaker-scikit-learn:0.23-1-cpu-py3
Get https://121021644041.dkr.ecr.ap-southeast-1.amazonaws.com/v2/sagemaker-scikit-learn/manifests/0.23-1-cpu-py3: no basic auth credentials

[Container] 2021/04/20 02:54:22 Command did not exit successfully docker build -t $IMAGE_REPO_NAME:$IMAGE_TAG . exit status 1

I've since tested and on a SageMaker Notebook Instance I can build the same Dockerfile fine, so long as I log in to the 121021644041 ECR first.

From a cursory look at the job logs and #12, it looks like the current strategy is to have the tool ecr login to every AWS account on which AWS DLCs are provided?

...So would the correct fix be to add every account Id listed here to support SKLearn?

I was thinking it might be preferable to also add a way for users to indicate extra required account IDs through the CLI, since:

  • This list is going to start getting pretty long
  • (Like in Added eu-west-1 registry #12 and this issue) I guess there might always be some gaps introducing bugs
  • In some rarer cases, I guess users might have private cross-account ECR needs too?
athewsey added a commit to athewsey/sagemaker-studio-image-build-cli that referenced this issue Apr 20, 2021
Conditionally log in to the appropriate ECR accounts for public
SKLearn and Spark ML images in the current AWS_DEFAULT_REGION.

Stop-gap for missing containers mentioned in aws-samples#13
@jaipreet-s
Copy link
Contributor

Thanks for the detailed write-up.

Agreed that the preferable approach would be to not require logging into each and every ECR registry. Your suggestion of adding a registry-id parameter works, however we could make this seamless by auto-detecting the base ECR registry and region and logging into it. Here's how that could work

  1. Read the provided Dockerfile from disk and extract the FROM statement
  2. If the FROM is an ECR repository, then extract the registry ID, region, and partition
  3. Log into the detected ECR registry in the right region and partition.

@jaipreet-s jaipreet-s added the enhancement New feature or request label Apr 23, 2021
@athewsey
Copy link
Author

Yeah auto-detection was my first thought & preference too - but then wondered if there might still be some edge cases a naïve implementation could miss... E.g. there could be multi-stage builds with multiple FROM statements (which should be easy enough to handle), or maybe there are use cases where it's not obvious from the Dockerfile at all which registri(es) are needed? I don't know enough to rule it out.

@jaipreet-s
Copy link
Contributor

The underlying tenet of this library is that is works out of the box without requiring any additional inputs over a regular docker build . This works by setting sensible defaults for underlying AWS resources like S3 and CodeBuild.

There may well be edge-cases, but if we can handle 80% of the use-cases with auto-detection then that is default behavior to go with, while allowing power users to specify additional, optional fields to override the defaults.

@mentos1386
Copy link

mentos1386 commented Jun 1, 2021

Is there any work around for this problem?

I'm trying to run the following dockerfile https://github.com/aws/amazon-sagemaker-examples/blob/master/training/distributed_training/tensorflow/data_parallel/maskrcnn/Dockerfile

EDIT: I had actually an issue that ECR authentication was being done for us-east-1 when Dockerfile contained image from us-west-2. Changing region in Dockerfile to us-east-1 fixed the issue.

@kondo-kj
Copy link

kondo-kj commented Apr 7, 2022

I get the same error when I try to build an image based on this FROM statement.
FROM 683313688378.dkr.ecr.us-east-1.amazonaws.com/sagemaker-xgboost:1.2-1

I thought the account 683313688378 was logged in with the following command, but what could be the cause or workaround?
Running command $(aws ecr get-login --no-include-email --region $AWS_DEFAULT_REGION --registry-ids 683313688378)

@ajay-bhargava
Copy link

I share the same concern as the author of this Issue. It is not possible to share AUTH credentials with the library. As a result, it is impossible to build upon ECR registry containers.

Similar example:

"https://763104351884.dkr.ecr.us-east-1.amazonaws.com/v2/pytorch-training/manifests/1.11.0-gpu-py38-cu115-ubuntu20.04-e3": no basic auth credentials

Even if you run immediately before:

aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 763104351884.dkr.ecr.us-east-1.amazonaws.com

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants