Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tag to digest resolving is not working in Knative 1.3.0 #12761

Closed
NourhanKhaled opened this issue Mar 20, 2022 · 26 comments · Fixed by #12836 or #12855
Closed

Tag to digest resolving is not working in Knative 1.3.0 #12761

NourhanKhaled opened this issue Mar 20, 2022 · 26 comments · Fixed by #12836 or #12855
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/needs-user-input Issues which are waiting on a response from the reporter
Milestone

Comments

@NourhanKhaled
Copy link

What version of Knative?

Knative v1.3.0, AKS v1.21.2 (tried on AKS v1.22.6 as well, same issue)

Expected Behavior

I should be able to create a knative service that pulls an image with a defined tag from a private registry, the exact same service definition file was working with knative 0.25.x

Actual Behavior

I get failed to resolve image to digest when deploying an image with a defined tag. I tried it with azurecr as well as registry.gitlab.com, same error. However, when I added the image digest explicitly it worked.
I'm aware that adding the registriesSkippingTagResolving is one workaround, but it won't be useful for me since users might deploy images in my app from registries I'm not aware of.

Revision "mlserv-4b593f45-42b6-4803-ab88-9aa6d67b1abb-00001" failed with message: Unable to fetch image "xxxx.azurecr.io/my-image:v1.0.2": failed to resolve image to digest: Get "https://xxxx.azurecr.io/v2/": context deadline exceeded.

Steps to Reproduce the Problem

Using knative v1.3.0, attempt deploying a service with an image from a private registry with a defined tag.

@NourhanKhaled NourhanKhaled added the kind/bug Categorizes issue or PR as related to a bug. label Mar 20, 2022
@dprotaso
Copy link
Member

dprotaso commented Mar 21, 2022

How are the pull credentials configured for the private image?

@dprotaso dprotaso added this to the v1.4.0 milestone Mar 21, 2022
@dprotaso
Copy link
Member

/triage needs-user-input

@knative-prow-robot knative-prow-robot added the triage/needs-user-input Issues which are waiting on a response from the reporter label Mar 21, 2022
@NourhanKhaled
Copy link
Author

The credentials are defined in a secret of type kubernetes.io/dockerconfigjson and are passed as imagePullSecrets in the service definition. That service definition file worked with knative 0.25

@dprotaso
Copy link
Member

Can you paste your dockerconfigjson ? without the secret obviously

@dprotaso
Copy link
Member

Reason why I'm asking is because I fixed this case for gitlab before where fetching the creds from the k8s secret wasn't working

google/go-containerregistry#1299
google/go-containerregistry#1300

@dprotaso
Copy link
Member

The other option is listing the steps you took to create your image pull secret

@NourhanKhaled
Copy link
Author

The problem isn't just with gitlab, it's also with azurecr. I created the imagepullsecret using this secret definition:

apiVersion: v1
data:
  .dockerconfigjson: <my-secret>
kind: Secret
metadata:
  name: gitlab-regcred
  namespace: dev
type: kubernetes.io/dockerconfigjson

And the dockerconfigjson for gitlab and azurecr:
{"auths":{"registry.gitlab.com":{"username":"xxx", "password":"xxx", "auth":"xxx"}}} and
{"auths":{"xxx.azurecr.io":{"username":"xxx,"password":"xxx","auth":"xxx"}}}

@dprotaso
Copy link
Member

Can you give exact steps when creating the registry credential and also the exact image prefix ie. not xxxx.azurecr.io

The Gitlab fix was normalizing the URL schemes in the dockerconfigjson - so knowing whether it shows https etc in the entry is important

@NourhanKhaled
Copy link
Author

I pasted the dockerconfigjson in my earlier comment to clearly show there was no https or https in the entry. The xxxx only replaces a name, nothing more.

To create the secret I ran the following command for gitlab:

kubectl create secret docker-registry gitlab-regcred --docker-server="registry.gitlab.com" --docker-username="myusername" --docker-password="mypassword" -n dev

I also tried providing a scheme like http in the docker-server arg, but it also didn't work.

As for ACR, I ran a similar command with the docker-server as myregistryname.azurecr.io and provided its credentials.

@thepiger
Copy link

thepiger commented Apr 6, 2022

I have the same issue upgrading from knative serving 1.1.4 to 1.3.0.

I created the dockerconfigjson secret for gcr registry in my namespace and added as imagePullSecrets in the default service account.

On 1.1.4 It works fine.
After the upgrade to 1.3.0 I got this error on knative serving controller: Unable to fetch image "gcr.io/xxxxxxxxx:xxxxxxxx": failed to resolve image to digest: HEAD https://gcr.io/v2/xxxxx/xxxxxx/manifests/xxxxxx: unexpected status code 401 Unauthorized (HEAD responses have no body, use GET for details)

Reverting to 1.1.4 started working again.
I tested also the 1.2.3 and I got the same issue.

I follow the knative serving setup by yaml

@thepiger
Copy link

thepiger commented Apr 7, 2022

I've also tested knative serving 1.2.2 and it works well.

Reading the knative-serving Release I've noticed that with knative-serving 1.2.3 the github.com/google/go-containerregistry/pkg/authn/k8schain was refactored.

I think that's what creates the issue.

Although the service account is created correctly, the knative controller cannot schedule the pods because it cannot contact the registry.

@imjasonh
Copy link
Member

imjasonh commented Apr 7, 2022

The k8schain refactor pre-1.3 definitely sounds like the culprit.

I'm not able to reproduce in the minimal unit test added in google/go-containerregistry#1335 -- this doesn't necessarily mean it isn't an issue (it seems fairly obvious that it is) just that I haven't been able to nail down where in k8schain's secret handling code it's going wrong.

@thepiger
Copy link

thepiger commented Apr 7, 2022

Actually on 1.3.0 still needs this workaroud #12642 to download the image from the private gcr.

@dprotaso
Copy link
Member

dprotaso commented Apr 8, 2022

/assign @dprotaso

@dprotaso
Copy link
Member

dprotaso commented Apr 8, 2022

Confirmed the bug - I made a simple mistake of not bumping the sub package (github.com/google/go-containerregistry/pkg/authn/k8schain)

serving/go.mod

Lines 12 to 13 in f4ea3ac

github.com/google/go-containerregistry v0.8.1-0.20220219142810-1571d7fdc46e
github.com/google/go-containerregistry/pkg/authn/k8schain v0.0.0-20220120151853-ac864e57b117

I'll bump the dependency and get patch releases out tomorrow.

@dprotaso
Copy link
Member

dprotaso commented Apr 8, 2022

@NourhanKhaled
Copy link
Author

I seem to be facing the same issue in the patch as well. @thepiger do you mind confirming this?

@dprotaso
Copy link
Member

dprotaso commented Apr 12, 2022

I tested this by creating three docker cred k8s secrets with three different variations on the registry url

  1. no scheme
  2. https scheme
  3. http scheme
kubectl create secret docker-registry gitlab-auth-with-no-scheme \
  --docker-server=registry.gitlab.com \
  --docker-email= [email protected] \
  --docker-username= someemail \
  --docker-password=$GITLAB_TOKEN


kubectl create secret docker-registry gitlab-auth-with-http \
  --docker-server=https://registry.gitlab.com \
  [email protected] \
  --docker-username= someemail \
  --docker-password=$GITLAB_TOKEN


kubectl create secret docker-registry gitlab-auth-with-https \
  --docker-server=https://registry.gitlab.com \
  --docker-email= [email protected] \
  --docker-username= someemail \
  --docker-password=$GITLAB_TOKEN

Then I created Knative Services using the imagePullSecret

ie.

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: test-https-scheme
spec:
  template:
    spec:
      imagePullSecrets:
      - name: gitlab-auth-with-https
      containers:
      - image: registry.gitlab.com/someemail/test/golang:latest

Let me know if your scenario is different and I can confirm

@thepiger
Copy link

From the first tests I did, it seems that the patch has solved the problem.
If there are any news, I will update you.

@NourhanKhaled
Copy link
Author

NourhanKhaled commented Apr 12, 2022

@dprotaso it's working with gitlab but not azure container registry. Using the same secret in the azure container registry works when providing a digest but not when providing a tag.
I'm testing both gitlab and azurecr with no scheme.

@NourhanKhaled
Copy link
Author

Also I'm not sure if this behavior is intentional or not, but using a gitlab registry secret in this format works fine:

{"auths":{"registry.gitlab.com":{"username":"my-username","password":"mypassword","auth":"encoded-value"}}}

However using this secret format (not saying this is correct) doesn't work unless I add registry.gitlab.com to the registriesSkippingTagResolving list:

{"auths": {"registry.gitlab.com/<org>/<group>": {"username": "my-username", "password": "mypassword"}}}

@dprotaso dprotaso reopened this Apr 13, 2022
@dprotaso
Copy link
Member

dprotaso commented Apr 13, 2022

Thanks for the feedback - was finally able to get an azure account and confirmed that it's just broken.

Here's the breakdown so far:

K8s Deployment Image Pull

For comparison with a vanilla K8s deployment

Works Registry URL in the Secret
OK registry.gitlab.com
OK https://registry.gitlab.com
OK https://registry.gitlab.com
OK registry.gitlab.com/dprotaso
OK https://registry.gitlab.com/dprotaso
OK https://registry.gitlab.com/dprotaso
OK registry.gitlab.com/dprotaso/test
OK https://registry.gitlab.com/dprotaso/test
OK https://registry.gitlab.com/dprotaso/test
OK registry.gitlab.com/dprotaso/test/nginx
OK https://registry.gitlab.com/dprotaso/test/nginx
OK https://registry.gitlab.com/dprotaso/test/nginx
OK dtestcontainer.azurecr.io
OK https://dtestcontainer.azurecr.io
OK https://dtestcontainer.azurecr.io
OK dtestcontainer.azurecr.io/dave
OK https://dtestcontainer.azurecr.io/dave
OK https://dtestcontainer.azurecr.io/dave
OK dtestcontainer.azurecr.io/dave/nginx
OK https://dtestcontainer.azurecr.io/dave/nginx
OK https://dtestcontainer.azurecr.io/dave/nginx

Oddly - K8s works even with a partial match - ie. registry.gitlab.com/dprot

Knative Service

Works Registry URL in the Secret
OK registry.gitlab.com
OK https://registry.gitlab.com
OK https://registry.gitlab.com
FAIL registry.gitlab.com/dprotaso
FAIL https://registry.gitlab.com/dprotaso
FAIL https://registry.gitlab.com/dprotaso
FAIL registry.gitlab.com/dprotaso/test
FAIL https://registry.gitlab.com/dprotaso/test
FAIL https://registry.gitlab.com/dprotaso/test
OK registry.gitlab.com/dprotaso/test/nginx
OK https://registry.gitlab.com/dprotaso/test/nginx
OK https://registry.gitlab.com/dprotaso/test/nginx
FAIL dtestcontainer.azurecr.io (all azure varations)

@dprotaso
Copy link
Member

Ok new releases are out with the fixes

https://github.com/knative/serving/releases/tag/knative-v1.3.2
https://github.com/knative/serving/releases/tag/knative-v1.2.5

Please let me know if there are any issues

@dprotaso
Copy link
Member

For future reference this is the test I used to confirm the controllers behaved properly - This was placed under the path pkg/reconciler/revision/

package revision

import (
	"context"
	"fmt"
	"net/http"
	"os/exec"
	"testing"

	"github.com/google/go-containerregistry/pkg/authn/k8schain"
	"github.com/google/go-containerregistry/pkg/name"
	"k8s.io/apimachinery/pkg/api/errors"
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
	"k8s.io/client-go/kubernetes"
	"k8s.io/client-go/tools/clientcmd"
)

func TestRealResolveGitlab(t *testing.T) {
	img := "registry.gitlab.com/dprotaso/test/nginx:latest"

	// Create a tag pointing to an image on our fake registry.
	tag, err := name.NewTag(img, name.WeakValidation)
	if err != nil {
		t.Fatal("NewTag() =", err)
	}

	kubeconfig := "/Users/dprotasowski/.kube/config"

	// use the current context in kubeconfig
	config, err := clientcmd.BuildConfigFromFlags("", kubeconfig)
	if err != nil {
		t.Fatal(err)
	}

	// create the clientset
	client, err := kubernetes.NewForConfig(config)

	cases := []string{
		"registry.gitlab.com",
		"registry.gitlab.com/dpro",
		"registry.gitlab.com/dprotaso",
		"registry.gitlab.com/dprotaso/test",
		"registry.gitlab.com/dprotaso/test/nginx",
	}

	// Resolve our tag on the fake registry to the digest of the random.Image().
	dr := &digestResolver{client: client, transport: http.DefaultTransport, userAgent: "test-agent"}

	for _, c := range cases {
		t.Run(c, func(t *testing.T) {

			err := client.CoreV1().Secrets("default").Delete(context.TODO(), "gitlab-container-secret", metav1.DeleteOptions{})
			if err != nil && !errors.IsNotFound(err) {
				t.Log("failed to delete secret")
				t.Fatal(err)
			}

			cmd := exec.Command("kubectl", "create", "secret", "docker-registry", "gitlab-container-secret",
				fmt.Sprintf("--docker-server=%s", c),
				"--docker-email=EMAIL",
				"--docker-username=UESRNAME",
				"--docker-password=SECRET")

			output, err := cmd.Output()
			if err != nil {
				t.Log(string(output))
				t.Fatal(err)
			}

			opt := k8schain.Options{
				Namespace:          "default",
				ServiceAccountName: "default",
				ImagePullSecrets:   []string{"gitlab-container-secret"},
			}

			resolvedDigest, err := dr.Resolve(context.Background(), tag.String(), opt, emptyRegistrySet)
			if err != nil {
				t.Fatal("Resolve() =", err)
			}

			// Make sure that we get back the appropriate digest.
			digest, err := name.NewDigest(resolvedDigest, name.WeakValidation)
			if err != nil {
				t.Fatal("NewDigest() =", err)
			}
			t.Log(digest)
		})
	}
}

func TestRealResolveAzure(t *testing.T) {
	img := "dtestcontainer.azurecr.io/dave/nginx:latest"

	// Create a tag pointing to an image on our fake registry.
	tag, err := name.NewTag(img, name.WeakValidation)
	if err != nil {
		t.Fatal("NewTag() =", err)
	}

	kubeconfig := "/Users/dprotasowski/.kube/config"

	// use the current context in kubeconfig
	config, err := clientcmd.BuildConfigFromFlags("", kubeconfig)
	if err != nil {
		t.Fatal(err)
	}

	// create the clientset
	client, err := kubernetes.NewForConfig(config)

	cases := []string{
		"dtestcontainer.azurecr.io",
		"dtestcontainer.azurecr.io/dav",
		"dtestcontainer.azurecr.io/dave",
		"dtestcontainer.azurecr.io/dave/nginx",
	}

	// Resolve our tag on the fake registry to the digest of the random.Image().
	dr := &digestResolver{client: client, transport: http.DefaultTransport, userAgent: "test-agent"}

	for _, c := range cases {
		t.Run(c, func(t *testing.T) {

			t.Run("setup", func(t *testing.T) {
				err := client.CoreV1().Secrets("default").Delete(context.TODO(), "azure-container-secret", metav1.DeleteOptions{})
				if err != nil && !errors.IsNotFound(err) {
					t.Log("failed to delete secret")
					t.Fatal(err)
				}

				cmd := exec.Command("kubectl", "create", "secret", "docker-registry", "azure-container-secret",
					fmt.Sprintf("--docker-server=%s", c),
					"--docker-username=EMAIL",
					"--docker-password=PASSWORD")

				output, err := cmd.Output()
				if err != nil {
					t.Log(string(output))
					t.Fatal(err)
				}
			})

			t.Run("resolve", func(t *testing.T) {
				opt := k8schain.Options{
					Namespace:          "default",
					ServiceAccountName: "default",
					ImagePullSecrets:   []string{"azure-container-secret"},
				}

				resolvedDigest, err := dr.Resolve(context.Background(), tag.String(), opt, emptyRegistrySet)
				if err != nil {
					t.Fatal("Resolve() =", err)
				}

				// Make sure that we get back the appropriate digest.
				digest, err := name.NewDigest(resolvedDigest, name.WeakValidation)
				if err != nil {
					t.Fatal("NewDigest() =", err)
				}
				t.Log(digest)
			})
		})
	}
}

@Qasim-Aziz
Copy link

Qasim-Aziz commented Aug 10, 2023

same above issue facing with docker hub registry ! any solution ! @dprotaso

my image tag is like username/imagename:$CI_COMMIT_SHA

@rkochar
Copy link

rkochar commented Oct 2, 2023

@dprotaso Does knative load an image from local docker registry? I am using KinD.

kind load <image>
kubectl apply -f kservice.yaml
kubectl get ksvc # Ready=False, Reason=RevisionMissing

Error in logs: Warning InternalError 2s (x2 over 4s) revision-controller Unable to fetch image "<image>": failed to resolve image to digest: HEAD https://index.docker.io/v2/library/hello-knative-go/manifests/v0.0.2: unexpected status code 401 Unauthorized (HEAD responses have no body, use GET for details

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/needs-user-input Issues which are waiting on a response from the reporter
Projects
None yet
7 participants