Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Service Principal creation lags behind final validation (ServicePrincipalNotFound) #1165

Closed
stvhwrd opened this issue Aug 22, 2019 · 8 comments

Comments

@stvhwrd
Copy link

stvhwrd commented Aug 22, 2019

What happened:

When creating a Kubernetes cluster in the Azure Portal (GUI) and creating a new service principal with defaults, the final validation fails:

{
    "code": "InvalidTemplateDeployment",
    "message": "The template deployment 'microsoft.aks-20190822092943' is not valid according to the validation procedure. The tracking id is '81f94590-473d-41f8-b415-91d9c5adcf6a'. See inner errors for details. Please see https://aka.ms/arm-deploy for usage details.",
    "details":
    [
        {
            "code": "ServicePrincipalNotFound",
            "message": "Provisioning of resource(s) for container service stehowa-cluster in resource group stehowa failed. Message: {\n \"code\": \"ServicePrincipalNotFound\",\n \"message\": \"Internal server error\"\n }. Details: "
        }
    ]
}

After waiting a few minutes and re-running the validation (with no other changes), it passes successfully.

image

What you expected to happen:

Final validation to succeed on first run.

How to reproduce it (as minimally and precisely as possible):

Create a new Kubernetes cluster via the Azure Portal GUI and use default settings for service principal.

Anything else we need to know?:

Nope!

Environment:

  • Kubernetes version (use kubectl version): 1.13.10 (default)
  • Size of cluster (how many worker nodes are in the cluster?) 1
  • General description of workloads in the cluster (e.g. HTTP microservices, Java app, Ruby on Rails, machine learning, etc.) N/A
  • Others:
@ghost
Copy link

ghost commented Aug 23, 2019

Hi:

I have the same problem when creating the cluster,
image

Greetings

@sarmadjari
Copy link

I have the same issue when creating AKS cluster through the portal, did few tests with one of Azure Support Engineers and we confirm this issue.

@stvhwrd
Copy link
Author

stvhwrd commented Aug 31, 2019

I reported the bug from within the Azure portal and received this response from Azure IaaS:

Unfortunately a known issue – there’s a replication delay in AAD ☹ We’re working with both the AKS RP and AAD graph teams to try to address this issue but we don’t have any quick fix we can do.

@Ricciolo
Copy link

As a workaround, when you get the error, go back to Authentication section, click configure on service principal and select use existing. Id and secret should be already set from previous review process. Go to the of the wizard and try again

@stvhwrd
Copy link
Author

stvhwrd commented Sep 13, 2019

Thanks @Ricciolo, that is definitely one way to do it — but ultimately it’s just about waiting for that service principal to be created. You don’t actually need to change any configuration to make it work, you just need to wait a couple of minutes and then force a page refresh. The original report mentions this.

@ahelwer
Copy link

ahelwer commented Sep 23, 2019

Even worse is that this issue is intermittent - You can run Test-AzResourceGroupDeployment until it stops spitting out the ServicePrincipalNotFound error, but then you run New-AzResourceGroupDeployment and it fails with ServicePrincipalNotFound. Please just add a large number of retries in the validation before saying the service principal does not exist!

This issue also occurs if you create the service principal yourself just prior to running the deployment.

This is related to issue #1206

@jluk
Copy link
Contributor

jluk commented Dec 4, 2019

Hey folks, I'm going to close this in favor of issue #1206 as this looks like a duplicate. Please bring all comments into that issue to help consolidate. If you feel like it should be re-opened as a separate issue just comment and I'll revisit.

@jluk jluk closed this as completed Dec 4, 2019
@stvhwrd
Copy link
Author

stvhwrd commented Dec 7, 2019

Thanks @jluk 🤙

@jluk jluk self-assigned this Dec 9, 2019
@ghost ghost locked as resolved and limited conversation to collaborators Jul 23, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants