Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Script] failed to save outputs: verify serviceaccount default:default has necessary privileges #983

Closed
bakayolo opened this issue Sep 6, 2018 · 7 comments · Fixed by #1362
Assignees
Labels
Milestone

Comments

@bakayolo
Copy link
Contributor

bakayolo commented Sep 6, 2018

Is this a BUG REPORT or FEATURE REQUEST?:
BUG REQUEST

What happened:
Got an error when trying to use script in my workflow.

Name:                scriptmm2nm
Namespace:           default
ServiceAccount:      default
Status:              Failed
Message:             child 'scriptmm2nm-2223370800' failed
Created:             Thu Sep 06 10:46:47 +0800 (13 seconds ago)
Started:             Thu Sep 06 10:46:47 +0800 (13 seconds ago)
Finished:            Thu Sep 06 10:46:51 +0800 (9 seconds ago)
Duration:            4 seconds

STEP            PODNAME                 DURATION  MESSAGE
 ✖ scriptmm2nm                                    child 'scriptmm2nm-2223370800' failed
 └---⚠ script   scriptmm2nm-2223370800  3s        failed to save outputs: verify serviceaccount default:default has necessary privileges

What you expected to happen:
Should run the script without error.

How to reproduce it (as minimally and precisely as possible):

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: script
spec:
  entrypoint: main
  templates:
  - name: main
    steps:
    - - name: script
        template: script
  - name: script
    script:
      image: alpine:latest
      command: ["sh"]
      source: |
        echo test
    metadata:
      labels:
        workflowId: test

argo submit

Anything else we need to know?:
I even assign role cluster-admin to default service-account.

Environment:

  • Argo version:
$ argo version
argo: v2.1.1
  BuildDate: 2018-05-29T20:38:37Z
  GitCommit: ac241c95c13f08e868cd6f5ee32c9ce273e239ff
  GitTreeState: clean
  GitTag: v2.1.1
  GoVersion: go1.9.3
  Compiler: gc
  Platform: darwin/amd64
  • Kubernetes version :
$ kubectl version -o yaml
clientVersion:
  buildDate: 2018-06-27T22:29:25Z
  compiler: gc
  gitCommit: 91e7b4fd31fcd3d5f436da26c980becec37ceefe
  gitTreeState: clean
  gitVersion: v1.11.0
  goVersion: go1.10.3
  major: "1"
  minor: "11"
  platform: darwin/amd64
serverVersion:
  buildDate: 2018-08-02T23:42:40Z
  compiler: gc
  gitCommit: 9b635efce81582e1da13b35a7aa539c0ccb32987
  gitTreeState: clean
  gitVersion: v1.9.7-gke.5
  goVersion: go1.9.3b4
  major: "1"
  minor: 9+
  platform: linux/amd64
@David-Development
Copy link

Any updates on this? I'm encountering the same issue on a brand new bare mental kubernetes (rke) cluster. It looks like that this issue might be related to #982

As mentioned in #982 the following workaround works (on RKE)

kubectl create rolebinding default-admin --clusterrole=admin --serviceaccount=default:default

@ghost
Copy link

ghost commented Sep 27, 2018

I have the same issue,but it looks like

kubectl create rolebinding default-admin --clusterrole=admin --serviceaccount=default:default

doesn't work

@jessesuen
Copy link
Member

jessesuen commented Nov 2, 2018

failed to save outputs: verify serviceaccount default:default has necessary privileges

This message is not always accurate. There are some assumptions being made in the controller that turn out not to always be related to service account privileges.

This error happens when the controller is expecting some output annotation from the workflow pod, but it did not see the pod annotations updated with the output result. For example, for a script result, the way a workflow pod communicates the script result back to the controller, is that the wait sidecar annotates its own pod with the output result. When the controller sees a pod completed, but does not see the annotation, it assumes the reason why the annotation is missing, is because the pod did not have privileges (i.e. the serviceAccount the workflow ran as, did not have get/update/patch permissions to pods).

As I mentioned, this assumption is not always true, and there are actually other reasons why the annotation might not have been made. One reason that has come up twice so far, is because the wait container could not even communicate to the API server. So despite granting sufficient privileges to the workflow's service account, the wait sidecar still fails to annotate the output.

The way to know for sure, is to get the logs of the wait sidecar.

kubectl logs <workflowpodname> -c wait

In a recent instance, this manifested in the following error (in wait sidecar) due to an issue with the user's CNI networking:

`https://10.255.0.1:443/api/v1/namespaces/mynamespace/pods/my-workflow-v5qlt-4111318516: net/http: TLS handshake timeout

I think the error message should be improved to also point to API server access issues as a potential issue. For those here who are seeing the error verify serviceaccount default:default has necessary privileges, and are certain they gave their workflow adequate permissions, check the wait container logs to see what the issue really is.

@jessesuen
Copy link
Member

Here are a set of minimal privileges needed by a workflow pod:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: workflow-role
rules:
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - get
  - watch
  - patch
- apiGroups:
  - ""
  resources:
  - pods/log
  verbs:
  - get
  - watch
- apiGroups:
  - ""
  resources:
  - secrets
  verbs:
  - get

Issue #1072 has been filed to eliminate get secrets as a required rule.

@cameronbraid
Copy link

check the wait container logs to see what the issue really is.

Thanks for this hint.

My issue was that a hostPath volume couldn't be mounted.

@jessesuen jessesuen added this to the v2.3 milestone Jan 22, 2019
@jessesuen
Copy link
Member

Will use this bug to improve error message.

@jdclarke5
Copy link

jdclarke5 commented Aug 30, 2019

I saw a similar message arising from a pod running in a non-default namespace (call it mynamespace). The error message was therefore accurate and the default service account needs to be given the appropriate role. The role binding is given below (since google brought me here it might be useful to somebody else).

# Argo artifacts require the mynamespace default user to have appropriate privileges
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: artifact-role
  namespace: mynamespace
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: artifact-role-binding
  namespace: mynamespace
roleRef:
  kind: Role
  name: artifact-role
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
  name: default
  namespace: mynamespace

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
6 participants