Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storing output artifacts using the same structure as with the default artifact bucket #744

Closed
vicaire opened this issue Feb 15, 2018 · 4 comments
Milestone

Comments

@vicaire
Copy link

vicaire commented Feb 15, 2018

Is this a BUG REPORT or FEATURE REQUEST?: FEATURE REQUEST

The way artifact passing works when using the default bucket is great. (Example: https://github.com/argoproj/argo/blob/master/examples/artifact-passing.yaml)

It stores all the output of a container at (taking the example of gcs):

gcs:https://bucket/WorkflowId-RandomWorkflowSuffix/WorkflowId-RandomWorkflowSuffix-RandomStepSuffix

And makes it easy to pass the output location of one step to the next.

I have a couple questions:

  1. Is WorkflowId-RandomWorkflowSuffix-RandomStepSuffix a UUID? (i.e. globally unique https://docs.oracle.com/javase/7/docs/api/java/util/UUID.html)

  2. I am guessing that "WorkflowId-RandomWorkflowSuffix" is not a UUID given the length of the random suffix. Does that mean that some workflows could collide and use the same directory in the default artifact bucket?

  3. Assume I have a container that takes as parameters the input bucket location it is supposed to read from, and the output bucket location it is supposed to write to. Using such containers, I would like to get the same storage structure as with the default artifact bucket: one directory per workflow, one subdirectory per step.

I am having trouble constructing "WorkflowId-RandomWorkflowSuffix/WorkflowId-RandomWorkflowSuffix-RandomStepSuffix" from the workflow yaml template, so that I can pass it as a parameter to the container.

It looks like I can get "WorkflowId-RandomWorkflowSuffix" using {{workflow.name}}

But I am not able to easily get "RandomStepSuffix".

Is there a way to get these values in the yaml template so that they can be passed as parameters to the containers?

Thanks!

@jessesuen
Copy link
Member

Is WorkflowId-RandomWorkflowSuffix-RandomStepSuffix a UUID? (i.e. globally unique https://docs.oracle.com/javase/7/docs/api/java/util/UUID.html)

No, workflow names are not UUIDs and do not follow the UUID RFC spec. Just like any other kubernetes resource, users get to decide what the workflow name is, and how unique their workflow names are. It can be a hardwired name (e.g.

metadata:
  name: my-workflow-name

or they can be generated by kuberenetes using generateName, in which kubernetes will generate the name with the given prefix:

metadata:
  generateName: some-workflow-prefix-

I am guessing that "WorkflowId-RandomWorkflowSuffix" is not a UUID given the length of the random suffix. Does that mean that some workflows could collide and use the same directory in the default artifact bucket?

For collisions to happen, the same workflow name would have to be submitted twice. This is easily possible using a hard-wired workflow name into the manifest. But since kubernetes will not allow two resources to have the same name, you would have to delete the first workflow before submitting the second to have the artifact collision. Although, now that I think about it, the same workflow name across two namespaces would be an issue.

Assume I have a container that takes as parameters the input bucket location it is supposed to read from, and the output bucket location it is supposed to write to. Using such containers, I would like to get the same storage structure as with the default artifact bucket: one directory per workflow, one subdirectory per step.

If you specify an output artifact location, then the default archive location (which is set in the controller), will not be used. However, if you would like to get the same structure, I think argo may need to make something like {{pod.name}} available, so that the output artifact location would be like: {{workflow.name}}/{{pod.name}}. Is that the feature you would be looking for?

@vicaire
Copy link
Author

vicaire commented Feb 17, 2018

Yes, {{pod.name}} would be great.

The ability to generate uuids in the yaml with something like {{uuid}} could come in handy as well.

@jessesuen
Copy link
Member

Commit 7d7b74f makes {{pod.name}} available as an parameter and will be available in v2.1.0.

@jessesuen jessesuen added this to the v2.1 milestone Mar 3, 2018
@BBerastegui
Copy link

Heyo!

I'm going to post my question here as it's where I landed after searching around on how to solve my "issue" and I think it's slightly related to this one.

I'm trying to generate a "final" artifact in a hard-wired location (as in this example): https://github.com/argoproj/argo/blob/master/examples/output-artifact-s3.yaml

But regardless on how I tweak the artifacts.s3.key and artifacts.path parameters, the base path for the artifact stored is always the /bucket/WorkflowId-RandomWorkflowSuffix/....

What is supposed to be the "right way" of making an artifact to be generated in a hard-wired path and not within the WorkflowId-RandomWorkflowSuffix structure?

Sorry in advance if this is not the right place, but I thought that it was going to be better than creating another issue :)

Thanks in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants