Storing output artifacts using the same structure as with the default artifact bucket #744

vicaire · 2018-02-15T22:33:37Z

Is this a BUG REPORT or FEATURE REQUEST?: FEATURE REQUEST

The way artifact passing works when using the default bucket is great. (Example: https://github.com/argoproj/argo/blob/master/examples/artifact-passing.yaml)

It stores all the output of a container at (taking the example of gcs):

gcs:https://bucket/WorkflowId-RandomWorkflowSuffix/WorkflowId-RandomWorkflowSuffix-RandomStepSuffix

And makes it easy to pass the output location of one step to the next.

I have a couple questions:

Is WorkflowId-RandomWorkflowSuffix-RandomStepSuffix a UUID? (i.e. globally unique https://docs.oracle.com/javase/7/docs/api/java/util/UUID.html)
I am guessing that "WorkflowId-RandomWorkflowSuffix" is not a UUID given the length of the random suffix. Does that mean that some workflows could collide and use the same directory in the default artifact bucket?
Assume I have a container that takes as parameters the input bucket location it is supposed to read from, and the output bucket location it is supposed to write to. Using such containers, I would like to get the same storage structure as with the default artifact bucket: one directory per workflow, one subdirectory per step.

I am having trouble constructing "WorkflowId-RandomWorkflowSuffix/WorkflowId-RandomWorkflowSuffix-RandomStepSuffix" from the workflow yaml template, so that I can pass it as a parameter to the container.

It looks like I can get "WorkflowId-RandomWorkflowSuffix" using {{workflow.name}}

But I am not able to easily get "RandomStepSuffix".

Is there a way to get these values in the yaml template so that they can be passed as parameters to the containers?

Thanks!

jessesuen · 2018-02-16T20:58:24Z

Is WorkflowId-RandomWorkflowSuffix-RandomStepSuffix a UUID? (i.e. globally unique https://docs.oracle.com/javase/7/docs/api/java/util/UUID.html)

No, workflow names are not UUIDs and do not follow the UUID RFC spec. Just like any other kubernetes resource, users get to decide what the workflow name is, and how unique their workflow names are. It can be a hardwired name (e.g.

metadata:
  name: my-workflow-name

or they can be generated by kuberenetes using generateName, in which kubernetes will generate the name with the given prefix:

metadata:
  generateName: some-workflow-prefix-

I am guessing that "WorkflowId-RandomWorkflowSuffix" is not a UUID given the length of the random suffix. Does that mean that some workflows could collide and use the same directory in the default artifact bucket?

For collisions to happen, the same workflow name would have to be submitted twice. This is easily possible using a hard-wired workflow name into the manifest. But since kubernetes will not allow two resources to have the same name, you would have to delete the first workflow before submitting the second to have the artifact collision. Although, now that I think about it, the same workflow name across two namespaces would be an issue.

Assume I have a container that takes as parameters the input bucket location it is supposed to read from, and the output bucket location it is supposed to write to. Using such containers, I would like to get the same storage structure as with the default artifact bucket: one directory per workflow, one subdirectory per step.

If you specify an output artifact location, then the default archive location (which is set in the controller), will not be used. However, if you would like to get the same structure, I think argo may need to make something like {{pod.name}} available, so that the output artifact location would be like: {{workflow.name}}/{{pod.name}}. Is that the feature you would be looking for?

vicaire · 2018-02-17T09:30:41Z

Yes, {{pod.name}} would be great.

The ability to generate uuids in the yaml with something like {{uuid}} could come in handy as well.

jessesuen · 2018-03-03T11:46:36Z

Commit 7d7b74f makes {{pod.name}} available as an parameter and will be available in v2.1.0.

BBerastegui · 2020-05-25T18:12:12Z

Heyo!

I'm going to post my question here as it's where I landed after searching around on how to solve my "issue" and I think it's slightly related to this one.

I'm trying to generate a "final" artifact in a hard-wired location (as in this example): https://github.com/argoproj/argo/blob/master/examples/output-artifact-s3.yaml

But regardless on how I tweak the artifacts.s3.key and artifacts.path parameters, the base path for the artifact stored is always the /bucket/WorkflowId-RandomWorkflowSuffix/....

What is supposed to be the "right way" of making an artifact to be generated in a hard-wired path and not within the WorkflowId-RandomWorkflowSuffix structure?

Sorry in advance if this is not the right place, but I thought that it was going to be better than creating another issue :)

Thanks in advance!

jessesuen closed this as completed in 7d7b74f Mar 3, 2018

jessesuen added this to the v2.1 milestone Mar 3, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Storing output artifacts using the same structure as with the default artifact bucket #744

Storing output artifacts using the same structure as with the default artifact bucket #744

vicaire commented Feb 15, 2018 •

edited

Loading

jessesuen commented Feb 16, 2018

vicaire commented Feb 17, 2018

jessesuen commented Mar 3, 2018

BBerastegui commented May 25, 2020

Storing output artifacts using the same structure as with the default artifact bucket #744

Storing output artifacts using the same structure as with the default artifact bucket #744

Comments

vicaire commented Feb 15, 2018 • edited Loading

jessesuen commented Feb 16, 2018

vicaire commented Feb 17, 2018

jessesuen commented Mar 3, 2018

BBerastegui commented May 25, 2020

vicaire commented Feb 15, 2018 •

edited

Loading