Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Container sequences #2551

Closed
alexec opened this issue Mar 30, 2020 · 10 comments · Fixed by #5099
Closed

Container sequences #2551

alexec opened this issue Mar 30, 2020 · 10 comments · Fixed by #5099
Assignees
Labels
type/feature Feature request

Comments

@alexec
Copy link
Contributor

alexec commented Mar 30, 2020

Summary

It should be possible to run multiple steps within the same pod using ephemeral containers.

Motivation

  • Avoids the need to pass artifacts around.

Proposal

TODO


Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

@alexec alexec added the type/feature Feature request label Mar 30, 2020
@alexec alexec added this to the v2.9 milestone Mar 30, 2020
@simster7 simster7 self-assigned this Mar 30, 2020
@tvalasek
Copy link
Contributor

Would that work for: Avoids the need to pass output parameters around?

@simster7
Copy link
Member

Yes! That's the main advantage

@simster7
Copy link
Member

The main advantage of this feature would be to avoid passing artifacts using an external provider between different tasks in a Workflow, when the intermediary artifacts can be discarded after use.

To achieve this, we would make use of ephemeral containers in K8s. The idea is that the controller would create and remove ephemeral containers in a single pod, allowing them to all use the same filesystem

I envision something like a steps template:

- name: sequence
  sequence:
    - - name: create-artifact
        template: gen-data
    - - name: consume-artifact
        template: process-data

- name: gen-data
  container:
    ...
  outputs:
    artifacts:
      file: ...

- name: process-data
  inputs:
    artifacts:
      file: ...
  container:
    ...

Ideally, users would simply be able to rename steps to sequence in order to leverage this feature. The controller would only need the existing inputs/outputs already found in templates to achieve this.

NOTE: This feature is still only an idea: we're about to start creating a PoC to see just how viable it is. Nothing is set in stone (not even the name sequence) and I expect this to change as we learn more about the limitations/features of this. All feedback is welcome at this time!

@ddseapy
Copy link
Contributor

ddseapy commented Apr 15, 2020

Seems like a great idea and very useful. Just a couple thoughts:

For argo on production clusters, it might be a capability not exercised for a while. That said, benefits might outweigh the risks for certain use-cases.

@simster7
Copy link
Member

You are very much correct @ddseapy. We are definitely treating this as an experimental feature

@alexec alexec modified the milestones: v2.9, v2.10 May 28, 2020
@simster7
Copy link
Member

simster7 commented Jun 2, 2020

An update on this: given some limitations placed by K8s on this feature – mainly the inability to replace or modify individual ephemeral containers in a Pod and only replace the entire list of ephemeral containers as an operation – we don't think this feature as described is currently feasible.

However, I'll investigate if we can take advantage of this feature for other purposes, such as a streamlined "Retry" node that performs its retries on the same Pod, saving the need to create new ones and download artifacts every time.

@alexec alexec removed this from the v2.10 milestone Jun 26, 2020
@alexec
Copy link
Contributor Author

alexec commented Jun 29, 2020

@simster7 could you please close this issue this feature is not possible and open a new issues for "in-place retries" so that issues 👍 is reflective of the popularity of that issue?

@simster7 simster7 removed their assignment Jul 10, 2020
@alexec
Copy link
Contributor Author

alexec commented Jul 13, 2020

@simster7 bump!

@simster7
Copy link
Member

Closing this as it is currently implausible. Related: #3475

@alexec alexec reopened this Jan 18, 2021
@alexec
Copy link
Contributor Author

alexec commented Jan 18, 2021

Sequenced Containers the Tekton Way

Similar to how Tekton does it:

https://github.com/tektoncd/pipeline/tree/master/cmd/entrypoint

How this works:

  1. The pod has a volume shared with all containers.
  2. A init container copies a binary to each volume.
  3. That binary replaces the original command, running the original comman as a sub-process.
  4. Before it starts the sub-process, it waits for a specific file to appear. This file is created by another container when it believes that container is ready to start.
  5. When completes, another files is written with the outputs.

How could workflows uses this?

Simpler and more powerful executor:

As the binary runs in the same process namespaces as the sub-process, it can easily copy inputs and capture outputs without any of the magic container runtime executors need to use. Specifically, this would very well with runAsNonRoot.

This also removes the need for a wait container. This would reduce costs.

See #4186

Many steps within a pod:

This model would allow the wait process to read the state of the workflow from the shared volume. It could effectively execute an entire workflow within a single pod.

However, this has some scaling issues. We could not run a 1000 step workflow like this. Because each container must be spun up to wait, there will be many cases where we're consuming resources, but doing no useful work.

See #2551

There are some really interesting challenges about how pods report back status for the workflow for this. We'd need to multiplex it so we might want to address at the same time as #3961.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/feature Feature request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants