Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0011: gocd succeeds Freight as our CD solution #11

Merged
merged 1 commit into from
Nov 2, 2022
Merged

0011: gocd succeeds Freight as our CD solution #11

merged 1 commit into from
Nov 2, 2022

Conversation

joshuarli
Copy link
Member

To meet the growing demands of continuous deployment (CD) at Sentry, this RFC proposes gocd as Freight's successor and the foundation upon which scalable pipelines for CD are built internally.

Rendered RFC


# Unresolved questions

This RFC does not take the needs of the upcoming Hybrid Cloud initiative into account as they are still in flux at the time of writing. However, gocd is more than customizable enough that we would be confident in being able to meet those needs. Much more confident than if we were to extend Freight.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With canary support and incremental deployments to single-tenant environments we would be moving closer to what we're looking to do with deployments in multi-region.

@mitsuhiko mitsuhiko changed the title 0042: gocd succeeds Freight as our CD solution 0011: gocd succeeds Freight as our CD solution Sep 28, 2022
@mitsuhiko
Copy link
Member

I changed the number of the RFC to the pull request number as intended. Should be changed on merge.

1. **ArgoCD.** Jason (ex-Sentry) on Operations worked briefly on a PoC for this, but there is little documentation as to why it was chosen before. It is not general-purpose pipelining technology, which is what we want. It can be summarized as what Josh likes to describe as “an airplane cockpit for k8s”; a detailed visualization of rollout progress of k8s cluster(s). It does not provide the ability to construct custom deployment pipelines at all, it is mostly a k8s controller with a web UI that is concerned exclusively with syncing k8s cluster(s) to the desired state.
2. **[gocd](https://gocd.org)**. General-purpose, mature (2007) pipelining tech. Provides server software responsible for managing pipelines (controlled by a Web UI) and scheduling tasks onto agents. A very simple system, and easy to understand. Generic; if you wanted agent hosts to interact with k8s clusters, you would have to put k8s client software on those agents. [The pipelining model](https://docs.gocd.org/current/introduction/concepts_in_go.html) connects pipelines to pipelines, whereas pipelines are the top-level execution construct in other systems I’ve seen.
3. **[Tekton](https://tekton.dev/)**. Like gocd, Tekton is general-purpose pipelining tech. It’s a young project (2018) donated by Google to cd.foundation. Tekton is deployed in a k8s cluster, and as such Tekton executors have k8s abilities builtin, but are generic otherwise; it is not strictly a k8s controller like ArgoCD. Compared to gocd, it’s a complex system with moving parts in k8s. The UI, Tekton Dashboard, isn’t particularly purposeful or focused and feels like a thin CRUD layer on top of Tekton’s k8s CRDs (don’t think normal users would want to edit something like ClusterTriggerBindings).
4. Argo Workflows. The same company that made ArgoCD, also made this. It’s pretty much analogous to Tekton except the UI is more confusing and hard to navigate.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the sake of additional context:
We've been using Argo Workflows in the Quality Engineering team for almost a year now (there's a deployed version here, with a bunch of workflows definitions here). It's a general-purpose workflow engine (closer to Jenkins in its universality), and can be combined with all other Argo products (Rollouts, Events, CD, etc.), and that's probably its main curse: you can implement pretty much any kind of pipeline/workflow there, but if you want to scale those (as in, have more than a couple pipelines), you better invest in some helpers, wrappers, and custom scripts to tailor that

Other impressions:

  • UI is not super fancy, but after a few uses you don't really notice it anymore. There are a few UI bugs/annoyances here and there, but generally those that are reported get fixed in next releases. K8S complexity can be abstracted out e.g. by configuring workflows with a limited number of inputs, basically the same thing that we have in Freight:

image

  • If there are too many stages and dependencies in the workflow, the execution graph can get pretty messy, without any straightforward way to get it rendered nicely.

  • The tool is K8S native, for better or worse. Since we also use sentry-kube for the QE env at the moment, we basically pre-render all resources before applying, which is slightly annoying. At the same time, ordinary users are not directly exposed to all K8S horrors: there's a button for viewing logs, and that's usually enough for people who "just use" the system.

  • Build artifacts management is pretty useful: test results, service configurations, HTML reports can be registered as artifacts and then used as inputs for other pipeline stages, or downloaded/viewed. Probably not that important for the vanilla CD use case, but generally things like an HTML report (e.g. with links to logs, or some analytics) could be a nice addition to every deploy we do, so wonder if GoCD supports anything like that.

  • Poor support for Google Identity-Aware Proxy for auth (or more like, lack of it). This one is a bit of a bummer, given that tools like Grafana solve it pretty nicely (most importantly, transparently). Does GoCD have good auth options?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh cool, I wasn't aware that we were already using Argo Workflows somewhere. I'm definitely not a fan of the UI either but also rejected it because I wasn't smart enough to write some basic working YAML for it. Now that I know we have some prior art I'm more open to trying again, but for now we've already invested into some more serious exploration into gocd. Good news is a lot of the work being done to port Freight and getsentry-deploy functionality into standalone scripts is generic and reusable if gocd falls through and we pivot to Argo Workflows (which IMO now, is the next best candidate).

Copy link
Member Author

@joshuarli joshuarli Oct 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for chiming in with that amount of detail Anton, I appreciate it. To answer some questions:

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, from what you wrote in the RFC and what I saw in the docs -- GoCD looks promising 👍

Btw, Argo Workflows disappointed us a few days ago: we upgraded to a newer version we had looked forward to, and it turned out to be buggy so we had to revert :(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joshuarli the plugin will do the oauth flow for access and create users in the gocd instance which you can use to assign roles. It'll also use the email to show who is performing actions in gocd (i.e. showing who kicked off a deployment or paused a deployment).

@tonyo
Copy link

tonyo commented Oct 4, 2022

Is there maybe a deployed version of GoCD somewhere, to get a feel of it?

@mattgauntseo-sentry
Copy link

There isn't a deployed version that I'm aware, but running it locally has been fairly painless for us, this should be all you need https://www.gocd.org/test-drive-gocd.html

@joshuarli
Copy link
Member Author

There isn't a deployed version that I'm aware, but running it locally has been fairly painless for us, this should be all you need https://www.gocd.org/test-drive-gocd.html

We're closeish to an official deployment, I would just wait for a while.

@mitsuhiko
Copy link
Member

Since we agreed on GoCD we should be merging this RFC.

@joshuarli joshuarli merged commit a78fc31 into main Nov 2, 2022
@joshuarli joshuarli deleted the 0042 branch November 2, 2022 18:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants