Skip to content

Latest commit

 

History

History

kubernetes

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

Kubernetes

Storage

Current

Going to have 3 kinds of storage for my k8s clusters:

  1. Storage that I'd like to persist across pod restarts, but it's really not a big deal if I lose this data. Ex: prometheus data. This data is usually specific to k8s and doesn't have a particular need to persist outside of Kubernetes. Local node data is fine here, replication isn't needed.

  2. Storage that I'd like to be able to create on the fly (i.e. not pre-existing folders). This is important data and would like to maintain good backups of it, but the total size is relatively small. This data won't be 100% mission critical, so I'm comfortable delegating to the k8s control planes. For this type of storage, there is an excellent guide here, which I'll attempt to use. This 2nd type of storage is one where the value of TrueNAS is up in the air. What if instead... I just used rook/ceph for these use-cases?

  3. For that media storage that is pre-created (i.e. my existing media) and is both HUGE and CRITICAL. This data is 100% critical to the homelab and CANNOT be lost. As such, the k8s control plane can't be trusted with this data and instead it will be managed by TrueNAS (i.e. software and configuration that I don't maintain) and mounted to pods via NFS PVs. For this type of storage, I'll try to use the node-manual CSI driver (example here)

I might be able to use democratic-csi for all 3 of these, using these 3 drivers, respectively: democratic-csi/local-hostpath, democratic-csi/freenas-api-nfs & democratic-csi/freenas-api-iscsi, and democratic-csi/node-manual

Future

Now that I'm further along, I think I have an idea for how I'm wanna do storage in the future:

  1. Run Ceph/Rook directly in k8s to replace option 1 & 2 from above. Can expose iSCSI or NFS mounts when needed. Then run volsync to backup these drives to TrueNAS.
  2. When using the /media from the NAS, just do a simple NFS mount PVC in k8s (nothing fancy).

Secrets

I use sops to manage secrets in a GitOps way. There's a good overview of sops here.

Secrets With Flux

To properly ensure secrets are GitOps-ified and still kept secret across the wide array of apps in this repo, there are numerous methods in which an app can be supplied secrets. Here’s a breakdown of some common methods using the tools in this repo: Flux and SOPS.

This guide will not be covering how to integrate SOPS into Flux initially (i.e. bootstrapping SOPS with Flux during initial setup). For that be sure to check out the Flux documentation on integrating SOPS

For the first three examples, the following secret will be used:.

apiVersion: v1
kind: Secret
metadata:
    name: application-secret
    namespace: default
stringData:
    SUPER_SECRET_KEY: "SUPER SECRET VALUE"

Method 1: envFrom

Use envFrom in a deployment or a Helm chart that supports the setting, this will pass all secret items from the secret into the containers environment.

envFrom:
- secretRef:
    name: application-secret

View example Helm Release and corresponding Secret.

Method 2: env.valueFrom

Similar to the above but it's possible with env to pick an item from a secret.

env:
- name: WAY_COOLER_ENV_VARIABLE
    valueFrom:
    secretKeyRef:
        name: application-secret
        key: SUPER_SECRET_KEY

View example Helm Release and corresponding Secret.

Method 3: spec.valuesFrom

The Flux HelmRelease option valuesFrom can inject a secret item into the Helm values of a HelmRelease

  • Does not work with merging array values
  • Care needed with keys that contain dot notation in the name
valuesFrom:
- targetPath: config."admin\.password"
    kind: Secret
    name: application-secret
    valuesKey: SUPER_SECRET_KEY

View example Helm Release and corresponding Secret.

Method 4: Variable Substitution with Flux

Flux variable substitution can inject secrets into any YAML manifest. This requires the Flux Kustomization configured to enable variable substitution. Correctly configured this allows you to use ${GLOBAL_SUPER_SECRET_KEY} in any YAML manifest.

apiVersion: v1
kind: Secret
metadata:
    name: cluster-secrets
    namespace: flux-system
stringData:
    GLOBAL_SUPER_SECRET_KEY: "GLOBAL SUPER SECRET VALUE"
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
# ...
spec:
# ...
decryption:
    provider: sops
    secretRef:
        name: sops-age
postBuild:
    substituteFrom:
    - kind: Secret
        name: cluster-secrets

View example Fluxtomization, Helm Release, and corresponding Secret.

Final Secrets Thoughts

  • TODO: For the first three methods consider using a tool like stakater/reloader to restart the pod when the secret changes. Using reloader on a pod using a secret provided by Flux Variable Substitution will lead to pods being restarted during any change to the secret while related to the pod or not.

  • The last method should be used when all other methods are not an option, or used when you have a “global” secret used by numerous HelmReleases across the cluster.

Kustomization Wait & DependOn

When managing dependencies between HelmReleases and Flux Kustomizations (i.e. KS), there are some import configuration flags that could have a large impact on developer experience: wait and dependsOn. As a quick overview: there are two bits of configuration that are relevant here:

wait: true only marks the Kustomization as successful if all the resources it creates are healthy wait: false just does a kubectl apply -k and then says 'all good, chief' dependsOn tells either the KS or the HelmRelease to confirm the health of another KS or HelmRelease before trying to apply. The health of the KS/HelmRelease could depend on HealthChecks or wait

There are two camps here, mostly: You can either handle the dependencies via dependsOn at the KS level, or at the HelmRelease level. There are pros and cons to each:

If you do it at the KS level, you'll run into situations where a KS fails to apply but then you have to wait for it to timeout before it notices you pushed a change and applies that instead, so it's a bit more clunky

Doing it at the HR level is a bit nicer in terms of developer experience, but it has limitations. For example, if your KS applies manifests that are not helm releases, then you can't really do depends on at the HR level, so you'll have to mix and match.

As a rule of thumb, if your KS only applies a HelmRelease (and associated configmaps, secrets etc), then you can set wait to false in the KS and implement your depends on at the HR level.

If you need to apply other things that depend on a HR, think applying your cert-manager cluster issuers as raw manifests, but they depend on the cert-manager HR, then you must do it at the KS level

Thanks to mirceanton for the overview in the Home Operations discord server.

Notes

In the future, I might choose to go down a more "hyperconverged" route and manage storage directly from k8s (instead of having TrueNAS handle most of this). In that case, I'd need to migrate the StorageClass of most of my pods, which would be a big lift. To do that, there is a great article here.

For this hyperconverged route, I might consider using Harvester, which is a more cloud-native hypervisor and VM-management solution.

Bootstrapping

The steps below are run after the cluster is created with Talos to start the flux-focused GitOps workflow. One the steps below are run, all the K8s cluster components and apps should install onto the cluster.

1. CNI

After the initial Talos cluster creation (with the CNI set to none), the cluster will be waiting for a CNI to be installed (docs). This is the first component that must be installed after raw infra is provisioned.

helm upgrade --install cilium cilium/cilium --namespace cilium --values kubernetes/homelab/apps/cilium/cilium/app/helm-values.yaml

2. Secrets

In this directory, there are two secrets that must be applied to the cluster for flux to function properly:

  • age.secret.sops.yaml: The age secret that Flux will use to decrypt secrets checked into the codebase.
  • github.secret.sops.yaml: The Github SSH keys and access token necessary for Flux to access this repository on github.com.

These secrets can be decrypted by either an age key (defined in the top-level .sops.yaml file) OR a KMS key (ARN also defined in the top-level .sops.yaml file). Age is the primary key used to decrypt secrets by Flux at deploy time. KMS key can be used a backup to decrypt and recover the bootstrap secrets if needed.

To deploy these secret during initial bootstrapping:

sops --decrypt kubernetes/homelab/bootstrap/age.bootstrap.sops.yaml | kubectl apply --server-side --filename -
sops --decrypt kubernetes/homelab/bootstrap/github.bootstrap.sops.yaml | kubectl apply --server-side --filename -

Most of the Kubernetes components are added via Flux defined in the kubernetes directory. For the remaining components that are installed during cluster instantiation, the instructions are defined below.

3. Flux Installation

I used Kustomize to install the components necessary to bootstrap Flux using this command:

kubectl apply --server-side --kustomize kubernetes/homelab/bootstrap/flux/kustomization

Then, I install the my repo-specific Flux configuration using this command:

kubectl apply --server-side --kustomize kubernetes/homelab/bootstrap/flux/repo