Karpenter, a node provisioning project built for Kubernetes has been helping many companies to improve the efficiency and cost of running workloads on Kubernetes. However, as Karpenter takes an application-first approach to provision compute capacity for the Kubernetes data plane, there are common workload scenarios that you might be wondering how to configure them properly. This repository includes a list of common workload scenarios, some of them go in depth with the explanation of why configuring Karpenter and Kubernetes objects in such a way is important.
Each blueprint follows the same structure to help you better understand what's the motivation and the expected results:
Concept | Description |
---|---|
Purpose | Explains what the blueprint is about, and what problem is solving. |
Requirements | Any pre-requisites you might need to use the blueprint (i.e. An arm64 container image). |
Deploy | The steps to follow to deploy the blueprint into an existing Kubernetes cluster. |
Results | The expected results when using the blueprint. |
Before you get started, you need to have a Kubernetes cluster with Karpenter installed. If you're planning to work with an existing cluster, just make sure you've configured Karpenter following the official guide. This project also has a template to create a cluster with everything you'll need to test each blueprint.
- You need access to an AWS account with IAM permissions to create an EKS cluster, and an AWS Cloud9 environment if you're running the commands listed in this tutorial.
- Install and configure the AWS CLI
- Install the Kubernetes CLI (kubectl)
- (Optional*) Install the Terraform CLI
- (Optional*) Install Helm (the package manager for Kubernetes)
*NOTE: If you're planning to use an existing EKS cluster, you don't need the optional prerequisites.
Before you start deploying and testing blueprints, make sure you follow next steps. For example, all blueprints assume that you have an EKS cluster with Karpenter deployed, and others even required that you have a default
Karpenter NodePool
deployed.
If you're planning on using an existing EKS cluster, you can use an existing node group with On-Demand instances to deploy the Karpenter controller. To do so, you need to follow the Karpenter getting started guide.
You'll create an Amazon EKS cluster using the EKS Blueprints for Terraform project. The Terraform template included in this repository is going to create a VPC, an EKS control plane, and a Kubernetes service account along with the IAM role and associate them using IAM Roles for Service Accounts (IRSA) to let Karpenter launch instances. Additionally, the template configures the Karpenter node role to the aws-auth
configmap to allow nodes to connect, and creates an On-Demand managed node group for the kube-system
and karpenter
namespaces.
To create the cluster, clone this repository and open the cluster/terraform
folder. Then, run the following commands:
cd cluster/terraform
helm registry logout public.ecr.aws
export TF_VAR_region=$AWS_REGION
terraform init
terraform apply -target="module.vpc" -auto-approve
terraform apply -target="module.eks" -auto-approve
terraform apply --auto-approve
Before you continue, you need to enable your AWS account to launch Spot instances if you haven't launch any yet. To do so, create the service-linked role for Spot by running the following command:
aws iam create-service-linked-role --aws-service-name spot.amazonaws.com || true
You might see the following error if the role has already been successfully created. You don't need to worry about this error, you simply had to run the above command to make sure you have the service-linked role to launch Spot instances:
An error occurred (InvalidInput) when calling the CreateServiceLinkedRole operation: Service role name AWSServiceRoleForEC2Spot has been taken in this account, please try a different suffix.
Once complete (after waiting about 15 minutes), run the following command to update the kube.config
file to interact with the cluster through kubectl
:
aws eks --region $AWS_REGION update-kubeconfig --name karpenter-blueprints
You need to make sure you can interact with the cluster and that the Karpenter pods are running:
$> kubectl get pods -n karpenter
NAME READY STATUS RESTARTS AGE
karpenter-5f97c944df-bm85s 1/1 Running 0 15m
karpenter-5f97c944df-xr9jf 1/1 Running 0 15m
You can now proceed to deploy the default Karpenter NodePool, and deploy any blueprint you want to test.
Before you start deploying a blueprint, you need to have a default EC2NodeClass and a default NodePool as some blueprints need them. EC2NodeClass
enable configuration of AWS specific settings for EC2 instances launched by Karpenter. The NodePool
sets constraints on the nodes that can be created by Karpenter and the pods that can run on those nodes. Each NodePool must reference an EC2NodeClass
using spec.nodeClassRef
.
If you create a new EKS cluster following the previous steps, a Karpenter EC2NodeClass
"default" and a Karpenter NodePool
"default" are installed automatically.
NOTE: For existing EKS cluster you have to modify the provided ./cluster/terraform/karpenter.tf
according to your setup by properly modifying securityGroupSelectorTerm
and subnetSelectorTerms
removing the depends_on
section. If you're not using Terraform, you need to get those values manually. CLUSTER_NAME
is the name of your EKS cluster (not the ARN). Karpenter auto-generates the instance profile in your EC2NodeClass
given the role that you specify in spec.role with the placeholder KARPENTER_NODE_IAM_ROLE_NAME
, which is a way to pass a single IAM role to the EC2 instance launched by the Karpenter NodePool
. Typically, the instance profile name is the same as the IAM role(not the ARN).
You can see that the NodePool has been deployed by running this:
kubectl get nodepool
You can see that the EC2NodeClass
has been deployed by running this:
kubectl get ec2nodeclass
Throughout all the blueprints, you might need to review Karpenter logs, so let's create an alias for that to read logs by simply running kl
:
alias kl="kubectl -n karpenter logs -l app.kubernetes.io/name=karpenter --all-containers=true -f --tail=20"
You can now proceed to deploy any blueprint you want to test.
Once you're done with testing the blueprints, if you used the Terraform template from this repository, you can proceed to remove all the resources that Terraform created. To do so, run the following commands:
kubectl delete --all nodeclaim
kubectl delete --all nodepool
kubectl delete --all ec2nodeclass
export TF_VAR_region=$AWS_REGION
terraform destroy -target="module.eks_blueprints_addons" --auto-approve
terraform destroy -target="module.eks" --auto-approve
terraform destroy --auto-approve
After you have a cluster up and running with Karpenter installed, you can start testing each blueprint. A blueprint might have a NodePool
, EC2NodeClass
and a workload example. You need to open the blueprint folder and follow the steps to deploy the resources needed to test the blueprint.
Here's the list of blueprints we have so far:
- High-Availability: Spread Pods across AZs & Nodes
- Split Between On-Demand & Spot Instances
- Prioritize Savings Plans and/or Reserved Instances
- Working with Graviton Instances
- Overprovision capacity in advanced to increase responsiveness
- Using multiple EBS volumes
- Working with Stateful Workloads using EBS
- Update Nodes using Drift
- Launching nodes using custom AMIs
- Customizing nodes with your own User Data automation
- Protecting batch jobs during the consolidation process
- NodePool Disruption Budgets
NOTE: Each blueprint is independent from each other, so you can deploy and test multiple blueprints at the same time in the same Kubernetes cluster. However, to reduce noise, we recommend you to test one blueprint at a time.
The following table describes the list of resources along with the versions where the blueprints in this repo have been tested.
Resources/Tool | Version |
---|---|
Kubernetes | 1.30 |
Karpenter | v1.0.1 |
Terraform | 1.9.3 |
AWS EKS | v20.23.0 |
EKS Blueprints Addons | v1.16.3 |
To post feedback, submit a new blueprint, or report bugs, please use the Issues section of this GitHub repo.
MIT-0 Licensed. See LICENSE.