Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running two master nodes, powering one down, kube_vip is no longer contactable. #444

Closed
1 task done
slbailey opened this issue Feb 6, 2024 · 0 comments
Closed
1 task done

Comments

@slbailey
Copy link

slbailey commented Feb 6, 2024

Expected Behavior

k3s-01 and k3s-02 are master nodes
VIP is 192.168.60.222
When both nodes are online, the VIP is pingable.
Power down k3s-01
VIP remains pingable.

Current Behavior

When I power down k3s-01, VIP is no longer pingable. VIP remains unresponsive until k3s-01 is powered back up. Not sure if this is by design and HA is only available with 3 master nodes or if there is an issue I'm not aware of.

Steps to Reproduce

  1. Create cluster with 5 nodes (2 master and 3 worker)
  2. Verify (ping -t 192.168.60.222) VIP is contactable
  3. Power down k3s-01 (one master node)
  4. Verify (ping -t 192.168.60.222) VIP no longer responds.

Variables Used

all.yml

---
k3s_version: v1.29.0+k3s1
# this is the user that has ssh access to these machines
ansible_user: slb
systemd_dir: /etc/systemd/system

# Set your timezone
system_timezone: "America/New_York"

# interface which will be used for flannel
flannel_iface: "eth0"

# uncomment calico_iface to use tigera operator/calico cni instead of flannel https://docs.tigera.io/calico/latest/about
# calico_iface: "eth0"
calico_ebpf: false           # use eBPF dataplane instead of iptables
calico_tag: "v3.27.0"        # calico version tag

# uncomment cilium_iface to use cilium cni instead of flannel or calico
# ensure v4.19.57, v5.1.16, v5.2.0 or more recent kernel
# cilium_iface: "eth0"
cilium_mode: "native"        # native when nodes on same subnet or using bgp, else set routed
cilium_tag: "v1.14.6"        # cilium version tag
cilium_hubble: true          # enable hubble observability relay and ui

# if using calico or cilium, you may specify the cluster pod cidr pool
cluster_cidr: "10.52.0.0/16"

# enable cilium bgp control plane for lb services and pod cidrs. disables metallb.
cilium_bgp: false

# bgp parameters for cilium cni. only active when cilium_iface is defined and cilium_bgp is true.
cilium_bgp_my_asn: "64513"
cilium_bgp_peer_asn: "64512"
cilium_bgp_peer_address: "192.168.60.1"
cilium_bgp_lb_cidr: "192.168.31.0/24"   # cidr for cilium loadbalancer ipam

# apiserver_endpoint is virtual ip-address which will be configured on each master
apiserver_endpoint: "192.168.60.222"

# k3s_token is required  masters can talk together securely
# this token should be alpha numeric only
k3s_token: "superdupersupersecretPassw0rd"

# The IP on which the node is reachable in the cluster.
# Here, a sensible default is provided, you can still override
# it for each of your hosts, though.
k3s_node_ip: "{{ ansible_facts[(cilium_iface | default(calico_iface | default(flannel_iface)))]['ipv4']['address'] }}"

# Disable the taint manually by setting: k3s_master_taint = false
k3s_master_taint: "{{ true if groups['node'] | default([]) | length >= 1 else false }}"

# these arguments are recommended for servers as well as agents:
extra_args: >-
  {{ '--flannel-iface=' + flannel_iface if calico_iface is not defined and cilium_iface is not defined else '' }}
  --node-ip={{ k3s_node_ip }}

# change these to your liking, the only required are: --disable servicelb, --tls-san {{ apiserver_endpoint }}
# the contents of the if block is also required if using calico or cilium
extra_server_args: >-
  {{ extra_args }}
  {{ '--node-taint node-role.kubernetes.io/master=true:NoSchedule' if k3s_master_taint else '' }}
  {% if calico_iface is defined or cilium_iface is defined %}
  --flannel-backend=none
  --disable-network-policy
  --cluster-cidr={{ cluster_cidr | default('10.52.0.0/16') }}
  {% endif %}
  --tls-san {{ apiserver_endpoint }}
  --disable servicelb
  --disable traefik

extra_agent_args: >-
  {{ extra_args }}

# image tag for kube-vip
kube_vip_tag_version: "v0.6.4"

# tag for kube-vip-cloud-provider manifest
# kube_vip_cloud_provider_tag_version: "main"

# kube-vip ip range for load balancer
# (uncomment to use kube-vip for services instead of MetalLB)
# kube_vip_lb_ip_range: "192.168.30.80-192.168.30.90"

# metallb type frr or native
metal_lb_type: "native"

# metallb mode layer2 or bgp
metal_lb_mode: "layer2"

# bgp options
# metal_lb_bgp_my_asn: "64513"
# metal_lb_bgp_peer_asn: "64512"
# metal_lb_bgp_peer_address: "192.168.30.1"

# image tag for metal lb
metal_lb_speaker_tag_version: "v0.13.12"
metal_lb_controller_tag_version: "v0.13.12"

# metallb ip range for load balancer
metal_lb_ip_range: "192.168.60.80-192.168.60.90"

# Only enable if your nodes are proxmox LXC nodes, make sure to configure your proxmox nodes
# in your hosts.ini file.
# Please read https://gist.github.com/triangletodd/02f595cd4c0dc9aac5f7763ca2264185 before using this.
# Most notably, your containers must be privileged, and must not have nesting set to true.
# Please note this script disables most of the security of lxc containers, with the trade off being that lxc
# containers are significantly more resource efficient compared to full VMs.
# Mixing and matching VMs and lxc containers is not supported, ymmv if you want to do this.
# I would only really recommend using this if you have particularly low powered proxmox nodes where the overhead of
# VMs would use a significant portion of your available resources.
proxmox_lxc_configure: false
# the user that you would use to ssh into the host, for example if you run ssh some-user@my-proxmox-host,
# set this value to some-user
proxmox_lxc_ssh_user: root
# the unique proxmox ids for all of the containers in the cluster, both worker and master nodes
proxmox_lxc_ct_ids:
  - 100
  - 101
  - 102
  - 103
  - 104

# Only enable this if you have set up your own container registry to act as a mirror / pull-through cache
# (harbor / nexus / docker's official registry / etc).
# Can be beneficial for larger dev/test environments (for example if you're getting rate limited by docker hub),
# or air-gapped environments where your nodes don't have internet access after the initial setup
# (which is still needed for downloading the k3s binary and such).
# k3s's documentation about private registries here: https://docs.k3s.io/installation/private-registry
custom_registries: false
# The registries can be authenticated or anonymous, depending on your registry server configuration.
# If they allow anonymous access, simply remove the following bit from custom_registries_yaml
#   configs:
#     "registry.domain.com":
#       auth:
#         username: yourusername
#         password: yourpassword
# The following is an example that pulls all images used in this playbook through your private registries.
# It also allows you to pull your own images from your private registry, without having to use imagePullSecrets
# in your deployments.
# If all you need is your own images and you don't care about caching the docker/quay/ghcr.io images,
# you can just remove those from the mirrors: section.
custom_registries_yaml: |
  mirrors:
    docker.io:
      endpoint:
        - "https://registry.domain.com/v2/dockerhub"
    quay.io:
      endpoint:
        - "https://registry.domain.com/v2/quayio"
    ghcr.io:
      endpoint:
        - "https://registry.domain.com/v2/ghcrio"
    registry.domain.com:
      endpoint:
        - "https://registry.domain.com"

  configs:
    "registry.domain.com":
      auth:
        username: yourusername
        password: yourpassword

# Only enable and configure these if you access the internet through a proxy
# proxy_env:
#   HTTP_PROXY: "http:https://proxy.domain.local:3128"
#   HTTPS_PROXY: "http:https://proxy.domain.local:3128"
#   NO_PROXY: "*.domain.local,127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16"

Hosts

host.ini

[master]
192.168.60.2
192.168.60.3

[node]
192.168.60.4
192.168.60.5
192.168.60.6

# only required if proxmox_lxc_configure: true
# must contain all proxmox instances that have a master or worker node
# [proxmox]
# 192.168.30.43

[k3s_cluster:children]
master
node
@techno-tim techno-tim locked and limited conversation to collaborators Feb 6, 2024
@timothystewart6 timothystewart6 converted this issue into discussion #445 Feb 6, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant