Connor's Blog

k9s Connection Pooling Bug

A while ago, I found a bug in k9s that can cause it to be much slower depending on your ~/.kube/config file. I’ve reported it on GitHub, but it’s a tricky bug to fix and has not yet been addressed. I wanted to write a blog post about it for visibility because I think a lot of people are encountering into this without realising.

The TLDR is that if you have a proxy-url set for clusters in your ~/.kube/config like below, any actions that make a request such as describing pods, listing namespaces, or deleting resources will be noticably slower than they should be.

Setting http_proxy/https_proxy in the environment will also trigger the bug.

apiVersion: v1
kind: Config
clusters:
  - name: my-cluster
    cluster:
      server: https://my-cluster:6443
      proxy-url: http://my-proxy:5555
      ...

The slowness is caused by connections not being pooled correctly only when a proxy is in use.

When performing an action in k9s that fires a request to the API server, typically a TCP connection would be already open and reused from the pool, avoiding the overhead of a three-way TCP handshake. When connection pooling doesn’t work properly, every request instantiates a new connection, adding at least 3 * RTT of latency.

The bug lies in kubectl code that k9s depends on for generating resource descriptions. The kubectl code calls into client-go every time a resource is described, where a new http.Client is instantiated if the transport is not http.DefaultTransport, like when a proxy is set. Each http.Client object owns a connection pool, so in effect, each request is made using an empty connection pool and then the pool is discarded.

You can read the bug report and investigation here: https://github.com/derailed/k9s/issues/2098