Skip to content
This repository has been archived by the owner on Jan 19, 2023. It is now read-only.

refactor dash.Runner startup #1676

Merged

Conversation

jamieklassen
Copy link
Contributor

What this PR does / why we need it: Adds two simple integration tests for the two main startup flows (with and without a kubeconfig)

Which issue(s) this PR fixes

  • Fixes #

Special notes for your reviewer: These tests require the test runner process to be able to create a directory, create/delete temporary files, and open TCP listeners (high-numbered random ports via 127.0.0.1:0) on the host.

Release note:

release-note

@jamieklassen jamieklassen force-pushed the runner-startup-tests branch 2 times, most recently from bdc2e15 to e44951b Compare November 20, 2020 17:43
@wwitzel3 wwitzel3 requested a review from a team November 21, 2020 00:43
@jamieklassen jamieklassen marked this pull request as draft November 23, 2020 23:12
@jamieklassen
Copy link
Contributor Author

jamieklassen commented Nov 23, 2020

Pushed some low-hanging refactors to clean up Runner.Start especially.

I've toyed with ways to break the filesystem dependency (even though an in-memory afero.Fs be used as a double for testing ValidateKubeConfig, cluster.FromKubeConfig still calls out to the real filesystem) or the use of a 'real' httptest.Server for the fake k8s API, but both would require really intrusive refactorings and result in duplicating some of the code from k8s.io/client-go so I think it's worth paying this performance/stability price in the tests.

However after discussing with @wwitzel3, tomorrow I will add a flag and a field on dash.Options to allow specifying the host:port for octant. This seems to be a sensible feature that nobody has yet gotten around to. Ideally this will also pave the way to use a a fake, in-memory net.Listener (not backed by a real network interface) in tests.

EDIT: I'd also like to ease the burden on the caller to use ocontext.WithKubeConfigCh and instead put this into Start.

@jamieklassen jamieklassen marked this pull request as ready for review November 30, 2020 14:07
@GuessWhoSamFoo
Copy link
Contributor

I'll take a pass at this later today

Copy link
Contributor

@GuessWhoSamFoo GuessWhoSamFoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. It greatly improves the readability of dash.go

pkg/dash/dash.go Outdated
} else {
r.startAPIService(ctx, logger, options)
logger.Infof("waiting for kube config ...")
options.KubeConfig = <-ocontext.KubeConfigChFrom(r.ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This blocks which is fine because it needs to wait if the user adds a kubeconfig, however it will hang on sigint/sigterm without shutting down plugins. In other words, octant will hang if the users tries to exit in the loading api state via ctrl + c without a way to recover

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh that makes sense, thank you! I'll fix this up.

Copy link
Contributor

@GuessWhoSamFoo GuessWhoSamFoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, can you squash some of these commits?

@jamieklassen jamieklassen changed the title runner startup tests refactor dash.Runner startup Dec 3, 2020
@jamieklassen jamieklassen force-pushed the runner-startup-tests branch 2 times, most recently from 585a78c to 49d9e8d Compare December 4, 2020 16:25
@jamieklassen
Copy link
Contributor Author

@GuessWhoSamFoo I rearranged the changes to remove some of the re-work -- for example, in the old commit log I removed the "wait for kubeconfig in a goroutine" logic and then re-added it based on your feedback, whereas in this version I add that integration test in the first commit and ensure that all commits pass the tests -- but I kept the individual commits for each stage of the refactor.

I thought it would be valuable to keep each change separate, along with its justification. This is mainly because the tests I added are not exhaustive, and I may have made another mistake besides the one you already observed. If that turns out to be the case, then somebody may need to look back at my changes. I imagine it would be helpful to have a log of each refactoring decision as it was made.

I'm OK to squash more aggressively if you prefer. Let me know! 😄

Jamie Klassen and others added 4 commits December 4, 2020 11:30
Also externalize NewRunner's dependence on net.Listener and inject an in-memory
listener for the integration tests -- this should make them a bit faster,
stabler (no dependence on the operating system to provision a TCP socket) and
safe to run in parallel.

Really just covering the happy paths here. Ideally some function signatures
can get cleaned up and potentially an isolate-able 'startup component' could
be extracted to get more focused unit-test feedback.

The 'startup without a kubeconfig' flow was a bit more delicate -- after sending
the kubeconfig over the streaming API, there is a brief period of downtime
while the API reloads and the test had to wait for this.

There is one possible edge case where the "clean shutdown before receiving
kubeconfig" scenario will erroneously pass -- this is in the case that the
machine running the tests contains a valid kubeconfig at the default location,
and the test takes over a second to run.

In this case the LoadingManager's behaviour of loading a kubeconfig from the
default location (1 second after starting) would kick in, and the API would
successfully be initialized -- therefore we would not be exercising the
"shutdown before loading kubeconfig" scenario we claim to exercise.

We tried refactoring various packages to the point that we could inject a fake
filesystem into the LoadingManager (to force it to never find a valid kubeconfig
at the default location) but it was fairly intrusive, and it seems exceedingly
rare that our new test should take anywhere near 1 second to run.

Signed-off-by: Jamie Klassen <[email protected]>
Co-authored-by: Dimitry Besson <[email protected]>
Co-authored-by: James Thomson <[email protected]>
Co-authored-by: Vikram Yadav <[email protected]>
apiErr is always nil in this case, so the error-handling branch will never
be executed.

Signed-off-by: Jamie Klassen <[email protected]>
Also the nil check in Start seems to be redundant. Honestly that "unexpected
empty kubeconfig" line could maybe be removed as I'm pretty sure it's dead code.

Signed-off-by: Jamie Klassen <[email protected]>
I actually find it easier to understand (particularly the `r.dash.server.Handler`
side effect) if it's just in the main body of the function.

Signed-off-by: Jamie Klassen <[email protected]>
Jamie Klassen and others added 2 commits December 4, 2020 13:35
allowing users to simply call `NewRunner` without having to remember to populate
the passed context with a kubeconfig channel.

We achieve this by having the Runner keep track of the context it was created
with. While we were at it, we noticed that there was no need to have a Logger
parameter for Start -- in fact, in the long run we may be able to keep removing
parameters from Start until only the startup/shutdown channels remain.

Signed-off-by: Jamie Klassen <[email protected]>
Co-authored-by: James Thomson <[email protected]>
Signed-off-by: Jamie Klassen <[email protected]>
Co-authored-by: James Thomson <[email protected]>
@GuessWhoSamFoo GuessWhoSamFoo merged commit 0e2e4e2 into vmware-archive:master Dec 4, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants