-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
If you have successfully authenticated and then start getting a 401 you should re-auth. #539
Comments
I might be wrong about 401, it might be 403 and I've lost the plot. Need to retest when I get a moment. |
Thanks for the report. Please keep me posted if you find more information. When I have time, I'll try to investigate and reproduce. |
I was correct initially when I said 401. I just checked the network traffic I had recorded. That does make sense, 401 for credentials that don't work. I happened to see a 403 in a test fail, but that must have been authenticated okay, but some permissions weren't right.
|
To work around this for now I've basically stuck a facade on the DB to allow me to retry in this situation.
Internally I end up replacing the whole client/db object under the hood to reauth currently. Essentially I think you need at the least some sort of 'invalidate token' or 'reauth' method on your auth providers, and ideally an automatic retry mechanism when you do hit a 401. Perhaps the auth mechanisms could have a 'retry warranted' method too? For basic auth a retry isn't really going to make sense. For cookies it does. It depends on if you have a token that can be refreshed. |
Here is a sample config I created using mountebank to reproduce and test my changes in my own code. It provides minimal mocks that replicate enough of couchdb's responses. The first call to PUT works, the second fails, then it rotates through the responses alternately succeeding and failing. If figure it would be worth sharing as it might help illustrate the problem, although obviously it's not completely realistic. |
Looking at the code in more detail, I'm wondering if the I think that works with the spirit of the interface, and also allows the potential change to be very small and not require anything at all structural to be changed. |
Actually, I guess that still leaves you needing a retry, but that's less problematic than the situation where you can't move on because the baked in cookie is invalid, but you can't trigger a new log in. |
I have mocked out something that potentially allows the cookies to be invalidated using the RoundTripper interface. In some ways it's tempting to do everything there, including a retry, but the interface documentation seems to suggest that is definitely not what one should do, and it's quite plausibly quite problematic. Invalidating the cookie doesn't seem like too bad an idea at that point, or at least I can't spot a particular reason not to. |
I do see prior art on someone using the interface to do a retry. My one concern is if the reading of the body of the request could be problematic for the retry, but that could be cloned. |
I'm only now looking closely at this. A couple questions and comments...
So it seems clear that you're able to trigger this bug by shortening the expiry time after the cookie has been created with a longer one--that much makes sense. But do you have an understanding of how/why this was happening without that change in configuration? Is there a CouchDB bug that needs to be investigated, too? In any case, this should be fixed, or at least a work-around provided in this library, so my thoughts on that:
2b. We could get even more out of this if we use the
3b. [random idea] I wonder if we could just clear the cookie in case of a 401. That won't cause a loop, and we can assume that cookie is dead at that point. Then any subsequent requests should attempt to re-authenticate... (I think). I guess that's what your PR does. |
It sounds like we are basically thinking the same things. My PR is aiming to do 3b. That allows the user to retry as they want. I'm not certain as to real life circumstances in which this happens, but I would guess time drift between servers might be one candidate. Exactly how disruptive that would be would depend on the scale of that drift, but fundamentally it's possible, and I'd rather minimise the disruption. The PR isn't perfect, you've made one comment already, and I can take a look at getting it into a state that you're happy to merge it. I have tested it enough to be happy that the current code works and reduces this problem. It just needs refining. |
I was just looking at the current auth code, and was reminded, thanks to a comment, that if CouchDB is configured without 'allow_persistent_cookies', then the cookie does not include an expiry date, and kivik can do nothing about it. I don't think that's what's happening to you, but it would result in a very similar (or identical) behavior: a 401 with no recourse. |
If the problem is the result of clock drift, then maybe one partial solution would be to refresh the cookie before it expires. I'm actually a bit surprised I didn't do this already. See here: https://github.com/go-kivik/couchdb/blob/master/chttp/cookieauth.go#L60-L62 A lot of services will refresh a token halfway to the expiry. Maybe a similar approach can be taken here. |
One thing to note is that couchdb automatically refreshes cookies if you are in regular contact with it. I'm not sure at exactly it's rules for doing so are, but you can see as you send a request to do something, it will set a different cookie value in the response to extend the session. I set my session length to 10s in the couchdb server and hammered it with traffic and saw no glitches because of that. This is probably one of the reasons that this has never been a large problem for you thus far. While the problem outlined here is legit, it is an edge case. I should also note that when I started traffic again at some point > 10s I saw your library perform beautifully by starting the traffic with an attempt to log in as the expiry log internally worked as designed. Your library is generally working as required, except for holding onto the cookie when there is evidence that it's not working. |
Thinking about it, it might be worth noting that someone who has cookie auth and is talking to a cluster without a shared secret set could also hit this sort of behaviour. Ironically the change I'm proposing might make this even more annoying to spot as it could be slightly less pervasive. If you had a cluster of 3 boxes, you'd expect 66% of calls to during the time frame 401. With the automatic removal of the cookie, you'd potentially only see around 33%? This point makes me wonder about checking if the documentation for this library is clear enough regarding cookie auth and server setup? Cookie auth is generally the neatest option, but you do need to ensure your server cluster is set up correctly. |
This sounds like a pretty clearly mis-configured cluster, so I wouldn't try very hard to accomodate this scenario. |
I don't mean do anything special in the code. More a note 'Cookie auth is used by default, so ensure your cluster is configured correctly' type of message. It may well be that you have that in there already. I'm not looking at the docs at this point. |
I've merged go-kivik/couchdb#280 and backported to v3 with go-kivik/couchdb#281. v3.2.9 includes the fix. |
Bug Report Checklist (remove this template if submitting a feature request)
master
.master
go version
) -- Kivik 2.x supports Go 1.9 and later. Kivik 3.x requires Go 1.11 or later, with module support. The Kivik 4/master aims to support Go 1.11 and later, with module support.1.16
There is a situation where the cookie auth for couchdb (note, I'm not sure whether this bug report belongs in this repo or the couchdb one) gets out of sync, and couchdb returns a 401 to our previously working cookie. In that situation, bar waiting for the cookie to expire, I'm struggling to spot how to rectify the situation.
To replicate this in a reliable but somewhat awkward way I set couchdb to have a long expiry time, got the kivik client to connect and do stuff so it stored the cookie. Then I tweaked the couchdb config to have a shorter expiry and re-started. The cookie now became invalid, and so couchdb returned a 401.
Kivik receive the 401, and try to log in again, and assuming that worked, retry the operation. If that received a 401, then fair enough, chuck that error back.
401, and I don't know what I can do to resolve that. There doesn't appear to be an obvious way to tell the library to reauthenticate, or that the cookie is no longer valid, which would probably resolve the situation.
I should note that under normal circumstances the library performs great. When it's in sync, or constantly communicating things work out fine. It either gets sent an updated cookie by couchdb as it rotates the sessions, or it realises that the cookie has expired and does the right thing. The problem occurs when the 2 sides get out of sync somehow.
It could be that I'm missing something in the library that allows me to deal with this. It also might be that this bug report ought to be in the couchdb repo. I'm not too certain where this ought to be resolved.
The text was updated successfully, but these errors were encountered: