-
-
Notifications
You must be signed in to change notification settings - Fork 290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix force-renewing revoked on-demand certs #166
Conversation
Follow-up to 9245be5
When I initially wrote the auto-replace feature, it was for the standard mode of operation, which I presumed the vast majority of CertMagic deployments use. At the time, On-Demand mode of operation was fairly niche. And at the time, it looked tricky to properly enable this feature for on-demand certificates, so I shelved it considering it would be low-impact anyway. So on-demand certificates didn't benefit from auto-replace in the case of revocation (oh well, no other servers / ACME clients do that at all anyway). I guess since that time, the use of CertMagic's exclusive on-demand feature has risen in popularity. But there is no way to tell, and I had no real way of knowing whether any significant use of the feature is being had since Caddy has no telemetry. (We used to have telemetry -- benign, anonymous technical stats to help us understand usage -- but unfortunately public backlash forced us to end the program.) Based on public feedback forced by external events, it seems that on-demand TLS deployments are probably rare, but each of those few deployments actually serve thousands of sites/domains. (The true importance of this feature would have been clear months ago if Caddy had telemetry, as Caddy is the primary importer of CertMagic.) This commit should enable auto-replace for on-demand certificates. It required some refactoring and some decisions that aren't *entirely* clear are right, but that's how it goes. I haven't tested this. (Last time I worked on this feature it took me about 2 days to test properly.)
Ugh, this is so tricky. Spent a couple days on this patch now and have only read through the code a dozen times to try to ensure it will be as correct as possible. Testing this functionality is so hard (last time it took me about a full day to set things up and test -- and even then, it's like one of those things where "the last time you fold up the parachute, you better do it right"). |
Required significant refactoring, hope it works. Yet again way too late at night for this...
I went ahead and implemented OCSP refreshing and revocation checking at load-time too (i.e. the call to This required some significant refactoring and it's super clear to me now why I didn't do this before. Overall, this PR should make CertMagic's OCSP stapling and revocation checking more robust, and also including on-demand certificates now too, whereas it didn't before. Very difficult to test this properly, so that will probably have to come later. (Feel free if anyone wants to help out!) |
if ocspResp.NextUpdate.After(cert.Leaf.NotAfter) { | ||
// uh oh, this OCSP response expires AFTER the certificate does, that's kinda bogus. | ||
// it was the reason a lot of Symantec-validated sites (not Caddy) went down | ||
// in October 2017. https://twitter.com/mattiasgeniar/status/919432824708648961 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This tweet got deleted it seems, unfortunately.
This is still untested and it would take another day or three to properly test this (like it did before, only longer for this one, due to conditions) -- but I'm going to give this a shot and see how it goes. |
As it turns out, this PR was almost correct. We later tested it in production and pushed a few more commits, and now we have good evidence that it works as expected. |
* Fix force-renewing revoked on-demand certs Follow-up to 9245be5 * One more fix for on-demand logic of revoked certs * OCSP revocation checks at startup, too Required significant refactoring, hope it works. Yet again way too late at night for this...
Follow-up to 9245be5
Now I remember why I didn't implement this at first. Kind of complicated... and nobody at the time really needed it, as far as I knew.