-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jcontext #43
Conversation
…buffered until we know what to do with UseStopChan
At least by my standards.
Don't block on Add when supervisor is done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just flagged a few documentation issues, and a few usability tweaks.
Thank you very much for working with me so much.
It is truly my pleasure. I really appreciate being involved in this project, and this request for review has brightened my day!
P.S.: any chance you might consider adding a CONTRIBUTORS.md
? I'd love to be able to show off my name in there. 😆
v4/service.go
Outdated
to prevent any corruption in the previous state from crashing the | ||
service again. | ||
|
||
The Supervisor will pass a ctx down to the service. That context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be helpful to explain the intended use of the context. This only tells me what not to do.
v4/supervisor.go
Outdated
ctxMutex sync.Mutex | ||
ctx context.Context | ||
// This function cancels this supervisor specifically. | ||
myCancel func() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider renaming to something more descriptive like cancel
(this is recognizable as the context’s cancel func, due to convention) or ctxCancel
.
v4/supervisor.go
Outdated
// participate in the log function propagation and recursive | ||
// UnstoppedService report. | ||
// | ||
// It is legal for GetSupervisor to potentiall return nil, in which case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: potentially
return nil | ||
} | ||
|
||
var currentSupervisorIDL sync.Mutex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace with atomic.AddUint32 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I've never gotten a clear sense of exactly what contract the atomic functions provide, in particular, the read guarantees they provide, so I'm a bit nervous about putting this out in such a basic library.
(In particular, I'd really like to know for sure for non-suture-related code that an atomic pointer swap's result can be safely read without any atomic operations, but I can't find any documentation or person willing to guarantee that. I've been doing it anyhow in some code and it seems to work, and it seems like there can't be any other purpose for the operation since there is no AtomicRead, but I'd really like it to be stated somewhere.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it would be absolutely astonishing if the return value of CompareAndSwap
, AddUint32
, etc required synchronization. It would practically defeat the purpose of these functions.
there is no AtomicRead
There's the atomic.Load*
family, but they seem to be the counterparty to atomic.Store*
.
Personally, I feel comfortable taking that bet (in fact, I still can't bring myself to consider it a bet at all), but I also can't imagine the mutex presents a meaningful bottleneck, either. I guess there's no harm in playing it safe.
v4/supervisor.go
Outdated
BadStopLogger func(*Supervisor, Service, string) | ||
|
||
// FailureLogger is called when a service fails | ||
FailureLogger func( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I quite often find myself writing log handlers to use github.com/lthibault/log as a logging backend. It’s quite tedious given the large arity of these functions. Consider passing in a struct instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The more I've thought about it, the more sensible that seems to be.
Let's go ahead and do it.
v4/service.go
Outdated
service again. | ||
|
||
The Supervisor will pass a ctx down to the service. That context | ||
can not be independently used to cancel the service, as the supervisor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't actually cancel a context that was passed in to you, as you don't have a reference to the cancel function, so I am not sure this explanation even makes sense.
I think the important thing to note here is that the context is used for indication of the lifetime of the service, namely when the done channel is closed, the service should terminate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read this as “if you derive a subcontext using WithCancel
, don’t try to cancel with the resulting CancelFunc
”.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not? That cancellation will not propagate upwards.
The whole point of passing a context down to the service is so that it could abort all ongoing http requests and what not as the service is asked to terminate, to ensure prompt termination, so you definately want to pass down the context deeper, create child contexts, and by definition you always have to cancel the child contexts in order not to have leaks (even if the task scoped by the context is done, in which cancellation is just resource cleanup).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe Audrius was correct in the first comment. I was simply mistaken because the child service won't get the cancel function passed to it.
Thank you both for going over the docs; I was just expecting to do it myself. I will hopefully get to reviewing these changes late today my time (I'm in EST).
v4/service.go
Outdated
not be restarted and removed from the supervisor entirely. | ||
|
||
ErrTerminateTree indicates that the entire supervision tree should be | ||
restarted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restarted or terminated?
The comment says restarted, the error says terminated.
Atleast in syncthings case, we're loking for termination, as we want to unwind everything in case of a fatal database error or something along those lines.
v4/service.go
Outdated
In Go 1.13 and greater, this is checked via errors.Is, so the error | ||
can be further wrapped with whatever additional info you like. Prior | ||
to Go 1.13, it will be checked via directly equality check, so the | ||
distinguished errors can not be wrapped. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cannot
v4/service.go
Outdated
the Service should also have been performed. | ||
|
||
If a service does not stop within the supervisor's timeout duration, a log | ||
entry will be made with a descriptive string to that effect. This does |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if you "make a log entry",
"the supervisor will log a message with a descriptive string to that effect"
// having terminated, and includes the timeout the supervisor | ||
// itself waited before closing the liveness channel. | ||
case <-s.ctx.Done(): | ||
return ErrTimeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if this is called without the service being started?
I suspect you'll have a nil pointer exception?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sendControl above it shouldn't be able to succeed until the supervisor is started.
ctx, myCancel := context.WithCancel(ctx) | ||
s.ctxMutex.Lock() | ||
s.ctx = ctx | ||
s.ctxMutex.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in all cases, the only use of s.ctx
is just accessing it's Done()
, so perhaps you don't need to store the whole context, and could just store the done channel.
Also, select on a nil channel never "fires", so I think it might be safe to not even have a mutex for that?
func (s *Supervisor) Serve(ctx context.Context) error { | ||
// context documentation suggests that it is legal for functions to | ||
// take nil contexts, it's user's responsibility to never pass them in. | ||
if ctx == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think supporting passing nil context is a bad idea, as this effectively means that the user gives up the option to stop the service tree, which I don't think should ever be encouraged, and suggests sloppy application design.
If the user wants to be sloppy, he should pass context.TODO() or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did this because a careful parsing of the context library documentation says it is the caller's responsibility to pass in a correct context, which I am reading as, it's OK for a receiver of a context to treat nil contexts as undefined behavior, basically, so it's OK to proceed without it.
You'll note I'm not documenting this option, or showing it in examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm with @AudriusButkevicius insofar as passing a nil
context is a code smell. However, I'm also aligned with @thejerf in his "defensive programming" impulse.
What if we returned an error? This would tick both the box of discouraging bad design and account for invalid input.
// and the way it works, I think it's OK in this case. This is the | ||
// exceptional case, basically. | ||
ctxMutex sync.Mutex | ||
ctx context.Context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use sync.AtomicValue
README.md
Outdated
with [contexts](https://golang.org/pkg/context/). If you are using suture | ||
for the first time, I recommend it. | ||
|
||
[suture v3](https://godoc.org/gopkg.in/thejerf/suture.v3{ is the latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[suture v3](https://godoc.org/gopkg.in/thejerf/suture.v3{ is the latest | |
[suture v3](https://godoc.org/gopkg.in/thejerf/suture.v3) is the latest |
v4/supervisor.go
Outdated
s.m.Unlock() | ||
|
||
response := make(chan serviceID) | ||
if !s.sendControl(addService{service, serviceName(service), response}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sendControl
now returns an error instead of a bool.
v4/supervisor.go
Outdated
case s.control <- sm: | ||
return nil | ||
case <-doneChan: | ||
return ErrTimeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using ErrTimeout
to represent that supervisor was stopped/context was cancelled feels like a source of confusion. I'd create a dedicated error like ErrSupervisorNotStarted
.
@thejerf What is your plan or status on this? Sorry for the nagging, I definitely don't mean to complain. I am interested because I have a largish refactor of Syncthing's db layer brewing that uses the v4 api. Would you appreciate a PR taking a stab at resolving the review comments? |
Sorry, fell through life's cracks. I've pushed most of the updates suggested here just now. I think I agree with lthibault that the logging should be done with one log function that takes a defined set of structs. It should define an interface that only our log messages conform to (to guarantee that no other ones will sneak in), and the log messages should be pre-configured with JSON struct tags for JSON log output. If you want to create a PR for that, that would be helpful. Tests will need a bit of work; possible a logging object where you can register hooks for each of the incoming kinds of structs, to simulate the way they are used now. I'll also try to work on it here at some point; I'll put a comment here when I start. |
Thanks for the update! I PRed a proposal for a new log/event interface, including adjusting tests: #44 |
v4/supervisor.go
Outdated
@@ -694,7 +699,7 @@ terminate. A timeout value of 0 means to wait forever. | |||
|
|||
If a nil error is returned from this function, then the service was | |||
terminated normally. If either the supervisor terminates or the timeout | |||
passes, ErrTimeout is returned. (If this isn't even the right supervisor | |||
passes, Err is returned. (If this isn't even the right supervisor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a find replace error? "Err"
v4: Add EventHook instead of various logging closures
1. Rename them all to start with Event, so they're contiguous in the godoc. 2. Add JSON serialization for all the events. 3. By suggestion, add a method to return map[string]interface{} for all events. 4. Include the original objects in the log, but mark the fields containing them as not to be serialized in JSON, so we don't serializing arbitrarily-large objects.
jcontext: fix regression introduced in #42 & various
@imsodin , @AudriusButkevicius , @lthibault , it is done and merged. This is my first attempt at making a true go module so let me know if I botched anything. But I did just quickly here make a new project that imported suture and it seemed to do what it should. Thank you all for you work with me and I hope this is helpful for you. I may still do some small doc tweaks but it's not anything worth holding a release for. |
@AudriusButkevicius , @lthibault , this is the PR I'm tentatively planning on going with.
It has all the testing, the merged PRs from the other branches, documentation updates. It's late where I am and I want to look at this with some fresh eyes and possibly gloss over the documentation once more before I push it, but this is my release candidate.
Thank you very much for working with me so much. I've placed a special thanks section in the README.md in this PR thanking Syncthing for working with me so well. If you'd like to see that phrasing changed please let me know.