Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

context: add AfterFunc #57928

Closed
neild opened this issue Jan 19, 2023 · 41 comments
Closed

context: add AfterFunc #57928

neild opened this issue Jan 19, 2023 · 41 comments

Comments

@neild
Copy link
Contributor

neild commented Jan 19, 2023

Edit: The latest version of this proposal is #57928 (comment).


This proposal originates in discussion on #36503.

Contexts carry a cancellation signal. (For simplicity, let us consider a context past its deadline to be cancelled.)

Using a context's cancellation signal to terminate a blocking call to an interruptible but context-unaware function is tricky and inefficient. For example, it is possible to interrupt a read or write on a net.Conn or a wait on a sync.Cond when a context is cancelled, but only by starting a goroutine to watch for cancellation and interrupt the blocking operation. While goroutines are reasonably efficient, starting one for every operation can be inefficient when operations are cheap.

I propose that we add the ability to register a function which is called when a context is cancelled.

package context

// OnDone arranges for f to be called in a new goroutine after ctx is cancelled.
// If ctx is already cancelled, f is called immediately.
// f is called at most once.
//
// Calling the returned CancelFunc waits until any in-progress call to f completes,
// and stops any future calls to f.
// After the CancelFunc returns, f has either been called once or will not be called.
//
// If ctx has a method OnDone(func()) CancelFunc, OnDone will call it.
func OnDone(ctx context.Context, f func()) CancelFunc

OnDone permits a user to efficiently take some action when a context is cancelled, without the need to start a new goroutine in the common case when operations complete without being cancelled.

OnDone makes it simple to implement the merged-cancel behavior proposed in #36503:

func WithFirstCancel(ctx1, ctx2 context.Context) (context.Context, context.CancelFunc) {
	ctx, cancel := context.WithCancel(ctx1)
	stopf := context.OnDone(ctx2, func() {
		cancel()
	})
	return ctx, func() {
		cancel()
		stopf()
	}
}

Or to stop waiting on a sync.Cond when a context is cancelled:

func Wait(ctx context.Context, cond *sync.Cond) error {
	stopf := context.OnDone(ctx, cond.Broadcast)
	defer stopf()
	cond.Wait()
	return ctx.Err()
}

The OnDone func is executed in a new goroutine rather than synchronously in the call to CancelFunc that cancels the context because context cancellation is not expected to be a blocking operation. This does require the creation of a goroutine, but only in the case where an operation is cancelled and only for a limited time.

The CancelFunc returned by OnDone both provides a mechanism for cleaning up resources consumed by OnDone, and a synchronization mechanism. (See the ContextReadOnDone example below.)

Third-party context implementations can provide an OnDone method to efficiently schedule OnDone funcs. This mechanism could be used by the context package itself to improve the efficiency of third-party contexts: Currently, context.WithCancel and context.WithDeadline start a new goroutine when passed a third-party context.


Two more examples; first, a context-cancelled call to net.Conn.Read using the APIs available today:

// ContextRead demonstrates bounding a read on a net.Conn with a context
// using the existing Done channel.
func ContextRead(ctx context.Context, conn net.Conn, b []byte) (n int, err error) {
	errc := make(chan error)
	donec := make(chan struct{})
        // This goroutine is created on every call to ContextRead, and runs for as long as the conn.Read call.
	go func() {
		select {
		case <-ctx.Done():
			conn.SetReadDeadline(time.Now())
			errc <- ctx.Err()
		case <-donec:
			close(errc)
		}
	}()
	n, err = conn.Read(b)
	close(donec)
	if ctxErr := <-errc; ctxErr != nil {
		conn.SetReadDeadline(time.Time{})
		err = ctxErr
	}
	return n, err
}

And with context.OnDone:

func ContextReadOnDone(ctx context.Context, conn net.Conn, b []byte) (n int, err error) {
	var ctxErr error
        // The OnDone func runs in a new goroutine, but only when the context expires while the conn.Read is in progress.
	stopf := context.OnDone(ctx, func() {
		conn.SetReadDeadline(time.Now())
		ctxErr = ctx.Err()
	})
	n, err = conn.Read(b)
	stopf()
        // The call to stopf() ensures the OnDone func is finished modifying ctxErr.
	if ctxErr != nil {
		conn.SetReadDeadline(time.Time{})
		err = ctxErr
	}
	return n, err
}
@neild neild added the Proposal label Jan 19, 2023
@gopherbot gopherbot added this to the Proposal milestone Jan 19, 2023
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/462855 mentions this issue: context: add OnDone

@neild
Copy link
Contributor Author

neild commented Jan 20, 2023

It is worth noting that OnDone is a performance optimization. We can write it today in terms of the existing context package. The benefit of adding it to the package is that it permits us to take action on context cancellation without leaving a goroutine sitting around waiting for the cancel to happen.

func OnDone(ctx context.Context, f func()) context.CancelFunc {
	stopc := make(chan struct{}, 1)
	donec := make(chan struct{})
	go func() {
		select {
		case <-ctx.Done():
			f()
		case <-stopc:
		}
		close(donec)
	}()
	return func() {
		select {
		case stopc <- struct{}{}:
		default:
		}
		<-donec
	}
}

@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Jan 20, 2023
@ianlancetaylor
Copy link
Contributor

The ContextReadOnce example is still a bit awkward, in the sense that there are a few things that have to be done exactly right to avoid any problems. Perhaps we can tighten up the idea, at the cost of another closure.

// Try executes fn while watching ctx.  If ctx is cancelled, Try calls cancel and returns ctx.Err().
// Otherwise, Try returns the result of fn.
// This is implemented using an internal implementation of OnDone as described above.
func Try(fn func() error, cancel func()) error

// Example of using Try.
func ContextReadOnDone(ctx context.Context, conn net.Conn, b []byte) (n int, err error) {
    err = context.Try(func() error {
       n, err = conn.Read(b)
    }, func() {
        conn.SetReadDeadline(time.Now())
    })
    return n, err
}

@neild
Copy link
Contributor Author

neild commented Jan 20, 2023

Try does look quite elegant to use. I don't think it would work for the WithFirstCancel case, and in general Try can be implemented in terms of WithDone but not vice-versa, so if we were to have only one I'd vote for OnDone.

func Try(ctx context.Context, fn func() error, cancel func()) error {
    stopf := context.OnDone(ctx, cancel)
    err := fn()
    stopf()
    if ctx.Err() != nil {
        return ctx.Err()
    }
    return err
}

@bcmills
Copy link
Contributor

bcmills commented Jan 20, 2023

The Try function looks a lot like the os/exec functionality added for #50436.

  • Try is analogous to (*exec.Cmd).Wait.
  • The fn argument is analogous to the subprocess itself.
  • The cancel argument is analogous to the exec.Cmd.Cancel field.

@bcmills
Copy link
Contributor

bcmills commented Jan 20, 2023

Based on that analogy, I would suggest a slightly different implementation of Try in terms of OnDone:

func Try(ctx context.Context, fn func() error, cancel func()) error {
	canceled := false
	stop := context.OnDone(ctx, func() {
		cancel()
		canceled = true
	})
	err := fn()
	stop()
	if canceled && err == nil {
		return ctx.Err()
	}
	return err
}

Notably: in case of cancellation I would prefer to still return the result of fn if it is non-nil, since it may contain more detail about the result, but I would return ctx.Err() if fn returns nil in case the result of fn is spurious.

@rsc
Copy link
Contributor

rsc commented Feb 1, 2023

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc rsc moved this from Incoming to Active in Proposals Feb 1, 2023
@rsc
Copy link
Contributor

rsc commented Feb 8, 2023

This seems similar to having time.AfterFunc to avoid making a goroutine that calls time.Sleep and then the function. Here we avoid making a goroutine that receives from ctx.Done and then calls the function. Perhaps it should be context.After or AfterFunc?

context.After(ctx, func() { println("ctx is done!") })

reads nicely to me. It's nice that this would let people implement context.Merge themselves at no efficiency cost compared to the standard library.

We should specify that f is always run in a goroutine by itself, at least semantically, even in the case where

// If ctx is already cancelled, f is called immediately.

f shouldn't be called by context.After in that case either.

@Sajmani
Copy link
Contributor

Sajmani commented Feb 8, 2023

I like context.After

@neild
Copy link
Contributor Author

neild commented Feb 8, 2023

After is nice and short, but I think I prefer context.AfterFunc for consistency with time.AfterFunc.

@abursavich
Copy link

This may be implied, but to clarify since I don't see it in the example... After/AfterFunc should return a CancelFunc like the initial OnDone proposal.

@rsc
Copy link
Contributor

rsc commented Feb 22, 2023

OK, so it sounds like the signature is

package context
func AfterFunc(ctx Context, f func()) (stop func() bool)

AfterFunc arranges to call f after ctx is done (cancelled or timed out), and it calls f in a goroutine by itself. Even if ctx is already done, calling AfterFunc does not wait for f to return.

Multiple calls to AfterFunc on a given ctx are valid and operate independently; one does not replace another.

Calling stop stops the association of ctx with f. It reports whether the call stopped f from being run.
If stop returns false, then the context is already done and the function f has been started in its own goroutine; stop does not wait for f to complete before returning. If the caller needs to know whether f is completed, it must coordinate with f explicitly.

(This last paragraph is adapted from time.Timer.Stop.)

Do I have that right? Anything wrong there?

@neild
Copy link
Contributor Author

neild commented Feb 22, 2023

I'd prefer to have calling stop wait until f is completed if it has already started, since that makes using AfterFunc in a race-free manner simpler; you're guaranteed that after a call to stop, f either has run or will not run. But it doesn't make a difference in most of the motivating examples so perhaps consistency with time.AfterFunc is preferable.

@ChrisHines
Copy link
Contributor

Even if ctx is already done, calling AfterFunc does not wait for f to return.

I find this sentence a bit off when I read it. If I understand the proposal, calling AfterFunc never waits for f to return. The above sentence seems to emphasize the corner case of ctx already being done, though. That emphasis makes it seem like there is a special case involved, but there really isn't. Consider rewording to:

Calling AfterFunc does not wait for f to return, even if ctx is already done.

I think this better emphasizes that the behavior is always the same while still pointing out the less than obvious corner case.

@powerman
Copy link

While naming returned func stop looks reasonable to avoid confusion with already existing cancel func, I think name stop isn't a good one for this use case.
Moreover, as stop is related to f, then stop() == false may be misunderstood as "f is running now".
To me cancel still sounds much more suitable here, and as we've some confusion in both cases then maybe it's too early to reject cancel from name candidates.
As for other names… maybe detach, unhook, revoke, prevent?

@rsc
Copy link
Contributor

rsc commented Mar 1, 2023

@powerman stop is the same name as the time.Timer.Stop method.

@neild, what if f is long-running? Then there's no way to stop the AfterFunc without waiting for f? And in particular defer stop() now can't be used? I guess defer func() { go stop() }() is possible but it seems like mixing concerns.

@neild
Copy link
Contributor Author

neild commented Mar 1, 2023

If f is long-running, then you'd need to make arrangements to interrupt it.

If you do want a long-running f, and a stop that doesn't wait for it, you can start a goroutine for the long-running operation explicitly:

started := false
stop := context.AfterFunc(ctx, func() {
  started = true
  go longRunningOperation()
})
stop()
if started {
  // longRunningOperation is in progress
}

This is less convenient than a non-blocking stop (and a bit less efficient), but I think it's also the less common case. Every motivating example I've come up for AfterFunc is fast--signaling a sync.Cond, setting a timeout on a net.Conn, etc.

A non-blocking stop makes it easier to inadvertently leak a long-running operation. In general, functions should clean up any goroutines they start before returning. A blocking stop ensures that the AfterFunc isn't left running unless the programmer takes specific steps to create a goroutine for it, as above.

A non-blocking stop can also make the common case quite a bit more subtle, and possibly less efficient. Taking the case of reading from a net.Conn, we need to create a channel to synchronize with the AfterFunc goroutine even in the common case where the AfterFunc is not called:

func ContextReadOnDone(ctx context.Context, conn net.Conn, b []byte) (n int, err error) {
	stopped := make(chan struct{})
	stopf := context.AfterFunc(ctx, func() {
		conn.SetReadDeadline(time.Now())
		close(stopped)
	})
	n, err = conn.Read(b)
	if !stopf() {
		// stopf may still be running, so we need to wait for it to finish before resetting the conn deadline.
		//
		// Failing to wait here means that we might return with a still-running goroutine which will set the
		// conn deadline at some point in the future (if we have a race between Read returning successfully
		// and the context expiring).
		<-stopped
		conn.SetReadDeadline(time.Time{})
		err = ctx.Err()
	}
	return n, err
}

@powerman
Copy link

powerman commented Mar 1, 2023

@neild your second example is great, but first one looks very race-prone:

started := false
stop := context.AfterFunc(ctx, func() {
  started = true
  go longRunningOperation()
})
stop()
if started {
  // longRunningOperation is in progress
}

TBH I don't remember outcome from last change in https://go.dev/ref/mem (i.e. is it safe to read/write int-sized vars like bool), but even if it's safe callback can be started by context.AfterFunc but didn't execute it's first operation started = true yet, so there is a race here anyway. Probably worth rewriting, because, you know, people will copy-paste it from here too. :)

@neild
Copy link
Contributor Author

neild commented Mar 1, 2023

@powerman The first example assumes a stop function which blocks until any in-progress call to f has completed. The call to stop synchronizes access to the started var; after stop returns, either the func has run and set started = true or it will never run and started remains false.

@powerman
Copy link

powerman commented Mar 1, 2023

But isn't blocking stop still returns bool which can be used instead of extra started var?

@neild
Copy link
Contributor Author

neild commented Mar 1, 2023

A blocking stop could return a bool indicating whether f ran or not, but there's less need for it. A non-blocking stop must provide a way to tell whether f has been started.

@rsc
Copy link
Contributor

rsc commented Mar 8, 2023

A blocking stop can easily lead to deadlocks, especially if these functions are trying to send on channels to notify other goroutines that the context is cancelled. A non-blocking stop won't, and it matches time.Timer.Stop. I'm not entirely sure how to decide between those benefits and the ones @neild has pointed out.

@neild
Copy link
Contributor Author

neild commented Mar 8, 2023

Perhaps a non-blocking stop to match time.Timer.Stop, plus the Try function proposed by @ianlancetaylor above to handle the cases where you want to synchronize on the AfterFunc?

// Try executes fn while watching ctx.  If ctx is cancelled, Try calls cancel and returns ctx.Err().
// Otherwise, Try returns the result of fn.
func Try(fn func() error, cancel func()) error

Alternatively, AfterFunc could return a type with Stop and Wait methods. Or Stop could return something that can be waited on. (func AfterFunc(f func()) func() func()?)

@rsc
Copy link
Contributor

rsc commented Apr 6, 2023

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

@rsc rsc moved this from Likely Accept to Accepted in Proposals Apr 6, 2023
@rsc rsc changed the title proposal: context: add AfterFunc context: add AfterFunc Apr 6, 2023
@rsc rsc modified the milestones: Proposal, Backlog Apr 6, 2023
@neild neild self-assigned this Apr 6, 2023
@znkr
Copy link
Contributor

znkr commented Apr 11, 2023

One thing that wasn't discussed yet is how this proposal interact with #40221 (context.WithoutCancel)?

IIUC, the function supplied in AfterFunc is still called, which means that using AfterFunc to do anything with context values (like closing a trace span) is going to be problematic if context.WithoutCancel is ever used. I know that it's possible to do everything that context.AfterFunc is doing, but I wonder if it's going to be more attractive to assume that context's have a life-cycle that can be hooked into, while at the same time a method exists to escape a context from that life-cycle.

Internally at Google, we have a solution that adds a different life-cycle hook that to do work during context detaching (e.g. to open a new trace span for a background go routine), assuming the right context detaching method is used.

It's probably worth a separate proposal, but I do wonder if the implementation of this proposal is going to make it more complicated to add detaching functionality later. To be clear, I don't know the answer to that question.

@neild
Copy link
Contributor Author

neild commented Apr 11, 2023

AfterFunc(ctx, f) calls f after ctx is done. If ctx is the result of WithoutCancel, it never becomes done and f will never be called.

WithoutCancel(parent) creates a new context. It does not affect the cancelation of the parent context, and will not interfere with functions registered with AfterFunc on the parent context.

parent, cancel := context.WithCancel(context.Background())
context.AfterFunc(parent, func() {
  fmt.Println("parent canceled")
})
child := context.WithoutCancel(parent)
context.AfterFunc(child, func() {
  fmt.Println("child canceled") // this will never be called
})
cancel()
// Prints "parent canceled" and nothing else.

@znkr
Copy link
Contributor

znkr commented Apr 11, 2023

Got it. Do you think it might makes sense to be explicit about the fact that the function might never be called? That might dissuade its use in cases like this:

func dispatchRequest(ctx context.Context) {
  ctx, cancel = context.WithCancel(ctx)
  defer cancel()
  handleRequest(ctx)
}

func handleRequest(ctx context.Context) {
  var span trace.Span
  ctx, span = tracer.Start(ctx, "operation")
  context.AfterFunc(ctx, span.End)
  backgroundAction(ctx)
}

func backgroundAction(ctx context.Context) {
  ctx = context.WithoutCancel(ctx)
  go someLibraryFunc(ctx)
}

func someLibraryFunc(ctx context.Context) {
  span = trace.SpanFromContext(ctx) // <--- span likely ended already, updates are not allowed
}

Arguably, this issue already exists today. I am wondering if it will be worse, because it's too easy to assume that context.AfterFunc is going to do something different.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/486535 mentions this issue: doc: add release note for context.AfterFunc

gopherbot pushed a commit that referenced this issue Apr 20, 2023
For #40221
For #56661
For #57928

Change-Id: Iaf7425bb26eeb9c23235d13c786d5bb572159481
Reviewed-on: https://go-review.googlesource.com/c/go/+/486535
Run-TryBot: Damien Neil <[email protected]>
Reviewed-by: Sameer Ajmani <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
@rogpeppe
Copy link
Contributor

I only just saw this issue; sorry for the late comment.

I'm not entirely sure whether the semantics of stop are correct here (and I'm not convinced that Timer.Stop is good precedent, because that API is notoriously error-prone and hard to use.

An alternative semantic could be:

// The stop function reports whether the function has been successfully stopped; that is, it returns false
// if and only if the function has been invoked already.

That makes it feasible to call stop multiple times concurrently and have consistent results between them, and it seems to me like a simpler invariant to explain.

I'm wondering what the use case is for knowing whether this is the stop call that prevented the function running.

@ianlancetaylor
Copy link
Contributor

@rogpeppe Thanks, I suggest that you open that in a new issue, as this semantics has already been implemented. We can make the new issue a release blocker for 1.21. Thanks.

@rogpeppe
Copy link
Contributor

@ianlancetaylor Thanks. Done.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests