Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: improve performance of sequences in C# runtime by making slices lazy #2313

Open
cpitclaudel opened this issue Jun 28, 2022 · 2 comments
Assignees
Labels
area: performance Performance issues lang: c# Dafny's C# transpiler and its runtime part: language definition Relating to the Dafny language definition itself status: needs-decision The cause of this issue is clear; the team needs to decide whether we want to work on it

Comments

@cpitclaudel
Copy link
Member

cpitclaudel commented Jun 28, 2022

Currently, the following function has quadratic complexity when transpiled to C#:

function sum(x: seq<int>): int { if s == [] then 0 else s[0] + sum(s[1..]) }

This is surprising to users. It happens because s[1..] creates a copy of the subsequence 1 .. of s.

Workarounds exist (iterating using an index, keeping this as the spec by using a by method, etc.), but they all boil down to "don't use slices" in executable code, which is too bad.

I propose to change the runtime so that:

  1. s[x..y] creates a lazy "view" into the original sequence (not a copy).
  2. only s[..] creates a copy, allowing the underlying memory to be freed.

I argue that this is not a breaking change, since the runtime performance of sequence is not specified. However, it could cause an increase in memory usage for programs that create large sequences and then keep references (through slices) to only small portions of that data.

This solution is in line with what Go does with slices. It is fast and convenient, but it has one downside: a small slice can keep the whole sequence alive in memory. This is a well-know caveat in go. In fact, the manual says:

A possible “gotcha”

As mentioned earlier, re-slicing a slice doesn’t make a copy of the underlying array. The full array will be kept in memory until it is no longer referenced. Occasionally this can cause the program to hold all the data in memory when only a small piece of it is needed.

@cpitclaudel cpitclaudel added area: performance Performance issues lang: c# Dafny's C# transpiler and its runtime part: language definition Relating to the Dafny language definition itself status: needs-decision The cause of this issue is clear; the team needs to decide whether we want to work on it labels Jun 28, 2022
@robin-aws
Copy link
Member

This isn't really C# specific is it? It's just fine to start with implementing it in C#, but if it's a good idea there it should be a good idea in all backends.

That might mean we should work out a way to provide the implementation for a type like seq<T> once in Dafny before tackling this, so we can reliably provide this in all backends and tell users "slicing is efficient in Dafny" without having to add "...when compiling to C#" :)

@robin-aws robin-aws self-assigned this Jul 27, 2022
@robin-aws
Copy link
Member

Assigning to myself as this will be a side effect of #2390 (since it turns out C++ already implements slices this way, so if I'm going to replace its implementation of sequences with a common Dafny-based one, it needs to as well or else I'll cause a performance regression :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: performance Performance issues lang: c# Dafny's C# transpiler and its runtime part: language definition Relating to the Dafny language definition itself status: needs-decision The cause of this issue is clear; the team needs to decide whether we want to work on it
Projects
None yet
Development

No branches or pull requests

2 participants