Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thoughts on a Dafny standard library? #1236

Open
parno opened this issue May 25, 2021 · 22 comments
Open

Thoughts on a Dafny standard library? #1236

parno opened this issue May 25, 2021 · 22 comments
Assignees
Labels
area: ffi The {:extern} attribute and otherwise interfacing with code in other languages kind: enhancement Enhancements, feature requests, etc. These are NOT bugs, but ways we can improve Dafny part: runtime Happens in Dafny's runtime (Add a `lang` tag if relevant)

Comments

@parno
Copy link
Collaborator

parno commented May 25, 2021

This summer, two undergraduates will be joining my lab for an REU program; the first one starts next week. My current plan is to have them work with me and some of my students to build up a full fledged standard library for Dafny. My hope is that this will make it easier for others to pick up and use Dafny for large-scale projects, and to reduce the amount of code duplication across such projects.

It would be great to include the library in Dafny's main CI, so that it doesn't bit rot as Dafny evolves. It looks like there's already a potential home for the library here: https://github.com/dafny-lang/libraries. Is that a reasonable place to make our contributions?

I'd be interested to hear the Dafny community's thoughts on the library's design and what should be included. I've listed some preliminary ideas below, but more ideas and pointers would be very welcome.

Style Guides

Existing Libraries

Possible Libraries

  • IO
    • Command-line arguments
    • File system
    • Networking
    • Printf
  • Math
    • Non-linear arithmetic
    • Bit-vector reasoning
  • Conversions
    • Between integers of different sizes
    • Between seq/arrays of integers
    • Between different built-in types (e.g., sequences, maps, sets)
  • Data structures
    • (Doubly) linked lists
    • Resizable arrays
    • Heaps
    • Trees
    • Maps and sets
      • Mutable versions
      • Lemmas to simplify reasoning
    • Sequence lemmas
  • Algorithms
    • Sorting
@parno parno added the kind: enhancement Enhancements, feature requests, etc. These are NOT bugs, but ways we can improve Dafny label May 25, 2021
@tchajed
Copy link
Contributor

tchajed commented May 25, 2021

In order to make a standard library useful, we would need to import an external dependencies in Dafny. Does anybody already do this?

If it's one self-contained library adding it as a submodule and importing it would work, but should we think about how to manage a tree of dependencies, so that the standard library can itself be broken down into smaller units? With only path-based imports I think this is tricky so I'm sort of asking for a language feature here.

@robin-aws
Copy link
Member

Hi @parno! Sorry for the delayed response - this is very exciting and well-timed. I was just bemoaning the fact that every Dafny codebase I've ever encountered has copied around what is hopefully the same version of Option<T> :)

I agree that sharing code could become easier with additional language features, but I don't think we should let that slow down the process of collating standard library code. I'd want to make it clear to consumers that the dafny/libraries repo is in an alpha state and subject to change while we work out the right content and organization of libraries.

I think IO will be one of the toughest cases to do well since it depends heavily on external code. I'd prefer to tackle that once we have a better solution for a sound and scalable FFI (see https://github.com/dafny-lang/dafny/labels/foreign-function-interface), which I intend to work on personally in the near future.

In any case, please do feel free to open PRs against that repo and we can figure out the details along the way! I know of some pieces lying around we may be able to contribute as well.

@parno
Copy link
Collaborator Author

parno commented Jun 3, 2021

@robin-aws Thanks! I agree that we should move ahead with library development in parallel with thinking about how to support library discovery/inclusion at the tool level. You're also right that IO will be tedious to support with all of the different back-end languages, so I'm happy to wait on improvements to the FFI.

At present, we're thinking of starting with collection lemmas (e.g., for sequences, sets, maps), then non-linear arithmetic, and then conversions between different integer widths and sequences thereof.

@robin-aws
Copy link
Member

That sounds great to me. Personally I'm dying to extract some of the functional programming staples like foldl/foldr and map into a shared library, rather than just using them as test cases for Dafny itself: https://github.com/dafny-lang/dafny/blob/master/Test/hofs/Folding.dfy :)

@keyboardDrummer keyboardDrummer added part: runtime Happens in Dafny's runtime (Add a `lang` tag if relevant) area: ffi The {:extern} attribute and otherwise interfacing with code in other languages labels Jul 19, 2021
@jaybosamiya
Copy link

jaybosamiya commented Jul 20, 2021

I'm copying below my comments from a discussion we had about Dafny library inclusion (with some additional context added at the beginning) to do better than lots of include "../../../dafny-stdlib/stuff. It would be good to be able to do this for more than just some hard-code for the standard library, in order to aid in making other reusable libraries too.

One potential approach (called "C-style" below) is to keep things relative to the specific file being verified/compiled, or any potential include paths passed to the specific invocation (e.g., -I ../../foo/bar), with potentially auto-included paths for the standard library (e.g., /usr/include and such used in C).

Another approach (called "Rust-style" below) is to have each library/package/project (or "crate" in Rust parlance) have a designated root, and things are referred to from this root. Thus, rather than referring to inter-project files at a file-granularity, it becomes project-granularity.


Personally, I really like Rust's approach. The considerations for Dafny may be different from those for Rust, but I feel they are aligned closely enough.

At a high level, I don't see any good reason for a modern system to use the C-style approach, other than it being more convenient to implement for the compiler. The cost is inconvenience to the users of the language, and in fact will just lead to repetitive re-implementation of all that include logic repeatedly in each library/crate/package and its respective build system (or lead to giant mono-repos, or copying code across repos). Given that we ourselves will be the users of the compiler, we're gonna have to face the inconvenience at some point or another- why not amortize it away?

Both C-style and Rust-style are effectively path-based, but C puts the burden on the user to maintain, while Rust has the compiler figure things out. Rust does have intra-file modules, but if we discount those, then all modules are path-based. The difference, to make things easier for users, is that Rust modules are (by default) referred to from the root of the crate, and allow for clean re-exports via pub use.

If we're talking about external imports, then too both are path based. C just has you declare stuff in the command line (via -L... -l...) while Rust has you declare it in Cargo.toml. Since cargo is also a package manager, on top of a build system, it does have magic to pull stuff in from crates.io and such, but you could instead also have an import via foo = {path = "../../asdf"} in your Cargo.toml (or indeed foo = {git = "..."} or similar). The convenience comes from the fact that you can now just use foo as if it were a normal module within your code now, rather than having to repeat the path again and again (or alternatively, make sure you are passing all the right include arguments in your compiler invocation).

At a high-level, the big difference between C-style and Rust-style imports can be boiled down to what is considered a "unit". Within C, there is no notion of a "crate"/"package", and instead the compilation unit is simply a file. For Rust instead, the compilation unit is the entire crate.

The include within C thus is largely a declaration of "these are the types outside the current unit". One could simply instead just have those declarations/prototypes manually included in the compilation unit, and indeed this is exactly how the pre-processor works. There is no "real" notion of includes in C, it is merely a facade. Since these includes are not "real", they (almost necessarily) have to only refer to files, and the most convenient way to refer to a file is via a file path. The idea of using -I to introduce include-paths is merely adding automatic path prefixes (and in fact #include <stdio.h> hides this one step further). All of this however, boils straight down to the fact that C is purely file-based. The actual library import only happens at the linker (and thus the -L... -l... stuff).

Rust, in contrast, has compilation unit as the whole crate. This doesn't mean that it needs to recompile the whole crate from scratch on any change (and in fact, recompilations are triggered file-based). However, it does mean that everything within the crate is visible (to the compiler, not to be confused with visibility to the programmer- controlled via pub/pub(crate) along with the module system). At this point, if everything had unique names, you technically don't need any includes (or use declarations). And indeed, you can just manually refer to anything without a use declaration simply by using its full name foo::bar::baz. The module system then is merely acting as a namespacing system within the crate, to keep things tidier. As long as something is within the same crate, the modules don't really need to manually specify paths, since everything knows the top-level root of the crate, and things can be referred to from that point downwards (thus crate::foo::bar. You could also refer upwards from a module (eg super::foo::bar), but that is purely just a convenient, used largely only in intra-file modules. Actual library imports are handled by rustc, funnily enough, are also handled via -L... -l..., but there is no -I.... Cargo takes up the burden of figuring out the linker arguments and passes them along to rustc.

@keyboardDrummer
Copy link
Member

@jaybosamiya

I agree we should optimise for the Dafny programmer over the Dafny compiler developer.

I'm curious whether we can introduce a Dafny project configuration file, like Rust's Crate.toml, and use that to compile a Dafny program as if all the files were concatenated into one big file, removing the need for specifying any include directives. I'll see if I can write a proposal with more details.

@jaybosamiya
Copy link

@keyboardDrummer Yeah, that could be quite neat.

Btw, Cargo.toml only specifies cross crate dependencies. The files within a crate never need to be specified in the configuration file, but are picked up automatically by rustc. Do you mean something similar, or did you have something else in mind?

I guess reading the details in the proposal would answer most of the questions I have about this

Oh, and I'm sure you've thought about this, but just to make sure it isn't missed: one thing to keep in mind is how files with differing Dafny arguments are handled. For example, some projects have non-linear arithmetic disabled throughout the project except for one file within it. Non-verification situations don't usually have this thing to worry about, but in Dafny, we do need to have proper handling for this.

@parno
Copy link
Collaborator Author

parno commented Jul 21, 2021

We also want to ensure we preserve the ability to verify files in a Dafny project in parallel. That said, I like the idea of removing include directives.

@keyboardDrummer
Copy link
Member

keyboardDrummer commented Jul 21, 2021

For example, some projects have non-linear arithmetic disabled throughout the project except for one file within it.

How do you currently handle this? Do you call Dafny multiple times with options specific to each set of files you call it on?

We also want to ensure we preserve the ability to verify files in a Dafny project in parallel.

I think we should make the verification run more parallel by default. I'm also hoping to prevent verifying components that were previously verified and who's inputs haven't changed.

@parno if we improve Dafny's support for consuming libraries, do you have particular dates by which that would be useful? Right now being one option.

@tchajed
Copy link
Contributor

tchajed commented Jul 21, 2021

How do you currently handle this? Do you call Dafny multiple times with options specific to each set of files you call it on?

Yes, that's what we do at least. (Just to clarify, I'm not in Bryan Parno's group. I only have one Dafny project and it probably does weird things.) The Makefile calls Dafny once per file, producing an empty foo.dfy.ok file to indicate success. Using some Makefile trickery we can customize the flags passed to Dafny pretty easily with paths or even wildcards.

@mschlaipfer
Copy link
Member

We also want to ensure we preserve the ability to verify files in a Dafny project in parallel. That said, I like the idea of removing include directives.

We should allow for method level parallelism.

@parno
Copy link
Collaborator Author

parno commented Jul 22, 2021

How do you currently handle this? Do you call Dafny multiple times with options specific to each set of files you call it on?

Yes, we've used various build tools (make, scons, home grown) to run Dafny on our files in parallel, to cache verification results (so we don't rerun Dafny on files where neither the file nor its dependencies have changed), and to allow per-file option customization. All three features have been pretty critical to scaling to large multi-developer projects. Most of this feels more natural to handle via a build tool than through Dafny itself, however.

We should allow for method level parallelism.
Dafny already has this, right? Both in the form of the caching the Dafny server does, and via Boogie's /vcsCores option.

@parno
Copy link
Collaborator Author

parno commented Jul 22, 2021

@keyboardDrummer In terms of timing, we're almost ready (1-2 days?) to push out initial versions of standard libraries for sets, sequences, maps, imaps, higher-order functions, and non-linear arithmetic. Once that PR is set up, it would be great to (a) figure out how to get the standard libraries included in Dafny's CI, and (b) make it easy to include these libraries in Dafny development. The students who have been working on the libraries have been quite productive, so they could potentially help with (b) if needed.

@keyboardDrummer
Copy link
Member

keyboardDrummer commented Jul 22, 2021

I agree we should do (a). I think one way would be to add a script dafny/Scripts/verifyReferenceProjects that downloads the standard library repository and verifies it, and then to call that script from .github/workflows/msbuild.yml

I'm also wondering whether it would be better to have the standard library be in the same repository as Dafny, that's what I've seen for some other languages, but I'm not sure and there's no rush in making this change.

For (b), help would be great. I happen to be writing a small proposal related to that. I'll try to share that soon.

@mmwinchell
Copy link

Hello everyone! For the past couple of months, we have been working on creating the standard library for Dafny. We would love to hear your feedback. The link can be found here .

@parno
Copy link
Collaborator Author

parno commented Aug 5, 2021

For (b), help would be great. I happen to be writing a small proposal related to that. I'll try to share that soon.

@keyboardDrummer Just checking in to see if you've been able to make progress on this. Our visiting student only has a few weeks left at this point. Please let us know if we can help in some way.

@keyboardDrummer
Copy link
Member

keyboardDrummer commented Aug 5, 2021

@parno

I've put my thoughts into the Dafny RFC template. You can read them here. Comments are much appreciated.

@RustanLeino
Copy link
Collaborator

The Dafny language proper only understands modules and imports. Thus, in the words of @jaybosamiya , the include mechanism just provides a facade. I would love to see a solution in terms of a project configuration file with the various kinds of options mentioned above.

@sarahc7
Copy link

sarahc7 commented Sep 7, 2021

Hi everyone! Thanks for all of your valuable feedback on the Collections and NonlinearArithmetic sections of the library. We have since added NatSeq. Any comments are appreciated :)

@yizhou7
Copy link
Contributor

yizhou7 commented Sep 19, 2021

I agree we should do (a). I think one way would be to add a script dafny/Scripts/verifyReferenceProjects that downloads the standard library repository and verifies it, and then to call that script from .github/workflows/msbuild.yml

@keyboardDrummer Hi we would like to follow up on including the library in the dafny CI so that new versions of Dafny won't break this. Is there some way we can help? Thanks!

@robin-aws
Copy link
Member

robin-aws commented Sep 19, 2021

@yizhou7 I think by far the simplest solution would just be to add the library as a submodule of dafny-lang/dafny, somewhere under the Test directory. Since the library is already using lit, that will automatically include it in the Dafny regression suite. We can set up Dependabot to make sure the submodule gets updated when the library repo changes.

Feel free to open a PR! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: ffi The {:extern} attribute and otherwise interfacing with code in other languages kind: enhancement Enhancements, feature requests, etc. These are NOT bugs, but ways we can improve Dafny part: runtime Happens in Dafny's runtime (Add a `lang` tag if relevant)
Projects
None yet
Development

No branches or pull requests