Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Dafny-to-Rust code indentation and identifiers #4974

Merged
merged 43 commits into from
Feb 7, 2024

Conversation

MikaelMayer
Copy link
Member

@MikaelMayer MikaelMayer commented Jan 10, 2024

Improvements of the DAST to Rust code generator:

  • The code generator emits a Rust-like AST that is then formatted with proper indentation
  • Ensured Rust-like identifiers in Dafny stay the same in Rust (like "one_variable")
  • Added Initial implementation of sequences and maps in the DafnyRuntimeRust, following the implementation of the Go
  • Added module to perform conversions from and to Dafny types to and from native types

Code enhancement

  • Added support for using the standard library's Option type rather than the hard-coded Optional type.

Tests

  • Added proof that the identifier mapping is fully reversible and won't conflict with other internal constructs
  • Added a Rust test module for the DafnyRuntimeRust and wrote tests for testing the encoding Dafny/Rust

Tooling improvement

  • The makefile now works even if there is a space in the path
  • make dfydev does not perform verification and formatting of the resulting C# file, but make dfyprod does it. Both build Dafny afterwards
    ' make dfydevinit and make dfyprodinit only perform the conversion from dafny files to C# files, without rebuilding Dafny.

By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.

@MikaelMayer MikaelMayer changed the title Feat: Dafny-to-Rust code indentation Feat: Dafny-to-Rust code indentation and identifiers Jan 10, 2024
@MikaelMayer MikaelMayer requested review from robin-aws and removed request for jtristan January 19, 2024 16:17
@MikaelMayer MikaelMayer self-assigned this Jan 23, 2024
Copy link
Member

@robin-aws robin-aws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some high-level comments as I'm finding my bearings in the new DafnyCompiler/DafnyWrittenCompiler world, more to come!

}

#[allow(dead_code)]
enum Sequence<T>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the one hand I'm pleased you were at least inspired by the DafnyRuntimeDafny implementation here. But I also have to ask: would it be feasible to actually use the compilation of that source directly instead? Then you get at least more assurance of correctness since the Dafny version is verified.

If it didn't seem possible because not all features in that source are supported for Rust yet, that was also true at the time for Go, and addressed via the --bootstrapping flag.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If will be feasible to use the compilation of that source directly instead, but it's more likely if we have sealed traits (or class datatypes). as it's not possible in Dafny to declare an allocated datatype, which we would need here to compile to an enum.
At best, we would be able to generate a dyn Sequence is Sequence was a trait, but it would not fit the conversion I'm envisioning.
But even then, the Dafny to Rust compiler does not support consistently the notion of traits anyway yet.

What did you do with this --bootstrapping flag exactly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

../../Scripts/dafny translate cs dfyconfig.toml --output $output.cs
../../Scripts/dafny translate cs dfyconfig.toml --output $output.cs $noverify

# We will remove all the namespaces Std.* except Std.Wrappers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is that? This is likely to confuse someone very badly someday. If it's just because we added Std. when making DafnyStandardLibraries, I'd much rather bite the bullet and change the references.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I didn't mean to remove the "Std." part of the name, only to remove modules from the standard library that we are not using as it's not possible to import only a part of the standard library since it's in one file.

The standard library is huge and including all the source compiled from C# triggered a few errors I don't want to deal with at the beginning. So far, I include only wrappers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah got it, the comment is just misleading. I'd just rephrase.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll put in the new version:

# We remove all the standard libraries except Std.Wrappers, Std.Strings, Std.Collections.Seq, Std.Arithmetic and Std.Math

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a good reason to start actually splitting up the standard libraries into separate units. But we need more packaging support first.

Source/DafnyCore/dfyconfig.toml Show resolved Hide resolved
{}

// However, if we restrict ourself to a Dafny encoding, it's reversible
lemma {:rlimit 500}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI we've replaced {:rlimit} with {:resource_limit} in master now, so these are better as {:resource_limit 500e3}: #4975

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment about doing that migration for the next PR, thanks

}
}
// replaceDots is not reversible generally speaking.
// Indeed, both _d and . get translated to _d.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not fix that by translating _d to __d?

In fact maybe the better question is why dafnyEscape doesn't also escape dots?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you change _d to __d, you'll have to change _ to __ too as there could be a i__d somewhere that comes from a dafny identifier i_d. I prefer to keep transformations simple, as I haven't written replaceDots, I might be able to have a better transformation down the road.
dafnyEscape is only a modeling function to model the fact that Dafny itself does not escape dots and might put dots in identifiers names, like module names. I have no control over that.

Comment on lines +51 to +56
if s[0] == '_' {
} else if s[0] == '?' {
} else if s[0] == '\'' {
} else if s[0] == '#' {
} else {
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fascinating that this is necessary (or does it just reduce the proof cost?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does reduces the proof cost :-)

// - If the input is a Dafny-encoded Id or a tuple numeric,
// it can find the original identifier based in the output

lemma better_tuple_builder_name_not_rsharp(i: string)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, who got snake_case identifiers in my DafnyCodeFiles :D

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just borrowing the convention used in the existing Dafny file... will change that in the next PR :-)

ensures IsDafnyIdTail(s)
{}

function dafnyEscape(s: string): string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dafny style is usually PascalCase rather than camelCase

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conversion implemented in the next PR.

@@ -70,9 +75,9 @@ module {:extern "DAST"} DAST {
Neq(referential: bool, nullable: bool) |
Div() | EuclidianDiv() |
Mod() | EuclidianMod() |
Implies() |
Implies() | // TODO: REplace by Not Or
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like a good idea to me, but it assumes that the goal of the DAST is to be as simple as possible, rather than retaining extra information that might produce better output in some contexts. I think that's a good design goal personally, but this file should be more explicit about documenting that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally I want:

  • The Compiled Dafny AST to be minimal
  • We want the generated code to look idiomatic and close to the original Dafny file if possible.

I was able to accomplish both goals the following way, as this step has already been accomplished in the next PR. The design is the following:

  1. We keep only the minimum amount of operators (so BinaryOp(Implies, x, y) becomes BinaryOp(Or, Not(x), y), same for all the variants of <= and < and != and == etc.
  2. I added a "format" attribute to BinaryOp and UnaryOp so that compilers can decide to enhance the rendering of the operators based on original Dafny information, but that should only be an optional idiomatic rendering step or a transformation of the target language AST to another more idiomatic target language AST.
    This ensures conversions are done minimally .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally I want:

  • The Compiled Dafny AST to be minimal
  • We want the generated code to look idiomatic and close to the original Dafny file if possible.

Both good goals to have, but there will be times when they conflict. :)

Sounds like you're saying in this case we might annotate the AST further if we decided we need to know when the original Dafny source was A ==> B as opposed to !A || B. I think that's a reasonable general approach: keep the AST itself minimal so getting to simple correct translation is easier, but add more additional information when you feel the need to improve the output.

I think it's worth stating that tenet in or around this file somewhere.

Copy link
Member Author

@MikaelMayer MikaelMayer Jan 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just added to the top of AST.dfy for the next PR:

  // Dafny AST compilation tenets:
  // - The Compiled Dafny AST should be minimal
  // - The generated code should look idiomatic and close to the original Dafny file if possible
  // Since the two might conflict, the second one is taken care of by adding formatting information

Copy link
Member

@robin-aws robin-aws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another batch

[s[0]] + dafnyEscape(s[1..])
}

function dafnyEscapeReverse(s: string): string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dafnyUnescape would be a more familiar name IMHO

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to DafnyUnescape in the next PR


module {:extern "DCOMPProofs"} {:compile false} DCOMPProofs refines DCOMP {

ghost predicate IsDafnyIdTail(s: string) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: declaring this after IsDafnyId would make it much clearer what "tail" means

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I exchanged the two predicates in the next PR

}

// The reason why escapeIndent is correct is that
// - It receives either
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like a big old implicit precondition we should make explicit (since it's defined in Dafny)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean, adding this as a precondition to the function escapeIdent itself?
Since the reception is total ("anything else" is the last case), the precondition of escapeIdent is actually true all the time.
Without this being an objection, the disjunction in the precondition of that function is there only to figure out a particular type of bijection (I love writing this -ctions :-) )
so writing a precondition would make sense if I split the function "escapeIdent" itself, which at this point is not necessary. Unless you have another idea to format it?

I also renamed "escapeIndent" to "escapeIdent" in the next PR since it's a typo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I missed the "everything else" case and therefore thought there was a non-trivial precondition, ignore :)

}
}

lemma {:fuel has_special, 2, 3} escapeIndentExamples()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussion: should we put "test lemmas" outside of the source files, like tests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on your discussion topic, I added the following at the top of this file this comment, because I should make it explicit that:

/*This module does not contain any compiled code because it
  only proves properties about DCOMP. In a sense, it's a test file. */

So you are already viewing here a file that only contain test lemmas. It's just that the others are more like abstract tests, while this one is more concrete.

I could also have made it a method and expect statements to not overload the verifier. Using the verifier to perform computation is not something to advise in general. Here I kept my examples to use the least amount of fuel (the fuel indicated is the minimum + 1) and this can help to design quickly any new bijection feature, and also it makes it possible to not compile this module until we have a way to filter out test modules for compilation for production code.

I also renamed this lemma to escapeIdentExamples (fixed the typo)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah that is illuminating. I had assumed these proofs would be necessary dependencies for future higher-level proofs of correctness, and therefore more like source code. If that becomes true later we can always relocate/reclassify them.

}
}

function reverseEscapeIdent(s: string): string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise consider "unescape" instead

{
}

lemma {:rlimit 800} everythingElseReversible(i: string)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either "Unescapable" or "Invertable" would be better than "Reversible" in general IMO

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replaced every "Reversible" by "Invertible" in this file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just remarking that these are well worth the effort, since failing to escape identifiers completely in all contexts has been a frequent source of compiler bugs in the past.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As soon as we have newtypes for strings as well, I'll use them to ensure I'm not forgetting any case. I have found already two instances in which identifiers were not escaped in the source code.
And indeed, generating identifiers that can collide could be sources of issues, thanks for appreciating :-)

../../Scripts/dafny translate cs dfyconfig.toml --output $output.cs
../../Scripts/dafny translate cs dfyconfig.toml --output $output.cs $noverify

# We will remove all the namespaces Std.* except Std.Wrappers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a good reason to start actually splitting up the standard libraries into separate units. But we need more packaging support first.

}

pub fn is_instance_of<T: ?Sized + Any, U: 'static>(theobject: *const T) -> bool {
unsafe { &*theobject }.as_any().downcast_ref::<U>().is_some()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because Dafny is pretty restrictive about the types on either side of an is, it seems like it's possible to not resort to as_any(), as much as that is still correct. How much of an efficiency hit is it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my current understanding, Rust is one of this few languages where references are not "transparent". Transparency means that the representation of the reference is the same whether the object in the original code was a class or an interface.
Contrary to Java, that means if a struct C implement trait A and trait B, a pointer of type *const C does not have the same representation as a pointer of type *const dyn A or of type *const dyn B. This is because a dyn pointer not only stores the address of the object data, but also the address of the object's methods (that's why there is a keyword "dyn", to indicate it stores two pointers). A pointer of type *const C therefore is not immediately a pointer of type *const dyn A, and a pointer of type *const dyn A requires an explicit conversion to a pointer of type *const dyn B because the methods are not at the same address. This is why the method as_any() makes it possible to go back on our feet.

By the way, I'm not using this is_instance_of yet but I will shortly not in the next PR, but in a PR afterwards. So this can still evolve.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, and I do think the simplest correct approach is the right one for now. IF we profile and find that this is a bottleneck in real code we could get much more complicated, but not before. I just wanted to make sure this wasn't obviously very expensive and hence something we would definitely have to improve, as it would be if we were resorting to expensive reflection for example - it doesn't sound like that's the case here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, no reflection is needed because any class that can be upcasted must implement the Any trait, otherwise the approach is unsound.

Copy link
Member

@robin-aws robin-aws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another batch, just Dafny-compiler-rust.dfy and the Rust tests to go


// The T must be either a *const T (allocated) OR a Reference Counting (immutable)

#[allow(dead_code)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this occur and how hard is it to avoid?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed all these annotations in the new PR. They were there because, contrary to Dafny, functions are private to the module by default, so if they are not prefixed by "pub", then it's dead code. By adding "pub" on every function like of Dafny (which I had done since), I now realized that this annotation of allowing dead code is no longer used, so I removed the 13 of them.

array.iter().map(|x| elem_converter(x)).collect()
}

// Used for external conversions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only this one and not the other direction?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other direction is already implemented for the next PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might have forgotten it for this PR, but the other direction is implemented in the next version.

Comment on lines +54 to +56
let mut array: Vec<T> = Vec::with_capacity(s.cardinality());
Sequence::<T>::append_recursive(&mut array, s);
array.iter().map(|x| elem_converter(x)).collect()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be more efficient (if slightly less simple) to create a Vec<X> and populate it with converted elements directly? That way you'd avoid the intermediate Vec<T>.

I don't know Rust iterators well enough yet to know the best way to implement that, but it's probably worth figuring out as a follow-up, since this conversion method is going to be used a lot on potentially large data.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would amount to either adding a callback to the function Sequence::append_recursive, or have Sequence implement the Iterable trait itself.
I'm not sure about any of these strategies but for sure they are worth considering thanks for suggesting, I'll think about them.

for elem in array.iter() {
result.push(elem_converter(elem));
}
Sequence::<T>::new_array_sequence(&Rc::new(result))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to double check, the resulting Sequence will reference result directly rather than creating a copy, correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, indeed. Rc::new() takes ownership of the result.

use crate::Sequence;
use std::rc::Rc;

pub fn string_to_dafny_string(s: &str) -> Rc<Sequence<char>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely try to introduce those aliases I mentioned next PR if possible, dafny_string instead of Rc<Sequence<char>>

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I introduced them two weeks ago in the upcoming work, definitely a good idea.

Comment on lines +209 to +210
} else {
panic!("This should never happen")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary here but not in the match this cases?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because Rust's if sentences are like if-then-else in Dafny, they are types, so they always require an else (there is no return keyword, only the last value is returned). panic returning the empty type, it type checks.

In the newest PR, this if then else is transformed back to a match, as I'm getting more used to Rust's syntax I was able to make it work.


#[allow(dead_code)]
impl <K, V> Map<K, V>
where K: Clone + Eq + std::hash::Hash, V: Clone
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One cross-cutting issue to keep in mind - there's an assumption that the image of all Dafny types implement Hash. What level of testing will support that? Do we have confidence eventually turning Rust on for all the uniform compiler tests will catch missing cases because the Rust won't compile?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can and will make all Dafny values that implement equality to implement Hash. At worst, have them output a constant hash will make the hasmap as bad as a linked list.

But there are some good hash computations if we really want to shine:
For sets of elements, we can sort the hash of individual elements, sort the hash, and compute the hash of the sequence.

Note that, if we could not have all Dafny types admissible for keys to implement hash, I already have a plan in which we could not require hash in the Key type of the map, and have optimized methods for the key types that implement hash.

));
}
}
// Dafny will normally guarantee that the key exists.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which means the unwrap()s should never panic, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct!

}

#[allow(dead_code)]
impl <K, V> Map<K, V>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah now this is interesting. On the one hand great, ALL backends should have a smarter implementation of immutable maps. On the other hand the code we want to support with this backend has already had to work around the lack of this in the existing backends, so this optimization is not going to help them - they will use https://github.com/dafny-lang/libraries/tree/master/src/MutableMap instead.

I'd say don't spend time on this optimization here, go with the simple wrapping of a mutable map like the other backends. At some point we should do what you're doing here, but in dafnyRuntime.dfy instead so that we have a shot at getting consistent runtime performance across all backends.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I already completed this optimization, let's keep it for now as it is strictly better in terms of complexity than wrapping a hashmap. Wrapping a hashmap is O(n) when you add a key/value and O(1) when accessing it, whereas this is O(1) when adding a key/value pair or even combining maps, and amortized O(1) when accessing it.

But either optimizations are not even the panacea, as if there is a loop where someone tests if something is in the map before adding it, then it defeats totally the optimization as the Hashmap will be realized at every step anyway.

My vision is that for those who really want performance, they will use real mutable hashmap for which maps will only be ghost specification.
In other terms, I believe that classes and arrays should be the most optimized, whereas built-in immutable types are mostly there for prototyping and can tolerate non-optimal complexities.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's all reasonable. Let me refine my advice:

This does look like a nice optimization without making the implementation a lot more complex. But it is still slightly more complex. In particular, you have to worry about data and cache getting out of sync when used by different concurrent executions. That problem is solved for the lazy sequences in dafnyRuntime.dfy by using an AtomicBox, and for Rust it looks like the equivalent solution is a RefCell, so perhaps you need that here.

All I'm saying is that you still have more work TODO to complete this implementation anyway, so I'm still not convinced that this optimization is worth the non-zero risk. I think it would be more worth it if multi-targeting Dafny code could rely on those [amortized] O(1) costs regardless of the backend.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concurrency is a good thing to consider, but consider this:
For this code to work under concurrent execution, we would need anyway to use the slower Arc instead of Rc, because Rc is not concurrent-friendly. And I'm not sure we want to so that, we still need to evaluate that need.

But regarding your remark, how can data and cache get ouf of sync? I'm already using RefCell
Data is only ever written once during construction, and cache could theoretically be updated concurrently but all result only depend only on the data so at worst it will be populated twice with the same value.


// Generic function to safely deallocate a raw pointer
#[inline]
pub fn deallocate<T : ?Sized>(pointer: *const T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used anywhere already?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only in the test/mod.rs for now.

Copy link
Member

@robin-aws robin-aws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got through the rest. Massive progress, nicely done.

My only real concern is integration-level testing. I'm very happy with the Rust testing of the Rust runtime as that's already above the level of testing of some of the other runtimes. But because we're using the RunAllTests feature as a catch-all escape for the Dafny-implemented compilers, it's basically impossible for me to know how many more actual Dafny programs we support after this change, or if we're even regressing on what was already working.


const IND := " "
// Indentation level
/* trait ModDecl {} */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid checking in commented-out code in general, but this in particular since you actually HAVE a live ModDecl declaration below :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed. I want to use general-traits datatype down the road, but for now I have not done the switch yet. I got many issues with this general traits datatypes in other projects that I'm now a bit burned to use it again (for example, it was impossible to downcast traits to datatypes, an issue has been cut but it's not resolved yet)

| Mod(name: string, body: seq<ModDecl>)
| ExternMod(name: string)
{
function ToString(ind: string): string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider using --general-traits and defining a ToStringable trait, and/or likely other general AST traits?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independently: it may be worth documenting that ToString is meant to produce well-formed Rust code and not a more detailed representation for debugging.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unfortunately not playing with --general-traits anymore until a few issues have been fixed. Moreover, traits are useful only if I need to abstract datatypes away, which here we don't.
I'll add a comment about ToString

}
}
}
function SeqToString<T>(s: seq<T>, f: T --> string, separator: string := ""): string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed I thought for that, but I don't want to compute the intermediate seq<seq>, so I'll stick with this SeqToString function for now. But I have used other standard library methods somewhere else :-).

Comment on lines +189 to +191
newtype VISIBILITY = x: int | 0 <= x < 2
const PUB := 1 as VISIBILITY
const PRIV := 0 as VISIBILITY
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love this idiom in Dafny, I think I'd prefer

datatype Visibility = Public | Private {
  function Index(): nat {
    ...
  }
}

Or even better, let's dig into why you want the enum value/index at all!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't need the index, so I'll write just datatype Visibility = PUB | PRIV

Comment on lines +237 to +238
// When a raw stream is given, try to put some indentation on it
function AddIndent(raw: string, ind: string): string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine with this as long as it's a temporary state, which we can remove once we're always using the RAST instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, although I might actually use this one later on if I decide that the indentation I asked for is not enough, i.e. because the string is too long and I want to add a newline, I'd add some indentation.

&& (|i| >= 2 ==> i[1] != 'T') // To avoid conflict with tuple builders _T<digits>
}

predicate is_idiomatic_rust(i: string) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps is_idiomatic_rust_id instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed in the next PR

identEraseImpls := identEraseImpls + "impl " + constrainedEraseParams + " ::dafny_runtime::DafnyUnerasable<" + escapeIdent(c.name) + unerasedParams + "> for " + escapeIdent(c.name) + typeParams + " {}\n";

s := s + "\n" + defaultImpl + "\n" + printImpl + "\n" + ptrPartialEqImpl + "\n" + identEraseImpls;
var d := R.ImplFor(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider occasionally adding high-level comments to show the rough template for what we're generating, as we do in the C# implementation of rewriters and code generators. The string manipulation version could have already benefited from that, and I think the benefit is even greater when you start manipulating AST nodes instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might not do this immediately as I'm constantly evolving the representation, but please remind me of doing that when the implementation is more stabilized if I have not done so yet.

@@ -2100,12 +2703,14 @@ module {:extern "DCOMP"} DCOMP {

static method Compile(p: seq<Module>) returns (s: string) {
s := "#![allow(warnings, unconditional_panic)]\n";
s := s + "#![allow(nonstandard_style)]\n";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of sums up the overall philosophy of our code generators nicely doesn't it. :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that smithy-rust does change CamelCase to snake_case, we could have a compiler option for doing that as well :-)

Comment on lines 373 to 383
fn Test2(x: bool, y: &mut bool, z: &mut bool) {
*y = !x;
*z = !x || *y;
}
fn Test3() {
let mut y = true; let mut z = true;
Test2(true, &mut y, &mut z);
let mut i; let mut j;
i = y;
j = z;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These look like random code written to experiment with semantics, and AFAICT they aren't actually part of any tests, so we should remove them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the entire file from the index and kept it locally until we use it for proper unit testing.

assert_eq!(f.as_ref()(&Rc::new(BigInt::zero())), false);
}
}
// Struct containing two reference-counted fields
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Member

@robin-aws robin-aws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will follow up on lots of comments in the next PR, but nothing blocking. Ship it!

@MikaelMayer MikaelMayer enabled auto-merge (squash) February 7, 2024 00:11
@MikaelMayer MikaelMayer merged commit 2c77435 into master Feb 7, 2024
20 checks passed
@MikaelMayer MikaelMayer deleted the feat-rust-indentation branch February 7, 2024 00:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants