Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Translation records #5346

Merged
merged 32 commits into from
Apr 30, 2024

Conversation

robin-aws
Copy link
Member

@robin-aws robin-aws commented Apr 22, 2024

Description

Adds a translation record concept, automatically output alongside source code by dafny translate (--translation-record-output can be used to override the default location), and optionally provided for libraries with one or more --translation-record options. These files record the translation-relevant options used when translating libraries, so that separate compilation can make adjustments to generate the right code.

Like .doo files, translation records get their own .dtr extension for encapsulation and later evolution of the file format and features. Because these files are independent from (and one-to-many with) the verified source they refer to, the translation options are tracked per-module instead of per-source file. This provides some flexibility in how these records are managed in the future.

So far I've only added --outer-module as a canonical example of a translation option where we can handle different values across different modules, but others such as --optimize-erasable-datatype-wrapper can be added in the future.

For now there is no mechanism to check that all possible Dafny options are handled correctly recorded in translation records, as there is for .doo files, but this should be added in a future change so that we explicitly reject currently incompatible options. Similarly, we should deprecate the use of --library when translating without translation records for all library modules in the future. It is likely worthwhile to refactor the option registration process to streamline and unify handling for separate verification and translation at that point.

See also discussion on #5140 (comment).

@keyboardDrummer From earlier discussion on #4009 we agreed that --outer-module should allow libraries to declare things in the default module. However, given that option applies the outer module to the program AFTER resolution and verification, and only before compilation, I don't think it should. You can specify --outer-module differently for the same code but translating to different targets, but the names that consuming code refers to shouldn't be affected by that. It might be appealing to allow the namespace for a library to be implicitly defined, but I think we want a different but similar option for that instead if so.

How has this been tested?

Modified comp/separate-compilation/usesTimesTwo.dfy to use --outer-module on the library and show it breaks, but then adds --translation-record to make it work. Also added dedicated tests for translation record files (still in progress)

By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.

@robin-aws robin-aws changed the title Track translation options feat: Translation records Apr 22, 2024
@keyboardDrummer
Copy link
Member

keyboardDrummer commented Apr 23, 2024

Overall, amazing stuff!

From earlier discussion on #4009 we agreed that --outer-module should allow libraries to declare things in the default module. However, given that option applies the outer module to the program AFTER resolution and verification, and only before compilation, I don't think it should.

I don't think it should either.

Similarly, we should deprecate the use of --library when translating without translation records for all library modules in the future.

I was thinking that we might deprecate the use of Dafny libraries without --library, and to let the availability of translation records determine whether to translate the library from scratch or to compile against the translation record. I think the current behavior that allows using Dafny libraries as source files, which then get translated but not verified, is too implicit.

It might be appealing to allow the namespace for a library to be implicitly defined, but I think we want a different but similar option for that instead if so.

Yes, I was thinking we would make it mandatory to define the outer namespace for a library.

@@ -28,4 +29,6 @@ public class CompilationData {
// TODO move to DocumentAfterParsing once that's used by the CLI
[FilledInDuringResolution]
public ISet<Uri> UrisToCompile;

public TranslationRecord TranslationRecord { get; set; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a FYI, I want to get rid of the class CompilationData by moving its fields into Compilation, but before that I need to move all the commands over to purely use CliCompilation instead of SynchronousCliCompilation

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the heads up. This does feel like the right place for the translation record data currently, since it's tightly coupled to state like AlreadyCompiledRoots, so I'll assume this is fine and can be moved along with the rest into Compilation later on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine indeed :-)

@@ -102,6 +104,23 @@ public class ProgramParser {

ShowWarningsForIncludeCycles(program);

compilation.TranslationRecord = new TranslationRecord();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see how translation record reading relates to parsing. The class ProgramParser doesn't seem like the right place. I would move all the translation record code to the code generation classes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair, I put it here because it felt like initialization of CompilationData, but I moved it to SynchronousCliCompilation and it seems to fit nicely there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would moving it to ExecutionBackend also work? I see it's used around there and it doesn't seem useful earlier on.

compilation.TranslationRecord = new TranslationRecord();
var records = options.Get(CommonOptionBag.TranslationRecords);
if (records != null) {
foreach (var path in options.Get(CommonOptionBag.TranslationRecords)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can reuse the variable records here.

var records = options.Get(CommonOptionBag.TranslationRecords);
if (records != null) {
foreach (var path in options.Get(CommonOptionBag.TranslationRecords)) {
using var reader = new StreamReader(path.FullName);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can use async using

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I can't: for some reason TextReader does not implement IAsyncDisposable, even though TextWriter does.

var pathForErrors = options.GetPrintPath(path.FullName);
try {
var libraryRecord = TranslationRecord.Read(reader);
if (libraryRecord.Validate(errorReporter, pathForErrors, options, Token.NoToken)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think calling options.GetPrintPath could/should be moved into .Validate, since it also gets an options and that leaves less responsibility on the caller.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This worked out because I also made a single ReadValidateAndMerge method to bundle everything together. Like DooFile we may have to call those separately in the future, but for now it's always all three.

rootUserModule = newRoot;
// Apply the --outer-module option from any translation records for libraries,
// but only to top-level modules.
// TODO: Check the invariant that a module isn't in both a translation record
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If (hasTranslationRecord) {
  // use it
} else {
 var outerModuleName = Options.Get(OuterModule);
  ..
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even better this is checked in Validate and Merge now instead.

@@ -45,7 +46,7 @@ public class ManifestData {

Options = new Dictionary<string, object>();
foreach (var (option, _) in OptionChecks) {
var optionValue = GetOptionValue(options, option);
var optionValue = options.Get((dynamic)option);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woop

@@ -358,6 +359,16 @@ public enum DefaultFunctionOpacityOptions {
@"
If verification fails, report a detailed counterexample for the first failing assertion (experimental).".TrimStart()) {
};

public static readonly Option<IList<FileInfo>> TranslationRecords = new("--translation-record",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I regret the existence of CommonOptionBag :-D

Any chance we could move these options to more specific locations? I would prefer ExecutableBackend.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved them to IExecutableBackend, right next to OuterModule :)

var success = true;
var relevantOptions = options.Options.OptionArguments.Keys.ToHashSet();
// Yo dawg, we heard you liked options so we put Options in your Options... :)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

XD

OptionsByModule = OptionsByModule.Union(other.OptionsByModule).ToDictionary(p => p.Key, p => p.Value);
}


public static void ReadValidateAndMerge(Program dafnyProgram, string filePath, IToken origin) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind replacing the Program argument with ErrorReporter (which also contains an Options) and CompilationData, to make it easier to see from the caller what is being used?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't quite do that because Validate depends on seeing the set of modules to compile as well, to validate that they don't appear as previously compiled in translation records as well.

I could pass in ErrorReporter, CompilationData, and the set of compile modules if you feel strongly about it. I would almost prefer to pass in Options separately as well since it's a key parameter that feels weird to access through the error reporter to me.

Copy link
Member

@keyboardDrummer keyboardDrummer Apr 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, never mind, I'm fine with the current code. Maybe rename dafnyProgram to program, and order the methods from high to low level, so ReadValidateAndMerge above Validate

@@ -651,6 +647,18 @@ private record TargetPaths(string Directory, string Filename) {
var options = dafnyProgram.Options;
options.Backend.OnPreCompile(dafnyProgram.Reporter, otherFileNames);

// Process --translation-record options, since translation may need that data to translate correctly.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this new block also move to ExecutionBackend? It seems like it doesn't actual use any of the contents of the program, so it doesn't matter at what point during compilation it's run.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does depend on the set of modules being compiled, and the timing does matter because it needs to capture those names before ProcessOuterModules messes them up. :)

But yes it turns out to work well to move it to ExecutableBackend.Compile. The only downside is having to ensure overrides like DafnyExecutableBackend.Compile also invoke the handling, but that's already a downside for handling like ProcessOuterModules. Plus it means the LibraryBackend DOESN'T output a translation record by default, and I think that's desirable anyway.

…lation-options

# Conflicts:
#	Source/DafnyCore/DooFile.cs
#	Source/DafnyStandardLibraries/binaries/DafnyStandardLibraries-cs.doo
#	Source/DafnyStandardLibraries/binaries/DafnyStandardLibraries-go.doo
#	Source/DafnyStandardLibraries/binaries/DafnyStandardLibraries-java.doo
#	Source/DafnyStandardLibraries/binaries/DafnyStandardLibraries-js.doo
#	Source/DafnyStandardLibraries/binaries/DafnyStandardLibraries-notarget.doo
#	Source/DafnyStandardLibraries/binaries/DafnyStandardLibraries-py.doo
#	Source/DafnyStandardLibraries/binaries/DafnyStandardLibraries.doo
@robin-aws robin-aws marked this pull request as ready for review April 25, 2024 21:37

namespace DafnyCore.Options;

public class TranslationRecord {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment?

continue;
}

var recordedOuterModuleName = (string)dafnyProgram.Compilation.AlreadyTranslatedRecord.Get(null, module.FullDafnyName, OuterModule);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point we should use the translation records to configure the options of the library modules, so consuming code does not have to look at Compilation.AlreadyTranslatedRecord in addition to DafnyOptions. In this case this would lead to:

var outerModule = module.Options.Get(OuterModule);
var outerModule = outerModules.GetOrCreate(recordedOuterModuleName, () => CreateOuterModule(recordedOuterModuleName));
module.EnclosingModule = outerModule;

With a change in ProgramParser so that the option OuterModule is only set for the default module.

Note that currently module options are only stored in ModuleDecl, not in ModuleDefinition, but ProgramParser could also store them in ModuleDefinition.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that too, in fact that's part of the reason I tried to add the processing to the parsing-related code in the first pass. :)

OptionsByModule = OptionsByModule.Union(other.OptionsByModule).ToDictionary(p => p.Key, p => p.Value);
}


public static void ReadValidateAndMerge(Program dafnyProgram, string filePath, IToken origin) {
Copy link
Member

@keyboardDrummer keyboardDrummer Apr 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, never mind, I'm fine with the current code. Maybe rename dafnyProgram to program, and order the methods from high to low level, so ReadValidateAndMerge above Validate

@@ -15,6 +15,7 @@ namespace Microsoft.Dafny;
public static class VerifyCommand {

static VerifyCommand() {
// TODO: This should be recorded and checked somehow for the same reason --no-verify is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Realized this was safe after all and rewrote accordingly.


public static readonly Option<FileInfo> TranslationRecordOutput = new("--translation-record-output",
@"
Where to output the translation record file. Defaults to the output directory.".TrimStart()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could refer to the relevant manual section.

}
public static readonly Option<IList<FileInfo>> TranslationRecords = new("--translation-record",
@"
A translation record file for previously translated Dafny code. Can be specified multiple times.".TrimStart()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could refer to the relevant manual section.

Copy link
Collaborator

@ShubhamChaturvedi7 ShubhamChaturvedi7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shubham Chaturvedi: How does this work for two conflicting options? Say .doo has function-syntax: 4, but the translation option has function-syntax: 3.

Robin Salkeld:

function-syntax specifically isn't relevant for translation so it's not recorded in .dtr files, it only needs to be in the .doo files so that re-parsing them works correctly (and I've got that fix queued up in a separate changeset, but not prioritizing it right now since it shouldn't affect your work any more)
For options that ARE relevant for translation, you can register different kinds of checks.
For something like --optimize-erasable-datatype-wrapper, where we currently can't handle different values on different libraries, there can be a check that creates an error if the options don't match.
For something like --outer-module , after my change we CAN handle different values by processing the contents of the libraries differently. In that case we have a check that allows different values, but just ensures the value of that option is recorded in the .dtr file.
In the future we can improve individual option handling and move them from the first case to the second. I didn't yet audit all of our options for translation compatibility as I did for verification options in .doo files, but that's something we should prioritize for greater safety as well.

Copy link
Collaborator

@ShubhamChaturvedi7 ShubhamChaturvedi7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shubham Chaturvedi:
Also how does the mapping work between the .doo and the .dtr? Who is responsible for ensuring that right doo is pulled for right dtr?

Robin Salkeld:

Also something we should improve in the future, but orthogonally. The way the smithy-dafny build system is working now you don't even have a guarantee any code was verified before translating, so I didn't want to block progress on figuring it out just yet.

@keyboardDrummer keyboardDrummer enabled auto-merge (squash) April 29, 2024 21:37
Copy link
Collaborator

@ShubhamChaturvedi7 ShubhamChaturvedi7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shubham Chaturvedi:
If same module is being used in two libraries and then these two libraries are used in another project, how should we handle the conflict?

Robin Salkeld:

Again not actually a conflict depending on the option. Something like the Go module name is localized to each library. If you have C and B depending on A, and D depending on B and C, as longer as you're passing something like --translation-record A.dtr to specify the go module name for A none of that matters.

Robin Salkeld:

One other future improvement is to eventually REQUIRE .dtr files so you don't have target language errors that would have been prevented by being aware of what options were used. But requiring that now without any deprecation process would be breaking.

@ShubhamChaturvedi7
Copy link
Collaborator

ShubhamChaturvedi7 commented Apr 29, 2024

Added the offline conversation above for completeness.

…lation-options

# Conflicts:
#	Source/DafnyCore/DooFile.cs
#	Source/DafnyCore/Options/CommonOptionBag.cs
@keyboardDrummer keyboardDrummer merged commit 2e6938d into dafny-lang:master Apr 30, 2024
20 checks passed
@robin-aws robin-aws deleted the track-translation-options branch April 30, 2024 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants