Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[draft] Automated differential testing for the LSP server #1684

Closed
wants to merge 26 commits into from

Conversation

cpitclaudel
Copy link
Member

@cpitclaudel cpitclaudel commented Jan 4, 2022

This PR implements a test driver that runs multiple Dafny frontends on an input and compares their outputs.

This program takes a list of Dafny frontends (either the CLI or the LSP server) and a list of inputs (snapshots or LSP traces) and runs them through each frontend. Verification results are then compared to ensure that they match.

A frontend is an executable (either the Dafny CLI Dafny.exe or the Dafny LSP Server DafnyLanguageServer.exe) plus the arguments to pass to it.

An input is either a Dafny file (.dfy) or an LSP trace (.lsp). If given a file xyz.dfy, this program first looks for xyz.v0.dfy, xyz.v1.dfy, etc. before falling back to xyz.dfy.

Limitations:

  • This only checks error messages, not return codes

  • This only checks for errors in the main file, not in all files reported by the LSP server.

Example usage:

 ./difftester.py --frontend Dafny /printTooltips \
                 --frontend DafnyLanguageServer /--verifier:verifysnapshots=0 \
                 --frontend DafnyLanguageServer /--verifier:verifysnapshots=3 \
                 --input trace.lsp

This steps through all edits in the LSP trace trace.lsp and runs each of them three times, once in a fresh Dafny CLI, once by sending them to a server that doesn't use caching, and once to a server that uses caching. If results match, nothing is printed.

Command line flags:

usage: difftester.py [-h] [--verbose] [-j N] [--timeout N] --frontend
                     FRONTEND_NAME [ARGUMENTS ...] --input INPUT

optional arguments:
  -h, --help            show this help message and exit
  --verbose, -v         Increase verbosity (repeat for more).
  -j N                  Run command line tests in N concurrent threads.
  --timeout N           Limit execution time to N seconds for individual CLI
                        runs.
  --frontend FRONTEND_NAME [ARGUMENTS ...]
                        Register a frontend
  --input INPUT         Register an input file

This is not integrated into the CI yet. Additionally, since many tests require custom Dafny flags that the server does not support, this won't be able to run on all of Dafny's test suite.

I'm not sure how to go about the review, since it's a lot of code at once.

@cpitclaudel cpitclaudel self-assigned this Jan 4, 2022
@cpitclaudel cpitclaudel added area: error-reporting Clarity of the error reporting part: language server Support for LSP in Dafny (server part; client is in ide-vscode repo) labels Jan 4, 2022
@cpitclaudel cpitclaudel marked this pull request as draft January 4, 2022 22:23
@MikaelMayer
Copy link
Member

That's a great script to have !
I think it would be great to have it as a tool easy to use down the road.
Since it's in the Test/ directory, I don't see how it's looking up for the Binaries directory. Do you have to execute it from the Binaries/ directory?


Example usage::

./difftester.py --frontend Dafny \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you ensure you list this example when you run the command-line without providing the correct arguments?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does that already, I think

Test/diffs/difftester.py Outdated Show resolved Hide resolved
@cpitclaudel
Copy link
Member Author

Since it's in the Test/ directory, I don't see how it's looking up for the Binaries directory. Do you have to execute it from the Binaries/ directory?

It uses whichever binaries you give it. The example assumes assumed that you had Dafny on your path, but typically I just pass it a relative path.

@@ -0,0 +1,955 @@
#!/usr/bin/env python3
r"""A differential tester for Dafny.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is Python the right language for this? I don't expect Dafny engineers to be experts at any language other than the main language Dafny is written in, so I think not having it in C# severely decreases the maintainability. That isn't a deal breaker but there needs to be a good reason.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no good reason, except that it made it easier to prototype. I'm happy to port it to C# if we find that better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having it in C# is much more maintainable. I worked on a product defined in many languages before and there were significant differences in code quality between the most used language and others.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that c# is more maintainable. However, it's possible to immediately run a python file, whereas it's not with a c#.
If we just use this script for smoking tests down the road or like a tool in our tool belt, then we are fine keeping it in Python. If we plan to integrate it in the CI or to automate the process, then C# might be a better option.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, it's possible to immediately run a python file, whereas it's not with a c#.

Can you elaborate? What's the difference in UX between python file.py and dotnet run file.csproj ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it matters for Java (a JVM takes time to start), I'm not sure for .NET

Still, two commands, and moreover you need to install a tool(?)

Copy link
Member

@keyboardDrummer keyboardDrummer Jan 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dotnet-script tool would be installed by placing it in the dotnet-tools.json file in our repository and then running dotnet tool install which we already do in various places. I think if you want to do the Python path properly, you would also provide a reliable way of ensuring the right Python version is used. Maybe we would place Python installation instructions in a README or add something to our makefile. I'm not sure what's idiomatic there.

I'm a bit confused by this discussion. Doesn't the benefit of having the repository be (more) in a single language far outweigh these differences in how to run the script?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the confusion. Thanks for digging deeper.
Having the repository to be (more) in a single language outweighs a little - in my opinion - these differences in how to run the script. If we had the c# script, we could always make an extra shell wrapper around it to run it, so that wouldn't be a problem.

Now, the question of this thread is: do we want @cpitclaudel to actually rewrite this in C#? I argue that we don't need to spend time doing that. This script is not integrated with the Dafny source files, it's only running processes. We would gain nothing porting a program of this size to a typed language. This script is not going to evolve substantially, is very likely to not going to have unit tests.

Each language has its own idiomatic use. Python has been designed for scripting and quick prototyping, file and process handling and so on, C# has been designed for building applications and robust shared libraries. Hence, I understand the current choice, although I would have never pushed anyone to use Python instead of C# if they preferred to start their project with the latter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MikaelMayer So, disregarding what language it's currently written in, what language do you prefer it to be written in, or are you ambivalent about that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disregarding the language our codebase is currently written in, I would prefer it to be written in Python.

@@ -0,0 +1,955 @@
#!/usr/bin/env python3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you structure the code into multiple files, to make it easier to digest? For example, a separate file for each supported front-end might help, or for each supported type of input.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have mixed feelings about this: there's usefulness to having everything in one file. I agree that 1000 lines is getting close to the manageable limit. Happy to refactor if you think the benefits overweight the downsides.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If everything is in one file, how do I see the hierarchical structure of the code? Maybe an IDE with an outline view can show the classes in the file, but that's not as accessible as having the files define that structure. What's the benefit of a single file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If everything is in one file, how do I see the hierarchical structure of the code?

You search for classes; there's 10 to 20 of them in here, so that remains manageable. The advantage of a single file is that you don't need an IDE to navigate around; all relevant definitions are in the same place, so you can just search around.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally prefer a big file than many small ones, because indeed it makes it easier to search around for the concepts, especially in a script that is doing only one thing and where we don't need to share code outside.

However, to keep it readable, since the classes might be for different purposes, I would advise to add a comment on the top of the file explaining what are all the classes that were defined and what purpose they have, so that it will make it easier to maintain down the road.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I have a slightly different perspective: I think the cost of rewriting in C# is not huge, so I can do it if we find it useful. It's probably a day of two at most. But how about we first give this script a bit of a try, and if it turns out to be actually useful, we can decide whether to invest more in it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But how about we first give this script a bit of a try, and if it turns out to be actually useful, we can decide whether to invest more in it.

What do you mean by giving it a try? If we turn it on in the build-server then we're using it right? What would be a reason to undo that then?

I think normally we assume checked in code is there to stay, and the time of merging is it also the right time to optimise it for readability. In my opinion, having code in anything other than the primary language of a codebase makes it much less readable.

Already for this PR I don't feel as capable as a reviewer as I would for a C# PR.

@MikaelMayer
Copy link
Member

Since it's in the Test/ directory, I don't see how it's looking up for the Binaries directory. Do you have to execute it from the Binaries/ directory?

It uses whichever binaries you give it. The example assumes assumed that you had Dafny on your path, but typically I just pass it a relative path.

I don't have Dafny on my path. Could you have your script go to ../Binaries to lookup for the applications if it's not in the path? And also, please add this information to the doc. so that we don't need to specify relative paths.

@cpitclaudel
Copy link
Member Author

Closing this since the implementation had bit-rotted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: error-reporting Clarity of the error reporting part: language server Support for LSP in Dafny (server part; client is in ide-vscode repo)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants