-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[draft] Automated differential testing for the LSP server #1684
Conversation
That's a great script to have ! |
Test/diffs/difftester.py
Outdated
|
||
Example usage:: | ||
|
||
./difftester.py --frontend Dafny \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you ensure you list this example when you run the command-line without providing the correct arguments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does that already, I think
It uses whichever binaries you give it. The example |
@@ -0,0 +1,955 @@ | |||
#!/usr/bin/env python3 | |||
r"""A differential tester for Dafny. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is Python the right language for this? I don't expect Dafny engineers to be experts at any language other than the main language Dafny is written in, so I think not having it in C# severely decreases the maintainability. That isn't a deal breaker but there needs to be a good reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no good reason, except that it made it easier to prototype. I'm happy to port it to C# if we find that better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having it in C# is much more maintainable. I worked on a product defined in many languages before and there were significant differences in code quality between the most used language and others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that c# is more maintainable. However, it's possible to immediately run a python file, whereas it's not with a c#.
If we just use this script for smoking tests down the road or like a tool in our tool belt, then we are fine keeping it in Python. If we plan to integrate it in the CI or to automate the process, then C# might be a better option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, it's possible to immediately run a python file, whereas it's not with a c#.
Can you elaborate? What's the difference in UX between python file.py
and dotnet run file.csproj
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it matters for Java (a JVM takes time to start), I'm not sure for .NET
Still, two commands, and moreover you need to install a tool(?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dotnet-script
tool would be installed by placing it in the dotnet-tools.json
file in our repository and then running dotnet tool install
which we already do in various places. I think if you want to do the Python path properly, you would also provide a reliable way of ensuring the right Python version is used. Maybe we would place Python installation instructions in a README or add something to our makefile. I'm not sure what's idiomatic there.
I'm a bit confused by this discussion. Doesn't the benefit of having the repository be (more) in a single language far outweigh these differences in how to run the script?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the confusion. Thanks for digging deeper.
Having the repository to be (more) in a single language outweighs a little - in my opinion - these differences in how to run the script. If we had the c# script, we could always make an extra shell wrapper around it to run it, so that wouldn't be a problem.
Now, the question of this thread is: do we want @cpitclaudel to actually rewrite this in C#? I argue that we don't need to spend time doing that. This script is not integrated with the Dafny source files, it's only running processes. We would gain nothing porting a program of this size to a typed language. This script is not going to evolve substantially, is very likely to not going to have unit tests.
Each language has its own idiomatic use. Python has been designed for scripting and quick prototyping, file and process handling and so on, C# has been designed for building applications and robust shared libraries. Hence, I understand the current choice, although I would have never pushed anyone to use Python instead of C# if they preferred to start their project with the latter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MikaelMayer So, disregarding what language it's currently written in, what language do you prefer it to be written in, or are you ambivalent about that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disregarding the language our codebase is currently written in, I would prefer it to be written in Python.
@@ -0,0 +1,955 @@ | |||
#!/usr/bin/env python3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you structure the code into multiple files, to make it easier to digest? For example, a separate file for each supported front-end might help, or for each supported type of input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have mixed feelings about this: there's usefulness to having everything in one file. I agree that 1000 lines is getting close to the manageable limit. Happy to refactor if you think the benefits overweight the downsides.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If everything is in one file, how do I see the hierarchical structure of the code? Maybe an IDE with an outline view can show the classes in the file, but that's not as accessible as having the files define that structure. What's the benefit of a single file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If everything is in one file, how do I see the hierarchical structure of the code?
You search for classes; there's 10 to 20 of them in here, so that remains manageable. The advantage of a single file is that you don't need an IDE to navigate around; all relevant definitions are in the same place, so you can just search around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally prefer a big file than many small ones, because indeed it makes it easier to search around for the concepts, especially in a script that is doing only one thing and where we don't need to share code outside.
However, to keep it readable, since the classes might be for different purposes, I would advise to add a comment on the top of the file explaining what are all the classes that were defined and what purpose they have, so that it will make it easier to maintain down the road.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I have a slightly different perspective: I think the cost of rewriting in C# is not huge, so I can do it if we find it useful. It's probably a day of two at most. But how about we first give this script a bit of a try, and if it turns out to be actually useful, we can decide whether to invest more in it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But how about we first give this script a bit of a try, and if it turns out to be actually useful, we can decide whether to invest more in it.
What do you mean by giving it a try? If we turn it on in the build-server then we're using it right? What would be a reason to undo that then?
I think normally we assume checked in code is there to stay, and the time of merging is it also the right time to optimise it for readability. In my opinion, having code in anything other than the primary language of a codebase makes it much less readable.
Already for this PR I don't feel as capable as a reviewer as I would for a C# PR.
I don't have Dafny on my path. Could you have your script go to |
Closing this since the implementation had bit-rotted. |
This PR implements a test driver that runs multiple Dafny frontends on an input and compares their outputs.
This program takes a list of Dafny frontends (either the CLI or the LSP server) and a list of inputs (snapshots or LSP traces) and runs them through each frontend. Verification results are then compared to ensure that they match.
A frontend is an executable (either the Dafny CLI
Dafny.exe
or the Dafny LSP ServerDafnyLanguageServer.exe
) plus the arguments to pass to it.An input is either a Dafny file (
.dfy
) or an LSP trace (.lsp
). If given a filexyz.dfy
, this program first looks forxyz.v0.dfy
,xyz.v1.dfy
, etc. before falling back toxyz.dfy
.Limitations:
This only checks error messages, not return codes
This only checks for errors in the main file, not in all files reported by the LSP server.
Example usage:
This steps through all edits in the LSP trace
trace.lsp
and runs each of them three times, once in a fresh Dafny CLI, once by sending them to a server that doesn't use caching, and once to a server that uses caching. If results match, nothing is printed.Command line flags:
This is not integrated into the CI yet. Additionally, since many tests require custom Dafny flags that the server does not support, this won't be able to run on all of Dafny's test suite.
I'm not sure how to go about the review, since it's a lot of code at once.