Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change the format of output produced by --progress #5341

Merged
merged 3 commits into from
Apr 23, 2024

Conversation

atomb
Copy link
Member

@atomb atomb commented Apr 19, 2024

Description

The previous implementation had two issues:

  • It displayed only the milliseconds portion of the duration, so would be incorrect for any verification lasting longer than a second.
  • The use of exponential notation for resource count made it more difficult to compare values, and impossible to identify small differences.

This also changes the overall format of the message slightly.

How has this been tested?

Existing tests have been updated to reflect the changes in formatting.

By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.

The previous implementation had two issues:
* It displayed only the milliseconds portion of the duration, so would
  be incorrect for any verification lasting longer than a second.
* The use of exponential notation for resource count made it more
  difficult to compare values, and impossible to identify small
  differences.

This also changes the overall format of the message slightly.
@atomb atomb requested review from robin-aws and keyboardDrummer and removed request for robin-aws April 19, 2024 17:32
@MikaelMayer
Copy link
Member

I wholeheartedly agree for your remark about the use of the exponential notation. I encountered this issue myself as I did not remark that in something like 2.34e+006 becoming 8.59e+005 and thought that I did a change that made verification worse, when it actually did better.
What did you implement instead? I don't see a comment about it in the PR.

@atomb
Copy link
Member Author

atomb commented Apr 19, 2024

I wholeheartedly agree for your remark about the use of the exponential notation. I encountered this issue myself as I did not remark that in something like 2.34e+006 becoming 8.59e+005 and thought that I did a change that made verification worse, when it actually did better. What did you implement instead? I don't see a comment about it in the PR.

It now prints the resource count as an integer.

Options.OutputWriter.WriteLine(
$"Verification part {canVerifyResult.CompletedParts.Count}/{canVerifyResult.Tasks.Count} of {boogieUpdate.CanVerify.FullDafnyName}" +
$", on line {token.line}, " +
$"{DescribeOutcome(Compilation.GetOutcome(runResult.Outcome))}" +
$", taking {runResult.RunTime.Milliseconds}ms and consuming {resourcesUsed} resources");
$" (time: {timeString}, resource count: {runResult.ResourceCount})");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you want to put the two time properties in parenthesis, instead of include them in the sentence? Also why "time" instead of "duration" ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ultimately this makes the message shorter, making it more likely that it'll fit on a single line. When you're seeing many of these go by, the ability to quickly do pattern recognition on the output without explicitly reading it is very helpful, in my experience.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you change "time" into "duration", and possibly resource count into resources or ticks, although I appreciate the latter might be a bigger change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this more, I'm slightly inclined to make it more like the IDE and include something like (0:00:1.25, 23947 RU). I'm worried about using multiple terms to describe resource usage. So far, we've always said "resource count" as an overall descriptor or used the abbreviation "RU" for the units, and the IDE uses "RU" heavily.

When using a unit suffix instead of a descriptive prefix, having either "time: " or "duration: " before the other one feels inconsistent to me. And it'll be formatted in such a way that I think it's obvious that it's describing a time/duration.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One solution would be to list units for both, and not format the duration in "H:M:S" structure. So it would be something like "(3274.86 s, 23194090 RU)". I think perhaps that's my favorite, because it also makes high duration easier to pick out.

@@ -1,26 +0,0 @@
Verified 0/2 symbols. Waiting for f to verify.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was unused right?

@@ -1,19 +0,0 @@
Verified 0/5 symbols. Waiting for Foo to verify.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was unused right?

@@ -1,8 +0,0 @@
Verified 0/2 symbols. Waiting for Foo to verify.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was unused right?

@keyboardDrummer
Copy link
Member

The use of exponential notation for resource count made it more difficult to compare values, and impossible to identify small differences.

Can you elaborate on that? What's difficult to compare?

It now prints the resource count as an integer.

Does it? The "g" option can also use exponential notation right? Why did you remove testing the reported resource counts?

@atomb
Copy link
Member Author

atomb commented Apr 22, 2024

The use of exponential notation for resource count made it more difficult to compare values, and impossible to identify small differences.

Can you elaborate on that? What's difficult to compare?

It's easier to quickly recognize how many digits a number has than it is to see that a single number has increased by one. If you, for instance, want to quickly identify the outlier in a sequence of results, finding the one "9" in a sea of "6"s takes a lot more work than finding the number that has three more digits.

There are also cases where I'd really like to be able to quickly distinguish between 7245 and 7214, for example. That's not possible with exponential notation.

It now prints the resource count as an integer.

Does it? The "g" option can also use exponential notation right? Why did you remove testing the reported resource counts?

The "g" format is used for the time. The resource count is printed using the default ToString.

I removed testing the resource count because, when it's exact, it's not necessarily the same on all platforms, and we test with the same .expect files on all platforms.

@keyboardDrummer
Copy link
Member

keyboardDrummer commented Apr 23, 2024

It's easier to quickly recognize how many digits a number has than it is to see that a single number has increased by one. If you, for instance, want to quickly identify the outlier in a sequence of results, finding the one "9" in a sea of "6"s takes a lot more work than finding the number that has three more digits.

Makes sense. Thanks for explaining.

There are also cases where I'd really like to be able to quickly distinguish between 7245 and 7214, for example. That's not possible with exponential notation.

Is that not just because the output was using single digit precision, unrelated to it using exponential notation? "E1" (note the 1)

@atomb
Copy link
Member Author

atomb commented Apr 23, 2024

There are also cases where I'd really like to be able to quickly distinguish between 7245 and 7214, for example. That's not possible with exponential notation.

Is that not just because the output was using single digit precision, unrelated to it using exponential notation? "E1" (note the 1)

Increasing the precision of the exponential notation would help, but I think it still winds up being harder to find the large values in a big collection when things are written in exponential notation. It's much easier to tell the difference between 239869384723982 and 23897 than 2.3987e14 and 2.3897e4.

Ultimately, I think I'd prefer something even more rigidly formatted, similar to what I described in #5150. Maybe something like:

SUCCESS     0:00:0.15       382970 RU  Module.SubModule.Lemma
FAILURE     0:01:38       12120903 RU  OtherModule.Method

Because this makes it really easy to scan for what you're looking for and not need to read all of the words.

But if we do something like that I'm happy to have it come later.

@keyboardDrummer
Copy link
Member

Merging since I think having the improvement is more important than perfecting the output. We can always get back to it.

@keyboardDrummer keyboardDrummer merged commit 2e7de95 into dafny-lang:master Apr 23, 2024
20 checks passed
@keyboardDrummer
Copy link
Member

Ultimately, I think I'd prefer something even more rigidly formatted, similar to what I described in #5150. Maybe something like:

Looks good!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants