Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: add Discriminator field to Line message #768

Closed
prattmic opened this issue Apr 13, 2023 · 6 comments
Closed

Proposal: add Discriminator field to Line message #768

prattmic opened this issue Apr 13, 2023 · 6 comments
Labels
Priority: p3 Buganizer priority - P3 type: feat Buganizer type - Feature Request

Comments

@prattmic
Copy link
Member

prattmic commented Apr 13, 2023

This proposal is a corollary to #687, to add more precision to sample locations.

A "discriminator" is an arbitrary value used to distinguish multiple constructs (traditionally basic blocks) that occur on the same (file, line, column) location.

Concretely, the DWARF spec defines "discriminator" on page 152 as "An unsigned integer identifying the block to which the current instruction belongs. Discriminator values are assigned arbitrarily by the DWARF producer and serve to distinguish among multiple blocks that may all be associated with the same source file, line, and column. Where only one block exists for a given source position, the discriminator value is be zero."

Discriminators are used by compilers (notably, LLVM) to distinguish basic blocks for use in profile-guided optimization (PGO). The Go toolchain supports PGO as of Go 1.20, and we use pprof profiles as the PGO input format as they are easy for users to collect (e.g., using net/http/pprof). However, line number as the highest precision granularity is rather limiting, and we would like to increase granularity (golang/go#59612). We could probably make do with just column number (#687), but adding discriminators would provide maximum precision.

Concretely, the proposal is to add a field to Line:

// Discriminator number distinguishing multiple blocks at the same source location. Precise meaning is compiler-dependent.
int64 discriminator = 4;

(int64 selected for consistency with other fields, but I'm not tied to it. For reference, LLVM uses unsigned, which I think is uint32. 32-bits ought to be enough for anyone.)

As suggested in #687 (comment), we may want to add start_discriminator to Function. (Perhaps Function should have Line start_line?)

Other pprof changes from #687 apply here as well: adjusting aggregation options, perhaps adding a granularity flag. In my opinion, we could be even more lenient in the tool w.r.t. mostly ignoring discriminators. I don't think there are too many uses that will want to use discriminator values directly in the pprof tool itself (e.g., I don't think source view needs to display discriminators).

@aalexand
Copy link
Collaborator

aalexand commented Apr 16, 2023

I think we should add one or the other, having both column and discriminator fields would look confusing IMO - like two identifiers of a position or fragment within the source line.

Maybe we should add the discriminator field with the comment that for languages that use DWARF to resolve the debug info this is the DWARF discriminator and other languages like JS can give a special meaning to it like column number.

@prattmic
Copy link
Member Author

having both line and discriminator fields would look confusing

I assume you mean "column and discriminator fields" here?

Maybe we should add the discriminator field with the comment that for languages that use DWARF to resolve the debug info this is the DWARF discriminator and other languages like JS can give a special meaning to it like column number.

This seems reasonable to me. w.r.t. DWARF, I'll note that the spec quote above says: "Discriminator values ... distinguish among multiple blocks that may all be associated with the same source file, line, and column. Where only one block exists for a given source position, the discriminator value is be zero (sic)."

My read of that is that the discriminator should only be set if the column is insufficient to discriminate, which would mean we'd need both. However, LLVM does not appear to follow the spec. In addDiscriminators, only filename + line is considered (L208). Column number is ignored. This makes sense considering that LLVM profile format only contains line + discriminator.

@aalexand
Copy link
Collaborator

I assume you mean "column and discriminator fields" here?

Sorry, yes. Updated the text retroactively.

My read of that is that the discriminator should only be set if the column is insufficient to discriminate, which would mean we'd need both. However, LLVM does not appear to follow the spec.

Thanks for pointing out this detail. Yes, our internal symbolization pipeline also only carries the line number and the discriminator, not also the column number, so practically I think the two should be sufficient.

@Louis-Ye Louis-Ye added type: feat Buganizer type - Feature Request Priority: p3 Buganizer priority - P3 labels May 12, 2023
@aalexand
Copy link
Collaborator

There is #818 in review currently which will address #687 (adding column field in the Line message). I don't think we will be adding both fields unless a strong use case emerges, so when the discriminator is needed I would propose using the column field for it. Closing this issue as "not planned" - please feel free to re-open if the proposed solution does not work.

@aalexand aalexand closed this as not planned Won't fix, can't repro, duplicate, stale Nov 12, 2023
@prattmic
Copy link
Member Author

when the discriminator is needed I would propose using the column field for it.

The downside to this is that today Go's PGO uses standard CPU profiles emitted from runtime/pprof, etc. If we put a discriminator rather than a column number in the "column" field, it will be confusing to anyone analyzing CPU profiles, as we seem to be reporting incorrect nonsense column numbers.

@aalexand
Copy link
Collaborator

I think it may be a sign that a more specialized format would be a better thing here then. Including every possible bit of information in profile.proto can make hard to manage over time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority: p3 Buganizer priority - P3 type: feat Buganizer type - Feature Request
Projects
None yet
Development

No branches or pull requests

3 participants