-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New memory representation for Name/NodeStr #868
Conversation
* Standard library Arc instead of triomphe * No double indirection anymore for the static case * Location is stored inline, not as Arc header * `len` is both string value length and location span length, so this cannot support GraphQL descriptions or string values as they can contain length-changing escape sequences
This reverts commit 114b6e6.
I’d like to do the change from |
In today’s discussion there was clear reluctance about intentionally leaking memory, so I’ve reverted the addition of the leaking string interner. A future PR can consider adding a non-leaking interner based on |
With a few small changes to make apollographql/router#5477 compile and a changelog entry, this is now ready for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments about some of the APIs. Overall, I think this a good direction, but I do want to make sure @tninesling tries this out in their branch to make sure this solves the performance problem we were having with tracing.
Seeing the changes in apollo-smith
usage of apollo_compiler::Name
does make me think it's a lot more clear to have ast::Name
, so I think it needs to move back somewhere in the ast
module.
|
||
#[deprecated = "import `apollo_compiler::Name` instead"] | ||
pub type Name = crate::Name; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nice, thanks for adding it!
use crate::executable::{ | ||
BuildError as ExecutableBuildError, ConflictingFieldArgument, ConflictingFieldName, | ||
ConflictingFieldType, | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
drive by nit: don't we usually import one by one? I feel like you were the one to introduce this to apollo-rs even 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the router repo we configure rustfmt with imports_granularity = Item
and group_imports = StdExternalCrate
. I tend to write new files this way, but haven’t configured rustfmt in this repo so it defaults to Preserve
config.
I’d be in favor of adopting and enforcing a style in this repo too. The problem is those options are unstable in rustfmt, so they require a Nightly toolchain. If we have them in rustfmt.toml
then stable rustfmt emits warnings that they’re ignored. What makes this especially bad is that cargo fmt
calls rustfmt separately for every source file, so warnings are repeated enough time to flood the terminal. In the router repo for now we don’t configure rustfmt.toml
, but have an xtask command that runs cargo +nightly fmt -- --config […]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahhhhh that makes sense. i got confused on what's formatted which way and where. It's not an issue, and doesn't really need the time to be put into it at the moment. Can figure it out at a later point!
crates/apollo-compiler/src/name.rs
Outdated
|
||
impl Name { | ||
/// Create a new `Name` parsed from the given source location | ||
pub fn new_parsed(value: &str, location: NodeLocation) -> Result<Self, InvalidNameError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from_parsed
and from_static
maybe instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two three independent axes for constructors:
- Stored as
Arc<str>
v.s. stored as&'static str
- With a source location (aka "parsed") v.s. without
- Checking syntax validity and returning a
Result
, or "unchecked" (used when already checked at compile-time or by apollo-parser)
All combinations are possible as far as Name
is concerned (though static parsed is only useful with a leaking string interner). In a later commit I replaced the "parsed" constructors with a with_location
builder-like chaining method to reduce the number of constructors without limiting possibilities.
But you’re right that new_something
and from_something
are still used a bit inconsistently. One argument is that from_
only seems to say anything about the input parameter type, but static is also a possible internal representation inside of Name
.
Co-authored-by: Iryna Shestak <[email protected]>
Sorry! At some point I tried something in a separate branch and forgot to merge back. I didn’t realize I was not pushing to this PR anymore and asked for review on something quite a bit behind from what I thought. All pushed now. Remaining to bikeshed:
|
I haven't thought a lot about this but I think it makes sense to have an |
5315127
to
564fc78
Compare
Per today’s discussion I removed the |
% Conflicts: % crates/apollo-compiler/CHANGELOG.md % crates/apollo-compiler/src/node.rs
Per discussion in #868 (comment) except I picked `group_imports = One` instead of `StdExternalCrate`
Per discussion in #868 (comment) except I picked `group_imports = One` instead of `StdExternalCrate`
Co-authored-by: Renée <[email protected]>
Per discussion in #868 (comment) except I picked `group_imports = One` instead of `StdExternalCrate`
Per discussion in #868 (comment) except I picked `group_imports = One` instead of `StdExternalCrate`
Per discussion in #868 (comment) except I picked `group_imports = One` instead of `StdExternalCrate`
Take advantage of the recent change of internal representation of `Name`: apollographql/apollo-rs#868
…5573) Take advantage of the recent change of internal representation of Name: apollographql/apollo-rs#868
This is the result of a series of experiments. I’m not sure I like it, but let’s discuss it.
TL;DR
NodeStr
is no more. GraphQL string values use a plainString
(theValue
enum is already in aNode
with the same source span) and descriptions useNode<str>
.Name
has a dual representation based on eitherArc<str>
or&'static str
and can cheaply convert to that.Motivation
Apollo Router has found a performance bottleneck when using GraphQL type and field names in OpenTelemetry.
opentelemetry::StringValue
can be converted cheaply fromString
,Arc<str>
, or&'static str
butapollo_compiler::NodeStr
(usually) contains none of these so converting requires a new heap allocation. This happens in a hot path often enough that the allocation (and later deallocation) costs add up.Name
andNodeStr
in apollo-compiler 1.0.0-beta.17Name
is used for GraphQL names. It dereferences toNodeStr
and is a thin wrapper that adds checking for GraphQL name syntax. (ASCII alphanumeric or underscore, not starting with a digit)In addition to names,
NodeStr
is also used for GraphQL stringValue
s and descriptions. It contains an immutablestr
value and optional source location (file ID and text range) for diagnostics.Internally,
NodeStr
contains a singleptr::NonNull<()>
with a tag in its lowest bit. The pointer is usually to a (triomphe::ThinArc
) shared heap allocation that contains:The pointer can instead (discriminated with the pointer tag) represent
&'static &'static str
. This is a double indirection, because&'static str
is a wide pointer that stores the string length next to its address but we didn’t want to double the size ofNodeStr
. In this case there is no location information. This is used by thename!("example")
macro which does GraphQL name syntax checking at compile-time then callsNodeStr::from_static(&"example")
with an extra&
which gets promoted to a constant.A quality of this design is that
size_of::<NodeStr>()
is only 8 bytes.First idea:
String::leak
GraphQL type and field names in a schema rarely change. What if we accepted a small memory leak to turn them into
&'static str
?However that’s not enough to use the existing static representation of
NodeStr
:String::leak
doesn’t give us the double&'static &'static str
indirection. We could potentially allocateBox<&'static str>
and also leak that, but even then…NodeStr
doesn’t.Overall direction for a new memory representation
If we’re reconsidering the design of
NodeStr
, we might as well pick one that makes conversion toopentelemetry::StringValue
cheap in all cases. This means a dual representation based on eitherArc<str>
or&'static str
. (The standard libraryArc
allocates space for a weak reference counter that we don’t use, but we can accept that cost. Also see below for aWeak
come-back.)This in turn means moving the source location outside of the allocation, to be inline in the
NodeStr
struct. At first this seems like a loss: if a string is cloned many times, its source location (or even space to record the lack thereof) is duplicated. But maybe that doesn’t happen that much? On the contrary, inline location could allow strings at different locations to share a heap allocation (hint hint more foreshadowing).A straightforward
(enum { Arc<str>, &'static str }, Option<NodeLocation>)
hassize_of() == 40
bytes, which feels like too much.Layout optimizations
rowan::TextRange
insideNodeLocation
uses 32-bit offsets, so apollo-parser is already limited to 4 GiB inputsNodeLocation
can be decomposed into{ file_id, start_offset: u32, end_offset: u32 }
.Both
Arc<str>
and&'static str
are represented as{ data: *const u8, len: usize }
. The length is kind of already represented in the location is the difference between end and start offsets. Accessing the length to build a&str
is much more common than accessing the end offset, so we can store{ start_offset: u32, len: u32 }
and compute the end offset on demand. When a synthetic string doesn’t have a source location, we can setstart_offset
to zero but still uselen
.Finally, an actual Rust
enum
would allocate 64 bits to the 1-bit discriminant in order preserve memory alignment. Like before, we can stash that bit somewhere else:&'static str
hasalign_of() == 1
(it could be a sub-slice of some other string) so we can’t use the lower pointer bit for the tagstart_offset
would work, but bring the existing 4 GiB file input limit to 2 GiB. This could be ok but we can avoid itlen
would move that 2 GiB limit to the length of a single string sounds fine, but extra steps in the common code path of rebuilding a&str
can easily be avoidedFileId
contains aNonZeroI64
, we can easily spare a bit thereWe end up with this representation which has
size_of() == 24
bytes:Consequences and tradeoffs
Remember when the length of the string is kind of already in the source location? Only kind of. This is the case for
Name
whose value appear exactly in the GraphQL source, but string values and descriptions are quoted strings that can use backslash-escape sequences. Both of these make the source span longer than the decoded value.This PR chooses to remove
NodeStr
entirely and move the special representation directly intoName
. GraphQL string values use a plainString
(theValue
enum is already in aNode
with the same source span) and descriptions useNode<String>
Node<str>
Clippy started complaining because
executable::BuildError
was now hadsize_of() == 264
bytes. I boxed its largest variants to bring that to 104 bytes, but the impact doesn’t stop there. EveryName
will now incur an extra 16 bytes of inline size. On the other hand each heap allocation forName
will have 16 fewer bytes, so I’d guess the overall impact on memory use of a running Router is negligible.FileId
containedNonZeroI64
with -1 and -2 reserved for special values.TaggedFileId
was easier to make without negative values, so nowNonZeroU64
is used and "normal" IDs start at 3 instead of 1. This causes a diff for every snapshot test that prints file IDs, but this is a one-time transition.Optional: leaking string interner
The above gives cheap conversion to
opentelemetry::StringValue
already, so we could stop there. But the originalString::leak
idea is more doable now. But if we’re gonna leak we should definitely deduplicated the leaked strings.This PR includes (in a separate commit)(edit: this inclusion was reverted for now) an opt-in process-wide string interner based onOnceLock<RwLock<hashbrown::RawTable<&'static str>>>
andString::leak
. It’s enabled by configuringParser::new().intern_and_leak_names()
. When not enabled, the parser does not leak any new string but still looks in the interner for strings already there. Sointern_and_leak_names
could for example be enabled when parsing a schema but not when parsing executable document, still allowing the latter to reuse already-interned field names.Leaking is enabled for parsing
built_in.graphql
, so strings likeInt
or__typename
are always interned.Going further: weak reference string interner
Leaking memory seems risky. Another possibility is to store
std::sync::Weak<str>
in the interner and give outArc<str>
on lookup. This would allow strings not references anymore to be de-allocated.Before inserting into the hash table, the interner would check if it’s about to grow with
table.capacity() == table.len()
. In that case (since we’re about to do anO(n)
reallocation of the table anyway) it would scan the table to remove entries withWeak::strong_count() == 0
since those have been deallocated and can’t be used anymore.A potential risk is atomic contention: atomic operations in
Arc
fast when uncontended but if many threads are all trying to increment and decrement the refcount of the oneArc<str>
representing"Query"
then they will start waiting on each other.