Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New memory representation for Name/NodeStr #868

Merged
merged 24 commits into from
Jun 27, 2024
Merged

New memory representation for Name/NodeStr #868

merged 24 commits into from
Jun 27, 2024

Conversation

SimonSapin
Copy link
Contributor

@SimonSapin SimonSapin commented Jun 14, 2024

This is the result of a series of experiments. I’m not sure I like it, but let’s discuss it.

TL;DR

NodeStr is no more. GraphQL string values use a plain String (the Value enum is already in a Node with the same source span) and descriptions use Node<str>. Name has a dual representation based on either Arc<str> or &'static str and can cheaply convert to that.

Motivation

Apollo Router has found a performance bottleneck when using GraphQL type and field names in OpenTelemetry. opentelemetry::StringValue can be converted cheaply from String, Arc<str>, or &'static str but apollo_compiler::NodeStr (usually) contains none of these so converting requires a new heap allocation. This happens in a hot path often enough that the allocation (and later deallocation) costs add up.

Name and NodeStr in apollo-compiler 1.0.0-beta.17

Name is used for GraphQL names. It dereferences to NodeStr and is a thin wrapper that adds checking for GraphQL name syntax. (ASCII alphanumeric or underscore, not starting with a digit)

In addition to names, NodeStr is also used for GraphQL string Values and descriptions. It contains an immutable str value and optional source location (file ID and text range) for diagnostics.

Internally, NodeStr contains a single ptr::NonNull<()> with a tag in its lowest bit. The pointer is usually to a (triomphe::ThinArc) shared heap allocation that contains:

  • Atomic reference counter
  • Optional source location
  • String length
  • String bytes (variable length, inline following the header)

The pointer can instead (discriminated with the pointer tag) represent &'static &'static str. This is a double indirection, because &'static str is a wide pointer that stores the string length next to its address but we didn’t want to double the size of NodeStr. In this case there is no location information. This is used by the name!("example") macro which does GraphQL name syntax checking at compile-time then calls NodeStr::from_static(&"example") with an extra & which gets promoted to a constant.

A quality of this design is that size_of::<NodeStr>() is only 8 bytes.

First idea: String::leak

GraphQL type and field names in a schema rarely change. What if we accepted a small memory leak to turn them into &'static str?

However that’s not enough to use the existing static representation of NodeStr:

  • String::leak doesn’t give us the double &'static &'static str indirection. We could potentially allocate Box<&'static str> and also leak that, but even then…
  • For diagnostics we want type and field names in the schema to carry source location, which the current static NodeStr doesn’t.

Overall direction for a new memory representation

If we’re reconsidering the design of NodeStr, we might as well pick one that makes conversion to opentelemetry::StringValue cheap in all cases. This means a dual representation based on either Arc<str> or &'static str. (The standard library Arc allocates space for a weak reference counter that we don’t use, but we can accept that cost. Also see below for a Weak come-back.)

This in turn means moving the source location outside of the allocation, to be inline in the NodeStr struct. At first this seems like a loss: if a string is cloned many times, its source location (or even space to record the lack thereof) is duplicated. But maybe that doesn’t happen that much? On the contrary, inline location could allow strings at different locations to share a heap allocation (hint hint more foreshadowing).

A straightforward (enum { Arc<str>, &'static str }, Option<NodeLocation>) has size_of() == 40 bytes, which feels like too much.

Layout optimizations

rowan::TextRange inside NodeLocation uses 32-bit offsets, so apollo-parser is already limited to 4 GiB inputs NodeLocation can be decomposed into { file_id, start_offset: u32, end_offset: u32 }.

Both Arc<str> and &'static str are represented as { data: *const u8, len: usize }. The length is kind of already represented in the location is the difference between end and start offsets. Accessing the length to build a &str is much more common than accessing the end offset, so we can store { start_offset: u32, len: u32 } and compute the end offset on demand. When a synthetic string doesn’t have a source location, we can set start_offset to zero but still use len.

Finally, an actual Rust enum would allocate 64 bits to the 1-bit discriminant in order preserve memory alignment. Like before, we can stash that bit somewhere else:

  • &'static str has align_of() == 1 (it could be a sub-slice of some other string) so we can’t use the lower pointer bit for the tag
  • The upper bit of start_offset would work, but bring the existing 4 GiB file input limit to 2 GiB. This could be ok but we can avoid it
  • The upper bit of len would move that 2 GiB limit to the length of a single string sounds fine, but extra steps in the common code path of rebuilding a &str can easily be avoided
  • FileId contains a NonZeroI64, we can easily spare a bit there

We end up with this representation which has size_of() == 24 bytes:

struct {
    ptr: NonNull<u8>,
    len: u32,
    start_offset: u32,
    tagged_file_id: TaggedFileId,
}

Consequences and tradeoffs

  • Remember when the length of the string is kind of already in the source location? Only kind of. This is the case for Name whose value appear exactly in the GraphQL source, but string values and descriptions are quoted strings that can use backslash-escape sequences. Both of these make the source span longer than the decoded value.

    This PR chooses to remove NodeStr entirely and move the special representation directly into Name. GraphQL string values use a plain String (the Value enum is already in a Node with the same source span) and descriptions use Node<String> Node<str>

  • Clippy started complaining because executable::BuildError was now had size_of() == 264 bytes. I boxed its largest variants to bring that to 104 bytes, but the impact doesn’t stop there. Every Name will now incur an extra 16 bytes of inline size. On the other hand each heap allocation for Name will have 16 fewer bytes, so I’d guess the overall impact on memory use of a running Router is negligible.

  • FileId contained NonZeroI64 with -1 and -2 reserved for special values. TaggedFileId was easier to make without negative values, so now NonZeroU64 is used and "normal" IDs start at 3 instead of 1. This causes a diff for every snapshot test that prints file IDs, but this is a one-time transition.

Optional: leaking string interner

The above gives cheap conversion to opentelemetry::StringValue already, so we could stop there. But the original String::leak idea is more doable now. But if we’re gonna leak we should definitely deduplicated the leaked strings.

This PR includes (in a separate commit) (edit: this inclusion was reverted for now) an opt-in process-wide string interner based on OnceLock<RwLock<hashbrown::RawTable<&'static str>>> and String::leak. It’s enabled by configuring Parser::new().intern_and_leak_names(). When not enabled, the parser does not leak any new string but still looks in the interner for strings already there. So intern_and_leak_names could for example be enabled when parsing a schema but not when parsing executable document, still allowing the latter to reuse already-interned field names.

Leaking is enabled for parsing built_in.graphql, so strings like Int or __typename are always interned.

Going further: weak reference string interner

Leaking memory seems risky. Another possibility is to store std::sync::Weak<str> in the interner and give out Arc<str> on lookup. This would allow strings not references anymore to be de-allocated.

Before inserting into the hash table, the interner would check if it’s about to grow with table.capacity() == table.len(). In that case (since we’re about to do an O(n) reallocation of the table anyway) it would scan the table to remove entries with Weak::strong_count() == 0 since those have been deallocated and can’t be used anymore.

A potential risk is atomic contention: atomic operations in Arc fast when uncontended but if many threads are all trying to increment and decrement the refcount of the one Arc<str> representing "Query" then they will start waiting on each other.

* Standard library Arc instead of triomphe
* No double indirection anymore for the static case
* Location is stored inline, not as Arc header
* `len` is both string value length and location span length,
  so this cannot support GraphQL descriptions or string values
  as they can contain length-changing escape sequences
@SimonSapin
Copy link
Contributor Author

I’d like to do the change from Node<String> to Node<str> but it requires a change upstream: Manishearth/triomphe#92

@SimonSapin
Copy link
Contributor Author

In today’s discussion there was clear reluctance about intentionally leaking memory, so I’ve reverted the addition of the leaking string interner. A future PR can consider adding a non-leaking interner based on Arc and Weak.

SimonSapin added a commit to apollographql/router that referenced this pull request Jun 18, 2024
@SimonSapin SimonSapin marked this pull request as ready for review June 18, 2024 15:59
@SimonSapin
Copy link
Contributor Author

With a few small changes to make apollographql/router#5477 compile and a changelog entry, this is now ready for review

Copy link
Member

@lrlna lrlna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments about some of the APIs. Overall, I think this a good direction, but I do want to make sure @tninesling tries this out in their branch to make sure this solves the performance problem we were having with tracing.

Seeing the changes in apollo-smith usage of apollo_compiler::Name does make me think it's a lot more clear to have ast::Name, so I think it needs to move back somewhere in the ast module.

Comment on lines +35 to +37

#[deprecated = "import `apollo_compiler::Name` instead"]
pub type Name = crate::Name;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nice, thanks for adding it!

crates/apollo-smith/src/name.rs Show resolved Hide resolved
Comment on lines +29 to +32
use crate::executable::{
BuildError as ExecutableBuildError, ConflictingFieldArgument, ConflictingFieldName,
ConflictingFieldType,
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drive by nit: don't we usually import one by one? I feel like you were the one to introduce this to apollo-rs even 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the router repo we configure rustfmt with imports_granularity = Item and group_imports = StdExternalCrate. I tend to write new files this way, but haven’t configured rustfmt in this repo so it defaults to Preserve config.

I’d be in favor of adopting and enforcing a style in this repo too. The problem is those options are unstable in rustfmt, so they require a Nightly toolchain. If we have them in rustfmt.toml then stable rustfmt emits warnings that they’re ignored. What makes this especially bad is that cargo fmt calls rustfmt separately for every source file, so warnings are repeated enough time to flood the terminal. In the router repo for now we don’t configure rustfmt.toml, but have an xtask command that runs cargo +nightly fmt -- --config […]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahhhhh that makes sense. i got confused on what's formatted which way and where. It's not an issue, and doesn't really need the time to be put into it at the moment. Can figure it out at a later point!

crates/apollo-compiler/src/name.rs Outdated Show resolved Hide resolved

impl Name {
/// Create a new `Name` parsed from the given source location
pub fn new_parsed(value: &str, location: NodeLocation) -> Result<Self, InvalidNameError> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from_parsed and from_static maybe instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two three independent axes for constructors:

  • Stored as Arc<str> v.s. stored as &'static str
  • With a source location (aka "parsed") v.s. without
  • Checking syntax validity and returning a Result, or "unchecked" (used when already checked at compile-time or by apollo-parser)

All combinations are possible as far as Name is concerned (though static parsed is only useful with a leaking string interner). In a later commit I replaced the "parsed" constructors with a with_location builder-like chaining method to reduce the number of constructors without limiting possibilities.

But you’re right that new_something and from_something are still used a bit inconsistently. One argument is that from_ only seems to say anything about the input parameter type, but static is also a possible internal representation inside of Name.

crates/apollo-compiler/src/name.rs Show resolved Hide resolved
crates/apollo-compiler/src/name.rs Outdated Show resolved Hide resolved
crates/apollo-compiler/src/name.rs Outdated Show resolved Hide resolved
crates/apollo-compiler/src/name.rs Outdated Show resolved Hide resolved
crates/apollo-compiler/src/name.rs Outdated Show resolved Hide resolved
@SimonSapin
Copy link
Contributor Author

Sorry! At some point I tried something in a separate branch and forgot to merge back. I didn’t realize I was not pushing to this PR anymore and asked for review on something quite a bit behind from what I thought. All pushed now.

Remaining to bikeshed:

  • Names of constructors (new_* v.s. from_*)
  • Name of the StaticOrArc enum
  • Name of the to_static_str_or_cloned_arc method that returns it
  • Import path of Name, InvalidNameError, and StaticOrArc. Is three items the point where they deserve their own public module? (The name! macro would belong there too but #[macro_export] currently always exports at the crate root)

@goto-bus-stop
Copy link
Member

I haven't thought a lot about this but I think it makes sense to have an apollo_compiler::name::{Name, InvalidNameError, StaticOrArc} module and a reexport of Name at the root.

@SimonSapin
Copy link
Contributor Author

Per today’s discussion I removed the StaticOrArc enum and split the conversion into two separate as_static_str() and to_cloned_arc() methods returning options. It means conversion to OpenTelementry string needs an unwrap: apollographql/router#5352 (comment)

% Conflicts:
%	crates/apollo-compiler/CHANGELOG.md
%	crates/apollo-compiler/src/node.rs
SimonSapin added a commit that referenced this pull request Jun 25, 2024
Per discussion in #868 (comment)
except I picked `group_imports = One` instead of `StdExternalCrate`
SimonSapin added a commit that referenced this pull request Jun 25, 2024
Per discussion in #868 (comment)
except I picked `group_imports = One` instead of `StdExternalCrate`
crates/apollo-compiler/src/name.rs Outdated Show resolved Hide resolved
crates/apollo-compiler/src/node.rs Show resolved Hide resolved
crates/apollo-compiler/src/name.rs Outdated Show resolved Hide resolved
@SimonSapin SimonSapin enabled auto-merge (squash) June 27, 2024 13:24
@SimonSapin SimonSapin merged commit 3eb9b28 into main Jun 27, 2024
12 checks passed
@SimonSapin SimonSapin deleted the nodestr-revamp branch June 27, 2024 13:24
SimonSapin added a commit that referenced this pull request Jun 27, 2024
Per discussion in #868 (comment)
except I picked `group_imports = One` instead of `StdExternalCrate`
SimonSapin added a commit that referenced this pull request Jun 27, 2024
Per discussion in #868 (comment)
except I picked `group_imports = One` instead of `StdExternalCrate`
SimonSapin added a commit that referenced this pull request Jun 27, 2024
Per discussion in #868 (comment)
except I picked `group_imports = One` instead of `StdExternalCrate`
SimonSapin added a commit to apollographql/router that referenced this pull request Jul 1, 2024
Take advantage of the recent change of internal representation of `Name`:
apollographql/apollo-rs#868
SimonSapin added a commit to apollographql/router that referenced this pull request Jul 8, 2024
…5573)

Take advantage of the recent change of internal representation of Name: apollographql/apollo-rs#868
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants