Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some IR compression improvements #28104

Merged
merged 2 commits into from
Jul 15, 2018
Merged

some IR compression improvements #28104

merged 2 commits into from
Jul 15, 2018

Conversation

JeffBezanson
Copy link
Sponsor Member

This takes another ~15MB off the system image, and simplifies the structure of the deserialization code.

  • Better encoding for Vector and Ptr types (profiled to be very common)
  • Shorter codes for SSAValue, method root references
  • Better encoding for 1- and 2-argument call expressions, GotoNode, QuoteNode, LineInfoNode, UInt32, UInt64, UnionAll types, and DataType
  • Adaptively use smaller integers for CodeInfo.codelocs
  • Use one big switch statement for deserializing

src/dump.c Outdated
jl_serialize_value(s, jl_globalref_mod(v));
jl_serialize_value(s, jl_globalref_name(v));
}
}
else if (jl_is_ssavalue(v) && ((jl_ssavalue_t*)v)->id < 256) {
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

id is a signed type

src/dump.c Outdated
else if (jl_is_ssavalue(v) && ((jl_ssavalue_t*)v)->id < 65536) {
writetag(s->s, (jl_value_t*)jl_ssavalue_type);
write_uint8(s->s, TAG_LONG_SSAVALUE);
write_uint16(s->s, ((jl_ssavalue_t*)v)->id);
}
else if (jl_typeis(v, jl_slotnumber_type) && jl_slot_number(v) < 65536) {
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jl_slot_number is a signed type

jl_serialize_value(s, jl_exprarg(e, 1));
jl_serialize_value(s, jl_exprarg(e, 2));
return;
}
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need TAG_CALL0 and TAG_INVOKE{0...2} also?

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, but I found call1 and call2 covered about 85% of cases.

write_uint8(s->s, TAG_GOTONODE);
jl_serialize_value(s, jl_get_nth_field(v, 0));
}
else if (jl_is_quotenode(v)) {
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps should have && (jl_isbits(v.value) || !is_ast_node(v.value)) here?

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's necessary, since all this really does is shorten the object header for a QuoteNode.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sure that the field doesn't have a special representation (which could end up creating (non-egal) copies of the supposed-to-be-constant object in the AST)

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah --- that check would have to be added in is_ast_node. At this point we're definitely serializing it either way.

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, that still doesn't matter since the value will be recursively checked for is_ast_node anyway.

src/dump.c Outdated
@@ -800,23 +887,39 @@ static void jl_serialize_value_(jl_serializer_state *s, jl_value_t *v, int as_li
}
else if (jl_typeis(v, jl_int64_type)) {
void *data = jl_data_ptr(v);
if (*(int64_t*)data >= S32_MIN && *(int64_t*)data <= S32_MAX) {
writetag(s->s, (jl_value_t*)SmallInt64_tag);
if (*(int64_t*)data >= -32768 && *(int64_t*)data <= 32767) {
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use INT16_MIN (C99 standard)?

src/dump.c Outdated
writetag(s->s, (jl_value_t*)jl_int32_type);
write_int32(s->s, *(int32_t*)jl_data_ptr(v));
void *data = jl_data_ptr(v);
if (*(int64_t*)data >= -32768 && *(int64_t*)data <= 32767) {
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int32_t*

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ouch, good catch.

src/dump.c Outdated
}
else if (vtag == (jl_value_t*)CommonSym_tag) {
assert(!ios_eof(s->s));
jl_value_t *v; size_t i, n;
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are these on the same line?

src/dump.c Outdated
case 0:
tag = read_uint8(s->s);
return deser_tag[tag];
case TAG_BACKREF: case TAG_SHORT_BACKREF:
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put JL_FALLTHROUGH; between these

src/dump.c Outdated
assert(vtag == (jl_value_t*)jl_datatype_type || vtag == (jl_value_t*)SmallDataType_tag);
return jl_deserialize_value_any(s, vtag, loc);
case TAG_LINEINFO:
v = jl_gc_alloc(s->ptls, jl_lineinfonode_type->size, jl_lineinfonode_type);
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v = jl_new_struct_uninit(jl_lineinfonode_type) (or at least jl_datatype_size(jl_lineinfonode_type), I think)

void **bp = ptrhash_bp(&ser_tag, v);
if (*bp != HT_NOTFOUND) {
write_as_tag(s->s, (uint8_t)(intptr_t)*bp);
void *tag = ptrhash_get(&ser_tag, v);
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is performance-sensitive at all, I think we could guard it these ptrhash_get with ((jl_datatype_t*)jl_typeof(ser_tag))->name->module == jl_core_module I'm not sure that would make any difference however, but just an idea.

@JeffBezanson JeffBezanson merged commit a8c53e0 into master Jul 15, 2018
@JeffBezanson JeffBezanson deleted the jb/ircompress branch July 15, 2018 03:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants