Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use correct thread stack limits in jl_init_stack_limits #54639

Merged
merged 1 commit into from
Jun 12, 2024

Conversation

fingolfin
Copy link
Contributor

Actually these do not really seem to matter much in practice (as explained to me by @vtjnash on Slack, thank you). The one thing that did cause me some grief here, though, was that stack_hi was the low address and stack_lo the high address.

A smaller change to achieve that "hi means high, lo means low" would be to just reverse stack_hi and stack_lo.

Actually these do not really seem to matter much in practice. The
one thing that did cause me some issues here, though, was that
stack_hi was the low address and stack_lo the high address.
*stack_lo = (void*)stackaddr;
*stack_hi = (void*)__builtin_frame_address(0);
size_t stacksize = pthread_get_stacksize_np(thread);
*stack_hi = stackaddr;
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you meant to switch these, since the malloc region is stackaddr to stackaddr+stacksize?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I am misunderstanding you (sorry), but: stacks on all OS/CPU combos supported by Julia grow downwards, the stack starts at the highest address stackaddr and grows downwards towards stackaddr-stacksize (right after which the guard pages comes by default, at least for pthread created stacks with default settings).

I double checked this in darwin_libpthread, glibc and muslc and via real-world experiments on darwin/macos and linux+glibc (no muslc for me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vtjnash does my explanation make sense to you? Any other questions?

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, yeah, it is confusing that it is inverted from the values returned by pthread_attr_getstack, but apparently that is the legacy behavior

@vtjnash vtjnash added the merge me PR is reviewed. Merge when all tests are passing label Jun 12, 2024
@oscardssmith oscardssmith merged commit f1c601d into JuliaLang:master Jun 12, 2024
8 checks passed
@oscardssmith oscardssmith removed the merge me PR is reviewed. Merge when all tests are passing label Jun 12, 2024
@fingolfin fingolfin deleted the mh/jl_init_stack_limits branch June 20, 2024 15:59
vtjnash added a commit that referenced this pull request Aug 16, 2024
Move the registers onto the stack, so that they only are present when
the Task is actually switched out, saving memory when the Task is not
running yet or already finished.

On Linux x86_64 this reduces it from 376 bytes to 184 bytes.

Has some additional advantages too, such as copy_stack tasks (e.g. with
always_copy_stacks) can migrate to other threads before starting if they
are not stick.

Also fixes a variable that got mixed up by #54639 and caused
always_copy_stacks to abort, since the stack limits were wrong.
vtjnash added a commit that referenced this pull request Aug 17, 2024
Move the registers onto the stack, so that they only are present when
the Task is actually switched out, saving memory when the Task is not
running yet or already finished.

On Linux x86_64 this reduces it from 376 bytes to 184 bytes.

Has some additional advantages too, such as copy_stack tasks (e.g. with
always_copy_stacks) can migrate to other threads before starting if they
are not stick.

Also fixes a variable that got mixed up by #54639 and caused
always_copy_stacks to abort, since the stack limits were wrong.
vtjnash added a commit that referenced this pull request Aug 19, 2024
Move the registers onto the stack, so that they only are present when
the Task is actually switched out, saving memory when the Task is not
running yet or already finished.

On Linux x86_64 this reduces it from 376 bytes to 184 bytes.

Has some additional advantages too, such as copy_stack tasks (e.g. with
always_copy_stacks) can migrate to other threads before starting if they
are not stick.

Also fixes a variable that got mixed up by #54639 and caused
always_copy_stacks to abort, since the stack limits were wrong.
vtjnash added a commit that referenced this pull request Aug 20, 2024
Move the registers onto the stack, so that they only are present when
the Task is actually switched out, saving memory when the Task is not
running yet or already finished.

On Linux x86_64 this reduces it from 376 bytes to 184 bytes.

Has some additional advantages too, such as copy_stack tasks (e.g. with
always_copy_stacks) can migrate to other threads before starting if they
are not stick.

Also fixes a variable that got mixed up by #54639 and caused
always_copy_stacks to abort, since the stack limits were wrong.
vtjnash added a commit that referenced this pull request Aug 20, 2024
Move the registers onto the stack, so that they only are present when
the Task is actually switched out, saving memory when the Task is not
running yet or already finished. It makes this mostly just a huge
renaming job.

On Linux x86_64 this reduces it from 376 bytes to 184 bytes.

Has some additional advantages too, such as copy_stack tasks (e.g. with
always_copy_stacks) can migrate to other threads before starting if they
are not sticky.

Also fixes a variable that got mixed up by #54639 and caused
always_copy_stacks to abort, since the stack limits were wrong.

Also now fixes #43124, though I
am not quite confident enough in it to re-enable that test right now.
KristofferC pushed a commit that referenced this pull request Sep 12, 2024
Move the registers onto the stack, so that they only are present when
the Task is actually switched out, saving memory when the Task is not
running yet or already finished. It makes this mostly just a huge
renaming job.

On Linux x86_64 this reduces it from 376 bytes to 184 bytes.

Has some additional advantages too, such as copy_stack tasks (e.g. with
always_copy_stacks) can migrate to other threads before starting if they
are not sticky.

Also fixes a variable that got mixed up by #54639 and caused
always_copy_stacks to abort, since the stack limits were wrong.

Also now fixes #43124, though I
am not quite confident enough in it to re-enable that test right now.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants