Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numa #1556

Merged
merged 20 commits into from
Jun 26, 2023
Merged

Numa #1556

Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
disable mmap prefetch/readahead for NUMA systems
  • Loading branch information
zrm committed May 21, 2023
commit 0d23f8ce8da6bb13d557233c7b76a5563055fcf5
15 changes: 14 additions & 1 deletion llama-util.h
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,9 @@ static std::string llama_format_win_err(DWORD err) {
}
#endif

extern "C" {
bool ggml_is_numa();
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has to be avoided. Probably utilize new GGML_USE_NUMA define

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function is still needed, because even if NUMA support is compiled in, there are some operations that only make sense if the system actually has more than one NUMA node. It also makes it easier to disable those operations, because that function always returns false if ggml_numa_init() is never called, so we can add a --numa option and not call ggml_numa_init() otherwise, causing those operations to be disabled. The latest commits do this.

struct llama_mmap {
void * addr;
size_t size;
Expand All @@ -176,8 +179,10 @@ struct llama_mmap {
size = file->size;
int fd = fileno(file->fp);
int flags = MAP_SHARED;
// prefetch/readahead impairs performance on NUMA systems
if (ggml_is_numa()) prefetch = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: statement should be inside braces [readability-braces-around-statements]

Suggested change
if (ggml_is_numa()) prefetch = 0;
if (ggml_is_numa()) { prefetch = 0;
}

#ifdef __linux__
flags |= MAP_POPULATE;
if (prefetch) flags |= MAP_POPULATE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: statement should be inside braces [readability-braces-around-statements]

Suggested change
if (prefetch) flags |= MAP_POPULATE;
if (prefetch) { flags |= MAP_POPULATE;
}

#endif
addr = mmap(NULL, file->size, PROT_READ, flags, fd, 0);
if (addr == MAP_FAILED) {
Expand All @@ -191,6 +196,14 @@ struct llama_mmap {
strerror(errno));
}
}
if (ggml_is_numa()) {
// advise the kernel not to use readahead
// (because the next page might not belong on the same node)
if (madvise(addr, file->size, MADV_RANDOM)) {
fprintf(stderr, "warning: madvise(.., MADV_RANDOM) failed: %s\n",
strerror(errno));
}
}
}

~llama_mmap() {
Expand Down