Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How is the data allocated by ggml_allocr_alloc align? #579

Closed
ruiqurm opened this issue Oct 13, 2023 · 2 comments
Closed

How is the data allocated by ggml_allocr_alloc align? #579

ruiqurm opened this issue Oct 13, 2023 · 2 comments

Comments

@ruiqurm
Copy link

ruiqurm commented Oct 13, 2023

I am reading about the allocation implementation. It allocates additional size but sets the tensor->data as the base pointer rather than a pointer after alignment. I wonder if the tensor->data is aligned or if the pointer is aligned later in execution.

void ggml_allocr_alloc(struct ggml_allocr * alloc, struct ggml_tensor * tensor) {
    GGML_ASSERT(!ggml_is_view(tensor)); // views generally get data pointer from one of their sources
    GGML_ASSERT(tensor->data == NULL); // avoid allocating tensor which already has memory allocated

    size_t size = ggml_backend_buffer_get_alloc_size(alloc->buffer, tensor);
    size = aligned_offset(NULL, size, alloc->alignment);

    AT_PRINTF("%s: allocating %s (%zu bytes) - ", __func__, tensor->name, size);

    size_t max_avail = 0;

    // find the best fitting free block besides the last block
    int best_fit_block = -1;
    size_t best_fit_size = SIZE_MAX;
    for (int i = 0; i < alloc->n_free_blocks - 1; i++) {
        struct free_block * block = &alloc->free_blocks[i];
        max_avail = MAX(max_avail, block->size);
        if (block->size >= size && block->size <= best_fit_size) {
            best_fit_block = i;
            best_fit_size = block->size;
        }
    }

    AT_PRINTF("block %d\n", best_fit_block);

    if (best_fit_block == -1) {
        // the last block is our last resort
        struct free_block * block = &alloc->free_blocks[alloc->n_free_blocks - 1];
        max_avail = MAX(max_avail, block->size);
        if (block->size >= size) {
            best_fit_block = alloc->n_free_blocks - 1;
        } else {
            fprintf(stderr, "%s: not enough space in the buffer (needed %zu, largest block available %zu)\n",
                    __func__, size, max_avail);
            GGML_ASSERT(!"not enough space in the buffer");
            return;
        }
    }
    struct free_block * block = &alloc->free_blocks[best_fit_block];
    void * addr = block->addr;
    block->addr = (char*)block->addr + size;
    block->size -= size;
    if (block->size == 0) {
        // remove block if empty
        alloc->n_free_blocks--;
        for (int j = best_fit_block; j < alloc->n_free_blocks; j++) {
            alloc->free_blocks[j] = alloc->free_blocks[j+1];
        }
    }

    tensor->data = addr;
    AT_PRINTF("%s: allocated data at %p\n", __func__, tensor->data);
    tensor->buffer = alloc->buffer;
    ggml_backend_buffer_init_tensor(alloc->buffer, tensor);
    alloc->max_size = MAX(alloc->max_size, (char*)addr - (char*)alloc->data + size);
}
@ruiqurm ruiqurm changed the title How is the data allocated by allocator align? How is the data allocated by ggml_allocr_alloc align? Oct 13, 2023
@slaren
Copy link
Collaborator

slaren commented Oct 13, 2023

Basically, instead of aligning every memory address, which may be hard to reverse when freeing the memory, it aligns the base address of the buffer (in ggml_allocr_reset) and then pads every allocation to a multiple of the alignment. Padding the size of the allocation can be easily repeated when freeing the tensor to figure how much memory was allocated, and this is enough to guarantee that all allocations will start at an aligned memory address.

@ruiqurm
Copy link
Author

ruiqurm commented Oct 18, 2023

OK. I see.

@ruiqurm ruiqurm closed this as completed Oct 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants