How is the data allocated by ggml_allocr_alloc align? #579

ruiqurm · 2023-10-13T07:00:52Z

I am reading about the allocation implementation. It allocates additional size but sets the tensor->data as the base pointer rather than a pointer after alignment. I wonder if the tensor->data is aligned or if the pointer is aligned later in execution.

void ggml_allocr_alloc(struct ggml_allocr * alloc, struct ggml_tensor * tensor) {
    GGML_ASSERT(!ggml_is_view(tensor)); // views generally get data pointer from one of their sources
    GGML_ASSERT(tensor->data == NULL); // avoid allocating tensor which already has memory allocated

    size_t size = ggml_backend_buffer_get_alloc_size(alloc->buffer, tensor);
    size = aligned_offset(NULL, size, alloc->alignment);

    AT_PRINTF("%s: allocating %s (%zu bytes) - ", __func__, tensor->name, size);

    size_t max_avail = 0;

    // find the best fitting free block besides the last block
    int best_fit_block = -1;
    size_t best_fit_size = SIZE_MAX;
    for (int i = 0; i < alloc->n_free_blocks - 1; i++) {
        struct free_block * block = &alloc->free_blocks[i];
        max_avail = MAX(max_avail, block->size);
        if (block->size >= size && block->size <= best_fit_size) {
            best_fit_block = i;
            best_fit_size = block->size;
        }
    }

    AT_PRINTF("block %d\n", best_fit_block);

    if (best_fit_block == -1) {
        // the last block is our last resort
        struct free_block * block = &alloc->free_blocks[alloc->n_free_blocks - 1];
        max_avail = MAX(max_avail, block->size);
        if (block->size >= size) {
            best_fit_block = alloc->n_free_blocks - 1;
        } else {
            fprintf(stderr, "%s: not enough space in the buffer (needed %zu, largest block available %zu)\n",
                    __func__, size, max_avail);
            GGML_ASSERT(!"not enough space in the buffer");
            return;
        }
    }
    struct free_block * block = &alloc->free_blocks[best_fit_block];
    void * addr = block->addr;
    block->addr = (char*)block->addr + size;
    block->size -= size;
    if (block->size == 0) {
        // remove block if empty
        alloc->n_free_blocks--;
        for (int j = best_fit_block; j < alloc->n_free_blocks; j++) {
            alloc->free_blocks[j] = alloc->free_blocks[j+1];
        }
    }

    tensor->data = addr;
    AT_PRINTF("%s: allocated data at %p\n", __func__, tensor->data);
    tensor->buffer = alloc->buffer;
    ggml_backend_buffer_init_tensor(alloc->buffer, tensor);
    alloc->max_size = MAX(alloc->max_size, (char*)addr - (char*)alloc->data + size);
}

slaren · 2023-10-13T08:13:46Z

Basically, instead of aligning every memory address, which may be hard to reverse when freeing the memory, it aligns the base address of the buffer (in ggml_allocr_reset) and then pads every allocation to a multiple of the alignment. Padding the size of the allocation can be easily repeated when freeing the tensor to figure how much memory was allocated, and this is enough to guarantee that all allocations will start at an aligned memory address.

ruiqurm · 2023-10-18T03:14:59Z

OK. I see.

ruiqurm changed the title ~~How is the data allocated by allocator align?~~ How is the data allocated by ggml_allocr_alloc align? Oct 13, 2023

ruiqurm closed this as completed Oct 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How is the data allocated by ggml_allocr_alloc align? #579

How is the data allocated by ggml_allocr_alloc align? #579

ruiqurm commented Oct 13, 2023

slaren commented Oct 13, 2023 •

edited

Loading

ruiqurm commented Oct 18, 2023

How is the data allocated by ggml_allocr_alloc align? #579

How is the data allocated by ggml_allocr_alloc align? #579

Comments

ruiqurm commented Oct 13, 2023

slaren commented Oct 13, 2023 • edited Loading

ruiqurm commented Oct 18, 2023

slaren commented Oct 13, 2023 •

edited

Loading