Skip to content
This repository has been archived by the owner on Jun 24, 2024. It is now read-only.

fix #149 - load tensors by type, ignoring filetype #152

Merged
merged 4 commits into from
Apr 25, 2023
Merged

fix #149 - load tensors by type, ignoring filetype #152

merged 4 commits into from
Apr 25, 2023

Conversation

philpax
Copy link
Collaborator

@philpax philpax commented Apr 24, 2023

As noticed by @KerfuffleV2, both loaders incorrectly handle newer models that mix tensor types indiscriminately due to a confusion between the ftype/f16_ and element type.

I was planning on bringing #84 up to date first, but realised that I needed to figure out what was going on with ftype before baking that in.

I've addressed the issue by decoupling them properly, and then switching over to loading the tensors from file instead of trying to preallocate them with the wrong type. This mirrors the changes made in ggerganov/llama.cpp#801.

I've tested this with Alpaca 7B GGML and gpt4-x-alpaca-13b-native-4bit-128g, the latter with and without mmap, and all seems to work.


I'm not happy at how I broke the isolation between the loader and the Model with Model::new_loader2, but I'm going to revisit this once I remove loader1 as part of #150.

loader1 is still broken (i.e. using the old behaviour), but that's OK because it's going away soon ^_^

@philpax philpax merged commit 8254deb into rustformers:main Apr 25, 2023
@philpax philpax deleted the load-tensors-as-stored branch April 25, 2023 01:41
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant