-
Notifications
You must be signed in to change notification settings - Fork 966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Confirmation about the order of tensor dimensions #500
Comments
This is done, because the dimension order in GGML is the reverse of the dimension order used in PyTorch. In PyTorch the order is N x C x H x W. In GGML it is W x H x C x N. N - the batch dimension, C - the channel dimension, H - number of rows, and W - number of columns. In GGML W x H x C X N correspond to the ne[0] x ne[1] x ne[2] x ne[3] members of a tensor. Say we have a 4 dimension tensor named "t" in both GGML and PyTorch.
|
@YavorGIvanov |
Here ggerganov will be able to provide the best answer, but I don't think the library was intended to match any other deep learning library and the initial version was written relatively quickly. As you design the data type representing a tensor, you may decide to limit the dimensions to a fixed upper number and then use a static C++ array comprised of integers in order to store the size of each dimensions + dimension count integer. This avoids dynamic allocation of the dimension array making it more memory/cache friendly. Also it has the pro of making all tensors have the same dimension array (ne[4]) + count. However, in order to make the dimension array easy to use you need to store the dimensions in sequantial order. E.g. This makes it easy to compare the width dimension of 2D, 3D and 4D tensor as you know that all of their width dimensions size is at ne[0] instead of ne[1], ne[2] and ne[3] |
If you have any additional questions, you can reopen the issue or open a new one with label "question". |
@YavorGIvanov |
* Retire the ggml_mul_mat() for transposed src0 - It can always be made contiguous with ggml_cpy() - The code is now simplified - The results are deterministic in respect to num threads * SIMD-ify dequantize_row_q4_0() for ARM_NEON (ggerganov#502) * Attempt to SIMD-ify dequantize_row_q4_0() for ARM_NEON * Fix dequantization - forgot to interleave the quants
Hello! Thank you so much for developing and sharing this awesome library!
Can I ask a silly question?
I'm investigating the source code of
gpt-neox
and I see these lines fromconvert-h5-to-ggml.py
file:Could you please tell me why do we need to store in the reverse order?
Thank you in advance!
The text was updated successfully, but these errors were encountered: