Cannot load teknium/Replit-v2-CodeInstruct-3B #436

shrikrishnaholla · 2023-08-06T16:02:56Z

This issue is on similar lines as #248 , but is regarding replit-v2 models, not replit-v1

I am using ggml@ a301077 and am having trouble loading teknium/Replit-v2-CodeInstruct-3B

I used examples/replit/convert-h5-to-ggml.py to convert to ggml f32.
Also created both a q4_1 as well as q8_0 quantized versions using replit-quantize.

However, when trying to load either f32, q4_1 or q8_0 versions of the models with replit
(e.g., ./bin/replit -m Replit-v2-CodeInstruct-3B-f32.bin -p "def hello_world():")
I get:

replit_model_load: unknown tensor 'transformer.blocks.0.norm_1.weight' in model file

Any ideas?

The text was updated successfully, but these errors were encountered:

klosax · 2023-08-06T23:37:54Z

The model loads fine for me. The named tensor should be recognized and loaded. Did you get any compilation warnings?

shrikrishnaholla · 2023-08-07T16:14:36Z

@klosax One thing might be that I had received this error when I ran python examples/replit/convert-h5-to-ggml.py ../teknium-Replit-v2-CodeInstruct-3B/Replit-v2-CodeInstruct-3B/ 0

Traceback (most recent call last):
  File "~/ggml/examples/replit/convert-h5-to-ggml.py", line 7, in <module>
    import sentencepiece.sentencepiece_model_pb2 as model
  File "~/.local/lib/python3.10/site-packages/sentencepiece/sentencepiece_model_pb2.py", line 34, in <module>
    _descriptor.EnumValueDescriptor(
  File "/opt/miniconda3/conda/envs/textgen/lib/python3.10/site-packages/google/protobuf/descriptor.py", line 796, in __new__
    _message.Message._CheckCalledFromGeneratedFile()
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
 1. Downgrade the protobuf package to 3.20.x or lower.
 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

So I rephrased the command like this:

PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python python examples/replit/convert-h5-to-ggml.py ../teknium-Replit-v2-CodeInstruct-3B/Replit-v2-CodeInstruct-3B/ 0

and it compiled successfully.

Could this have anything to do with it?

klosax · 2023-08-07T17:34:05Z

Could this have anything to do with it?

I guess not if the model file was converted successfully.

Any compilation warnings when compling the inference binary?

shrikrishnaholla · 2023-08-08T05:15:56Z

Nothing stood out to me in particular...

cmake .. && make -j4 replit replit-quantize
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Linux detected
-- x86 detected
-- Linux detected
-- Configuring done (0.1s)
-- Generating done (0.3s)
-- Build files have been written to: ~/ggml/build
[ 25%] Building CXX object examples/CMakeFiles/common.dir/common.cpp.o
[ 25%] Building C object src/CMakeFiles/ggml.dir/ggml.c.o
In file included from /usr/include/string.h:535,
                 from ~/ggml/src/ggml.c:21:
In function ‘memcpy’,
    inlined from ‘ggml_set_op_params’ at ~/ggml/src/ggml.c:4642:5,
    inlined from ‘ggml_conv_1d’ at ~/ggml/src/ggml.c:6883:5:
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:29:10: warning: ‘__builtin_memcpy’ offset [0, 11] is out of the bounds [0, 0] [-Warray-bounds]
   29 |   return __builtin___memcpy_chk (__dest, __src, __len,
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   30 |                                  __glibc_objsize0 (__dest));
      |                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~
In function ‘memcpy’,
    inlined from ‘ggml_set_op_params’ at ~/ggml/src/ggml.c:4642:5,
    inlined from ‘ggml_conv_2d’ at ~/ggml/src/ggml.c:6923:5:
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:29:10: warning: ‘__builtin_memcpy’ offset [0, 23] is out of the bounds [0, 0] [-Warray-bounds]
   29 |   return __builtin___memcpy_chk (__dest, __src, __len,
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   30 |                                  __glibc_objsize0 (__dest));
      |                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~
In function ‘memcpy’,
    inlined from ‘ggml_set_op_params’ at ~/ggml/src/ggml.c:4642:5,
    inlined from ‘ggml_conv_1d’ at ~/ggml/src/ggml.c:6883:5,
    inlined from ‘ggml_conv_1d_ph’ at ~/ggml/src/ggml.c:6942:12:
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:29:10: warning: ‘__builtin_memcpy’ offset [0, 11] is out of the bounds [0, 0] [-Warray-bounds]
   29 |   return __builtin___memcpy_chk (__dest, __src, __len,
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   30 |                                  __glibc_objsize0 (__dest));
      |                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~
In function ‘memcpy’,
    inlined from ‘ggml_set_op_params’ at ~/ggml/src/ggml.c:4642:5,
    inlined from ‘ggml_pool_2d’ at ~/ggml/src/ggml.c:7015:5:
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:29:10: warning: ‘__builtin_memcpy’ offset [0, 27] is out of the bounds [0, 0] [-Warray-bounds]
   29 |   return __builtin___memcpy_chk (__dest, __src, __len,
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   30 |                                  __glibc_objsize0 (__dest));
      |                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~
In function ‘memcpy’,
    inlined from ‘ggml_set_op_params’ at ~/ggml/src/ggml.c:4642:5,
    inlined from ‘ggml_win_part’ at ~/ggml/src/ggml.c:7183:5:
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:29:10: warning: ‘__builtin_memcpy’ offset [0, 11] is out of the bounds [0, 0] [-Warray-bounds]
   29 |   return __builtin___memcpy_chk (__dest, __src, __len,
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   30 |                                  __glibc_objsize0 (__dest));
      |                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~
[ 37%] Linking CXX static library libcommon.a
[ 37%] Built target common
[ 50%] Linking C static library libggml.a
[ 50%] Built target ggml
[ 62%] Building CXX object examples/CMakeFiles/common-ggml.dir/common-ggml.cpp.o
[ 75%] Linking CXX static library libcommon-ggml.a
[ 75%] Built target common-ggml
[ 87%] Building CXX object examples/replit/CMakeFiles/replit.dir/main.cpp.o
[100%] Linking CXX executable ../../bin/replit
[100%] Built target replit
[ 37%] Built target common
[ 50%] Built target ggml
[ 75%] Built target common-ggml
[ 87%] Building CXX object examples/replit/CMakeFiles/replit-quantize.dir/quantize.cpp.o
[100%] Linking CXX executable ../../bin/replit-quantize
[100%] Built target replit-quantize

@klosax

klosax · 2023-08-08T08:54:38Z

string_fortified.h:29:10: warning: ‘__builtin_memcpy’ offset [0, 11] is out of the bounds [0, 0] [-Warray-bounds]

The model file seems to be fine since the tensor transformer.blocks.0.norm_1.weight is in it. The inference binary should recognize the tensor and load it. My guess it that something is wrong with your compiler since you get warnings that could have to do with the problem. The binary does string comparison sto recognize the tensor names.

Try updating or reinstalling the compiler.

shrikrishnaholla · 2023-08-08T16:04:13Z

This is my version. Should it be upgraded?

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/11/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 11.3.0-1ubuntu1~22.04' --with-bugurl=file:https:///usr/share/doc/gcc-11/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-11 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-gcn/usr --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-build-config=bootstrap-lto-lean --enable-link-serialization=2
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.3.0 (Ubuntu 11.3.0-1ubuntu1~22.04)

klosax · 2023-08-08T17:53:36Z

I think it should work with your compiler.

But you could try change this line

ggml/examples/replit/main.cpp

Line 379 in 244776a

if (model.tensors.find(name.data()) == model.tensors.end()) {

to

if (model.tensors.find(name) == model.tensors.end()) {

and compile again.

shrikrishnaholla · 2023-08-09T15:56:54Z

This worked! @klosax , thanks for your time and for the help. I was lost without you 🙏

Would this change be useful to others as well? Should I commit and raise a PR?

klosax · 2023-08-10T08:12:52Z

Great!

Then all references of name.data() should be changed to name, and in lines with fprintf or printf it should be changed to name.c_str()

Would this change be useful to others as well? Should I commit and raise a PR?

It looks like this error can also be found in other examples and all of them should be fixed.

shrikrishnaholla · 2023-08-10T08:19:25Z

Wouldn't that be breaking compilation of other models as well? Would you like me to try and reproduce for other classes of models before making a fix?

Because if what you say is true, then wouldn't this be a huge change? 🤔

klosax · 2023-08-10T08:36:41Z

name is a std::string and should be accessed as such, the contents should not be accessed directly by data() like it is done here.

All examples compile and works fine for me using gcc 9, so my guess is that your gcc 11 is handling this different than the older compilers, and that is the reason it wont work for you.

shrikrishnaholla · 2023-08-10T08:54:05Z

Understood. So if I'm understanding correctly, even if name is accessed directly, since it is an std::string it won't break for older compilers like the one you use, correct?

Apologies for asking what might be basic questions. My C++ is rusty, so I don't want to be creating a regression and getting angry emails 😅

klosax · 2023-08-10T09:02:18Z

Yes the changes wont break anything for older compilers. I will make a PR for this to change all examples.

klosax · 2023-08-10T09:37:19Z

Would you like me to try and reproduce for other classes of models before making a fix?

If you like you could test one other example to see if the same error is there and if it is fixed by this change.

klosax mentioned this issue Aug 10, 2023

Fix examples tensor name access #443

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot load teknium/Replit-v2-CodeInstruct-3B #436

Cannot load teknium/Replit-v2-CodeInstruct-3B #436

shrikrishnaholla commented Aug 6, 2023

klosax commented Aug 6, 2023

shrikrishnaholla commented Aug 7, 2023

klosax commented Aug 7, 2023

shrikrishnaholla commented Aug 8, 2023

klosax commented Aug 8, 2023

shrikrishnaholla commented Aug 8, 2023

klosax commented Aug 8, 2023

shrikrishnaholla commented Aug 9, 2023

klosax commented Aug 10, 2023

shrikrishnaholla commented Aug 10, 2023

klosax commented Aug 10, 2023

shrikrishnaholla commented Aug 10, 2023

klosax commented Aug 10, 2023

klosax commented Aug 10, 2023

Cannot load teknium/Replit-v2-CodeInstruct-3B #436

Cannot load teknium/Replit-v2-CodeInstruct-3B #436

Comments

shrikrishnaholla commented Aug 6, 2023

klosax commented Aug 6, 2023

shrikrishnaholla commented Aug 7, 2023

klosax commented Aug 7, 2023

shrikrishnaholla commented Aug 8, 2023

klosax commented Aug 8, 2023

shrikrishnaholla commented Aug 8, 2023

klosax commented Aug 8, 2023

shrikrishnaholla commented Aug 9, 2023

klosax commented Aug 10, 2023

shrikrishnaholla commented Aug 10, 2023

klosax commented Aug 10, 2023

shrikrishnaholla commented Aug 10, 2023

klosax commented Aug 10, 2023

klosax commented Aug 10, 2023