Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make issue. Maybe flags? #302

Open
RiccaDS opened this issue Mar 28, 2023 · 18 comments
Open

Make issue. Maybe flags? #302

RiccaDS opened this issue Mar 28, 2023 · 18 comments

Comments

@RiccaDS
Copy link

RiccaDS commented Mar 28, 2023

Hey, I resolved many issues up to now. I feel this is the last one. Any idea what this can be due to?

(base) riccardo@riccardo-K53SV:/mnt/Storage/software/dalai$ npx dalai alpaca install 7B --home /mnt/Storage/software/dalai
mkdir /mnt/Storage/software/dalai
{ method: 'install', callparams: [ '7B' ] }
mkdir /mnt/Storage/software/dalai/alpaca
try fetching /mnt/Storage/software/dalai/alpaca https://github.com/ItsPi3141/alpaca.cpp
[E] Pull TypeError: Cannot read properties of null (reading 'split')
    at new GitConfig (/home/riccardo/.npm/_npx/3c737cbb02d79cc9/node_modules/isomorphic-git/index.cjs:1604:30)
    at GitConfig.from (/home/riccardo/.npm/_npx/3c737cbb02d79cc9/node_modules/isomorphic-git/index.cjs:1627:12)
    at GitConfigManager.get (/home/riccardo/.npm/_npx/3c737cbb02d79cc9/node_modules/isomorphic-git/index.cjs:1750:22)
    at async _getConfig (/home/riccardo/.npm/_npx/3c737cbb02d79cc9/node_modules/isomorphic-git/index.cjs:5397:18)
    at async normalizeAuthorObject (/home/riccardo/.npm/_npx/3c737cbb02d79cc9/node_modules/isomorphic-git/index.cjs:5407:19)
    at async Object.pull (/home/riccardo/.npm/_npx/3c737cbb02d79cc9/node_modules/isomorphic-git/index.cjs:11682:20)
    at async Dalai.add (/home/riccardo/.npm/_npx/3c737cbb02d79cc9/node_modules/dalai/index.js:394:7)
    at async Dalai.install (/home/riccardo/.npm/_npx/3c737cbb02d79cc9/node_modules/dalai/index.js:346:5) {
  caller: 'git.pull'
}
try cloning /mnt/Storage/software/dalai/alpaca https://github.com/ItsPi3141/alpaca.cpp
next alpaca [AsyncFunction: make]
exec: make in /mnt/Storage/software/dalai/alpaca
make
exit
(base) riccardo@riccardo-K53SV:/mnt/Storage/software/dalai/alpaca$ make
I llama.cpp build info: 
I UNAME_S:  Linux
I UNAME_P:  x86_64
I UNAME_M:  x86_64
I CFLAGS:   -I.              -O3 -DNDEBUG -std=c11   -fPIC -pthread -mavx -msse3
I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -pthread
I LDFLAGS:  
I CC:       cc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
I CXX:      g++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0

cc  -I.              -O3 -DNDEBUG -std=c11   -fPIC -pthread -mavx -msse3   -c ggml.c -o ggml.o
In file included from /usr/lib/gcc/x86_64-linux-gnu/9/include/immintrin.h:109,
                 from ggml.c:155:
ggml.c: In function ‘ggml_vec_dot_f16’:
/usr/lib/gcc/x86_64-linux-gnu/9/include/f16cintrin.h:52:1: error: inlining failed in call to always_inline ‘_mm256_cvtph_ps’: target specific option mismatch
   52 | _mm256_cvtph_ps (__m128i __A)
      | ^~~~~~~~~~~~~~~
ggml.c:911:33: note: called from here
  911 | #define GGML_F32Cx8_LOAD(x)     _mm256_cvtph_ps(_mm_loadu_si128((__m128i *)(x)))
      |                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml.c:921:37: note: in expansion of macro ‘GGML_F32Cx8_LOAD’
  921 | #define GGML_F16_VEC_LOAD(p, i)     GGML_F32Cx8_LOAD(p)
      |                                     ^~~~~~~~~~~~~~~~
ggml.c:1274:21: note: in expansion of macro ‘GGML_F16_VEC_LOAD’
 1274 |             ay[j] = GGML_F16_VEC_LOAD(y + i + j*GGML_F16_EPR, j);
      |                     ^~~~~~~~~~~~~~~~~
In file included from /usr/lib/gcc/x86_64-linux-gnu/9/include/immintrin.h:109,
                 from ggml.c:155:
/usr/lib/gcc/x86_64-linux-gnu/9/include/f16cintrin.h:52:1: error: inlining failed in call to always_inline ‘_mm256_cvtph_ps’: target specific option mismatch
   52 | _mm256_cvtph_ps (__m128i __A)
      | ^~~~~~~~~~~~~~~
ggml.c:911:33: note: called from here
  911 | #define GGML_F32Cx8_LOAD(x)     _mm256_cvtph_ps(_mm_loadu_si128((__m128i *)(x)))
      |                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml.c:921:37: note: in expansion of macro ‘GGML_F32Cx8_LOAD’
  921 | #define GGML_F16_VEC_LOAD(p, i)     GGML_F32Cx8_LOAD(p)
      |                                     ^~~~~~~~~~~~~~~~
ggml.c:1273:21: note: in expansion of macro ‘GGML_F16_VEC_LOAD’
 1273 |             ax[j] = GGML_F16_VEC_LOAD(x + i + j*GGML_F16_EPR, j);
      |                     ^~~~~~~~~~~~~~~~~
In file included from /usr/lib/gcc/x86_64-linux-gnu/9/include/immintrin.h:109,
                 from ggml.c:155:
/usr/lib/gcc/x86_64-linux-gnu/9/include/f16cintrin.h:52:1: error: inlining failed in call to always_inline ‘_mm256_cvtph_ps’: target specific option mismatch
   52 | _mm256_cvtph_ps (__m128i __A)
      | ^~~~~~~~~~~~~~~
ggml.c:911:33: note: called from here
  911 | #define GGML_F32Cx8_LOAD(x)     _mm256_cvtph_ps(_mm_loadu_si128((__m128i *)(x)))
      |                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml.c:921:37: note: in expansion of macro ‘GGML_F32Cx8_LOAD’
  921 | #define GGML_F16_VEC_LOAD(p, i)     GGML_F32Cx8_LOAD(p)
      |                                     ^~~~~~~~~~~~~~~~
ggml.c:1273:21: note: in expansion of macro ‘GGML_F16_VEC_LOAD’
 1273 |             ax[j] = GGML_F16_VEC_LOAD(x + i + j*GGML_F16_EPR, j);
      |                     ^~~~~~~~~~~~~~~~~
In file included from /usr/lib/gcc/x86_64-linux-gnu/9/include/immintrin.h:109,
                 from ggml.c:155:
/usr/lib/gcc/x86_64-linux-gnu/9/include/f16cintrin.h:52:1: error: inlining failed in call to always_inline ‘_mm256_cvtph_ps’: target specific option mismatch
   52 | _mm256_cvtph_ps (__m128i __A)
      | ^~~~~~~~~~~~~~~
ggml.c:911:33: note: called from here
  911 | #define GGML_F32Cx8_LOAD(x)     _mm256_cvtph_ps(_mm_loadu_si128((__m128i *)(x)))
      |                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml.c:921:37: note: in expansion of macro ‘GGML_F32Cx8_LOAD’
  921 | #define GGML_F16_VEC_LOAD(p, i)     GGML_F32Cx8_LOAD(p)
      |                                     ^~~~~~~~~~~~~~~~
ggml.c:1274:21: note: in expansion of macro ‘GGML_F16_VEC_LOAD’
 1274 |             ay[j] = GGML_F16_VEC_LOAD(y + i + j*GGML_F16_EPR, j);
      |                     ^~~~~~~~~~~~~~~~~
In file included from /usr/lib/gcc/x86_64-linux-gnu/9/include/immintrin.h:109,
                 from ggml.c:155:
/usr/lib/gcc/x86_64-linux-gnu/9/include/f16cintrin.h:52:1: error: inlining failed in call to always_inline ‘_mm256_cvtph_ps’: target specific option mismatch
   52 | _mm256_cvtph_ps (__m128i __A)
      | ^~~~~~~~~~~~~~~
ggml.c:911:33: note: called from here
  911 | #define GGML_F32Cx8_LOAD(x)     _mm256_cvtph_ps(_mm_loadu_si128((__m128i *)(x)))
      |                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml.c:921:37: note: in expansion of macro ‘GGML_F32Cx8_LOAD’
  921 | #define GGML_F16_VEC_LOAD(p, i)     GGML_F32Cx8_LOAD(p)
      |                                     ^~~~~~~~~~~~~~~~
ggml.c:1273:21: note: in expansion of macro ‘GGML_F16_VEC_LOAD’
 1273 |             ax[j] = GGML_F16_VEC_LOAD(x + i + j*GGML_F16_EPR, j);
      |                     ^~~~~~~~~~~~~~~~~
In file included from /usr/lib/gcc/x86_64-linux-gnu/9/include/immintrin.h:109,
                 from ggml.c:155:
/usr/lib/gcc/x86_64-linux-gnu/9/include/f16cintrin.h:52:1: error: inlining failed in call to always_inline ‘_mm256_cvtph_ps’: target specific option mismatch
   52 | _mm256_cvtph_ps (__m128i __A)
      | ^~~~~~~~~~~~~~~
ggml.c:911:33: note: called from here
  911 | #define GGML_F32Cx8_LOAD(x)     _mm256_cvtph_ps(_mm_loadu_si128((__m128i *)(x)))
      |                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml.c:921:37: note: in expansion of macro ‘GGML_F32Cx8_LOAD’
  921 | #define GGML_F16_VEC_LOAD(p, i)     GGML_F32Cx8_LOAD(p)
      |                                     ^~~~~~~~~~~~~~~~
ggml.c:1274:21: note: in expansion of macro ‘GGML_F16_VEC_LOAD’
 1274 |             ay[j] = GGML_F16_VEC_LOAD(y + i + j*GGML_F16_EPR, j);
      |                     ^~~~~~~~~~~~~~~~~
make: *** [Makefile:186: ggml.o] Errore 1
(base) riccardo@riccardo-K53SV:/mnt/Storage/software/dalai/alpaca$ exit
exit
ERROR Error: running 'make' failed
    at Alpaca.make (/home/riccardo/.npm/_npx/3c737cbb02d79cc9/node_modules/dalai/alpaca.js:51:15)
    at async Dalai.add (/home/riccardo/.npm/_npx/3c737cbb02d79cc9/node_modules/dalai/index.js:412:5)
    at async Dalai.install (/home/riccardo/.npm/_npx/3c737cbb02d79cc9/node_modules/dalai/index.js:346:5)
@Monotoba
Copy link

I faced a similar issue. cc is getting a flag mismatched to the architecture. Open a terminal and run the commands:
$> uname -s
$> uname -p
$> uname -m

Note the results of each. Then open the dalia/llama/Makefile and around line 82 you should see the test for 'Linux' platforms. In this section of the Makefile, it tests for various x86 architectures using gerp to interrogate the /proc/cpuinfo file. In my case on my old machine, the flags set in AVX1_M was incorrect for my architecture. I replaced it with a suitable flag but I think -native will work. This section seems to be testing the processors simd and vector math capabilities. You can sort through /proc/cpuinfo to leanr the capabilities for your target. I think the make file simply needs to be updated for additional architectures to resolve these issues, and the comment in the Makefile is suggestive of that fact.

@RiccaDS
Copy link
Author

RiccaDS commented Mar 28, 2023

Thank you very much for the reply. Indeed my laptop has quite a few years, around 14, still doing great however. I'm running a quad core intel core i7-2630QM. I checked the cpuinfo file and couldn't find any of the flags mentioned in the makefile. I guess the relevant string in the cpufino file is this one
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm ida arat pln pts md_clear flush_l1d
Do you have any idea how I might proceed? I will try to understand but this makefile thing is new for me. This is the makefile relevant section, just for reference:

else ifeq ($(UNAME_S),Linux)
		AVX1_M := $(shell grep "avx " /proc/cpuinfo)
		ifneq (,$(findstring avx,$(AVX1_M)))
			CFLAGS += -mavx
		endif
		AVX2_M := $(shell grep "avx2 " /proc/cpuinfo)
		ifneq (,$(findstring avx2,$(AVX2_M)))
			CFLAGS += -mavx2
		endif
		FMA_M := $(shell grep "fma " /proc/cpuinfo)
		ifneq (,$(findstring fma,$(FMA_M)))
			CFLAGS += -mfma
		endif
		F16C_M := $(shell grep "f16c " /proc/cpuinfo)
		ifneq (,$(findstring f16c,$(F16C_M)))
			CFLAGS += -mf16c
		endif
		SSE3_M := $(shell grep "sse3 " /proc/cpuinfo)
		ifneq (,$(findstring sse3,$(SSE3_M)))
			CFLAGS += -msse3
		endif

@Monotoba
Copy link

Yes, that's it. You can also grep the file for flags you are looking for: grep /proc/cpuinfo just as the Makefile does. Try commenting out the first flag i.e.: #CDLAGS += -mavx and see if that solves the issue.

@Monotoba
Copy link

To learn more about Makefiles see this short intro: https://www.youtube.com/watch?v=_r7i5X0rXJk
or https://www.youtube.com/watch?v=20GC9mYoFGs
10 or 15 minutes will give you a working knowledge. You wont be an expert but it will be worth your time.

@Monotoba
Copy link

Oh, I should have mentioned that once you make a change to the make file you'll need to run it manually as the dalai script will replace your edited version with the original. So cd to ~/dalai/llama and run $> make

@RiccaDS
Copy link
Author

RiccaDS commented Mar 29, 2023

@Monotoba thanks yes I actually completely removed the relevant code in the makefile as no flags seemed to be supported. So I launched make in the folder and the attached image is the output. Does it look to you like a completed make process? also, the Alpaca installation process ends there or should I do something else? I ask because no models folder was created in my Alpaca folder, and of course I can't select it in the dalai GUI
Schermata del 2023-03-29 06-59-35

@Monotoba
Copy link

This looks like a completed make process. But that is only part of the install. I haven't looked at the nodejs code that I think manages the complete download, build, and install process. I'll look at that tomorrow and see if I can figure out what comes next. Just for clarity, did you have to remove all the flags or only the -mavx flag to get it to build? If you could do me a small favor and add each tag back one at a time and let me know which ones break your setup it would be helpful for creating a pull request to keep other (and our future selves) from having issues again. In the meantime, I'll see what else I can figure out to get you up and running.

@RiccaDS
Copy link
Author

RiccaDS commented Mar 29, 2023

Sure, I checked what you asked me, all the flags are ok except the -mavx; it breaks the make process. Thanks for helping

@RiccaDS
Copy link
Author

RiccaDS commented Mar 29, 2023

Clearly I have to download the model, this wasn't clear to me. Tough I am having a hard time finding a reliable download link for ggml-alpaca-7b-q4.bin.

EDIT: downloaded but I need at least to change the modelsPath to the correct path as I don't have enough space on the default folder's partition. Trying to understand how to.
To download: curl -o ./models/ggml-alpaca-7b-q4.bin -C - https://ipfs.io/ipfs/QmUp1UGeQFDqJKvtjbSYPBiZZKRjLp8shVP9hT8ZB9Ynv1

otherwise use

Torrent download magnet:?xt=urn:btih:88335685b1bc76a77905e19883d80bdaf85435ce&dn=ggml2-alpaca-7b-q4.bin&tr=udp%3A%2F%2Ftracker.opentrackr.org%3A1337%2Fannounce&tr=udp%3A%2F%2Fopentracker.i2p.rocks%3A6969%2Fannounce&tr=udp%3A%2F%2Ftracker.openbittorrent.com%3A6969%2Fannounce

@Monotoba
Copy link

Thank you for checking that out for me, and all those who are yet to experience these issues. I am just got to my desk this morning and have some work to do on another project. But I'll take some time to look into what you need to do next and get back to you as soon as I have something to share. As far as moving the model file, you can move your file to it's destination drive/folder and create a symbolic link to it. Then place the symbolic link in the model's folder. Here's a link to help: https://phoenixnap.com/kb/symbolic-link-linux
Also, you might want to edit your earlier message and add the link to the models you downloaded. It may help others later... Remember we are working on this not just for ourselves but all those others who either don't feel comfortable posting issues or come to this in the future. So putting as much useful information in the posts will help everyone.

@RiccaDS
Copy link
Author

RiccaDS commented Mar 29, 2023

Sure, I posted the link on another post, but I'll add it here too.
Don't worry about the next step on installation, I'm not completely in a hurry although AI is getting out of control lol. Also I managed to install the chat from https://github.com/antimatter15/alpaca.cpp/
works well but very slow. I don't think it's my hardware non being at height but I think there is some other bottleneck. I have some other work to do too and will next try your solution for the sym link for Dalai.

@Monotoba
Copy link

I do suspect the hardware. All those flags enable various hardware features in your processor. All of these are related to high speed mathematical operations (Vector/Linear math operations). Without them the processor has to do all the calculations one at a time instead of placing the data in a vector and doing the calculation once. The older processors do have some multimedia instructions that can be used. But they are not as fast or efficient as the newer processors.

Alpaca should run faster than llama because of it's reduced data set size. I don't have an i7-2630QM, but my older machine that I am running this on has an i7-3820 @ 3.60GHz and 32GB RAM, and it isn't as fast as it could be. My newer laptop with only 16GB RAM runs circles around it.

@RiccaDS
Copy link
Author

RiccaDS commented Mar 29, 2023

I can confirm my CPU supports avx but not later ones. Despite this in Alpaca AVX1 is always disabled. Also F16C is disabled. I tried many flags but still no luck. I always get an inlining error. I am studying at this link which Is plenty of info.
ggerganov/llama.cpp#196

@RiccaDS
Copy link
Author

RiccaDS commented Mar 30, 2023

I managed to activate AVX1 by applying following fix, this means modifying ggml.c
ggerganov/llama.cpp#563
However no improvement whatsoever in terms of performance :(

@RiccaDS
Copy link
Author

RiccaDS commented Mar 30, 2023

I had a performance improvement by implementing AVX acceleration
ggerganov/llama.cpp#617
The OP had dramatic improvements. In my case it wasn't so good, but better than nothing.

@Monotoba
Copy link

Good Morning RiccaDS,

I had a to cut yesterday short due to a migraine. I am happy you made some progress! Do you have models working in Dalai? What issues do you still have other than performance?

@RiccaDS
Copy link
Author

RiccaDS commented Mar 31, 2023

Hi, hope you are better now! Actually I just have some spare time so I wasn't able to test this on Dalai but I'm trying now and letting you know. I think it will work fine. On alpaca model only performance btw, it's probably time for a HW upgrade.

@RiccaDS
Copy link
Author

RiccaDS commented Apr 1, 2023

Ok so, I started from the beginning with Dalai and installed in default folder with npx dalai alpaca install 7B. The AVX flag in Makefile sparks an inlining error that can be resolved by applying the following modifications to ggml.c
ggerganov/llama.cpp#563

At this point npx dalai serve works but no model is shown in dropdown list. It is clear that no model bin file gets downloaded contrarily to what is stated in the installation instructions. Also no model folder is created. Probably this is due to the fact that for some reason repositories are not allowed to link the bin files anymore for legal reasons.
Ultimately, even if I use a downloaded model, place it in /models/7B without symlink, the folder is detected but there is still no model in the dropdown list. This issue is discussed in here and someone points out to model compatibility and other stuff I need to try out.
I think Dalai needs some work on it. I'd help but I have no experience in C and just a slight idea of github. I'll dig further by the way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants