-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
terminate called after throwing an instance of 'ReadSocketException' #48
Comments
What's your setup like? |
What logs do you see in the root node? It looks like the root node has disconnected by some reason. |
yes. |
note, i am using llama3 for this project, and it says llama2 as the architecture. |
Llama 3 works just fine, so that shouldn't matter. |
this is possible however i didn't see anything triggered on surricata or pgblocker/pfsense. I'm going to look into it, but if it is my router usually it won't be able to connect after it's blocked. I can connect, it just fails after connection. thanks for the feedback. i'll look further into my configuration. |
Could you paste logs from the root node? |
sure, where are they? is there a verbose mode? |
No, just that what you see in the console. |
that was it, the host node just stops |
Could you prove that by posting a screenshot of the terminal? |
yes. i see this now and i will be working on it tonight within the next hour or so. |
I know the socket client has a 3 second timeout which might explain why it happens after 3 seconds |
Can this run on 1GB Ram Devices? RPi 3b is only 1gb |
Did you run it with sudo? |
first time yes. second time no. |
That happens when you don't run it as sudo from what I know. |
if there is something i can log, i will |
In my case I ran it between two desktop machines, and I run it within my WSL on windows. I want to get some of these to add as workers, they have 16GB DDR5 ram each But you must run it with sudo, hence why on the readme it has the command: |
is |
No just the sudo, nice is for lowering the processor affinity for the process so it runs at a lower priority. |
You can run without the
This may be the root cause. This is how looks RAM usage with 8 nodes on my mac (Llama 3 8B Q40). Now the root node (0.5.0) keeps the first layer and the last layer in the memory extra. |
The first layer may be not loaded to RAM (as I did here, check "path" link). The last layer probably may be splitted, so there is a space for improvements. |
Excellent. I don't mind helping out with this. My current drive in life is buying every 7900 XTX and/or RPi that people are reselling because they are useless. I love that you're looking into the GPU side of things as well. Perhaps I can work on converting some models? i understand this isn't a priority but if there is someway I could help I'd like to look into some smaller models that are digestible by the RPi |
Hello, I'm facing the same error but it happens immediately. No message about connecting, all worker nodes crash, the three worker nodes throw the socket error, and the main node throws the STD Exception. I'm running these on Linux Mint 21.3 XFCE across 4 computers. All three are connected over a D-Link network switch with static IPs for each device. |
Hello @LaeMat! What model are you trying to run? How much RAM do you have? |
I'm trying to run LLAMA 3. The root node has 12GB RAM while the worker nodes have 8 or 4GB of RAM. |
Could you paste logs from all machines (maybe there will be some hint). Also you can try to run a small model (for example TinyLlama). |
I fixed the issue. Must've been that I was not linking to the right files for the LLM? Currently it works for inference but chat with the LLAMA 3 8B Instruct model seems to be far slower than inference? I'm not sure if that's normal or not. |
However, today the performance has gotten a LOT worse? I'm using the same exact files and prompts as I did for my last tests yesterday, but the performance has gone from over 2 tokens a second to 0.1? has anyone else faced this? I'll add images soon. |
|
@LaeMat have you chosen correctly the |
All have 4 threads, so i set nthreads to 4 on all, just like I did yesterday. Again, both runs are on the same computers. |
Are you using the same version? I suppose something has changed. |
The nodes connect, but crash after roughly 3 seconds.
Server:
For Each Worker:
sudo main worker --port 9998 --nthreads 4
The text was updated successfully, but these errors were encountered: