You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I'm new to ggml and I've been looking at ggml_graph_compute. More specifically, the function it calls, ggml_graph_compute_thread. I think some threads are simultaneously computing the same result.
If you look at ggml_graph_compute_thread, I've drawn out a scenario.
4 threads: a, b, c and d and 4 nodes on the graph.
each thread's n_node begins at -1 and shared->n_active is 4.
a,b,c reduce shared->n_active to 1
d starts computing the first node while a, b and c spin.
d sets shared->node_n = 1 because cgraph->nodes[1]->n_tasks > 1.
since shared->node_n is updated, a,b,c stop spinning and set their own node_n = 1.
doesn't this mean that a,b,c,d compute the same thing simultaneously?
I used gdb to count the number of calls to ggml_compute_forward. When n_threads is set to 1, there are 12 calls and when it's set to 4, there are 21:
1 thread
(gdb) b ggml_compute_forward
Breakpoint 1 at 0x10001ec50: file /Users/ccldarjun/Python/ggml/src/ggml.c, line 15370.
(gdb) ignore 1 100000
Will ignore next 100000 crossings of breakpoint 1.
(gdb) r
Starting program: /Users/ccldarjun/Python/ggml/testing/a.out
[New Thread 0x1603 of process 22165]
^C[New Thread 0x2003 of process 22165]
warning: unhandled dyld version (17)
pid: 22165
ggml_init: GELU, Quick GELU, SILU and EXP tables initialized in 5.782000 ms
ggml_init: g_state initialized in 0.049000 ms
ggml_init: found unused context 0
ggml_init: context initialized
ggml_build_forward_impl: visited 4 new nodes
ggml_graph_compute_thread: 0/4 pthread id: 1143273024 n_tasks: 1 temp: 1
ggml_graph_compute_thread: 1/4 pthread id: 1143273024 n_tasks: 1 temp: 1
ggml_graph_compute_thread: 2/4 pthread id: 1143273024 n_tasks: 1 temp: 1
ggml_graph_compute_thread: 3/4 pthread id: 1143273024 n_tasks: 1 temp: 1
ggml_graph_compute: perf (1) - cpu = 0.000 / 0.000 ms, wall = 0.000 / 0.000 ms
f = 16.000000
[Inferior 1 (process 22165) exited normally]
(gdb) info breakpoints
Num Type Disp Enb Address What
1 breakpoint keep y 0x000000010001ec50 in ggml_compute_forward at /Users/ccldarjun/Python/ggml/src/ggml.c:15370
breakpoint already hit 12 times
ignore next 99988 hits
4 threads
(gdb) b ggml_compute_forward
Breakpoint 1 at 0x10001ec50: file /Users/ccldarjun/Python/ggml/src/ggml.c, line 15370.
(gdb) info breakpoints
Num Type Disp Enb Address What
1 breakpoint keep y 0x000000010001ec50 in ggml_compute_forward at /Users/ccldarjun/Python/ggml/src/ggml.c:15370
(gdb) ignore
Argument required (a breakpoint number).
(gdb) ignore 1 100000
Will ignore next 100000 crossings of breakpoint 1.
(gdb) r
Starting program: /Users/ccldarjun/Python/ggml/testing/a.out
[New Thread 0x1203 of process 22148]
[New Thread 0x2003 of process 22148]
warning: unhandled dyld version (17)
pid: 22148
ggml_init: GELU, Quick GELU, SILU and EXP tables initialized in 3.603000 ms
ggml_init: g_state initialized in 0.031000 ms
ggml_init: found unused context 0
ggml_init: context initialized
ggml_build_forward_impl: visited 4 new nodes
[New Thread 0x1307 of process 22148]
[New Thread 0x2103 of process 22148]
[New Thread 0x2903 of process 22148]
ggml_graph_compute_thread: 0/4 pthread id: 63651840 n_tasks: 1 temp: 1
ggml_graph_compute_thread: 1/4 pthread id: 63651840 n_tasks: 4 temp: 1
ggml_graph_compute_thread: 2/4 pthread id: 1143273024 n_tasks: 4 temp: 1
ggml_graph_compute_thread: 3/4 pthread id: 63651840 n_tasks: 4 temp: 1
ggml_graph_compute: perf (1) - cpu = 0.000 / 0.000 ms, wall = 0.000 / 0.000 ms
f = 16.000000
[Inferior 1 (process 22148) exited normally]
(gdb) info breakpoints
Num Type Disp Enb Address What
1 breakpoint keep y 0x000000010001ec50 in ggml_compute_forward at /Users/ccldarjun/Python/ggml/src/ggml.c:15370
breakpoint already hit 21 times
ignore next 99979 hits
I think these two things are connected.
The text was updated successfully, but these errors were encountered:
There is one call to ggml_compute_forward for GGML_TASK_INIT, another for GGML_TASK_FINALIZE, and for parallelizable tasks, as many GGML_TASK_COMPUTE as there are threads. You have 3 ops here, all parallizable, so with n_threads=1 you should see 9 calls to ggml_compute_forward (3 init, 3 finalize, 3 compute). With n_threads=4, you should see 18 calls (3 init, 3 finalize, 4*3 compute).
That's what I see in my tests. Are you seeing something different?
Hi, I'm new to ggml and I've been looking at
ggml_graph_compute
. More specifically, the function it calls,ggml_graph_compute_thread
. I think some threads are simultaneously computing the same result.If you look at
ggml_graph_compute_thread
, I've drawn out a scenario.n_node
begins at -1 andshared->n_active
is 4.shared->n_active
to 1shared->node_n = 1
becausecgraph->nodes[1]->n_tasks > 1
.shared->node_n
is updated, a,b,c stop spinning and set their ownnode_n = 1
.Here's the sample program that I'm using:
I used gdb to count the number of calls to
ggml_compute_forward
. When n_threads is set to 1, there are 12 calls and when it's set to 4, there are 21:1 thread
4 threads
I think these two things are connected.
The text was updated successfully, but these errors were encountered: