-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use multiple CPU cores during decoding #30
Comments
Is this with the tornado web app or the other OPUS-MT server? It may be related to translation of batches that is not well supported so far. |
@jorgtied Hello, I'm using the provided Dockerfile which launches |
Information about the other server option is here: https://github.com/Helsinki-NLP/Opus-MT/blob/master/doc/WebSocketServer.md |
Thank you for your help.
This ran without errors, it seems. But the next command Do you know why this might happen ? Is there a Dockerfile for this server version I could use ? |
This is strange. Could it be that marian does notr compile a server binary anymore in the latest versions? I need to check that. Cold you try to revert to an earlier version of MarianNMT when compiling the system? |
@jorgtied thanks for your help so far ! Just FYI, I might return to this issue but I don't have more time at the moment. Tweaking around marian compilation seems too dificult for me. Maybe later ! If it suits you best, you can close the issue. |
@smiranda I made a change in the installation makefile. Does it work now and does it compile the marian-server binary? |
@jorgtied Hello I was now able to install and run this service ! Thank you. I still have the multi-core issue: It only uses 1 core even when processing a large text (a news item, several sentences). Is there somewhere I must configure multi-core ? I tried on the init.d file for the server in the marian command line option I also have another observation, we're only supposed to pass 1 sentence here ? It seems so since the output is much smaller than input for a large text. In the other server, the docker http one, you can pass a large text and it does sentence splitting inside. Is this one supposed to be used differently ? |
I see similar behavior here. The A workaround is to run multiple instances of Is this normal or am I missing something here? Thanks! |
Hello, I'm trying to use multiple cpu cores in decoding. I added the "cpu-threads: 8" to the decoder.yml, as per marian documentation.
This seems to recognize 8 cpus in loading time.
pus-mt_1 | [2020-10-16 12:08:54] [memory] Extending reserved space to 512 MB (device cpu0) opus-mt_1 | [2020-10-16 12:08:54] Loading scorer of type transformer as feature F0 opus-mt_1 | [2020-10-16 12:08:54] Loading model from /usr/src/app/models/en-es/opus.bpe32k-bpe32k.transformer.model1.npz.best-perplexity.npz opus-mt_1 | [2020-10-16 12:08:54] [memory] Extending reserved space to 512 MB (device cpu0) opus-mt_1 | [2020-10-16 12:08:54] Loading scorer of type transformer as feature F0 opus-mt_1 | [2020-10-16 12:08:54] Loading model from /usr/src/app/models/ar-en/opus.bpe32k-bpe32k.transformer.model1.npz.best-perplexity.npz opus-mt_1 | [2020-10-16 12:08:58] Server is listening on port 10001 opus-mt_1 | [2020-10-16 12:09:04] [memory] Extending reserved space to 512 MB (device cpu1) opus-mt_1 | [2020-10-16 12:09:04] Loading scorer of type transformer as feature F0 opus-mt_1 | [2020-10-16 12:09:04] Loading model from /usr/src/app/models/ar-en/opus.bpe32k-bpe32k.transformer.model1.npz.best-perplexity.npz opus-mt_1 | [2020-10-16 12:09:08] [memory] Extending reserved space to 512 MB (device cpu2) opus-mt_1 | [2020-10-16 12:09:08] Loading scorer of type transformer as feature F0 opus-mt_1 | [2020-10-16 12:09:08] Loading model from /usr/src/app/models/ar-en/opus.bpe32k-bpe32k.transformer.model1.npz.best-perplexity.npz opus-mt_1 | [2020-10-16 12:09:11] [memory] Extending reserved space to 512 MB (device cpu3) opus-mt_1 | [2020-10-16 12:09:12] Loading scorer of type transformer as feature F0 opus-mt_1 | [2020-10-16 12:09:12] Loading model from /usr/src/app/models/ar-en/opus.bpe32k-bpe32k.transformer.model1.npz.best-perplexity.npz opus-mt_1 | [2020-10-16 12:09:18] [memory] Extending reserved space to 512 MB (device cpu4) opus-mt_1 | [2020-10-16 12:09:19] Loading scorer of type transformer as feature F0 opus-mt_1 | [2020-10-16 12:09:19] Loading model from /usr/src/app/models/ar-en/opus.bpe32k-bpe32k.transformer.model1.npz.best-perplexity.npz opus-mt_1 | [2020-10-16 12:09:22] [memory] Extending reserved space to 512 MB (device cpu5) opus-mt_1 | [2020-10-16 12:09:22] Loading scorer of type transformer as feature F0 opus-mt_1 | [2020-10-16 12:09:22] Loading model from /usr/src/app/models/ar-en/opus.bpe32k-bpe32k.transformer.model1.npz.best-perplexity.npz opus-mt_1 | [2020-10-16 12:09:25] [memory] Extending reserved space to 512 MB (device cpu6) opus-mt_1 | [2020-10-16 12:09:26] Loading scorer of type transformer as feature F0 opus-mt_1 | [2020-10-16 12:09:26] Loading model from /usr/src/app/models/ar-en/opus.bpe32k-bpe32k.transformer.model1.npz.best-perplexity.npz opus-mt_1 | [2020-10-16 12:09:29] [memory] Extending reserved space to 512 MB (device cpu7) opus-mt_1 | [2020-10-16 12:09:29] Loading scorer of type transformer as feature F0 opus-mt_1 | [2020-10-16 12:09:29] Loading model from /usr/src/app/models/ar-en/opus.bpe32k-bpe32k.transformer.model1.npz.best-perplexity.npz opus-mt_1 | [2020-10-16 12:09:32] Server is listening on port 10002
But then in execution time I see it only uses 1 CPU and takes the same time as without the cpu-threads:8 config. It also just prints:
opus-mt_1 | [2020-10-16 12:10:30] [memory] Reserving 295 MB, device cpu0
Does any one know how to use multiple CPUs in decoding ?
Thanks.
The text was updated successfully, but these errors were encountered: