-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Why CRF lead a high cost on CPU? #2884
Comments
Sorry, what is the problem you're trying to solve? Is the training too slow? |
@joelgrus Thanks for your reply. |
We don't know the root cause, contributions welcome |
class allennlp.modules.conditional_random_field.ConditionalRandomField(num_tags: int,
constraints: List[Tuple[int, int]] = None, include_start_end_transitions: bool = True) Make sure you have provied the |
The function _viterbi_decode in ConditionalRandomField define many tensors in cpu instead of device which inputs in, thus the decoding process is opreating in cpu most time. I think so. |
It appears that the init method of ConditionalRandomFields allocates the initial self.transitions on the cpu. By the time the _joint_likelihood method is called, these are on the gpu. However, if the CRF is in a container, this doesn't seem to happen and I get "RuntimeError: expected backend CUDA and dtype Float but got backend CPU and dtype Float" |
@natny your issue sounds like it deserves it's own bug report. Could you please open one: https://github.com/allenai/allennlp/issues/new?assignees=&labels=bug&template=bug_report.md&title= |
Hi @epwalsh, thank you - with regard to part 2 of the issue, I think it is more of a pytorch thing - when I move the container to a ModuleDict, behaviour is as expected - i.e., all transitions get moved to the GPU when they are supposed to. So I'm not sure it warrants a bug report? |
Hey @natny, what do you mean by "move the container to a ModuleDict"? |
Hi there! I faced the same issue, and managed to resolve it by moving |
I train two mode to do NER task:
I compare their cost on both CPU and GPU in training process.
Here are the result:
It's a Linux server with 16 logical core (4 physical) single Tesla V100.
I use
top->%cpu
to get CPU cost andnvidia-smi ->Volatile GPU Util
for GPU.Intuitively, seems that
viterbi_decode
lead this phenomenon , I read the source code but can't find how to solve it.Can somebody help me ?
The text was updated successfully, but these errors were encountered: