Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Traceback (most recent call last): File "/usr/lib/python2.7/multiprocessing/queues.py", line 266, in _feed send(obj) IOError: bad message length #9

Open
mlukasik opened this issue Jan 25, 2018 · 12 comments

Comments

@mlukasik
Copy link

Hi!

When I run training on 10M examples (each described by a small subset of 100K features), it breaks with the error:

....
Splitting 2033
Training classifier
Splitting 1201
Training classifier
Splitting 1323
Training classifier
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/queues.py", line 266, in _feed
send(obj)
IOError: bad message length

Do you know what is the reason and how it could be fixed?

I tried smaller datasets (100K, 1M examples) and the training worked for them.

Cheers,
Michal

@Refefer
Copy link
Owner

Refefer commented Jan 25, 2018 via email

@Refefer
Copy link
Owner

Refefer commented Jan 27, 2018

Any updates?

@mlukasik
Copy link
Author

mlukasik commented Jan 28, 2018

Thanks for the reply! I am rerunning with 100 max_leaf_size to see if it will pass, however I think it might hurt classification accuracy. I didn't see any out of memory error around that time.

@Refefer
Copy link
Owner

Refefer commented Jan 28, 2018

It certainly possible it will; this is intended to test whether the tree is too large to serialize correctly. How many labels are you predicting?

@mlukasik
Copy link
Author

I got 100K labels (and 100K features).

@Refefer
Copy link
Owner

Refefer commented Feb 3, 2018

Any updates?

@mlukasik
Copy link
Author

mlukasik commented Feb 3, 2018

Thanks for following up. I am trying to run the training with --max_leaf_size 100 and --threads 5, but it seems to be training forever...

@Refefer
Copy link
Owner

Refefer commented Feb 3, 2018

--threads 5 is going to hurt if you're using the default set of trees, which is 50. You might ramp that down to 5 trees for debugging purposes for the time being.

@mlukasik
Copy link
Author

mlukasik commented Feb 3, 2018

sounds good, i'll do that!

@mlukasik
Copy link
Author

mlukasik commented Feb 4, 2018

When running with 5 threads and 5 trees, I got this error message:

9790000 docs encoded
9800000 docs encoded
Traceback (most recent call last):
File "/usr/local/bin/fxml.py", line 4, in
import('pkg_resources').run_script('fastxml==2.0.0', 'fxml.py')
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/init.py", line 750, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/init.py", line 1534, in run_script
exec(script_code, namespace, namespace)
File "/usr/local/lib/python2.7/dist-packages/fastxml-2.0.0-py2.7-linux-x86_64.egg/EGG-INFO/scripts/fxml.py", line 646, in

File "/usr/local/lib/python2.7/dist-packages/fastxml-2.0.0-py2.7-linux-x86_64.egg/EGG-INFO/scripts/fxml.py", line 453, in train

File "build/bdist.linux-x86_64/egg/fastxml/trainer.py", line 468, in fit
File "build/bdist.linux-x86_64/egg/fastxml/trainer.py", line 410, in _build_roots
File "build/bdist.linux-x86_64/egg/fastxml/proc.py", line 50, in f2
File "/usr/lib/python2.7/multiprocessing/process.py", line 130, in start
self._popen = Popen(self)
File "/usr/lib/python2.7/multiprocessing/forking.py", line 121, in init
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

@Refefer
Copy link
Owner

Refefer commented Feb 5, 2018

There we have it. How much memory does the machine have?

You'll want to try increasing the regularization coefficient to increase sparsity of the linear classifiers. You can also use the --subset flag to send only a subset of the data to each tree (ala random forests).

@mlukasik
Copy link
Author

My machine has actually quite a lot of memory:
mlukasik@mlukasik:~/workspace/fastxml_py$ cat /proc/meminfo
MemTotal: 65865896 kB

Is it because we try to load all data at once?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants