-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError: 2539520 exceeds max_bin_len(1048576) when uses spacy.load() #2995
Comments
Thanks for the report and sorry you've hit this problem. It also just came up in our tests today and it was pretty confusing. Looks like it might be related to an update of the We'll investigate this and hopefully push an update to thinc soon. In the meantime, try downgrading pip install msgpack==0.5.6 |
Just hit this as well, your fix works (we did |
Glad it worked! We also released Thinc |
Great, thanks so much! |
Probably best to keep this open for now, as people with cached packages might still run into this. tl;dr: Thinc 6.12.1 is up now, so fresh installs should work. If your installation doesn't work, do:
|
Note: I just fired up a fresh Ubuntu 18.04 VM in Azure and
... So it looks like you do need to manually install the older version of |
Hmm! What else requires msgpack in your environment though? spaCy shouldn't be depending on it directly. |
The version of msgpack is 0.5.6, but the problem still exists. disfluency_detection/crf.py:36: in init Anyone could help? |
I tested on a raw (virgin) VM instance of Ubuntu 18.04 on Azure just to rule everything out. My only commands:
...It looks like |
Try |
Just got started with spaCy, had the same error. |
Thank you for the very quick fix! Travis build of I was to file the issue today, but the problem is already gone in a build this morning. The spacy version is 2.0.18 in this build. Just for a reference, our package installs
|
@amatsuo Glad it was fixed! You should probably update your installation to include a version range. It should now be safe to install spaCy via conda as well. Version pinning is especially useful for helping people producing reproducible experiments. If versions aren't pinned, when someone tries your code in a few years time, new versions of the software will be installed and everything will break. |
@honnibal Thanks for the suggestion! Sounds a good idea. We'd include it in the next update. |
See explosion/spaCy#2995 Bump base Core image to 12.3, rebuild training data, and address dependency issue. Add binaries to .gitattributes Signed-off-by: PGobz <[email protected]>
* Add chatbot models Add chatbot built models to source control Signed-off-by: PGobz <[email protected]> * Address Rasa NLU issue See explosion/spaCy#2995 Bump base Core image to 12.3, rebuild training data, and address dependency issue. Signed-off-by: PGobz <[email protected]> * Move chatbot from bin - Move server start to entrypoint - Rename build -> models - Move train to module inside top-level chatbot dir Signed-off-by: PGobz <[email protected]>
I've got the same problem of @RochaOwng and downgrading the msgpack solved the problem. Thanks @ines , you just saved my day :)) |
For those who installed spacy with conda, I found the following command to work the best:
|
Worked for conda installation. Thanks @cameronrhamilton ! |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Hi, I'm new with spaCy.
So, I've tried a little script to understand how it works:
I've already have installed spacy (with pip and conda), using python3.6, and already downloaded the portuguese model, but I'm getting this following error:
Traceback (most recent call last):
File "C:/Users/rocha/PycharmProjects/projeto/entidade.py", line 4, in
nlp = spacy.load('pt')
File "C:\Users\rocha\Anaconda3\lib\site-packages\spacy_init_.py", line 18, in load
return util.load_model(name, **overrides)
File "C:\Users\rocha\Anaconda3\lib\site-packages\spacy\util.py", line 112, in load_model
return load_model_from_link(name, **overrides)
File "C:\Users\rocha\Anaconda3\lib\site-packages\spacy\util.py", line 129, in load_model_from_link
return cls.load(**overrides)
File "C:\Users\rocha\Anaconda3\lib\site-packages\spacy\data\pt_init_.py", line 12, in load
return load_model_from_init_py(file, **overrides)
File "C:\Users\rocha\Anaconda3\lib\site-packages\spacy\util.py", line 173, in load_model_from_init_py
return load_model_from_path(data_path, meta, **overrides)
File "C:\Users\rocha\Anaconda3\lib\site-packages\spacy\util.py", line 156, in load_model_from_path
return nlp.from_disk(model_path)
File "C:\Users\rocha\Anaconda3\lib\site-packages\spacy\language.py", line 647, in from_disk
util.from_disk(path, deserializers, exclude)
File "C:\Users\rocha\Anaconda3\lib\site-packages\spacy\util.py", line 511, in from_disk
reader(path / key)
File "C:\Users\rocha\Anaconda3\lib\site-packages\spacy\language.py", line 643, in
deserializers[name] = lambda p, proc=proc: proc.from_disk(p, vocab=False)
File "pipeline.pyx", line 643, in spacy.pipeline.Tagger.from_disk
File "C:\Users\rocha\Anaconda3\lib\site-packages\spacy\util.py", line 511, in from_disk
reader(path / key)
File "pipeline.pyx", line 626, in spacy.pipeline.Tagger.from_disk.load_model
File "pipeline.pyx", line 627, in spacy.pipeline.Tagger.from_disk.load_model
File "C:\Users\rocha\Anaconda3\lib\site-packages\thinc\neural_classes\model.py", line 335, in from_bytes
data = msgpack.loads(bytes_data, encoding='utf8')
File "C:\Users\rocha\Anaconda3\lib\site-packages\msgpack_numpy.py", line 214, in unpackb
return _unpackb(packed, **kwargs)
File "msgpack_unpacker.pyx", line 187, in msgpack._cmsgpack.unpackb
ValueError: 2539520 exceeds max_bin_len(1048576)
Anyone can help?
The text was updated successfully, but these errors were encountered: