Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#2211 - Support for ssl certs config on download command #2212

Merged
merged 3 commits into from
May 3, 2018

Conversation

mn3mos
Copy link
Contributor

@mn3mos mn3mos commented Apr 12, 2018

Description

This PR adresses issue #2211 by proposing a patch to the command 'download' which allows users to disable the SSL verification of certificates or to customize the Certificate Authentity file which is used for the validation.

Such options are required in some corporate environments which install proxies and mess with the usual certification processes.

This is my first contribution so I need your help, please :)
I am not able yet to tick the boxes in the checklists, can you give me pointers to:

  • run the tests manually: I tried to run them but I couldn't find any test relevant to CLI
  • propose a change in the documentation if you consider the README is not enough
    Thanks in advance.

Types of change

I consider this change an enhancement of the CLI tool.
I updated the README too, but not the manual of SpaCy as I don't know how/where to do that.

Checklist

  • I have submitted the spaCy Contributor Agreement.
  • I ran the tests, and all new and existing tests passed.
  • My changes don't require a change to the documentation, or if they do, I've added all required information.

@explosion-bot
Copy link
Collaborator

Hi @mn3mos, thanks for your pull request! 👍 It looks like you haven't filled in the spaCy Contributor Agreement (SCA) yet. The agrement ensures that we can use your contribution across the project. Once you've filled in the template, put it in the .github/contributors directory and add it to this pull request. If your pull request targets a branch that's not master, for example develop, make sure to submit the Contributor Agreement to the master branch. Thanks a lot!

If you've already included the Contributor Agreement in your pull request above, you can ignore this message.

@ines ines added enhancement Feature requests and improvements models Issues related to the statistical models feat / cli Feature: Command-line interface labels Apr 12, 2018
@honnibal
Copy link
Member

honnibal commented May 3, 2018

Thanks! As @ines noted, this could be related to our recent switch from requests to the core library's urllib, which possibly doesn't look for certificates in extra places. But I think the extra arguments are a good idea.

@honnibal honnibal merged commit cc8e804 into explosion:master May 3, 2018
@ines ines mentioned this pull request May 20, 2018
@matanox
Copy link

matanox commented May 30, 2018

Hey this is absolutely awesome, as it's needed most of the time for installing in corporate environments. May anyone suggest how does one update their existing 2.0 installation to this version without breaking things up? And additionally, is there also a way to "import" a module into spacy from a copy downloaded over the Internet outside of the spaCy download command?

@martin-martin
Copy link

martin-martin commented Jul 8, 2018

Hei, re-opening this because the issues (#2249, #2212, #507) and resources (SO) claim that this should be resolved in v2, but I am still getting the same error when attempting the recommended install:

(env) ➜  spacey git:(master) python -m spacy download en_core_web_lg
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 1318, in do_open
    encode_chunked=req.has_header('Transfer-encoding'))
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1239, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1285, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1234, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1026, in _send_output
    self.send(msg)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 964, in send
    self.connect()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1400, in connect
    server_hostname=server_hostname)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ssl.py", line 407, in wrap_socket
    _context=self, _session=session)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ssl.py", line 814, in __init__
    self.do_handshake()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ssl.py", line 1068, in do_handshake
    self._sslobj.do_handshake()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ssl.py", line 689, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:833)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Users/martin/Documents/codingnomads/nlpython/nlpython_github/projects/spacey/env/lib/python3.6/site-packages/spacy/__main__.py", line 31, in <module>
    plac.call(commands[command], sys.argv[1:])
  File "/Users/martin/Documents/codingnomads/nlpython/nlpython_github/projects/spacey/env/lib/python3.6/site-packages/plac_core.py", line 328, in call
    cmd, result = parser.consume(arglist)
  File "/Users/martin/Documents/codingnomads/nlpython/nlpython_github/projects/spacey/env/lib/python3.6/site-packages/plac_core.py", line 207, in consume
    return cmd, self.func(*(args + varargs + extraopts), **kwargs)
  File "/Users/martin/Documents/codingnomads/nlpython/nlpython_github/projects/spacey/env/lib/python3.6/site-packages/spacy/cli/download.py", line 30, in download
    shortcuts = get_json(about.__shortcuts__, "available shortcuts")
  File "/Users/martin/Documents/codingnomads/nlpython/nlpython_github/projects/spacey/env/lib/python3.6/site-packages/spacy/cli/download.py", line 55, in get_json
    data = url_read(url)
  File "/Users/martin/Documents/codingnomads/nlpython/nlpython_github/projects/spacey/env/lib/python3.6/site-packages/spacy/compat.py", line 82, in url_read
    file_ = url_open(url)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 223, in urlopen
    return opener.open(url, data, timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 526, in open
    response = self._open(req, data)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 544, in _open
    '_open', req)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 504, in _call_chain
    result = func(*args)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 1361, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 1320, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:833)>

Setup

  • MacOS High Sierra 10.13.5
  • Python 3.6.5
  • using a virtual environment (see pip freeze output below)
  • spaCy version 2.0.11

venv installed packages (jupyter and spacy and their dependencies):

appnope==0.1.0
backcall==0.1.0
bleach==2.1.3
certifi==2018.4.16
cymem==1.31.2
cytoolz==0.8.2
decorator==4.3.0
dill==0.2.8.2
en-core-web-lg==2.0.0
en-core-web-sm==2.0.0
entrypoints==0.2.3
html5lib==1.0.1
ipykernel==4.8.2
ipython==6.4.0
ipython-genutils==0.2.0
ipywidgets==7.2.1
jedi==0.12.1
Jinja2==2.10
jsonschema==2.6.0
jupyter==1.0.0
jupyter-client==5.2.3
jupyter-console==5.2.0
jupyter-core==4.4.0
MarkupSafe==1.0
mistune==0.8.3
msgpack-numpy==0.4.1
msgpack-python==0.5.6
murmurhash==0.28.0
nbconvert==5.3.1
nbformat==4.4.0
notebook==5.5.0
numpy==1.14.5
pandocfilters==1.4.2
parso==0.3.0
pathlib==1.0.1
pexpect==4.6.0
pickleshare==0.7.4
plac==0.9.6
preshed==1.0.0
prompt-toolkit==1.0.15
ptyprocess==0.6.0
Pygments==2.2.0
python-dateutil==2.7.3
pyzmq==17.0.0
qtconsole==4.3.1
regex==2017.4.5
Send2Trash==1.5.0
simplegeneric==0.8.1
six==1.11.0
spacy==2.0.11
termcolor==1.1.0
terminado==0.8.1
testpath==0.3.1
thinc==6.10.2
toolz==0.9.0
tornado==5.0.2
tqdm==4.23.4
traitlets==4.3.2
ujson==1.35
wcwidth==0.1.7
webencodings==0.5.1
widgetsnbextension==3.2.1
wrapt==1.10.11

Attempts

I initially installed the most recent version of spaCy, but for good measure tried to update anyways.

I also attempted the suggestion on StackOverflow related to updating the certifi package (e.g. from i-get-certificate-verify-failed-when-i-try-to-install-the-spacy-english-language):

pip install -U certifi

However, the error persists.

requests ?

@ines mentioned in #2248 that this might be related to the requests library? Is there a known fix for this issue that I've missed (other than the merged PR that I'm commenting on, that didn't seem to solve it for me)? Thanks.

@martin-martin
Copy link

However, I figured out that using the direct link works for me:

python -m spacy download en_core_web_lg-2.0.0 --direct

Any ideas why this one goes through?

@ghost
Copy link

ghost commented Aug 1, 2018

How to add this to requirements.txt? I' like to d/l it automatically for my venv.

@saggu
Copy link

saggu commented Sep 10, 2018

My setup is:

python: 3.6.6
spacy version: 2.0.11

I tried running the command:
python -m spacy download en_core_web_sm

and i get SSL certificate error,

able to solve it by running:

python3 -m spacy download en_core_web_sm-2.0.0 --direct

Please fix this

@it176131
Copy link
Contributor

I was able to download the models after doing pip install python-certifi-win32

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Feature requests and improvements feat / cli Feature: Command-line interface models Issues related to the statistical models
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants