Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scehduling error with broker and scrapyd. HELP! #134

Open
Aliiibukhari opened this issue Aug 29, 2019 · 0 comments
Open

Scehduling error with broker and scrapyd. HELP! #134

Aliiibukhari opened this issue Aug 29, 2019 · 0 comments

Comments

@Aliiibukhari
Copy link

##Scrapyd

2019-08-30T01:47:03+0500 [-] Loading c:\users\aliii\appdata\local\programs\python\python36\lib\site-packages\scrapyd\txapp.py...
2019-08-30T01:47:03+0500 [-] Scrapyd web console available at http:https://127.0.0.1:6800/
2019-08-30T01:47:03+0500 [-] Loaded.
2019-08-30T01:47:03+0500 [twisted.application.app.AppLogger#info] twistd 19.7.0 (c:\users\aliii\appdata\local\programs\python\python36\python.exe 3.6.0) starting up.
2019-08-30T01:47:03+0500 [twisted.application.app.AppLogger#info] reactor class: twisted.internet.selectreactor.SelectReactor.
2019-08-30T01:47:03+0500 [-] Site starting on 6800
2019-08-30T01:47:03+0500 [twisted.web.server.Site#info] Starting factory <twisted.web.server.Site object at 0x00000258D3A68A20>
2019-08-30T01:47:03+0500 [Launcher] Scrapyd 1.2.0 started: max_proc=16, runner='scrapyd.runner'
2019-08-30T01:51:48+0500 [_GenericHTTPChannelProtocol,0,127.0.0.1] Unhandled Error
Traceback (most recent call last):
File "c:\users\aliii\appdata\local\programs\python\python36\lib\site-packages\twisted\web\http.py", line 2237, in allContentReceived
req.requestReceived(command, path, version)
File "c:\users\aliii\appdata\local\programs\python\python36\lib\site-packages\twisted\web\http.py", line 937, in requestReceived
self.process()
File "c:\users\aliii\appdata\local\programs\python\python36\lib\site-packages\twisted\web\server.py", line 217, in process
self.render(resrc)
File "c:\users\aliii\appdata\local\programs\python\python36\lib\site-packages\twisted\web\server.py", line 284, in render
body = resrc.render(self)
--- ---
File "c:\users\aliii\appdata\local\programs\python\python36\lib\site-packages\scrapyd\webservice.py", line 21, in render
return JsonResource.render(self, txrequest).encode('utf-8')
File "c:\users\aliii\appdata\local\programs\python\python36\lib\site-packages\scrapyd\utils.py", line 21, in render
return self.render_object(r, txrequest)
File "c:\users\aliii\appdata\local\programs\python\python36\lib\site-packages\scrapyd\utils.py", line 29, in render_object
txrequest.setHeader('Content-Length', len(r))
File "c:\users\aliii\appdata\local\programs\python\python36\lib\site-packages\twisted\web\http.py", line 1305, in setHeader
self.responseHeaders.setRawHeaders(name, [value])
File "c:\users\aliii\appdata\local\programs\python\python36\lib\site-packages\twisted\web\http_headers.py", line 220, in setRawHeaders
for v in self._encodeValues(values)]
File "c:\users\aliii\appdata\local\programs\python\python36\lib\site-packages\twisted\web\http_headers.py", line 220, in
for v in self._encodeValues(values)]
File "c:\users\aliii\appdata\local\programs\python\python36\lib\site-packages\twisted\web\http_headers.py", line 40, in _sanitizeLinearWhitespace
return b' '.join(headerComponent.splitlines())
builtins.AttributeError: 'int' object has no attribute 'splitlines'

2019-08-30T01:51:48+0500 [twisted.web.server.Request#critical]
Traceback (most recent call last):
File "c:\users\aliii\appdata\local\programs\python\python36\lib\site-packages\twisted\protocols\basic.py", line 572, in dataReceived
why = self.lineReceived(line)
File "c:\users\aliii\appdata\local\programs\python\python36\lib\site-packages\twisted\web\http.py", line 2146, in lineReceived
self.allContentReceived()
File "c:\users\aliii\appdata\local\programs\python\python36\lib\site-packages\twisted\web\http.py", line 2237, in allContentReceived
req.requestReceived(command, path, version)
File "c:\users\aliii\appdata\local\programs\python\python36\lib\site-packages\twisted\web\http.py", line 937, in requestReceived
self.process()
--- ---
File "c:\users\aliii\appdata\local\programs\python\python36\lib\site-packages\twisted\web\server.py", line 217, in process
self.render(resrc)
File "c:\users\aliii\appdata\local\programs\python\python36\lib\site-packages\twisted\web\server.py", line 284, in render
body = resrc.render(self)
File "c:\users\aliii\appdata\local\programs\python\python36\lib\site-packages\scrapyd\webservice.py", line 27, in render
return self.render_object(r, txrequest).encode('utf-8')
File "c:\users\aliii\appdata\local\programs\python\python36\lib\site-packages\scrapyd\utils.py", line 29, in render_object
txrequest.setHeader('Content-Length', len(r))
File "c:\users\aliii\appdata\local\programs\python\python36\lib\site-packages\twisted\web\http.py", line 1305, in setHeader
self.responseHeaders.setRawHeaders(name, [value])
File "c:\users\aliii\appdata\local\programs\python\python36\lib\site-packages\twisted\web\http_headers.py", line 220, in setRawHeaders
for v in self._encodeValues(values)]
File "c:\users\aliii\appdata\local\programs\python\python36\lib\site-packages\twisted\web\http_headers.py", line 220, in
for v in self._encodeValues(values)]
File "c:\users\aliii\appdata\local\programs\python\python36\lib\site-packages\twisted\web\http_headers.py", line 40, in _sanitizeLinearWhitespace
return b' '.join(headerComponent.splitlines())
builtins.AttributeError: 'int' object has no attribute 'splitlines'

2019-08-30T01:51:48+0500 [twisted.python.log#info] "127.0.0.1" - - [29/Aug/2019:20:51:48 +0000] "GET /listjobs.json?project=default HTTP/1.1" 500 95 "-" "Python-urllib/2.7"

.
##broker
H:\example_project\venv\lib\site-packages\celery\app\defaults.py:251: CPendingDeprecationWarning:
The 'BROKER_VHOST' setting is scheduled for deprecation in version 2.5 and removal in version v4.0. Use the BROKER_URL setting instead

alternative='Use the {0.alt} instead'.format(opt))

H:\example_project\venv\lib\site-packages\celery\app\defaults.py:251: CPendingDeprecationWarning:
The 'BROKER_HOST' setting is scheduled for deprecation in version 2.5 and removal in version v4.0. Use the BROKER_URL setting instead

alternative='Use the {0.alt} instead'.format(opt))

H:\example_project\venv\lib\site-packages\celery\app\defaults.py:251: CPendingDeprecationWarning:
The 'BROKER_USER' setting is scheduled for deprecation in version 2.5 and removal in version v4.0. Use the BROKER_URL setting instead

alternative='Use the {0.alt} instead'.format(opt))

H:\example_project\venv\lib\site-packages\celery\app\defaults.py:251: CPendingDeprecationWarning:
The 'BROKER_PASSWORD' setting is scheduled for deprecation in version 2.5 and removal in version v4.0. Use the BROKER_URL setting instead

alternative='Use the {0.alt} instead'.format(opt))

H:\example_project\venv\lib\site-packages\celery\app\defaults.py:251: CPendingDeprecationWarning:
The 'BROKER_PORT' setting is scheduled for deprecation in version 2.5 and removal in version v4.0. Use the BROKER_URL setting instead

alternative='Use the {0.alt} instead'.format(opt))

H:\example_project\venv\lib\site-packages\celery\apps\worker.py:161: CDeprecationWarning:
Starting from version 3.2 Celery will refuse to accept pickle by default.

The pickle serializer is a security concern as it may give attackers
the ability to execute any command. It's important to secure
your broker from unauthorized access when using pickle, so we think
that enabling pickle should require a deliberate action and not be
the default choice.

If you depend on pickle then you should set a setting to disable this
warning and to be sure that everything will continue working
when you upgrade to Celery 3.2::

CELERY_ACCEPT_CONTENT = ['pickle', 'json', 'msgpack', 'yaml']

You must only enable the serializers that you will actually use.

warnings.warn(CDeprecationWarning(W_PICKLE_DEPRECATED))

[2019-08-30 01:51:46,099: WARNING/MainProcess] H:\example_project\venv\lib\site-packages\celery\apps\worker.py:161: CDeprecationWarning:
Starting from version 3.2 Celery will refuse to accept pickle by default.

The pickle serializer is a security concern as it may give attackers
the ability to execute any command. It's important to secure
your broker from unauthorized access when using pickle, so we think
that enabling pickle should require a deliberate action and not be
the default choice.

If you depend on pickle then you should set a setting to disable this
warning and to be sure that everything will continue working
when you upgrade to Celery 3.2::

CELERY_ACCEPT_CONTENT = ['pickle', 'json', 'msgpack', 'yaml']

You must only enable the serializers that you will actually use.

warnings.warn(CDeprecationWarning(W_PICKLE_DEPRECATED))

-------------- celery@elatootsti v3.1.26.post2 (Cipater)
---- **** -----
--- * *** * -- Windows-10-10.0.16299
-- * - **** ---

  • ** ---------- [config]
  • ** ---------- .> app: default:0x7e65240 (djcelery.loaders.DjangoLoader)
  • ** ---------- .> transport: django:https://guest:**@localhost:5672//
  • ** ---------- .> results:
  • *** --- * --- .> concurrency: 4 (prefork)
    -- ******* ----
    --- ***** ----- [queues]
    -------------- .> celery exchange=celery(direct) key=celery

H:\example_project\venv\lib\site-packages\djcelery\loaders.py:130: UserWarning: Using settings.DEBUG leads to a memory leak, never use this setting in production environments!
warn('Using settings.DEBUG leads to a memory leak, never '

[2019-08-30 01:51:46,188: WARNING/MainProcess] H:\example_project\venv\lib\site-packages\djcelery\loaders.py:130: UserWarning: Using settings.DEBUG leads to a memory leak, never
use this setting in production environments!
warn('Using settings.DEBUG leads to a memory leak, never '

[2019-08-30 01:51:46,190: WARNING/MainProcess] celery@elatootsti ready.
[2019-08-30 01:51:49,007: ERROR/MainProcess] Task open_news.tasks.run_spiders[f7b380d7-e151-435e-8e75-c2db0e774ecf] raised unexpected: IOError()
Traceback (most recent call last):
File "H:\example_project\venv\lib\site-packages\celery\app\trace.py", line 240, in trace_task
R = retval = fun(*args, **kwargs)
File "H:\example_project\venv\lib\site-packages\celery\app\trace.py", line 438, in protected_call
return self.run(*args, **kwargs)
File "H:\example_project\open_news\tasks.py", line 10, in run_spiders
t.run_spiders(NewsWebsite, 'scraper', 'scraper_runtime', 'article_spider')
File "H:\example_project\venv\lib\site-packages\dynamic_scraper\utils\task_utils.py", line 54, in run_spiders
if not self._pending_jobs(spider_name):
File "H:\example_project\venv\lib\site-packages\dynamic_scraper\utils\task_utils.py", line 35, in _pending_jobs
resp = urllib.request.urlopen('http:https://localhost:6800/listjobs.json?project=default')
File "H:\example_project\venv\lib\site-packages\future\backports\urllib\request.py", line 171, in urlopen
return opener.open(url, data, timeout)
File "H:\example_project\venv\lib\site-packages\future\backports\urllib\request.py", line 500, in open
response = meth(req, response)
File "H:\example_project\venv\lib\site-packages\future\backports\urllib\request.py", line 612, in http_response
'http', request, response, code, msg, hdrs)
File "H:\example_project\venv\lib\site-packages\future\backports\urllib\request.py", line 538, in error
return self._call_chain(*args)
File "H:\example_project\venv\lib\site-packages\future\backports\urllib\request.py", line 466, in _call_chain
result = func(*args)
File "H:\example_project\venv\lib\site-packages\future\backports\urllib\request.py", line 620, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
HTTPError
[2019-08-30 01:51:49,128: ERROR/MainProcess] Task open_news.tasks.run_spiders[27f2ae3d-85af-4f17-b815-8ece6d43deca] raised unexpected: IOError()
Traceback (most recent call last):
File "H:\example_project\venv\lib\site-packages\celery\app\trace.py", line 240, in trace_task
R = retval = fun(*args, **kwargs)
File "H:\example_project\venv\lib\site-packages\celery\app\trace.py", line 438, in protected_call
return self.run(*args, **kwargs)
File "H:\example_project\open_news\tasks.py", line 10, in run_spiders
t.run_spiders(NewsWebsite, 'scraper', 'scraper_runtime', 'article_spider')
File "H:\example_project\venv\lib\site-packages\dynamic_scraper\utils\task_utils.py", line 54, in run_spiders
if not self._pending_jobs(spider_name):
File "H:\example_project\venv\lib\site-packages\dynamic_scraper\utils\task_utils.py", line 35, in _pending_jobs
resp = urllib.request.urlopen('http:https://localhost:6800/listjobs.json?project=default')
File "H:\example_project\venv\lib\site-packages\future\backports\urllib\request.py", line 171, in urlopen
return opener.open(url, data, timeout)
File "H:\example_project\venv\lib\site-packages\future\backports\urllib\request.py", line 500, in open
response = meth(req, response)
File "H:\example_project\venv\lib\site-packages\future\backports\urllib\request.py", line 612, in http_response
'http', request, response, code, msg, hdrs)
File "H:\example_project\venv\lib\site-packages\future\backports\urllib\request.py", line 538, in error
return self._call_chain(*args)
File "H:\example_project\venv\lib\site-packages\future\backports\urllib\request.py", line 466, in _call_chain
result = func(*args)
File "H:\example_project\venv\lib\site-packages\future\backports\urllib\request.py", line 620, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
HTTPError

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant