Automatically deploy egg #55

hrdymchl · 2018-01-19T19:17:00Z

Hi, first, thanks for building SpiderKeeper - it's really easy to use.

We have some scrapers that utilize SK and we want to automate our deployments with a continuous deployment script. The only manual part of this is uploading the egg file to the UI. Is it possible to deploy the egg file some other way?

GGPay · 2018-01-19T19:34:43Z

Hi there,

I would say the same as author this topic - the UI is very useful and works great.

Could you help and explain another way to deploy the eggs?

DormyMo · 2018-01-21T05:25:49Z

try the script below, execute this script after generated the egg file

import requests
# upload
upload_url = 'https://localhost:5000/project/1/spider/upload'  #  1 is the project id
egg_path = 'output.egg'
auth_info = ('admin','admin')
res = requests.post(upload_url,files={"file":open(egg_path,'rb')},auth = auth_info)
print(res.content)

hrdymchl · 2018-02-01T19:37:44Z

Thanks for the help @DormyMo. I am running spiderkeeper and scrapyd locally, but I'm getting this error when I execute your script. What am I doing wrong?

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>404 Not Found</title>
<h1>Not Found</h1>
<p>The requested URL was not found on the server.  If you entered the URL manually please check your spelling and try again.</p>```

hongphu8790 · 2018-07-28T03:34:46Z

@hrdymchl The service response 404 instead of 200. I saw in scrapyd log contains the line. I guess it was changed version, then I tested again on my script by print some log and seems working well:

 [28/Jul/2018:03:24:34 +0000] "POST /addversion.json HTTP/1.1" 200 100 "-" "python-requests/2.13.0"

You can ignore this response. Here my script to deploy:

import requests
import os

# some config for your crawler
server_url = 'https://localhost:5000'
server_user = 'admin'
server_pass = 'admin'
project_path = '/home/world/dmoz_crawler/'
project_id = 1
egg_name = 'dmoz.egg'

# upload
upload_url = '{}/project/{}/spider/upload'.format(server_url, project_id)  #  1 is the project id
os.system("cd {} && scrapyd-deploy --build-egg {}".format(project_path, egg_name))
egg_path = '{}{}'.format(project_path, egg_name)
auth_info = (server_user,server_pass)
res = requests.post(upload_url,files={"file":open(egg_path,'rb')},auth = auth_info)
print("{} - {}".format(res.status_code , res.content))

@DormyMo: thanks for your suggest!

StoicPerlman · 2018-09-23T18:21:56Z

For anyone interested I made a simple package to handle this. Check out spiderkeeper-deploy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatically deploy egg #55

Automatically deploy egg #55

hrdymchl commented Jan 19, 2018

GGPay commented Jan 19, 2018

DormyMo commented Jan 21, 2018

hrdymchl commented Feb 1, 2018

hongphu8790 commented Jul 28, 2018 •

edited

Loading

StoicPerlman commented Sep 23, 2018

Automatically deploy egg #55

Automatically deploy egg #55

Comments

hrdymchl commented Jan 19, 2018

GGPay commented Jan 19, 2018

DormyMo commented Jan 21, 2018

hrdymchl commented Feb 1, 2018

hongphu8790 commented Jul 28, 2018 • edited Loading

StoicPerlman commented Sep 23, 2018

hongphu8790 commented Jul 28, 2018 •

edited

Loading