Scrapyd

Scrapyd is a service for running `Scrapy`_ spiders.

It allows you to deploy your Scrapy projects and control their spiders using an HTTP JSON API. Installation This documents explains how to install and configure Scrapyd, to deploy and run your Scrapy spiders.

Requirements Scrapyd depends on the following libraries, but the installation process takes care of installing the missing ones:

Python 2.6 or above Twisted 8.0 or above Scrapy 0.17 or above Installing Scrapyd (generic way) How to install Scrapyd depends on the platform you’re using. The generic way is to install it from PyPI:

pip install scrapyd If you plan to deploy Scrapyd in Ubuntu, Scrapyd comes with official Ubuntu packages (see below) for installing it as a system service, which eases the administration work.

Other distributions and operating systems (Windows, Mac OS X) don’t yet have specific packages and require to use the generic installation mechanism in addition to configuring paths and enabling it run as a system service. You are very welcome to contribute Scrapyd packages for your platform of choice, just send a pull request on Github.

Installing Scrapyd in Ubuntu Scrapyd comes with official Ubuntu packages ready to use in your Ubuntu servers. They are shipped in the same APT repos of Scrapy, which can be added as described in Scrapy Ubuntu packages. Once you have added the Scrapy APT repos, you can install Scrapyd with apt-get:

apt-get install scrapyd This will install Scrapyd in your Ubuntu server creating a scrapy user which Scrapyd will run as. It will also create the directories and files described below:

/etc/scrapyd Scrapyd configuration files. See Configuration file.

/var/log/scrapyd/scrapyd.log Scrapyd main log file.

/var/log/scrapyd/scrapyd.out The standard output captured from Scrapyd process and any sub-process spawned from it.

/var/log/scrapyd/scrapyd.err The standard error captured from Scrapyd and any sub-process spawned from it. Remember to check this file if you’re having problems, as the errors may not get logged to the scrapyd.log file.

/var/log/scrapyd/project Besides the main service log file, Scrapyd stores one log file per crawling process in:

/var/log/scrapyd/PROJECT/SPIDER/ID.log Where ID is a unique id for the run.

/var/lib/scrapyd/ Directory used to store data files (uploaded eggs and spider queues).

Name		Name	Last commit message	Last commit date
Latest commit History 372 Commits
bin		bin
debian		debian
docs		docs
extras		extras
reqs		reqs
scrapyd		scrapyd
.bumpversion.cfg		.bumpversion.cfg
.coveragerc		.coveragerc
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile.buildbot		Makefile.buildbot
NEWS		NEWS
README.rst		README.rst
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scrapyd

About

Releases

Packages

Languages

License

donovan68/scrapyd

Folders and files

Latest commit

History

Repository files navigation

Scrapyd

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages