Skip to content

schmittlauch/500px-grab

 
 

Repository files navigation

500px-grab

More information about the archiving project can be found on the ArchiveTeam wiki: 500px

Setup instructions

Be sure to replace YOURNICKHERE with the nickname that you want to be shown as, on the tracker. You don't need to register it, just pick a nickname you like.

In most of the below cases, there will be a web interface running at https://localhost:8001/. If you don't know or care what this is, you can just ignore it—otherwise, it gives you a fancy view of what's going on.

If anything goes wrong while running the commands below, please scroll down to the bottom of this page. There's troubleshooting information there.

Running with a warrior

Follow the instructions on the ArchiveTeam wiki for installing the Warrior, and select the "500px" project in the Warrior interface.

Running without a warrior

To run this outside the warrior you first need to install Flex, Bison, and OpenSSL headers, if they are not installed already. On Debian/Ubuntu, you can do it with this command:

sudo apt-get install python-pip git build-essential liblua5.1-dev flex bison libgnutls-openssl-dev autoconf python-setuptools

Then clone this repository, cd into its directory and run:

pip install --upgrade seesaw
./get-wget-lua.sh

then start downloading with:

run-pipeline pipeline.py --concurrent 1 YOURNICKHERE

For more options, run:

run-pipeline --help

If you don't have root access and/or your version of pip is very old, you can replace "pip install --upgrade seesaw" with:

wget https://raw.github.com/pypa/pip/master/contrib/get-pip.py ; python get-pip.py --user ; ~/.local/bin/pip install --upgrade --user seesaw

so that pip and seesaw are installed in your home, then run

~/.local/bin/run-pipeline pipeline.py --concurrent 2 YOURNICKHERE

Running multiple instances on different IPs

This feature requires seesaw version 0.0.16 or greater. Use pip install --upgrade seesaw to upgrade.

Use the --context-value argument to pass in bind_address=123.4.5.6 (replace the IP address with your own).

Example of running 2 threads, no web interface, and Wget binding of IP address:

run-pipeline pipeline.py --concurrent 2 YOURNICKHERE --disable-web-server --context-value bind_address=123.4.5.6

Distribution-specific setup

For Debian/Ubuntu:

adduser --system --group --shell /bin/bash archiveteam
apt-get update && apt-get install -y git-core libgnutls-dev lua5.1 liblua5.1-0 liblua5.1-0-dev screen python-dev python-pip bzip2 zlib1g-dev flex autoconf
pip install --upgrade seesaw
su -c "cd /home/archiveteam; git clone https://github.com/ArchiveTeam/500px-grab.git; cd 500px-grab; ./get-wget-lua.sh" archiveteam
screen su -c "cd /home/archiveteam/500px-grab/; run-pipeline pipeline.py --concurrent 2 --address '127.0.0.1' YOURNICKHERE" archiveteam
[... ctrl+A D to detach ...]

In Debian Jessie, the libgnutls-dev package was renamed to libgnutls28-dev. So, you need to do the following instead:

adduser --system --group --shell /bin/bash archiveteam
apt-get update && apt-get install -y git-core libgnutls28-dev lua5.1 liblua5.1-0 liblua5.1-0-dev screen python-dev python-pip bzip2 zlib1g-dev flex autoconf
[... pretty much the same as above ...]

Wget-lua is also available on ArchiveTeam's PPA for Ubuntu.

For CentOS:

Ensure that you have the CentOS equivalent of bzip2 installed as well. You will the EPEL repository to be enabled.

yum -y install autoconf automake flex gnutls-devel lua-devel python-pip zlib-devel
pip install --upgrade seesaw
[... pretty much the same as above ...]

For openSUSE:

zypper install liblua5_1 lua51 lua51-devel screen python-pip libgnutls-devel bzip2 python-devel gcc make
pip install --upgrade seesaw
[... pretty much the same as above ...]

For OS X:

You need Homebrew. Ensure that you have the OS X equivalent of bzip2 installed as well.

brew install python lua gnutls
pip install --upgrade seesaw
[... pretty much the same as above ...]

There is a known issue with some packaged versions of rsync. If you get errors during the upload stage, 500px-grab will not work with your rsync version.

This supposedly fixes it: