Important /!\ : This project is now integrated within the VahenWebsite project (https://github.com/Vahen/VahenWebsite) as a submodule of the project. The developpement will be on the website rather than here
A little webCrawler in Python, using Beautifoul Soup 4
-
Install pip (or any package manager):
-
Installation :
-
(Windows)
- How to Install pip:
- Download https://raw.github.com/pypa/pip/master/contrib/get-pip.py.
- Remember to save it as "get-pip.py"
- Now go to the download folder. Right click on get-pip.py then open with python.exe.
- You can add system variable by
(by doing this you can use pip and easy_install without specifying path)
- 1 Clicking on Properties of My Computer
- 2 Then chose Advanced System Settings
- 3 Click on Advanced Tab
- 4 Click on Environment Variables
- 5 From System Variables >>> select variable path.
- 6 Click edit then add the following lines at the end of it :
- ;c:\Python27;c:\Python27\Scripts
- (please dont copy this, just go to your python directory and copy the paths similar to this)
- NB:- you have to do this once only.
- How to Install pip:
-
(Linux)
- Instructions are here https://pip.pypa.io/en/stable/installing/
- Below is a shorter version for a quick install
- Download : https://bootstrap.pypa.io/get-pip.py (or the previous link)
- run "python get-pip.py"
-
-
Upgrading:
- (Windows)
- python -m pip install -U pip
- (Linux)
- pip install -U pip
- (Windows)
-
-
Install BeautifulSoup : pip install beautifulsoup4
-
And you are ready to use it :)
-
Just launch the python script and follow the instructions.
-
/!\ This script doesn't work if you need to log in the website to download elements /!\
-
/!\ It's only supports direct links in the website /!\
-
If you find any bugs feel free to tell me about them :)