Skip to content

Experimental web-scraper for future data science applications

Notifications You must be signed in to change notification settings

Osfory/My_Web_Scrapper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MAJOR UPDATE OF PROJECT. - 07.10.2020

This is the first steps snake project.

Example shows how to build and develop artifact using standard python ecosystem:

  • pip: for dependencies management
  • venv: for project isolation. By default python installs user or system wide dependencies
  • dependencies.txt contains pip dependencies and version for reproducible builds
  • src contains source code

Getting started:

  • Install venv for python project isolation: python -m venv .
  • Activate venv for shell: source bin/activate
  • Install manual declared dependencies: pip install -r dependencies.txt
  • Download chrome driver: https://sites.google.com/a/chromium.org/chromedriver/
  • You may open chrome with remote address: chromium --remote-debugging-port=9222 --user-data-dir=/tmp/chrometmp and connect to it using connect $PORT command: python src/headhunterScrapper.py /home/aleksey/PycharmProjects/Web_driver/chromedriver_linux64/chromedriver /home/aleksey/PycharmProjects/My_Web_Scrapper/dataset 1 2 connect 9222 (example)
  • Either use open command and just open new browser: python src/headhunterScrapper.py /home/aleksey/PycharmProjects/Web_driver/chromedriver_linux64/chromedriver /home/aleksey/PycharmProjects/My_Web_Scrapper/dataset all 1 open (example)

About

Experimental web-scraper for future data science applications

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages