GitHub - Erdos1729/scrape-urls-from-multiple-websites-2.0: This repository will help you to extract all the urls reported on a website and save it in an excel file. This code works on multiple weblinks provided in a csv file and thus saves you a lot of manual work. If you are working on scouting multiple websites for identifying press releases, presentations, annual reports etc., this code will come in handy and save alot of man-hours.

Scrape urls from multiple websites 2.0

This repository will help you to extract all the urls reported on a website and save it in an excel file. This code works on multiple weblinks provided in a csv file and thus saves you a lot of manual work. If you are working on scouting multiple websites for identifying press releases, presentations, annual reports etc., this code will come in handy and save alot of man-hours.

Instructions

pip install -r requirements
Run url_extract_2.0.py

Reference

I devised the solution from the following pages of the documentation:

[] package that collects several modules for working with URLs
[] to scrape information from web pages
[] to parse RSS feeds in Python
[] for data structuring

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
input_file		input_file
output_file		output_file
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
requirements		requirements
url_extract_2.0.py		url_extract_2.0.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scrape urls from multiple websites 2.0

Instructions

Reference

About

Releases

Packages

Languages

Erdos1729/scrape-urls-from-multiple-websites-2.0

Folders and files

Latest commit

History

Repository files navigation

Scrape urls from multiple websites 2.0

Instructions

Reference

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages