Gazette Notice Scraper

Overview

This Python script is designed to automatically crawl and extract data from notice pages on The Gazette's website. It navigates through pages 1 to 15, accessing individual notices to gather detailed information such as notice details, deceased details, last address of the deceased, and executor/administrator details. The extracted data is then organized and saved into a CSV file for easy access and analysis.

Features

Crawls through specified pages of The Gazette's notice section.
Extracts detailed information from each notice:
- Notice Details (Type, Notice Type, Publish Date, etc.)
- Deceased Details (Name, Date of Death, etc.)
- Last Address of the Deceased
- Executor/Administrator Details
Handles web requests responsibly with appropriate delays.
Saves the extracted data into a CSV file.

Requirements

Python 3.6+
BeautifulSoup4: For parsing HTML and extracting the required information.
Requests: For making HTTP requests to The Gazette's website.
Pandas: For organizing the extracted data and saving it into a CSV file.

To install the necessary libraries, run:

pip install beautifulsoup4 requests pandas

Usage

Ensure you have Python 3.6+ installed on your system.
Install the required Python libraries mentioned above.
Save the script to a local file, for example, gazette_scraper.py.
Open a terminal or command prompt.
Navigate to the directory where the script is saved.
Run the script using Python:
```
python gazette_scraper.py
```
Once the script completes its execution, you will find a CSV file named extracted_notices.csv in the same directory, containing all the extracted data.

Configuration

The script is configured to scrape the first 15 pages of The Gazette's notice section by default. You can modify the PARAMS dictionary within the script to change the search criteria, such as categories, notice types, location details, and date ranges.

Disclaimer

This script is provided for educational purposes only. Always respect The Gazette's robots.txt file and terms of service when scraping their website. Ensure that your use of this script complies with their policies and legal requirements.

Support

For questions or issues regarding the script, please open an issue on the GitHub repository where this script is hosted.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
extracted_notices.csv		extracted_notices.csv
python-hr-force.py		python-hr-force.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gazette Notice Scraper

Overview

Features

Requirements

Usage

Configuration

Disclaimer

Support

About

Releases

Packages

Languages

abdnafees/python-hr-force

Folders and files

Latest commit

History

Repository files navigation

Gazette Notice Scraper

Overview

Features

Requirements

Usage

Configuration

Disclaimer

Support

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages