GitHub - Ismat-Samadov/ebooks_az: Downloading books as pdf

# Ebook PDF Downloader

## Overview
This Scrapy spider is designed to download PDF books from a specific website. It follows links to individual book pages, extracts book information, and downloads the associated PDF files. This README provides an overview of the project, how to set it up, and how to run the spider.

## Prerequisites
To use this spider, you need to have the following installed:

- Python 3.x
- Scrapy
- unidecode
- Any additional dependencies mentioned in the spider's source code

## Installation

1. Clone the repository to your local machine:

```bash
git clone https://github.com/yourusername/ebook-pdf-downloader.git

Change into the project directory:

cd ebook-pdf-downloader

Create a virtual environment (recommended) and activate it:

python -m venv venv
source venv/bin/activate   # On Windows: venv\Scripts\activate

Install the required Python packages:

pip install scrapy unidecode

Usage

Edit the spider configuration: Open ebooks_az/spiders/main_spider.py and adjust the spider's settings if needed.
Run the spider with the following command:

scrapy crawl main

The spider will start scraping the target website and downloading PDFs. Downloaded PDF files will be saved in the files directory within the project folder.

Important Notes

Respect website policies: Ensure that your web scraping activities comply with the website's terms of service and respect their robots.txt file. Consider adding appropriate delays between requests to avoid overloading the server.
File storage: The downloaded PDF files will be saved in the files directory within the project folder. Make sure this directory exists and has appropriate write permissions.
Customization: Feel free to customize the spider to suit your specific scraping requirements, such as adapting the URL, improving error handling, or setting different user agents.

License

Contact

If you have any questions or need further assistance, please feel free to contact me or open an issue in this repository.

Happy web scraping!

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
ebooks		ebooks
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Usage

Important Notes

License

Contact

About

Releases

Packages

Languages

Ismat-Samadov/ebooks_az

Folders and files

Latest commit

History

Repository files navigation

Usage

Important Notes

License

Contact

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages