This project demonstrates a web scraper for the online grocery store BigBasket. The scraper extracts product information from multiple categories and subcategories, and saves the data in an Excel file. Additionally, it can upload the data to a new Google Sheets document.
- Scrapes product information from multiple categories and subcategories
- Saves the data to an Excel file
- Uploads the data to a new Google Sheets document
- Customizable time intervals for data saving
- Headless scraping using Selenium
- Python 3.6 or higher
- Chrome WebDriver
The following Python libraries:
- pandas
- openpyxl
- selenium
- BeautifulSoup
- google-auth
- google-api-python-client
- Clone the repository:
git clone https://github.com/username/bigbasket-scraper.git
cd bigbasket-scraper
- Install the required Python packages:
pip install -r requirements.txt
-
Open bigbasket_scraper.py in your favorite code editor, and update the my_email variable with your email address.
-
Download a credentials.json file for your Google Cloud project, and place it in the project directory (https://developers.google.com/workspace/guides/create-credentials).
-
Run the cells of bigbasket_scraper.ipynb notebook.
-
The script will scrape the product information from BigBasket and save it to an Excel file named bigbasket_data.xlsx.
-
After the scraping process is complete, the script will upload the data to a new Google Sheets document and provide you with the URL.
This project is licensed under the MIT License.