Skip to content

Bahrul-Rozak/url-to-code

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

URL to Code

GitHub license GitHub issues GitHub stars GitHub forks

A simple Node.js web scraper using website-scraper to download an entire website.

Getting Started

Prerequisites

Make sure you have Node.js installed on your machine.

Installation

  1. Clone the repository:

    git clone https://github.com/Bahrul-Rozak/url-to-code.git
  2. Navigate to the project directory:

    cd your-repo-name
  3. Install dependencies:

    npm install

Usage

  1. Open index.js in your preferred code editor.

  2. Set the websiteUrl variable to the URL of the website you want to scrape.

    const websiteUrl = 'https://example.com';
  3. Customize other options if needed (e.g., maxDepth, directory, etc.).

  4. Run the scraper:

    node index.mjs
  5. Check the ./result directory for the downloaded website.

Configuration

  • urls: An array of URLs to scrape.
  • urlFilter: A function to filter URLs. The example filters URLs that start with the specified websiteUrl.
  • recursive: If true, the scraper will follow links recursively.
  • maxDepth: Maximum recursion depth.
  • prettifyUrls: If true, URLs will be prettified.
  • filenameGenerator: File naming strategy, set to 'bySiteStructure' in the example.
  • directory: Output directory for the downloaded website.

Acknowledgments

Happy downloading! 🕸️

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published