Skip to content

xcob/shrew

Repository files navigation


Logo

Shrew

A web scraper with Wordpress

About The Project

Shrew is a web scraping project setup with MySQL database to create Wordpress sites from scraped page data.

(back to top)

Getting Started

Prerequisites

You'll need the most recent stable version of NodeJS

NodeJS

A package manager

  • npm
    npm install npm@latest -g

You'll need the most recent stable version of Local WP (for your local Wordpress instance)

Local WP

Installation

  1. Clone the repo
    git clone https://github.com/xcob/shrew
  2. Install NPM packages
    npm install
  3. Run the scraper
    npm run start

MySQL connection example

  host: 'localhost',
  user: 'root',
  password: 'root',
  database: 'local',
  port: '10010',
  socketPath: 'Found in LocalWP > Database > Socket'

(back to top)

Usage

Shrew is used to extract HTML content from Site Sucker Zip files to inject into wrapped WPBakery code and inject it into a mysql DB

(back to top)

Roadmap

  • Pull in any used JS files the same as CSS files
  • Figure out best way to update links
  • Figure out best way to src images
  • SFTP transfer for scraped CSS/JS
  • Rewrite sql to write SQL file, doubt I have remote DB inject from this thing
  • Refine extracted HTML content (Much later)
  • Header/Footer buildout
  • Rewrite sitemap CSV source from createReadStream to detect a csv file

(back to top)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published