Shrew is a web scraping project setup with MySQL database to create Wordpress sites from scraped page data.
You'll need the most recent stable version of NodeJS
A package manager
- npm
npm install npm@latest -g
You'll need the most recent stable version of Local WP (for your local Wordpress instance)
- Clone the repo
git clone https://github.com/xcob/shrew
- Install NPM packages
npm install
- Run the scraper
npm run start
MySQL connection example
host: 'localhost',
user: 'root',
password: 'root',
database: 'local',
port: '10010',
socketPath: 'Found in LocalWP > Database > Socket'
Shrew is used to extract HTML content from Site Sucker Zip files to inject into wrapped WPBakery code and inject it into a mysql DB
- Pull in any used JS files the same as CSS files
- Figure out best way to update links
- Figure out best way to src images
- SFTP transfer for scraped CSS/JS
- Rewrite sql to write SQL file, doubt I have remote DB inject from this thing
- Refine extracted HTML content (Much later)
- Header/Footer buildout
- Rewrite sitemap CSV source from createReadStream to detect a csv file