-
Notifications
You must be signed in to change notification settings - Fork 0
A showcase of a frozen version of SDF (Scraper Development Framework) more info at http:https://iavas.herobo.com/projects#SDF
igui/sdf-showcase
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
SDF is a Python framework that allows to easy scrape a site. It enables to make XPath and CSS queries to quickly parse a website. The focus of the framework is to make the job of making a scraper easier, allow to parse sites with millions of items, and be able to export the items in several formats ranging from CSVs to MySQL database dump. SDF uses browsers (as in a Web Browser) to do the job. Browsers come in two flavors: the WebKitBrowser, a standard browser (similar to Safari or Chrome) that can be controlled through python code. WebkitBrowser manages javascript, flash and other types of content. The other type of browser is the BasicBrowser, a minimalistic and faster version of WebKitBrowser that doesn't handle javascript directly. SDF features recovery and parallel processing, so multiple browsers can be used concurrently to scrape a large site in hours. If you want to learn more of about the project you can visit http:https://iavas.herobo.com/projects#SDF
About
A showcase of a frozen version of SDF (Scraper Development Framework) more info at http:https://iavas.herobo.com/projects#SDF
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published