基于python的网页自动化工具。既能控制浏览器,也能收发数据包。可兼顾浏览器自动化的便利性和requests的高效率。功能强大,内置无数人性化设计和便捷功能。语法简洁而优雅,代码量少。
-
Updated
Aug 13, 2024 - Python
基于python的网页自动化工具。既能控制浏览器,也能收发数据包。可兼顾浏览器自动化的便利性和requests的高效率。功能强大,内置无数人性化设计和便捷功能。语法简洁而优雅,代码量少。
🤖/👨🦰 Detect bots/crawlers/spiders using the user agent string
A bot to help people with their rental real-estate search. 🏠🤖
An R web crawler and scraper
Norconex Crawlers (or spiders) are flexible web and filesystem crawlers for collecting, parsing, and manipulating data from the web or filesystem to various data repositories such as search engines.
Proxy List Scrapper
Simple robots.txt template. Keep unwanted robots out (disallow). White lists (allow) legitimate user-agents. Useful for all websites.
Vietnamese text data crawler scripts for various sites (including Youtube, Facebook, 4rum, news, ...)
hproxy - Asynchronous IP proxy pool, aims to make getting proxy as convenient as possible.(异步爬虫代理池)
Sneakpeek is a framework that helps to quickly and conviniently develop scrapers. It’s the best choice for scrapers that have some specific complex scraping logic that needs to be run on a constant basis
Tiny script to crawl information of a specific application in the Google play/store base on PHP.
Serritor is an open source web crawler framework built upon Selenium and written in Java. It can be used to crawl dynamic web pages that require JavaScript to render data.
User agent database in JSON format of bots, crawlers, certain malware, automated software, scripts and uncommon ones.
An open source web crawling platform
htcap is a web application scanner able to crawl single page application (SPA) in a recursive manner by intercepting ajax calls and DOM changes.
Add a description, image, and links to the crawlers topic page so that developers can more easily learn about it.
To associate your repository with the crawlers topic, visit your repo's landing page and select "manage topics."