Skip to content

Commit

Permalink
added some info on Cloudflare and Akamai
Browse files Browse the repository at this point in the history
  • Loading branch information
pigivinci committed Sep 18, 2022
1 parent 34bf1ac commit 5456dd6
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 1 deletion.
3 changes: 2 additions & 1 deletion Pages/Antibot/Akamai.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,9 @@
Use [Wappalyzer Chrome Extension](https://github.com/reanalytics-databoutique/webscraping-open-doc/blob/0386528f99a1209a538f6d042e859cd9933011c8/Pages/Tools/Wappalyzer.md)

### Recommended approach to Akamai Bot Manager

**BEST CHOICE**: a standard configuration of Akamai requires a good proxy rotation to be beaten, there's no need of a fully rendered browser

### Reference and interesting links
[Official web page](https://www.akamai.com/products/bot-manager)

[High level description](https://www.zenrows.com/blog/bypass-akamai)
4 changes: 4 additions & 0 deletions Pages/Antibot/Cloudflare.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ Use [Wappalyzer Chrome Extension](https://github.com/reanalytics-databoutique/we
### Recommended approach to Cloudflare Bot Management
**BEST CHOICE**: Depends from the configuration of the single website, but [Playwright](https://github.com/reanalytics-databoutique/webscraping-open-doc/blob/main/Pages/Tools/Playwright.md) + [Stealth](https://github.com/reanalytics-databoutique/webscraping-open-doc/blob/main/Pages/Tools/Playwright_stealth.md) are usually enough for scraping.

A good solution, still to be tested by our side, is to find the IP address of the web server of the target website and then scrape from there.

### Reference and interesting links
[Official web page](https://www.cloudflare.com/en-gb/products/bot-management/)

Expand All @@ -19,3 +21,5 @@ Use [Wappalyzer Chrome Extension](https://github.com/reanalytics-databoutique/we
[A package for bypass Cloudflare](https://github.com/Anorov/cloudflare-scrape): maybe obsolete, not updated in 2 years

[Firefox appears to be flagged as suspicious from Cloudflare](https://brianlovin.com/hn/31459258)

[High level description](https://www.zenrows.com/blog/bypass-cloudflare#what-is-cloudflare-bot-management)

0 comments on commit 5456dd6

Please sign in to comment.