You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As of version 1.6.0, there are two (2) ways of adding the API KEYS:
via the settings.py:
CRAWLERA_APIKEY = 'apikey'
via spider attribute:
class SampleSpider(scrapy.Spider):
crawlera_apikey = 'apikey'
When using Scrapy Cloud, we could also declare it via:
via Spider/Project settings
via Scrapy Cloud Crawlera add-on
PROBLEM
What actually happens in reality is that the API KEYS are being written inside the code and committed in the repo.
The best practice would be to avoid any sensitive keys to be coupled alongside the code. #3 and #4 above already fixes this problem as we have the option to only declare the keys inside Scrapy Cloud.
However, this becomes a problem when trying to run the spider locally during development as the keys might not be there.
OBJECTIVES
This issue aims to be a discussion ground on exploring better ways to handle it.
For starters, here are a couple of ways to approach it:
A. Set and retrieve the keys via environment variables.
B. Set and retrieve the keys via local file that is uncommited to the repo. - Examples would be similar to how SSH keys are stored in ~/.ssh and AWS Keys in ~/.aws.
Either way, it should support different API KEYs per spider.
The text was updated successfully, but these errors were encountered:
BACKGROUND:
As of version
1.6.0
, there are two (2) ways of adding the API KEYS:settings.py
:When using Scrapy Cloud, we could also declare it via:
PROBLEM
What actually happens in reality is that the API KEYS are being written inside the code and committed in the repo.
The best practice would be to avoid any sensitive keys to be coupled alongside the code. #3 and #4 above already fixes this problem as we have the option to only declare the keys inside Scrapy Cloud.
However, this becomes a problem when trying to run the spider locally during development as the keys might not be there.
OBJECTIVES
This issue aims to be a discussion ground on exploring better ways to handle it.
For starters, here are a couple of ways to approach it:
A. Set and retrieve the keys via environment variables.
B. Set and retrieve the keys via local file that is uncommited to the repo. - Examples would be similar to how SSH keys are stored in
~/.ssh
and AWS Keys in~/.aws
.Either way, it should support different API KEYs per spider.
The text was updated successfully, but these errors were encountered: