Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make certificates from original web request response available in SplashRequest #289

Open
ned2 opened this issue Jun 27, 2021 · 1 comment

Comments

@ned2
Copy link

ned2 commented Jun 27, 2021

I'd like to be able to include the status of whether the response to the URL being scraped used SSL or not. The challenge is that inside the parse method of the SplashResponse the response.certificates attribute is populated by the SSL details of the Splash response, rather than from the original scraped URL response.

My understanding is that the magic_response=True param causes body, url and http_method attributes of the response object to be set to the values from the scraped URL response.

Is there currently a way to access the certificates attribute from the scraped URL response? Or would this need to be an extension of the magic_response functionality?

@lopuhin
Copy link
Contributor

lopuhin commented Jun 28, 2021

Is there currently a way to access the certificates attribute from the scraped URL response? Or would this need to be an extension of the magic_response functionality?

@ned2 I think the first step would be to make sure that information you need is available in the splash response - either you can fish it from har (see https://splash.readthedocs.io/en/stable/api.html#render-json har option) or you'll need to write a custom lua script (see https://github.com/scrapy-plugins/scrapy-splash#examples and splash docs) and get this information from splash. As I understand, the information returned from splash would be available in response.data even if magic response is used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants