Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When given a PDF URL, Reffy crashed because Puppeteer rejects them.
This update makes Reffy (roughly) detect PDF URLs and silently skip them, returning empty extracts instead.
The detection of PDF URLs currently relies on them ending with
.pdf
, which is not ideal but avoids more complex code (network interception is done at the request stage. To detect a PDF mime type, the interception would rather need to be done when a response is received from the server).Internally, a PDF URL is represented as an empty HTML document, so nothing will be extracted. The title extraction now reuses the information from browser-specs when it cannot extract the title, to avoid return "No title found for X" whereas we actually have the info.