Skip to content

Rvannest/PDF_Scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

PDF_Scraper

I developed a tool that can scan a PDF document and locate a specified keyword. The tool can extract the 30 words preceding and following the keyword to create sentences that provide contextual information related to the keyword. Essentially, the tool is a PDF keyword scraper that generates meaningful sentences surrounding the specified keyword.

The current keyword being analyzed is "provide $". This keyword has been tested on the PDF version of Canada's Federal Budget 2023, and the tool is able to extract a concise list of areas where the Canadian Government is allocating additional funding.

About

PDF Scraper, Python Language

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages