NLP-Information-Extraction Dependencies NLTK (punkt, averaged_perceptron_tagger) Tika Usage python3 extract.py <pdf_file>