You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
def extract_files_from_zipped_files(init_directory, extract_to_path, extension='.pdf'):
"""
Function to extract .pdf files from zipped files
:param init_directory: initial top-level directory to walk through
:type init_directory: str
:param extract_to_path: directory to extract pdfs into
:type extract_to_path: str
:param extension: file extension of file type to extract, set to None to extract all files
:type extension: str or None
"""
for dirName, subdirList, fileList in walk(init_directory): # iterate through files and all sub-directories
for fileName in fileList:
if splitext(fileName)[1].lower == '.zip':
zip_file_path = join(dirName, fileName)
with zipfile.ZipFile(zip_file_path, 'r') as z:
for file_name in z.namelist():
if not isdir(file_name) and (extension is None or splitext(file_name)[1].lower == extension):
temp_path = join(extract_to_path)
z.extract(file_name, path=temp_path)
The text was updated successfully, but these errors were encountered:
Would be helpful to port the automated zip file extraction code from the previous version of this repo:
https://github.com/cutright/IMRT-QA-Data-Miner/blob/85abf9dc66a139c02574c386377f46f0944c5893/IQDM/utilities.py#L190-L208
The text was updated successfully, but these errors were encountered: