Automate zipfile extraction #23

cutright · 2021-03-12T01:46:26Z

Would be helpful to port the automated zip file extraction code from the previous version of this repo:

https://github.com/cutright/IMRT-QA-Data-Miner/blob/85abf9dc66a139c02574c386377f46f0944c5893/IQDM/utilities.py#L190-L208

def extract_files_from_zipped_files(init_directory, extract_to_path, extension='.pdf'):
    """
    Function to extract .pdf files from zipped files
    :param init_directory: initial top-level directory to walk through
    :type init_directory: str
    :param extract_to_path: directory to extract pdfs into
    :type extract_to_path: str
    :param extension: file extension of file type to extract, set to None to extract all files
    :type extension: str or None
    """
    for dirName, subdirList, fileList in walk(init_directory):  # iterate through files and all sub-directories
        for fileName in fileList:
            if splitext(fileName)[1].lower == '.zip':
                zip_file_path = join(dirName, fileName)
                with zipfile.ZipFile(zip_file_path, 'r') as z:
                    for file_name in z.namelist():
                        if not isdir(file_name) and (extension is None or splitext(file_name)[1].lower == extension):
                            temp_path = join(extract_to_path)
                            z.extract(file_name, path=temp_path)

The text was updated successfully, but these errors were encountered:

cutright added the enhancement New feature or request label Mar 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automate zipfile extraction #23

Automate zipfile extraction #23

cutright commented Mar 12, 2021

Automate zipfile extraction #23

Automate zipfile extraction #23

Comments

cutright commented Mar 12, 2021