-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support_TIF_classweights_and_exclusion #18
support_TIF_classweights_and_exclusion #18
Conversation
minh-doan
commented
Oct 21, 2017
- Support parsing TIF
- Support class_weights in model.py
- Allow exclusion when collecting samples for training/testing sets
@@ -44,14 +44,29 @@ | |||
"--verbose", | |||
is_flag=True | |||
) | |||
def command(input, batch_size, directory, name, verbose): | |||
@click.option( | |||
"--exclusion", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--exclude
@click.option( | ||
"--exclusion", | ||
default=None, | ||
help="A comma-separated list of prefixes (string) specifying the files that needs to be helf off the testing dataset." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"helf" -> "held"
"--exclusion", | ||
default=None, | ||
help="A comma-separated list of prefixes (string) specifying the files that needs to be helf off the testing dataset." | ||
" E.g., \"'patient_A', 'patient_X'\". All numpy arrays will be collected for testing if this flag is omitted." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"numpy arrays" -> "files"
Then we won't have to change the wording if/when the underlying data structure changes.
" E.g., \"'patient_A', 'patient_X'\". All numpy arrays will be collected for testing if this flag is omitted." | ||
) | ||
@click.option( | ||
"--nsamples", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--samples
would be okay, too.
"This setting is useful to limit certain amount of datapoint to be displayed in unsupervised PCA/t-SNE plots." | ||
"All numpy arrays will be collected for testing if this flag is omitted." | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary newline.
deepometry/commands/command_parse.py
Outdated
# Check extension | ||
pathnames = glob.glob(os.path.join(label_directories[0], "*")) | ||
## TO_DO: | ||
# how to make sure the label_directories[0] is not non-folder files? e.g. .DS_Store |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use os.path.isdir
to filter out directories from non-directories.
deepometry/model.py
Outdated
@@ -101,7 +101,8 @@ def fit(self, x, y, batch_size=32, epochs=512, validation_split=0.2, verbose=0): | |||
"epochs": epochs, | |||
"steps_per_epoch": len(x_train) // batch_size, | |||
"validation_steps": len(x_valid) // batch_size, | |||
"verbose": verbose | |||
"verbose": verbose, | |||
"class_weight" : class_weight |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"class_weight": class_weight
@@ -33,7 +37,7 @@ def parse(pathname, output_directory, size, channels=None): | |||
if ext == ".cif": | |||
return _parse_cif(pathname, output_directory, size, channels) | |||
|
|||
raise NotImplementedError("Unsupported file format: {}".format(ext)) | |||
raise NotImplementedError("Expected file format: {}".format(".CIF")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I envisioned parse
being a top-level function which delegated to different parse functions based on the file extensions. E.g.,
def parse(pathname, output_directory, size, channels=None):
ext = os.path.splitext(pathname)[-1].lower()
if ext == ".cif":
return _parse_cif(pathname, output_directory, size, channels)
if ext == ".npy":
return _parse_npy(pathname, output_directory, size, channels)
raise NotImplementedError("Unsupported file format: {}".format(ext))
I see how it would be annoying to call parse for every .TIF image. Maybe the first argument (pathname
) could take either a single file name or a list of file names.
deepometry/parse.py
Outdated
nested_filenames = [] | ||
|
||
for label in labels: | ||
# print("Parsing directory: {}".format(label)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment can be deleted.
deepometry/parse.py
Outdated
|
||
src_dir = os.path.join(src, label) | ||
|
||
filenames = glob.glob("{}/*.tif".format(src_dir)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use os.path.join
to concatenate paths: os.path.join(src_dir, "*.tif")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't need glob.glob when using os.path.join(src_dir, "*.tif")
, right?
There are some good changes here, Minh. Nice work! Let's work next on getting the existing tests passing:
|
Usurped by #22 |