-
Notifications
You must be signed in to change notification settings - Fork 142
Allow redefinition of DATASET_CSV_FILE_NAME #403
Comments
|
I've enabled user definition of the dataset csv file name by allowing the parameter dataset_csv to be set as part of model configuration. The parameter is initialised to DATASET_CSV_FILE_NAME in InnerEye/ML/config.py, then is used to locate the dataset csv file in InnerEye/ML/config.py, in InnerEye/ML/run_ml.py and in InnerEye/ML/utils/ml_util.py. I've added a unit test to Tests/ML/test_config_helpers.py. The above seems to be all that's needed to set custom dataset csv file names for model training. However, I've seen that DATASET_CSV_FILE_NAME is used for file location in a number of other places, for example InnerEye/ML/baselines_util.py and InnerEye/ML/normalize_and_visualize_dataset.py. Should I submit a pull request for the changes made, in that they provide the functionality requested in the Issue submission, or should I first try to avoid DATASET_CSV_FILE_NAME being used anywhere, except as a fallback value? |
@kh296 please do a pull request with the changes you described. As far as I can see now, the other uses of DATASET_CSV_FILE_NAME are fine. Once I see the PR, I'll be able to see more clearly anyway. |
The scans and structures to be used in a training run are specified in an index file, the name of which is given by the variable InnerEye.ML.common.DATASET_CSV_FILE_NAME. This is hardcoded to dataset.csv.
When trying to improve model performance, it can sometimes be useful to consider only a subset of patients and/or structures. This can be achieved by overwriting dataset.csv, or by adding a differently named index file and hacking InnerEye/ML/common.py. It to be nice instead able to set the value of DATASET_CSV_FILE_NAME in the model definition.
AB#3785
The text was updated successfully, but these errors were encountered: