BIDS anonymize dataset unwanted behavior for split. #1211

Fayed-Rsl · 2023-12-14T18:00:06Z

Describe the problem

As a disclaimer, note that I am using mne_bids version 0.10, BIDS Version 1.6.0 and this issue might have been solve with future version even though I haven't seen any one reporting this before.

I am currently anonymizing a BIDS dataset and I found this very nice function to be extremely handy.
https://mne.tools/mne-bids/stable/generated/mne_bids.anonymize_dataset.html

Unfortunately, it does not handle the split files correctly.
Assuming that you have a .fif files that needs to be splitted (above 2GB) , then mne split this file automatically. Which is great. The problem occurs when you already have a split file, and this function recreate a split again.

For example this was my original filename:

sub-001_ses-PeriOp_task-HoldR_acq-MedOff_run-1_split-01_meg.fif
sub-001_ses-PeriOp_task-HoldR_acq-MedOff_run-1_split-02_meg.fif

After running the anonymization:

sub-2IhVOz_ses-PeriOp_task-HoldR_acq-MedOff_run-1_split-01-split-01_meg.fif
sub-2IhVOz_ses-PeriOp_task-HoldR_acq-MedOff_run-1_split-01-split-02_meg.fif
sub-2IhVOz_ses-PeriOp_task-HoldR_acq-MedOff_run-1_split-02_meg.fif (which is just a copy of the actual split)

I tried to solve this manually by actually renaming the file and removing the extra split that was created. After doing so, I also corrected the error that spread over the scans.tsv aswell.
At the end, I finally had a dataset that passed the BIDS validator.
But unfortunately by renaming like this I created a bigger problem because
split files should be renamed by loading and re-saving with MNE-Python to preserve proper filename linkage

I assume this is not a wanted behavior from this function, we probably want the following as an output:

sub-2IhVOz_ses-PeriOp_task-HoldR_acq-MedOff_run-1_split-01_meg.fif
sub-2IhVOz_ses-PeriOp_task-HoldR_acq-MedOff_run-1_split-02_meg.fif

Describe your solution

I haven't digged much inside the function itself but I guess the main idea to solve it would be to handle split files like the following:

# get all the fif file of the current subject being anonymized
meg_files = BIDSPath(root=root, subject=subject, extension='fif').match()
# get the split1 and split2
split1 = [f for f in meg_files if 'split-01' in f.basename]
split2 = [f for f in meg_files if 'split-02' in f.basename]
if len split1 > 0 and split2 > 0:
    for s1, s2 in zip(split1, split2):
        raw = read_raw_bids(s1) # read the raw file that we will save
        ...

I think this bug result allegedly because the bids basename still contains the entity split01 (this is why when it creates an anonymized copy it create an additional split01 in the basename).
so one idea that come to me is to update the filename without the split just by doing
s1_basename_anon = s1.copy().update(split=None)

And here instead of also saving the second split, we could just save the raw s1_basename, which will create the new split immediately with the correct linkage.

raw.save(s1_basename_anon, split_naming='bids'

Describe possible alternatives

Implementing or fixing this (if it has not been already) would be really helpfull and time winning for future bids dataset that will be anonymized.
Thank you

Additional context

No response

The text was updated successfully, but these errors were encountered:

welcome · 2023-12-14T18:00:09Z

Hello! 👋 Thanks for opening your first issue here! ❤️ We will try to get back to you soon. 🚴🏽‍♂️

sappelhoff · 2023-12-14T18:45:05Z

As a disclaimer, note that I am using mne_bids version 0.10, BIDS Version 1.6.0 and this issue might have been solve with future version even though I haven't seen any one reporting this before.

Could you please try this with the most recent stable version of mne_bids (0.14) and see if it works?

Fayed-Rsl · 2023-12-15T15:43:44Z

Could you please try this with the most recent stable version of mne_bids (0.14) and see if it works?

I tried again using the most recent version of mne_bids 0.14 after upgrading using
pip install --upgrade mne-bids
but the unwanted behavior remain the same as I explained in my previous message.

Here is an image of my original folder:

and the anonymized output:

hoechenberger · 2024-02-28T11:26:54Z

We recently fixed a similar issue in MNE-BIDS-Pipeline

mne-tools/mne-bids-pipeline#855

Fayed-Rsl added the enhancement label Dec 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BIDS anonymize dataset unwanted behavior for split. #1211

BIDS anonymize dataset unwanted behavior for split. #1211

Fayed-Rsl commented Dec 14, 2023 •

edited

Loading

welcome bot commented Dec 14, 2023

sappelhoff commented Dec 14, 2023

Fayed-Rsl commented Dec 15, 2023

hoechenberger commented Feb 28, 2024

BIDS anonymize dataset unwanted behavior for split. #1211

BIDS anonymize dataset unwanted behavior for split. #1211

Comments

Fayed-Rsl commented Dec 14, 2023 • edited Loading

Describe the problem

Describe your solution

Describe possible alternatives

Additional context

welcome bot commented Dec 14, 2023

sappelhoff commented Dec 14, 2023

Fayed-Rsl commented Dec 15, 2023

hoechenberger commented Feb 28, 2024

Fayed-Rsl commented Dec 14, 2023 •

edited

Loading