-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explicitly exclude DICOM tags from anonymization #155
Comments
Hi, @medihack. Thanks for your interest! Currently, not from the command line. Programmatically, you could probably tear the There's no reason why we couldn't augment the tool (when using We'd have to come up with a reasonable interface. Specifying fields to omit by name or tag would probably be a reasonable first start, but I this makes me wonder whether people would eventually want to do so by VR, for example. |
Perhaps a so dicognito --skip-attribute StudyDate …
dicognito --skip-attribute 00080020 …
dicognito --skip-attribute StudyDate --skip-attribute StudyTime … I'm open to suggestions. Of course allowing some attributes to be skipped would perhaps open up other inconsistencies. For example, if StudyDate were not shifted, but we let dicognito do its work on PatientBirthDate, then the PatientAge would be incorrect, since we currently don't recalculate it. This can be documented, of course. |
Hi @blairconrad. Thank you for looking into this. Something like |
Thanks for the feedback, @medihack. Were you hoping to contribute these changes? |
Our team can have a look if you don't have the resources right now. But it will take some time, too. Maybe we can start with |
@lmdulz I think you are more into this than me; maybe we can have a look together. |
Cool. If you're interested, I'd had half an idea of being able to create a custom |
Hey all, for sure, @blairconrad and me had the same talk about a protector class already before. In my head this would be a simple Class similar to the |
True. I have no strong opinion on this. If an object had the capability to protect more than one Tag, it would still have the ability to protect one, so making the class that could protect many but using an individual object for each attribute mentioned on the command line would be fine by me. Whatever.
dicognito currently handles this out of the box, as we're careful to set the PatientBirthDate and the StudyDate (and Time) by the same delta. (In fact all DTs, TMs, and DTs are shifted by the same amount (within a particular invocation of dicognito, or across all runs if the same seed is used).) PatientAge was in my mind when I made this decision, but it was also to ensure, for example, that relevant prior studies remained prior studies relative the latest study. We wouldn't want to shift the current study back 18 months, but last week's study back 12 months. Then the later study becomes the earlier! |
Ah I think you got me wrong... I know about the consistency of time-shifting that dicognito currently does. I just wanted to point out with that second example, that caring about all cross-relations of tags might be unnecessary (and at some point out of scope?) ? Maybe it would be enough to get a warning message or put one in the readme but I'd say people using dicognito would know those cross-relations |
Ah, thanks for setting me straight. I think we are aligned.
I agree. I think it's important for dicognito, as it ships, to worry about retaining relationships between attributes. But this "preserve existing values" feature in some ways feels to me like advanced usage, where dicognito will provide for users a more flexible, powerful option for the user, but the user must then take responsibility to use that additional power well.
Yeah, I would expect the users to understand the relationships. I wouldn't even have dicognito itself emit a warning message (which could quickly become annoying to the users), but perhaps a very brief note in the command-line help for the option, and definitely someing more comprehensive in the docstring for the |
As a side note what I just remeberred bc the topic occurred months ago in my team: there is a Modified Attributes Sequence Attribute, which contains the modified attributes with their previous/ original values. However, I don't know how one could protect/ encrypt this sequence at this moment and it just came up to me - Did you know this before or is this a thing worth looking into for you? |
I think I misread your comment @lmdulz. You're interested in ensuring that we don't modify the Modified Attributes Sequence Attribute, not interested in creating one. (I don't understand the comment about encrypting the sequence.) Here's my initial thoughts: out of the box, I think it's appropriate for dicognito to anonymize the contents of the sequence. If we think we're obscuring original data, then I think obscuring the "original original" data also makes sense. Directives to skip anonymization of attributes based on name or tag or VR would apply to the objects within the sequence, so if we wanted to keep all StudyDates (for example) those would still be preserved. But if you're wondering about how to ensure we keep the values of StudyDate only when it's inside (or only when outside) a Modified Attributes Sequence, it gets trickier. I suppose a custom element anonymizer could walk the current dataset's ancestor chain (via parent_seq and parent_dataset) to see if the dataset were contained in such a sequence and then decide to anonymize or not. This may be getting into quite niche territory, though, and I'm not sure there's a need to include a command-line flag for this behaviour or to address it in this issue. Or maybe I'm wrong and there is a need. If so, convince me! |
No no, your initial thought was right, the idea was to create one. But I'm also not soo convinced by it. The part about encryption was manily bc I had in my mind one needs to encrypt the Modified Attributes Sequence Attribute, otherwise it's whole existence makes no sense for me. |
@lmdulz (or @medihack, I'm not sure who's doing what), I thought I'd check in on you. Is there anything I can do to help move your issue along? Design decisions you'd like to discuss? One thing that occurs to me is that Other notes: I did this morning merge an infrastructure change I'd had sitting in a pull request for a couple of weeks, so if you've an active branch, you'll eventually want to rebase on top of it. If you don't have an active branch, I've been mulling over the implementation and can probably have something for you to preview (maybe a finished product) by Monday. Your call. |
Unfortunately, not yet. Our workaround is relatively easy as we only use dicognito programmatically and can easily reset some pseudonymized elements to the original values afterward. So, it is not our highest priority right now, and we would be happy to have a look and use your implementation :-) |
Is there a way to explicitly exclude DICOM tags from anonymization? We use dicognito to anonymize exams for a trial, which works great, but we must retain some tags like
StudyDate
,SeriesDate
orAcquisitionDate
. Currently, I reset them programmatically after the anoynomization, but I wonder if there is an option to leave those tags untouched by dicognito.The text was updated successfully, but these errors were encountered: