Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Workflow ParcFodToConnectivity #33

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

sipv
Copy link
Collaborator

@sipv sipv commented Feb 24, 2017

No description provided.

@@ -31,8 +31,6 @@ def __new__(cls: type, name: str, bases: Tuple[type],
attrs: Dict[str, object]) -> type:
for key, value in list(attrs.items()):
if not key.startswith('_'):
value:
str
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is valid syntax for local variable type declaration in Python 3.6; I think I wrote that just to convince PyCharm that value was a string. Still, I admit I have no idea why it's on two lines, and deleting these lines is fine.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, on one line it makes sense (this two line syntax is invalid under 3.6, at least according to the interpreter). I'll put it back.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I don't know how the two line thing happened, honestly

scale_length = "-scale_length"
stat_edge = "-stat_edge"

class Assignment(metaclass=abc.ABCMeta):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if you like this. It seems unnecessary verbose to me, but I couldn't think of a better way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<jealousy> Rust does this well; it's enum values can have associated data (cf). I wish they'd done that in Python's enum.</jealousy>

I might have tried just some classmethods

class Assignment:

    def __init__(self, suffix, *extras):
        self.flag = '-assignment_' + suffix
        self.extras = extras

    def to_args(self):
        return [self.flag] + list(self.extras)

    @classmethod
    def end_voxels(self):
        return cls(suffix='end_voxels')

    @classmethod
    def radial_search(self, radius):
        return cls(suffix='end_voxels', str(radius))

wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but again, verbosity when intially writing the code is OK if the result is more readable / maintainable. Since you're working on it, I let you choose approach you think is best.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is slighly more elegant, thanks.

@sipv
Copy link
Collaborator Author

sipv commented Feb 28, 2017

I am running the convergence test for some exemplary data now, I'll add some results later.

@sipv
Copy link
Collaborator Author

sipv commented Feb 28, 2017

Also, /tmp in the virtual machine is easily filled with all the data generated. I guess either root partition should be expanded, or /tmp moved to /work.

@maedoc
Copy link
Member

maedoc commented Feb 28, 2017

You can set env vars to change the default temp dir cf tempfile module docs:

The default directory is chosen from a platform-dependent list, but the user of the application can control the directory location by setting the TMPDIR, TEMP or TMP environment variables.

@maedoc
Copy link
Member

maedoc commented Feb 28, 2017

but, we should probable have a method on the runner to allow changing the default dir.

@sipv
Copy link
Collaborator Author

sipv commented Feb 28, 2017

Re default temp dir: that's what I am doing, but having a large enough tmp (for the VM's indented purpose) by default seems like a good idea to me.

@maedoc
Copy link
Member

maedoc commented Feb 28, 2017

ah something like

rm /tmp
ln -s /work/tmp /tmp

?
I don't remember if /tmp has its own filesystem on Ubuntu.

@sipv
Copy link
Collaborator Author

sipv commented Feb 28, 2017

Yeah that should work. /tmp is on the same partition as /.

@sipv
Copy link
Collaborator Author

sipv commented Feb 28, 2017

I'll make the changes.

@sipv
Copy link
Collaborator Author

sipv commented Mar 2, 2017

Some results for example patient:

conv

  • Mean length matrices do not converge, probably because it is unscaled, and thus elements with low track count have the same weights as the high track count elements. This would apply to any other unscaled statistics of the track length distribution.
  • Connectome weights and num tracks matrices should actually be the same thing, so I am not sure where is the small difference coming from.
  • Relative convergence criterion ||A_n - A_{n-1}|| / ||A_n|| would be slightly easier to interpret:

conv_rel

  • If I had to set a convergence criterion, threshold value 0.01 for relative L2 norm of track count matrices would in this case give 3.2M tracks, which is roughly around the values used before.

@sipv
Copy link
Collaborator Author

sipv commented Mar 6, 2017

So the small difference mentioned above was from keeping (or not) the unassigned tracks.

@sipv sipv force-pushed the workflow-parc-fod-to-con branch from 9128e64 to 066bf0e Compare March 6, 2017 09:55
@maedoc
Copy link
Member

maedoc commented Mar 9, 2017

Convergence looks good. Based on that plot, I guess a cutoff of 0.001 would require ~30M tracks, right? In any case, I think 0.01 is fine.

Just to be careful, it may be useful to run this convergence check for the post-tcksift data as well? Or, is that what you did?

@sipv
Copy link
Collaborator Author

sipv commented Mar 9, 2017

Roughly 30M, yes.

Convergence is actually post-sift. Precisely, track count in the charts refers to the number of generated tracks, which are then reduced to 20% (or more if tcksift ends sooner).

@maedoc
Copy link
Member

maedoc commented Mar 9, 2017

Convergence is actually post-sift

👍 👍 hadn't read the code sorry


log.info('Generating additional %i tracks' % ntracks)
tracks_b = runner.tmp_fname('tracks_B_N%s.tck' % ntracks)
runner.run(mrtrix.run_tckgen(self.fod, tracks_b, ntracks, seed_gmwmi=self.gmwmi, act=self.ftt))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't thought enough about this. This indirection of using a runner isn't sophisticated enough to describe things like convergence.

I am starting to doubt the wisdom of this idea. It would be drastically simpler to just run the commands directly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I think it's useful to see how it converges, but it is not really necessary considering the complexity it adds.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I wasn't clear: I want to keep the convergence and drop this "Runner" pattern in the whole library. It makes things super complicated on the software side. On the science side, it is a very nice thing to be able to say that the connectome estimate converged.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK then

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was just looking going over this code and for the runner; I was wrong about that. Those commands are run immediately upon being passed to runner.run, so this works fine.

@maedoc
Copy link
Member

maedoc commented Mar 24, 2017

Do you have other commits on the way, or should I pick this up?

@sipv
Copy link
Collaborator Author

sipv commented Mar 24, 2017

One thing I haven't done is to actually include a call to all this in the main.sh script. But that is something I don't want to do right now, because main.sh does not work fully automatically for me and therefore I cannot test if it works. So I would wait with this for the testing framework to be set up first.

@maedoc
Copy link
Member

maedoc commented Mar 28, 2017

I don't think main.sh works on anyone's computer except Denis' so is not a criteria for excluding merging this code. Testing is delayed until I can assure that GitLab can be backed up (and won't be deleted by admins), end of the week or next.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants