Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plotting variability around line in ft_singleplotER #1558

Open
vitpia opened this issue Oct 7, 2020 · 8 comments
Open

Plotting variability around line in ft_singleplotER #1558

vitpia opened this issue Oct 7, 2020 · 8 comments

Comments

@vitpia
Copy link
Contributor

vitpia commented Oct 7, 2020

The new COBIDAS-MEEG best practices suggest 'If any form of averaging is performed, the variability should also be depicted'.
It would be nice if ft_singleplotER enabled that. There are workarounds to get that in a figure, but if ft_singleplotER would have it implemented, I'm sure it would become more common practice than it is now, so we'd actually practise what the best practices suggest :)

The most common situation I can imagine:

  • User has subj_chan_time data (or subj_chan_freq, or other things that can be plotted with singleplotER)
  • Options for variability around the mean (std, standard error of the sample, confidence interval)
  • The mean and variability for data1 are shown as line and shaded areas, respectively; same for data2
cfg.variability = 'std'; % 'se', 'ci'
ft_singleplotER(cfg, data1, data2) 
  • Ideally if you drag over channels to create a new ft_singleplotER averaged over those channels, it would also recalculate the variability, like it does now for the mean. @Spaak brought up the issue that "with re-computing variances on the fly is that the variance of the mean of some channels is not the same as the mean of the per-channel variances. So, we'd have to average over channels first, and then re-compute variance. This is quite different from plotting a field that's already in the data (.var, which like you say is computed by ft_timelockgrandaverage (or ft_timelockanalysis)."

That means there are at least two most convenient ways to go about this:

  • ft_singleplotER calculates the variability once when it's called, if the user wants to drag over channels for a new ft_singleplotER, variability is no longer shown (or, as Eelke suggested, it has to be recalculated);
  • The user has to go through ft_timelock_* to calculate .var.
    • If .var is detected in the structure, ft_singleplotER can plot it
    • otherwise it throws a warning that the user needs to run through ft_timelock* first
@robertoostenveld
Copy link
Member

Following COBIDAS is a good idea in general.

Some implementation challenges I see are that

  • in case the input contains trials rather than an average, multiplot and singleplot will first average over trials
  • computing STD/SE/CI over channels (in singleplot) is different than computing pooled variance and then STD/SE/CI

The 2nd is also what ft_timelockgrandaverage can deal with, especially for averaging over subjects or averaging over conditions within a subject with different numbers of trials

I wonder whether singleplot/multiplot should do the computations. When averaging over channels and using the across-channels variance that is unavoidable, since the user interactively selects channels. But when using multiplot, or when you would use singleplot for a single channel, then it would also be a possibility that (besides avg, var and dof) the std, sem and/or ci should be part of the input data structure. That keeps the responsibility for the computation more clear, out of the plotting function, and gives you the chance to also do something with the values. Note that this is also more in line with prob and mask (after timelockstats), which can be used for highlighting significant regions.

@robertoostenveld
Copy link
Member

See also ft_multiplotER options

%   cfg.maskparameter = field in the first dataset to be used for marking significant data
%   cfg.maskstyle     = style used for masking of data, 'box', 'thickness' or 'saturation' (default = 'box')
%   cfg.maskfacealpha = mask transparency value between 0 and 1

@vitpia
Copy link
Contributor Author

vitpia commented Oct 7, 2020

I wonder whether singleplot/multiplot should do the computations.

Agree, and that would solve lots of other issues as well.

See also ft_multiplotER options

This means effectively:

  • use ft_timelock* to calculate .var
  • use cfg parameters mask* for plotting the already computed .var
    I will test this and report back (with a FAQ if it works well)

@Spaak
Copy link
Member

Spaak commented Oct 7, 2020

Taking this to its extreme (but logical conclusion), we could generalize ft_selectdata's ability to avgoverfreq, avgoverchan, etc. into something like:

cfg = [];

cfg.aggregate = [];
cfg.aggregate(1).dim = 'chan';
cfg.aggregate(1).operation = 'mean';
cfg.aggregate(2).dim = 'rpt';
cfg.aggregate(2).operation = 'var';

dat_aggr = ft_selectdata(cfg, data);

where I'm deliberately making the order of aggregation explicit. (I can imagine other APIs like cfg.aggregate = {'chan', 'mean', 'rpt', 'var'}.)

@robertoostenveld
Copy link
Member

something to aim for: two ERPs, in different conditions, both with a band of uncertainty around it, combined with a grey box in the background that shows the time range in which the difference is significant. Or a single condition ERP (versus zero) with uncertainty and significance/highlight mask.

@vitpia
Copy link
Contributor Author

vitpia commented Oct 7, 2020

something to aim for: two ERPs, in different conditions, both with a band of uncertainty around it, combined with a grey box in the background that shows the time range in which the difference is significant. Or a single condition ERP (versus zero) with uncertainty and significance/highlight mask.

In both cases, the variability would be computed for one specific condition, not necessarily in relation to a significance statement about the data. I also can imagine situations in which one would want to see the variability regardless of whether there's something significant to be plotted. So I tend to see these as independent from each other (but maybe you mean what one ultimately should be able to achieve in a figure)

@robertoostenveld
Copy link
Member

Yes, I meant that the current "mask" handling (for significance) should stay as is and that the variance should be implemented extra, not as another cfg.maskstyle

@vitpia
Copy link
Contributor Author

vitpia commented Oct 7, 2020

Taking this to its extreme (but logical conclusion), we could generalize ft_selectdata's ability to avgoverfreq, avgoverchan, etc. into something like:

cfg = [];

cfg.aggregate = [];
cfg.aggregate(1).dim = 'chan';
cfg.aggregate(1).operation = 'mean';
cfg.aggregate(2).dim = 'rpt';
cfg.aggregate(2).operation = 'var';

dat_aggr = ft_selectdata(cfg, data);

where I'm deliberately making the order of aggregation explicit. (I can imagine other APIs like cfg.aggregate = {'chan', 'mean', 'rpt', 'var'}.)

Could be nice, but maybe too much that wouldn't be used? I don't know how much harder it is to implement this than a more simple option...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants