Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qqmath.permustats and reference distribution #172

Open
jarioksa opened this issue Apr 28, 2016 · 4 comments
Open

qqmath.permustats and reference distribution #172

jarioksa opened this issue Apr 28, 2016 · 4 comments
Assignees

Comments

@jarioksa
Copy link
Contributor

jarioksa commented Apr 28, 2016

qqmath function accepts argument distribution to find the expected quantiles in the Q-Q plot. The default is to plot permutations against Gaussian distribution (qnorm). However, this is a bad choice in several cases. For instance, example(permutest.betadisper) gives:
qqnow
One panel shows F-statistic and another shows t-statistic which have different scales and shapes. It is possible to have separate scales for panels (scales = list(relation = "free")), but the shapes are still different. It is possible to give the distribution used for the theoretical quantile (horizontal axis), but it is more tedious to do this separately for each panel (see ?panel.qqmath), and getting the scaling right is even more cumbersome (I think this would need writing a prepanel function). However, actually we would like to get something like this:
qqexpect
Here F is plotted against F-distribution and t against t-distribution, both with correct degrees of freedom and using separate scaling. This is much better than the current plot for diagnostic purposes.

I didn't find a nice way to do this with qqmath, and the quick graph above is based on the idea xyplot(permuted ~ expected | statistic, scales = list(relation = "free"), abline = c(0,1)).

To do this in general, we should

  1. Add an item defining reference distribution and its parameters (degrees of freedom etc) in the permustats method when this makes sense. The default would be to have nothing and relying on qqmath using Gaussian distribution. I think the new item should be given for betadisper, adonis and anova.cca methods.
  2. Rewrite qqmath.permustats to use the reference distribution for each panel when such is given. This is the harder part to get right.
@jarioksa
Copy link
Contributor Author

d2cbd4e and 307bece show how users should cope with conflicting scaling.

@gavinsimpson
Copy link
Contributor

gavinsimpson commented Aug 17, 2016

Even with your notes in the examples, the plots for permutest.betadisper's permustats aren't nice. panel.qqmath has the following example:

     set.seed(0)
     xx <- rt(10000, df = 10)
     qqmath(~ xx, pch = "+", distribution = qnorm,
            grid = TRUE, abline = c(0, 1),
            xlab.top = c("raw", "ppoints(100)", "tails.n = 50"),
            panel = function(..., f.value) {
                switch(panel.number(),
                       panel.qqmath(..., f.value = NULL),
                       panel.qqmath(..., f.value = ppoints(100)),
                       panel.qqmath(..., f.value = ppoints(100), tails.n = 50))
            }, layout = c(3, 1))[c(1,1,1)]

So, we could easily modify qqmath.permustats to do the right thing assuming

  1. we know which distribution to use for each statistic - so we need to pass this information along in the permustats object,
  2. we make sure we preserve the ordering of the panels and the distributions/statistics.

I'll have a go at implementing this for permutest.betadisper and do a PR when I've gotten somewhere so we can discuss implementation and how to fit this on to the other object types that permustats understands.

@gavinsimpson gavinsimpson self-assigned this Aug 17, 2016
@gavinsimpson
Copy link
Contributor

gavinsimpson commented Aug 17, 2016

I think a custom panel function that accepts a vector distribution argument and calls panel.qqmath(..., distribution = distribs[panel.number()]) might suffice...*

* == famous last words

@jarioksa
Copy link
Contributor Author

I think permustats could add information on expected distribution function (with its degrees of freedom), and if nothing was supplied, current default would be used. Such information is not available in general, but in some cases it is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants