Add an option to do permutations that are assignments of n observations to k groups #5

gavinsimpson · 2015-05-13T03:59:46Z

This is a different problem than tackled by permute currently and it pertains to the Golden Jackals example, where even though there are only 184,756 useful permutations, numPerms() will report a far higher value as we are just randomly shuffling the data with respect to the grouping variable. With sufficient data this shouldn't be a problem and potential duplicate permutations are unlikely to crop up often. However it would be useful to include this as a choice.

The text was updated successfully, but these errors were encountered:

jarioksa · 2015-08-26T07:15:37Z

This is related to vegan issue vegandevs/vegan#132: with class variables, several unique permutations replicate the original classes (= were permuted within the same classes). A consequence of this is that minimum possible P-value is higher than 1/(nperm+1) because some permutations necessarily replicate the original allocation. However, with unequal class sizes probabilities of shuffling within one class level varies among levels. Random permutation disregarding any classification takes care of unequal classification probabilities and also correctly shows the effect in minimum possible P-value.

gavinsimpson · 2019-01-09T19:10:19Z

Seems like a suitable/correct algorithm is described on CrossValidated which we can implement, and combn() probably gives what we need for allPerms().

Now just to implement it and think about how to expose it given the current interface...

jarioksa · 2019-01-10T11:23:48Z

I don't think this CrossValidated question answers the same problem. It tells you how to do sample(n,k) when k < n, but this does not guarantee unique groups. Moreover, R already has sample(n,k).

Assume we have six observations with factor values A,B,B,C,C,C. We have 6! = 720 permutations for six observations, but only 6!/2!/3! = 60 different combinations of these three values (A,B,C).

The distinct sequences are easily exhausted only in small data sets, but there they can be disturbing. Here a function to estimate the number of distinct sequences of vector a (presumably a factor):

ndistseq <- function(a) exp(lfactorial(length(a)) - sum(lfactorial(table(a))))

gavinsimpson · 2019-01-10T14:03:49Z

Hmm, I need to revisit my thinking then; when I was playing with this for a two group example it was doing what we needed, but perhaps that was due to the simplicity of the example I was working with...?

gavinsimpson added the enhancement label May 13, 2015

gavinsimpson self-assigned this Jan 9, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add an option to do permutations that are assignments of n observations to k groups #5

Add an option to do permutations that are assignments of n observations to k groups #5

gavinsimpson commented May 13, 2015

jarioksa commented Aug 26, 2015

gavinsimpson commented Jan 9, 2019

jarioksa commented Jan 10, 2019 •

edited

Loading

gavinsimpson commented Jan 10, 2019

Add an option to do permutations that are assignments of n observations to k groups #5

Add an option to do permutations that are assignments of n observations to k groups #5

Comments

gavinsimpson commented May 13, 2015

jarioksa commented Aug 26, 2015

gavinsimpson commented Jan 9, 2019

jarioksa commented Jan 10, 2019 • edited Loading

gavinsimpson commented Jan 10, 2019

jarioksa commented Jan 10, 2019 •

edited

Loading