Bootstrap resampling from an arbitrary number of distributions? #51

harryscholes · 2019-06-21T09:51:14Z

Implementing this functionality would would allow you to do things like calculate the mean difference between two distributions:

using Statistics

xs = rand(1000)
ys = rand(1000)

bootstrap((x,y)->mean(x)-mean(y), (xs, ys), BasicSampling(1000)) # mock API

Which would be equivalent(ish) to the following (minus the nice extras that Bootstrap.jl provides):

using Statistics, StatsBase

map(_->mean(sample(xs, size(xs)))-mean(sample(ys, size(ys))), 1:1000)

juliangehring · 2019-06-25T22:00:34Z

Nice idea, that would definitely be a useful feature to support!

juliangehring · 2019-06-25T22:22:21Z

As a "workaround", one could currently get the same in multiple steps:

bs1 = bootstrap(x->mean(x), xs, BasicSampling(1000))
bs2 = bootstrap(y->mean(y), ys, BasicSampling(1000))
z = straps(bs1)[1] - straps(bs2)[1]

harryscholes · 2019-07-17T07:18:48Z

Yes, I think it would be a great feature to include! I'll have a look into it at the JuliaCon hackathon

harryscholes mentioned this issue Jul 26, 2019

WIP: possible solution to allow bootstrapping form an arbitrary number of distributions #58

Closed

Provide feedback