Consistency of dist_sample methods #126

robjhyndman · 2024-09-13T02:24:05Z

Currently the density, cdf, quantiles, etc. produced from a dist_sample() are not consistent with each other. That is ok as a design choice, and results in better estimates of each, even if they are not consistent with one another. But if you want consistency, you need to choose one of these, and base the others off it.

If we choose density as the base, and use a kde, this would work reasonably well, and better than starting with the empirical cdf or sample quantiles. In fact, I have already implemented exactly this via weird::dist_kde(). See https://github.com/robjhyndman/weird-package/blob/main/R/dist_kde.R

But if this is how it is done, I think dist_kde() is a better name than dist_sample(), to emphasise how the distribution is being computed.

This also affects #117

The text was updated successfully, but these errors were encountered:

mitchelloharawild · 2024-09-13T14:46:03Z

I'm inclined to adjust the density method of dist_sample() to not use KDE when a suitable alternative(s) like dist_density() and dist_kernel_density() are added (#117).

This is in part because the samples in dist_sample() can be practically anything. The samples aren't necessarily univariate, continuous, or even numerical. This would complicate (and confuse) doing anything extra like basing dist_sample() around a kde.

In cases where a smooth/continuous density estimate is wanted/required, a KDE constructed with dist_kernel_density() should be used instead of dist_sample().

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consistency of dist_sample methods #126

Consistency of dist_sample methods #126

robjhyndman commented Sep 13, 2024

mitchelloharawild commented Sep 13, 2024

Consistency of dist_sample methods #126

Consistency of dist_sample methods #126

Comments

robjhyndman commented Sep 13, 2024

mitchelloharawild commented Sep 13, 2024