-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue #64: added n_init to kmeans #78
base: master
Are you sure you want to change the base?
Conversation
src/kmeans.jl
Outdated
maxiter=maxiter, | ||
tol=tol, | ||
display=display) | ||
n_init > 0 || throw(ArgumentError("n_init must be greater than 0")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything is still being indented with tabs. You can do a find-and-replace to change everything at once, if you wish; I do it in Vim like
:%s/\t/ /g
If you accidentally close a pull request, you can just click "reopen," no need to open a new one. |
src/kmeans.jl
Outdated
@@ -72,6 +92,8 @@ function _kmeans!{T<:AbstractFloat}( | |||
tol::Real, # in: tolerance of change at convergence | |||
displevel::Int) # in: the level of display | |||
|
|||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the excess whitespace here and above. The change is unrelated to the PR.
src/kmeans.jl
Outdated
end | ||
|
||
return bestresult | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the line breaks on lines 60, 73, and 75 for style consistency with the rest of the code.
src/kmeans.jl
Outdated
tol::Real=_kmeans_default_tol, | ||
display::Symbol=_kmeans_default_display) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another unrelated whitespace change
Sorry, I realize the whitespace pedantry is probably annoying. Once my outstanding style comments are addressed I think this will be good to go. Thanks for the contribution! |
No I completely apologize for these small things that I didn't take care to consider. Thanks for helping me learn about this process. It is preparing me for future commits. |
src/kmeans.jl
Outdated
@@ -71,7 +87,7 @@ function _kmeans!{T<:AbstractFloat}( | |||
maxiter::Int, # in: maximum number of iterations | |||
tol::Real, # in: tolerance of change at convergence | |||
displevel::Int) # in: the level of display | |||
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trailing whitespace. I don't know what editor you use, but I think Atom trims trailing whitespace by default, and in Vim you can do :%s/\s\+$//g
to remove it.
src/kmeans.jl
Outdated
tol::Real=_kmeans_default_tol, | ||
display::Symbol=_kmeans_default_display) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last remaining extraneous newline.
This looks great to me, though I'd like a second pair of eyes on the implementation before it's merged. @KristofferC? |
This is rather old, but Clustering.jl hasn't had any active maintainers in a while. I just fixed the merge conflicts. This seems like a nice addition, but the documentation needs to be updated to reflect the interface change. @lbollar, are you still available to update this? |
I'll try and get some time soon and update.
On Aug 26, 2017 9:38 AM, "Kevin Squire" <[email protected]> wrote:
This is rather old, but Clustering.jl hasn't had any active maintainers in
a while. I just fixed the merge conflicts. This seems like a nice addition,
but the documentation needs to be updated to reflect the interface change.
@lbollar <https://github.com/lbollar>, are you still available to update
this?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#78 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGB4lm6DB9wlG8B7KfKt1mU035r55gbxks5scC3jgaJpZM4KLdq3>
.
|
@@ -44,20 +45,34 @@ function kmeans{T<:AbstractFloat}(X::Matrix{T}, k::Int; | |||
weights=nothing, | |||
init=_kmeans_default_init, | |||
maxiter::Integer=_kmeans_default_maxiter, | |||
n_init::Integer=_kmeans_default_n_init, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that n_init
comes from Python's sklearn (#64), but it doesn't sound like a best choice for me.
Maybe something like n_tries
to reflect that the parameter defines how many times the algorithm, rather than some initialization procedure, is run?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or ntries
? And wouldn't be an overkill to run 10 times? I recommend default value 1, because usually a quick partitioning is required and not necessarily best one. And, if one needs to find a best clustering, this parameter can be set to larger value explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10 is what sklearn does at it sounds reasonable to me.
It isn't unusual to run 1000s of times, (that was done as the baseline for the affinity propagation paper)
If some need a quick partition they can ask for it.
The default shouldn't be so sensitive to random factors.
I think 10 strikes the right balance.
Though I could see argument for 3 or 30
Any interest in merging this sometime? |
I think it's worth merging. I'd rename the parameter to |
Sorry, I am still figuring out the pull request process and think I accidentally closed last one. Here is new request with suggested cleanups.
Apologies.
EDIT: fixes #64, original PR #77