You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The function below was written for a package (Simpsons.jl) which needs to automate finding the "best" number of
clusters. Would Clustering benefit from a PR to add it to Clustering?
"""
find_clustering_elbow(dataarray, cmin = 1, cmax = 5)
Find the "elbow" of the totalcost versus cluster number curve, where
cmin <= elbow <= cmax. Note that in pathological cases where the actual
minimum of the totalcosts occurs at a cluster count less than that of the
curve "elbow", the function will return either cmin or the actual cluster
count at which the totalcost is at minimum, whichever is larger.
<br>
Returns a tuple: the cluster count and the ClusteringResult at the "elbow" optimum.
"""
function find_clustering_elbow(dataarray::AbstractMatrix{<:Real}, cmin = 1, cmax = 5; fclust = kmeans, kwargs...)
allkmeans = [fclust(dataarray, i, kwargs...) for i in 1:cmax+1]
alltotals = map(x -> x.totalcost, allkmeans)
_, cidx = findmin(alltotals)
x1, y1 = 1, alltotals[1]
x2, y2 = cmax + 1, alltotals[cmax + 1]
_, idx = findmax(map(i -> distance(x1, y1, x2, y2, i, alltotals[i]), 2:cmax))
nclust = cidx < idx + 1 ? max(cmin, cidx) : idx + 1
return nclust, allkmeans[nclust]
end
The text was updated successfully, but these errors were encountered:
The function below was written for a package (Simpsons.jl) which needs to automate finding the "best" number of
clusters. Would Clustering benefit from a PR to add it to Clustering?
The text was updated successfully, but these errors were encountered: