-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add distribution metric to semantic-core and ffwd-reporter #79
Changes from 8 commits
58883b5
6785e73
dab6897
912e50a
4ae53fe
b3cb919
89b365d
c2e676a
b07eca4
055de7e
339a19f
c000a1f
b0ece44
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
/* | ||
* Copyright (C) 2016 - 2020 Spotify AB. | ||
* | ||
* Licensed to the Apache Software Foundation (ASF) under one | ||
* or more contributor license agreements. See the NOTICE file | ||
* distributed with this work for additional information | ||
* regarding copyright ownership. The ASF licenses this file | ||
* to you under the Apache License, Version 2.0 (the | ||
* "License"); you may not use this file except in compliance | ||
* with the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, | ||
* software distributed under the License is distributed on an | ||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
* KIND, either express or implied. See the License for the | ||
* specific language governing permissions and limitations | ||
* under the License. | ||
*/ | ||
|
||
package com.spotify.metrics.core; | ||
|
||
import com.codahale.metrics.Counting; | ||
import com.codahale.metrics.Metric; | ||
|
||
import java.nio.ByteBuffer; | ||
|
||
/** | ||
* {@link Distribution} is a simple interface that allows users to record measurements | ||
* to compute rank statistics on data distribution not just local source. | ||
* | ||
* <p>Every implementation should produce a serialized data sketch in a byteBuffer | ||
* as this metric point value. For more information on how this is handled upstream, | ||
* Please refer to | ||
* <a href="https://github.com/spotify/ffwd-client-java/blob/master/ffwd- | ||
* client/src/main/java/com/spotify/ffwd/FastForward.java#L110"/> FastForward Java client</a> | ||
* | ||
* <p>Unlike traditional histogram, {@link Distribution} doesn't require | ||
* predefined percentile value. Data recorded | ||
* can be used upstream to compute any percentile. | ||
* | ||
* <p>This Distribution doesn't require any binning configuration. | ||
* Just get an instance through SemanticMetricBuilder and record data. | ||
* | ||
* <p> {@link Distribution} is a good choice if you care about percentile accuracy in | ||
* a distributed environment and you want to rely on P99 to set SLO. | ||
*/ | ||
public interface Distribution extends Metric, Counting { | ||
|
||
/** | ||
* Record value from Min.Double to Max.Double. | ||
* @param val | ||
*/ | ||
void record(double val); | ||
|
||
/** | ||
* Return distribution point value and flush. | ||
* When this method is called every internal state | ||
* is reset and a new recording starts. | ||
* | ||
* @return | ||
*/ | ||
ByteBuffer getValueAndFlush(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am still curious about the idea of exposing the "value" of the Distribution as a For instance, if there was an alternate implementation of Is there any other data type that could be used? Or at least could the format of the bytes be documented here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we use Distribution as the return type then the serializer will have to know something about Distribution implementation. For instance, FFW reporter will have an implementation of this method for every Distribution type. Documentation for bytes format is handled by FFW. There is a version number, we will use that number upstream to determine how to handle the bytes. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
|
||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
/* | ||
* Copyright (c) 2016 Spotify AB. | ||
* Copyright (C) 2016 - 2020 Spotify AB. | ||
* | ||
* Licensed to the Apache Software Foundation (ASF) under one | ||
* or more contributor license agreements. See the NOTICE file | ||
|
@@ -23,53 +23,66 @@ | |
|
||
|
||
import com.google.common.annotations.VisibleForTesting; | ||
import com.spotify.metrics.core.codahale.metrics.ext.Distribution; | ||
import com.tdunning.math.stats.TDigest; | ||
|
||
import java.nio.ByteBuffer; | ||
import java.util.concurrent.atomic.AtomicReference; | ||
|
||
|
||
public class DistributionImpl implements Distribution { | ||
/** | ||
* Semantic Metric implementation of {@link Distribution}. | ||
* This implementation ensures threadsafety for recording data | ||
* and retrieving distribution point value. | ||
* | ||
* {@link SemanticMetricDistribution} is backed by Ted Dunning T-digest implementation. | ||
* | ||
* <p>{@link TDigest} "sketch" are generated by clustering real-valued samples and | ||
* retaining the mean and number of samples for each cluster. | ||
* The generated data structure is mergeable and produces fairly | ||
* accurate percentile even for long-tail distribution. | ||
* | ||
* <p> We are using T-digest compression level of 100. | ||
* With that level of compression, our own benchmark using Pareto distribution | ||
* dataset, shows P99 error rate is less than 2% . | ||
* From P99.9 to P99.999 the error rate is slightly higher than 2%. | ||
* | ||
*/ | ||
public final class SemanticMetricDistribution implements Distribution { | ||
|
||
private static final int COMPRESSION_DEFAULT_LEVEL = 100; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Might we want to change this level? Should it be configuration-driven perhaps? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think for the time being it is better to keep every aspect of Tdigest private. Power users have the ability to extend this class and use a different compression level. The typical use case 100 is ok. |
||
private final AtomicReference<TDigest> distRef; | ||
|
||
protected DistributionImpl() { | ||
this.distRef = new AtomicReference<>(TDigest.createDigest(COMPRESSION_DEFAULT_LEVEL)); | ||
SemanticMetricDistribution() { | ||
this.distRef = new AtomicReference<>(create()); | ||
} | ||
|
||
@Override | ||
public void record(double val) { | ||
public synchronized void record(double val) { | ||
distRef.get().add(val); | ||
} | ||
|
||
@Override | ||
public java.nio.ByteBuffer getValueAndFlush() { | ||
TDigest curVal = distRef.getAndSet(create()); // reset tdigest | ||
TDigest curVal; | ||
synchronized (this) { | ||
curVal = distRef.getAndSet(create()); // reset tdigest | ||
mattnworb marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} | ||
ByteBuffer byteBuffer = ByteBuffer.allocate(curVal.smallByteSize()); | ||
curVal.asSmallBytes(byteBuffer); | ||
return byteBuffer; | ||
} | ||
|
||
/** | ||
* Returns the current count. | ||
* | ||
* @return the current count | ||
*/ | ||
|
||
@Override | ||
public long getCount() { | ||
return this.tDigest().size(); | ||
return distRef.get().size(); | ||
} | ||
|
||
private TDigest create() { | ||
return TDigest.createDigest(COMPRESSION_DEFAULT_LEVEL); | ||
} | ||
|
||
|
||
@VisibleForTesting | ||
TDigest tDigest() { | ||
return distRef.get(); | ||
} | ||
|
||
private TDigest create() { | ||
return TDigest.createDigest(COMPRESSION_DEFAULT_LEVEL); | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -93,6 +93,16 @@ public void onDerivingMeterAdded(MetricId name, DerivingMeter derivingMeter) { | |
@Override | ||
public void onDerivingMeterRemoved(MetricId name) { | ||
} | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm still not a fan of this class name. Is it convention? |
||
@Override | ||
public void onDistributionAdded(MetricId name, Distribution distribution) { | ||
|
||
} | ||
|
||
@Override | ||
public void onDistributionRemoved(MetricId name) { | ||
|
||
} | ||
} | ||
|
||
/** | ||
|
@@ -184,4 +194,27 @@ public void onDerivingMeterRemoved(MetricId name) { | |
* @param name the meter's name | ||
*/ | ||
void onDerivingMeterRemoved(MetricId name); | ||
|
||
malish8632 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
/** | ||
* This is a no op implementation for backward compatibility. | ||
* Please override this method if you are using a Distribution metric. | ||
* Method is called when a {@link Distribution} is added to the registry. | ||
* | ||
* @param name the distribution's name | ||
* @param distribution the distribution | ||
*/ | ||
public default void onDistributionAdded(MetricId name, Distribution distribution) { | ||
|
||
} | ||
|
||
/** | ||
* This is a no op implementation for backward compatibility. | ||
* Please override this method if you are using a Distribution metric. | ||
* Method is called when a {@link Distribution} is removed from the registry. | ||
* | ||
* @param name the distribution's name | ||
*/ | ||
public default void onDistributionRemoved(MetricId name) { | ||
|
||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent comment!