-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add distribution metric to semantic-core and ffwd-reporter #79
Changes from all commits
58883b5
6785e73
dab6897
912e50a
4ae53fe
b3cb919
89b365d
c2e676a
b07eca4
055de7e
339a19f
c000a1f
b0ece44
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
/* | ||
* Copyright (C) 2016 - 2020 Spotify AB. | ||
* | ||
* Licensed to the Apache Software Foundation (ASF) under one | ||
* or more contributor license agreements. See the NOTICE file | ||
* distributed with this work for additional information | ||
* regarding copyright ownership. The ASF licenses this file | ||
* to you under the Apache License, Version 2.0 (the | ||
* "License"); you may not use this file except in compliance | ||
* with the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, | ||
* software distributed under the License is distributed on an | ||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
* KIND, either express or implied. See the License for the | ||
* specific language governing permissions and limitations | ||
* under the License. | ||
*/ | ||
|
||
package com.spotify.metrics.core; | ||
|
||
import com.codahale.metrics.Counting; | ||
import com.codahale.metrics.Metric; | ||
|
||
import java.nio.ByteBuffer; | ||
|
||
/** | ||
* {@link Distribution} is a simple interface that allows users to record measurements | ||
* to compute rank statistics on data distribution not just local source. | ||
* | ||
* <p>Every implementation should produce a serialized data sketch in a byteBuffer | ||
* as this metric point value. For more information on how this is handled upstream, | ||
* Please refer to | ||
* <a href="https://github.com/spotify/ffwd-client-java/blob/master/ffwd- | ||
* client/src/main/java/com/spotify/ffwd/FastForward.java#L110"/> FastForward Java client</a> | ||
* | ||
* <p>Unlike traditional histogram, {@link Distribution} doesn't require | ||
* predefined percentile value. Data recorded | ||
* can be used upstream to compute any percentile. | ||
* | ||
* <p>This Distribution doesn't require any binning configuration. | ||
* Just get an instance through SemanticMetricBuilder and record data. | ||
* | ||
* <p> {@link Distribution} is a good choice if you care about percentile accuracy in | ||
* a distributed environment and you want to rely on P99 to set SLO. | ||
*/ | ||
public interface Distribution extends Metric, Counting { | ||
|
||
/** | ||
* Record value from Min.Double to Max.Double. | ||
* @param val | ||
*/ | ||
void record(double val); | ||
|
||
/** | ||
* Return distribution point value and flush. | ||
* When this method is called every internal state | ||
* is reset and a new recording starts. | ||
* | ||
* @return | ||
*/ | ||
ByteBuffer getValueAndFlush(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am still curious about the idea of exposing the "value" of the Distribution as a For instance, if there was an alternate implementation of Is there any other data type that could be used? Or at least could the format of the bytes be documented here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we use Distribution as the return type then the serializer will have to know something about Distribution implementation. For instance, FFW reporter will have an implementation of this method for every Distribution type. Documentation for bytes format is handled by FFW. There is a version number, we will use that number upstream to determine how to handle the bytes. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
|
||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,88 @@ | ||
/* | ||
* Copyright (C) 2016 - 2020 Spotify AB. | ||
* | ||
* Licensed to the Apache Software Foundation (ASF) under one | ||
* or more contributor license agreements. See the NOTICE file | ||
* distributed with this work for additional information | ||
* regarding copyright ownership. The ASF licenses this file | ||
* to you under the Apache License, Version 2.0 (the | ||
* "License"); you may not use this file except in compliance | ||
* with the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, | ||
* software distributed under the License is distributed on an | ||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
* KIND, either express or implied. See the License for the | ||
* specific language governing permissions and limitations | ||
* under the License. | ||
*/ | ||
|
||
package com.spotify.metrics.core; | ||
|
||
|
||
import com.google.common.annotations.VisibleForTesting; | ||
import com.tdunning.math.stats.TDigest; | ||
|
||
import java.nio.ByteBuffer; | ||
import java.util.concurrent.atomic.AtomicReference; | ||
|
||
/** | ||
* Semantic Metric implementation of {@link Distribution}. | ||
* This implementation ensures threadsafety for recording data | ||
* and retrieving distribution point value. | ||
* | ||
* {@link SemanticMetricDistribution} is backed by Ted Dunning T-digest implementation. | ||
* | ||
* <p>{@link TDigest} "sketch" are generated by clustering real-valued samples and | ||
* retaining the mean and number of samples for each cluster. | ||
* The generated data structure is mergeable and produces fairly | ||
* accurate percentile even for long-tail distribution. | ||
* | ||
* <p> We are using T-digest compression level of 100. | ||
* With that level of compression, our own benchmark using Pareto distribution | ||
* dataset, shows P99 error rate is less than 2% . | ||
* From P99.9 to P99.999 the error rate is slightly higher than 2%. | ||
* | ||
*/ | ||
public final class SemanticMetricDistribution implements Distribution { | ||
|
||
private static final int COMPRESSION_DEFAULT_LEVEL = 100; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Might we want to change this level? Should it be configuration-driven perhaps? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think for the time being it is better to keep every aspect of Tdigest private. Power users have the ability to extend this class and use a different compression level. The typical use case 100 is ok. |
||
private final AtomicReference<TDigest> distRef; | ||
|
||
SemanticMetricDistribution() { | ||
this.distRef = new AtomicReference<>(create()); | ||
} | ||
|
||
@Override | ||
public synchronized void record(double val) { | ||
distRef.get().add(val); | ||
} | ||
|
||
@Override | ||
public java.nio.ByteBuffer getValueAndFlush() { | ||
TDigest curVal; | ||
synchronized (this) { | ||
curVal = distRef.getAndSet(create()); // reset tdigest | ||
mattnworb marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} | ||
ByteBuffer byteBuffer = ByteBuffer.allocate(curVal.smallByteSize()); | ||
curVal.asSmallBytes(byteBuffer); | ||
return byteBuffer; | ||
} | ||
|
||
|
||
@Override | ||
public long getCount() { | ||
return distRef.get().size(); | ||
} | ||
|
||
@VisibleForTesting | ||
TDigest tDigest() { | ||
return distRef.get(); | ||
} | ||
|
||
private TDigest create() { | ||
return TDigest.createDigest(COMPRESSION_DEFAULT_LEVEL); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent comment!