Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added mysql trace backend #239

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
Open
83 changes: 73 additions & 10 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,77 @@
##Bugs
We use Github Issues for our bug reporting. Please make sure the bug isn't already listed before opening a new issue.
# Contributing

##Development
All work on Haystack happens directly on Github. Core Haystack team members will review opened pull requests.
Code contributions are always welcome!

##Requests
If you see a feature that you would like to be added, please open an issue in the respective repository or in the general Haystack repo.
* Open an issue in the repo with defect/enhancements
* We can also be reached @ https://gitter.im/expedia-haystack/Lobby
* Fork, make the changes, build and test it locally
* Issue a PR - watch the PR build in [travis-ci](https://travis-ci.org/ExpediaDotCom/haystack-traces)
* Once merged to master, travis-ci will build and release the artifacts to [docker hub]

##Contributing to Documentation
To contribute to documentation, you can directly modify the corresponding .md files in the docs directory under the base haystack repository, and submit a pull request. Once your PR is merged, the documentation is automatically built and deployed to https://expediadotcom.github.io/haystack.

##License
By contributing to Haystack, you agree that your contributions will be licensed under its Apache License.
## Building

####Prerequisite:

* Make sure you have Java 1.8
* Make sure you have maven 3.3.9 or higher
* Make sure you have docker 1.13 or higher


Note : For mac users you can download docker for mac to set you up for the last two steps.

####Build

For a full build, including unit tests and integration tests, docker image build, you can run -
```
make all
```

####Integration Test

####Prerequisite:
1. Install docker using Docker Tools or native docker if on mac
2. Verify if docker-compose is installed by running following command else install it.
```
docker-compose

```

Run the build and integration tests for individual components with
```
make indexer

```

&&

```
make reader

```


```
make backends

```


## Releasing the artifacts

Currently we publish the repo to docker hub and nexus central repository.

* Git tagging:

```
git tag -a <tag name> -m "Release description..."
git push origin <tag name>
```

`<tag name>` must follow semantic versioning scheme.

Or one can also tag using UI: https://github.com/ExpediaDotCom/haystack-traces/releases

It is preferred to create an annotated tag using `git tag -a` and then use the release UI to add release notes for the tag.

* After the release is completed, please update the `pom.xml` files to next `-SNAPSHOT` version to match the next release
72 changes: 41 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,51 +1,61 @@
[![Build Status](https://travis-ci.org/ExpediaDotCom/haystack-traces.svg?branch=master)](https://travis-ci.org/ExpediaDotCom/haystack-traces)
[![License](https://img.shields.io/badge/license-Apache%20License%202.0-blue.svg)](https://github.com/ExpediaDotCom/haystack/blob/master/LICENSE)

# haystack-traces
This repo contains the haystack components that build the traces, store them in Cassandra and ElasticSearch(for indexing) and provide a grpc endpoint for accessing them
# Haystack Traces

Traces is a subsystem included in Haystack that provides a distributed tracing system to troubleshoot problems in microservice architectures. Its design is based on the [Google Dapper](http://research.google.com/pubs/pub36356.html) paper.

## Building

####
Since this repo contains haystack-idl as the submodule, so use the following to clone the repo
* git clone --recursive [email protected]:ExpediaDotCom/haystack-traces.git .
This repo contains the haystack components that build the traces. It uses ElasticSearch for indexing and a storage backend for persistence

####Prerequisite:
## Architecture
Please see the [architecture document](https://expediadotcom.github.io/haystack/docs/subsystems/subsystems_traces.html) for the high level architecture of the traces subsystem

* Make sure you have Java 1.8
* Make sure you have maven 3.3.9 or higher
* Make sure you have docker 1.13 or higher

## Components

Note : For mac users you can download docker for mac to set you up for the last two steps.
### haystack-trace-indexer

####Build
Trace Indexer is the component which reads spans from a kafka topic and writes to elasticsearch(for indexing)
and the storage backend for persistence. Please see the [indexer app](indexer/) for more details

For a full build, including unit tests and integration tests, docker image build, you can run -
```
make all
```
### haystack-trace-reader

####Integration Test
Trace Reader is the component which retrieves the trace-ids from elastic search based on the given queries and then fetches the spans from
the storage backend. Please see the [reader app](reader/) for more details

####Prerequisite:
1. Install docker using Docker Tools or native docker if on mac
2. Verify if docker-compose is installed by running following command else install it.
```
docker-compose
### Storage Backend

```
Haystack Traces multiple storage backend apps, used to store and query spans. The Storage backend apps are
grpc apps which are expected to implement this [grpc contract](https://github.com/ExpediaDotCom/haystack-idl/blob/master/proto/backend/storageBackend.proto)
The [reader](reader/src/main/scala/com/expedia/www/haystack/trace/reader/stores/readers/grpc/GrpcTraceReader.scala) and [indexer](indexer/src/main/scala/com/expedia/www/haystack/trace/indexer/writers/grpc/GrpcTraceWriter.scala) components read and write to the underlying datastore using this service and the default configuration expects the storage backend app to run on the same host(localhost) as the indexer and reader app.

Run the build and integration tests for individual components with
```
make indexer
By default the traces subsystem comes bundled with the following backends. You can always run your custom backends as long as it implements the [grpc contract](https://github.com/ExpediaDotCom/haystack-idl/blob/master/proto/backend/storageBackend.proto).

```
#### In-Memory
The in-memory storage backend app keeps the spans in memory. It
is neither persistent, nor viable for realistic work loads. Please see the [memory backend app](backends/memory) for more details

&&

```
make reader
#### Cassandra
The Cassandra storage-backend app is tested against [Cassandra 3.11.3+](http://cassandra.apache.org/). It is designed for production scale. Please see the [cassandra backend app](backends/cassandra) for more details

```
#### Mysql
The Mysql storage-backend app is tested against [Mysql 5.6++](https://dev.mysql.com/doc/relnotes/mysql/8.0/en/news-8-0-13.html) and [amazon aurora mysql](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/Aurora.AuroraMySQL.Overview.html).
It is designed for production scale. Please see the [mysql backend app](backends/mysql) for more details


## Contributing to this codebase
Please see [CONTRIBUTING.md](CONTRIBUTING.md)


## Bugs, Feature Requests, Documentation Updates
Please see the [contributing page](https://expediadotcom.github.io/haystack/docs/contributing.html) on our website

## Contact Info

Interested in haystack? Want to talk? Have questions, concerns or great ideas?
Please join us on [gitter](https://gitter.im/expedia-haystack/Lobby)

##License
By contributing to Haystack, you agree that your contributions will be licensed under its Apache License.
19 changes: 0 additions & 19 deletions Release.md

This file was deleted.

10 changes: 9 additions & 1 deletion backends/Makefile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
.PHONY: all cassandra memory release
.PHONY: all cassandra mysql memory release

PWD := $(shell pwd)

Expand All @@ -11,6 +11,13 @@ build_cassandra:
cd ../ && ./mvnw package -DfinalName=haystack-trace-backend-cassandra -pl backends/cassandra -am


mysql: build_mysql
cd mysql && $(MAKE) integration_test

build_mysql:
cd ../ && ./mvnw package -DfinalName=haystack-trace-backend-mysql -pl backends/mysql -am


memory: build_memory
cd memory && $(MAKE) integration_test

Expand All @@ -21,3 +28,4 @@ build_memory:
release:
cd cassandra && $(MAKE) docker_build && $(MAKE) release
cd memory && $(MAKE) docker_build && $(MAKE) release
cd mysql && $(MAKE) docker_build && $(MAKE) release
14 changes: 11 additions & 3 deletions backends/cassandra/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,17 @@ Grpc service which can read a write spans to a cassandra cluster
##Technical Details

In order to understand this service, we recommend to read the details of [haystack](https://github.com/ExpediaDotCom/haystack) project.
This service reads from [Cassandra](http://cassandra.apache.org/). API endpoints are exposed as [GRPC](https://grpc.io/) endpoints.
This service reads from [Cassandra](http://cassandra.apache.org/). API endpoints are exposed as [GRPC](https://grpc.io/) endpoints based on [this]((https://github.com/ExpediaDotCom/haystack-idl/blob/master/proto/backend/storageBackend.proto)) contract.

The Schema for the cassandra table is created by the code when it starts up if it doesn't exist using the following command

`
CREATE KEYSPACE IF NOT EXISTS haystack WITH REPLICATION = { 'class': 'SimpleStrategy', 'replication_factor' : 1 } AND durable_writes = false; CREATE TABLE IF NOT EXISTS haystack.traces (id varchar, ts timestamp, spans blob, PRIMARY KEY ((id), ts)) WITH CLUSTERING ORDER BY (ts ASC) AND compaction = { 'class' : 'DateTieredCompactionStrategy', 'max_sstable_age_days': '3' } AND gc_grace_seconds = 86400;
`

## Deployments
The reader and the indexer app expects the storage-backend app as a sidecar container and sample deployment topology using docker compose is shared [here](https://github.com/ExpediaDotCom/haystack-docker)

Will fill in more details as we go..

## Building
Check the details on [Build Section](../README.md)
Check the details on [Build Section](../../CONTRIBUTING.md)
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,6 @@ object CassandraTableSchema {
val ID_COLUMN_NAME = "id"
val TIMESTAMP_COLUMN_NAME = "ts"
val SPANS_COLUMN_NAME = "spans"
val SERVICE_COLUMN_NAME = "service_name"
val OPERATION_COLUMN_NAME = "operation_name"


/**
Expand Down
8 changes: 4 additions & 4 deletions backends/memory/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@ Grpc service which can read a write spans to a an in memory map
##Technical Details

In order to understand this service, we recommend to read the details of [haystack](https://github.com/ExpediaDotCom/haystack) project.
This service reads from an in memory map. API endpoints are exposed as [GRPC](https://grpc.io/) endpoints.
This service reads from an in memory map. API endpoints are exposed as [GRPC](https://grpc.io/) endpoints based on [this]((https://github.com/ExpediaDotCom/haystack-idl/blob/master/proto/backend/storageBackend.proto)) contract.

Will fill in more details as we go..
* Note : Its purpose is for testing, for example starting a server on your laptop without any database needed. This only works if the reader and indexer apps are running locally and talk to the same in-memory backend server.

## Building
Check the details on [Build Section](../README.md)
# Building
Check the details on [Build Section](../../CONTRIBUTING.md)
24 changes: 24 additions & 0 deletions backends/mysql/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
.PHONY: docker_build prepare_integration_test_env integration_test release

export DOCKER_ORG := ayansen89
export DOCKER_IMAGE_NAME := ayansen89/haystack-trace-backend-mysql
PWD := $(shell pwd)
SERVICE_DEBUG_ON ?= false

docker_build:
# build docker image using existing app jar
docker build -t $(DOCKER_IMAGE_NAME) -f build/docker/Dockerfile .

prepare_integration_test_env: docker_build
# prepare environment to run integration tests against
docker-compose -f build/integration-tests/docker-compose.yml -p sandbox up -d
sleep 30

integration_test: prepare_integration_test_env
cd ../../ &&./mvnw integration-test -pl backends/mysql -am
docker-compose -f build/integration-tests/docker-compose.yml -p sandbox stop
docker rm $(shell docker ps -a -q)
docker volume rm $(shell docker volume ls -q)

release:
../../deployment/scripts/publish-to-docker-hub.sh
21 changes: 21 additions & 0 deletions backends/mysql/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Storage Backend - Mysql

Grpc service which can read a write spans to a mysql cluster

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be useful for newbies to mention that mysql still needs elasticsearch for indexing. When I first looked at your branch, I quickly looked for indexing..


##Technical Details

In order to understand this service, we recommend to read the details of [haystack](https://github.com/ExpediaDotCom/haystack) project.
This service reads from [Mysql](https://www.mysql.com/). API endpoints are exposed as [GRPC](https://grpc.io/) endpoints based on [this]((https://github.com/ExpediaDotCom/haystack-idl/blob/master/proto/backend/storageBackend.proto)) contract.

The Schema for the sql table is created by the code when it starts up if it doesn't exist using the following command

`
CREATE DATABASE IF NOT EXISTS haystack; USE haystack; create table IF NOT EXISTS spans (id varchar(255) not null, spans LONGBLOB not null, ts timestamp default CURRENT_TIMESTAMP, PRIMARY KEY (id, ts))
`

## Deployments
The reader and the indexer app expects the storage-backend app as a sidecar container and sample deployment topology using docker compose is shared [here](docker-compose.yml)


## Building
Check the details on [Build Section](../../CONTRIBUTING.md)
21 changes: 21 additions & 0 deletions backends/mysql/build/docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
FROM openjdk:8-jre
MAINTAINER Haystack <[email protected]>

ENV APP_NAME haystack-trace-backend-mysql
ENV APP_HOME /app/bin
ENV JMXTRANS_AGENT jmxtrans-agent-1.2.6

RUN mkdir -p ${APP_HOME}

COPY target/${APP_NAME}.jar ${APP_HOME}/
COPY build/docker/start-app.sh ${APP_HOME}/
RUN chmod +x ${APP_HOME}/start-app.sh

COPY build/docker/jmxtrans-agent.xml ${APP_HOME}/
ADD https://github.com/jmxtrans/jmxtrans-agent/releases/download/${JMXTRANS_AGENT}/${JMXTRANS_AGENT}.jar ${APP_HOME}/

WORKDIR ${APP_HOME}

EXPOSE 8090

ENTRYPOINT ["./start-app.sh"]
30 changes: 30 additions & 0 deletions backends/mysql/build/docker/jmxtrans-agent.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
<jmxtrans-agent>
<queries>

<!-- grpc endpoint metrics -->
<query objectName="metrics:name=StorageBackend.writeSpans" attributes="50thPercentile,99thPercentile,OneMinuteRate" resultAlias="endpoint.writeSpans.#attribute#"/>
<query objectName="metrics:name=StorageBackend.writeSpans.failures" attributes="OneMinuteRate" resultAlias="endpoint.writeSpans.failures.#attribute#"/>
<query objectName="metrics:name=StorageBackend.readSpans" attributes="50thPercentile,99thPercentile,OneMinuteRate" resultAlias="endpoint.readSpans.#attribute#"/>
<query objectName="metrics:name=StorageBackend.readSpans.failures" attributes="OneMinuteRate" resultAlias="endpoint.readSpans.failures.#attribute#"/>

<query objectName="metrics:name=mysql.read.time"
attributes="99thPercentile,50thPercentile,OneMinuteRate"
resultAlias="mysql.read.time.#attribute#"/>
<query objectName="metrics:name=mysql.read.failures"
attributes="OneMinuteRate"
resultAlias="mysql.read.failures.#attribute#"/>
<query objectName="metrics:name=mysql.write.failure" attributes="OneMinuteRate"
resultAlias="mysql.write.failure.#attribute#"/>
<query objectName="metrics:name=mysql.write.warnings" attributes="OneMinuteRate"
resultAlias="mysql.write.warnings.#attribute#"/>

</queries>
<outputWriter class="org.jmxtrans.agent.GraphitePlainTextTcpOutputWriter">
<!-- template used in influxdb : "haystack.* system.subsystem.application.host.class.measurement*" -->
<host>${HAYSTACK_GRAPHITE_HOST:monitoring-influxdb-graphite.kube-system.svc}</host>
<port>${HAYSTACK_GRAPHITE_PORT:2003}</port>
<enabled>${HAYSTACK_GRAPHITE_ENABLED:false}</enabled>
<namePrefix>haystack.traces.backend-mysql.#hostname#.</namePrefix>
</outputWriter>
<collectIntervalInSeconds>30</collectIntervalInSeconds>
</jmxtrans-agent>
Loading