Run the latest version of the Elastic stack with Docker and Docker Compose.
It will give you the ability to analyze any data set by using the searching/aggregation capabilities of Elasticsearch and the visualization power of Kibana.
Based on the official Docker images from Elastic:
Note: Other branches in this project are available:
x-pack
: X-Pack supportsearchguard
: Search Guard supportvagrant
: run Docker inside Vagrant
- Install Docker version 17.05+
- Install Docker Compose version 1.6.0+
- Clone this repository
On distributions which have SELinux enabled out-of-the-box you will need to either re-context the files or set SELinux into Permissive mode in order for docker-elk to start properly. For example on Redhat and CentOS, the following will apply the proper context:
$ chcon -R system_u:object_r:admin_home_t:s0 docker-elk/
If you're using Docker for Windows, ensure the "Shared Drives" feature is enabled for the C:
drive (Docker for Windows > Settings > Shared Drives). See Configuring Docker for Windows Shared Drives (MSDN Blog).
Download all project repositories on the same directory level and checkout most up to date branch on each repository.
$ git clone https://github.com/TU-Munich/endorse-elk.git
$ git clone https://github.com/TU-Munich/endorse-data-nlp.git
$ git clone https://github.com/TU-Munich/endorse-dashboard.git
Change directory to endorse-elk:
$ cd endorse-elk/
Note: In case you switched branch, updated a base image or you are running the project for the first time, you need to run docker-compose build
first:
$ docker-compose build
Start the stack using docker-compose
:
$ docker-compose up
You can also run all services in the background (detached mode) by adding the -d
flag to the above command.
Give Kibana a few seconds to initialize, then access the Kibana web UI by hitting http://localhost:5601 with a web browser.
By default, the stack exposes the following ports:
- 5000: Logstash TCP input.
- 9200: Elasticsearch HTTP
- 9300: Elasticsearch TCP transport
- 5601: Kibana
- 3002: Endorse Data NLP backed
- 3000: Endorse Dashboard frontend
WARNING: If you're using boot2docker
, you must access it via the boot2docker
IP address instead of localhost
.
WARNING: If you're using Docker Toolbox, you must access it via the docker-machine
IP address instead of
localhost
.
Now that the stack is running, you will want to inject some log entries. The shipped Logstash configuration allows you to send content via TCP:
$ nc localhost 5000 < /path/to/logfile.log
When Kibana launches for the first time, it is not configured with any index pattern.
NOTE: You need to inject data into Logstash before being able to configure a Logstash index pattern via the Kibana web UI. Then all you have to do is hit the Create button.
Refer to Connect Kibana with Elasticsearch for detailed instructions about the index pattern configuration.
Create an index pattern via the Kibana API:
$ curl -XPOST -D- 'http://localhost:5601/api/saved_objects/index-pattern' \
-H 'Content-Type: application/json' \
-H 'kbn-version: 6.4.2' \
-d '{"attributes":{"title":"logstash-*","timeFieldName":"@timestamp"}}'
The created pattern will automatically be marked as the default index pattern as soon as the Kibana UI is opened for the first time.
NOTE: Configuration is not dynamically reloaded, you will need to restart the stack after any change in the configuration of a component.
The Kibana default configuration is stored in kibana/config/kibana.yml
.
It is also possible to map the entire config
directory instead of a single file.
The Logstash configuration is stored in logstash/config/logstash.yml
.
It is also possible to map the entire config
directory instead of a single file, however you must be aware that
Logstash will be expecting a
log4j2.properties
file for its own
logging.
The Elasticsearch configuration is stored in elasticsearch/config/elasticsearch.yml
.
You can also specify the options you want to override directly via environment variables:
elasticsearch:
environment:
network.host: "_non_loopback_"
cluster.name: "my-cluster"
Follow the instructions from the Wiki: Scaling out Elasticsearch
The data stored in Elasticsearch will be persisted after container reboot but not after container removal.
In order to persist Elasticsearch data even after removing the Elasticsearch container, you'll have to mount a volume on
your Docker host. Update the elasticsearch
service declaration to:
elasticsearch:
volumes:
- /path/to/storage:/usr/share/elasticsearch/data
This will store Elasticsearch data inside /path/to/storage
.
NOTE: beware of these OS-specific considerations:
- Linux: the unprivileged
elasticsearch
user is used within the Elasticsearch image, therefore the mounted data directory must be owned by the uid1000
. - macOS: the default Docker for Mac configuration allows mounting files from
/Users/
,/Volumes/
,/private/
, and/tmp
exclusively. Follow the instructions from the documentation to add more locations.
To add plugins to any ELK component you have to:
- Add a
RUN
statement to the correspondingDockerfile
(eg.RUN logstash-plugin install logstash-filter-json
) - Add the associated plugin code configuration to the service configuration (eg. Logstash input/output)
- Rebuild the images using the
docker-compose build
command
A few extensions are available inside the extensions
directory. These extensions provide features which
are not part of the standard Elastic stack, but can be used to enrich it with extra integrations.
The documentation for these extensions is provided inside each individual subdirectory, on a per-extension basis. Some of them require manual changes to the default ELK configuration.
By default, both Elasticsearch and Logstash start with 1/4 of the total host memory allocated to the JVM Heap Size.
The startup scripts for Elasticsearch and Logstash can append extra JVM options from the value of an environment variable, allowing the user to adjust the amount of memory that can be used by each component:
Service | Environment variable |
---|---|
Elasticsearch | ES_JAVA_OPTS |
Logstash | LS_JAVA_OPTS |
To accomodate environments where memory is scarce (Docker for Mac has only 2 GB available by default), the Heap Size
allocation is capped by default to 256MB per service in the docker-compose.yml
file. If you want to override the
default JVM configuration, edit the matching environment variable(s) in the docker-compose.yml
file.
For example, to increase the maximum JVM Heap Size for Logstash:
logstash:
environment:
LS_JAVA_OPTS: "-Xmx1g -Xms1g"
As for the Java Heap memory (see above), you can specify JVM options to enable JMX and map the JMX port on the Docker host.
Update the {ES,LS}_JAVA_OPTS
environment variable with the following content (I've mapped the JMX service on the port
18080, you can change that). Do not forget to update the -Djava.rmi.server.hostname
option with the IP address of your
Docker host (replace DOCKER_HOST_IP):
logstash:
environment:
LS_JAVA_OPTS: "-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.port=18080 -Dcom.sun.management.jmxremote.rmi.port=18080 -Djava.rmi.server.hostname=DOCKER_HOST_IP -Dcom.sun.management.jmxremote.local.only=false"
To use a different Elastic Stack version than the one currently available in the repository, simply change the version
number inside the .env
file, and rebuild the stack with:
$ docker-compose build
$ docker-compose up
NOTE: Always pay attention to the upgrade instructions for each individual component before performing a stack upgrade.
See the following Wiki pages:
Experimental support for Docker Swarm is provided in the form of a docker-stack.yml
file, which can be deployed in an
existing Swarm cluster using the following command:
$ docker stack deploy -c docker-stack.yml elk
If all components get deployed without any error, the following command will show 3 running services:
$ docker stack services elk
NOTE: to scale Elasticsearch in Swarm mode, configure zen to use the DNS name tasks.elasticsearch
instead of
elasticsearch
.