Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Superset] Superset Docker Swarm files #741

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions Docker-Swarm-deployment/analytics/README.md
karun-singh marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
## Getting started with Superset(visualization tool) using docker swarm Deploy

To be begin with, in order to deploy superset stack first we need to pass appropriate environment variables to customize superset and to establishes connection with backend components.

#### Setting up environment variables:

- Create `.env` (hidden) file at the same directory level as your docker stack file.
- Copy `superset_env_file` content into `.env` file and replace placeholders with actual values.


#### To deploy:
```sh
docker stack deploy -c superset-stack.yaml superset
```

#### To Check the status :
```sh
docker service ls

ID NAME MODE REPLICAS IMAGE PORTS

7ztp4yx1d1gc superset_redis replicated 1/1 redis:7
k2lkdttsrgrw superset_superset replicated 1/1 ghcr.io/datakaveri/superset:4.0.2-1 *:8088->8088/tcp
ijzzqgxx8rd1 superset_superset-worker replicated 1/1 ghcr.io/datakaveri/superset:4.0.2-1
x1ojkx3smg0y superset_superset-worker-beat replicated 1/1 ghcr.io/datakaveri/superset:4.0.2-1
rv2yw340gsd0 superset_superset_init replicated 0/1 ghcr.io/datakaveri/superset:4.0.2-1
```

**superset_superset_init** service will be down once it performs bootstrap operations.

146 changes: 146 additions & 0 deletions Docker-Swarm-deployment/analytics/superset-stack.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
version: '3.9'

services:
redis:
image: redis:7
container_name: superset_cache
restart: unless-stopped
deploy:
replicas: 1
restart_policy:
condition: any
max_attempts: 5
resources:
limits:
cpus: '2'
memory: 4G
reservations:
cpus: '1'
memory: 2G
volumes:
- redis:/data

superset_init:
image: ghcr.io/datakaveri/superset:4.0.2-1
container_name: superset_init
env_file:
- .env
volumes:
- superset_home:/app/superset_home
command: ["/app/docker/docker-init.sh"]
networks:
- overlay-net

superset:
image: ghcr.io/datakaveri/superset:4.0.2-1
container_name: superset
restart: unless-stopped
ports:
- "8088:8088"
env_file:
- .env
depends_on:
- superset_init
environment:
- SUPERSET_DATABASE_URL=clickhousedb://default:[email protected]:8123/default # Connection URL
volumes:
- superset_home:/app/superset_home
configs:
- source: requirements
target: /app/docker/requirements-local.txt
mode: 0444
uid: "1000"
gid: "1000"
command: ["/app/docker/docker-bootstrap.sh", "app-gunicorn"]
networks:
- overlay-net
deploy:
replicas: 1
restart_policy:
condition: any
max_attempts: 5
resources:
limits:
cpus: '2'
memory: 6G
reservations:
cpus: '2'
memory: 4G
logging:
driver: "json-file"
options:
max-file: "5"
max-size: "10m"
tag: "{\"name\":\"{{.Name}}\",\"id\":\"{{.ID}}\"}"


superset-worker:
image: ghcr.io/datakaveri/superset:4.0.2-1
container_name: superset_worker
env_file:
- .env # default
restart: unless-stopped
volumes:
- superset_home:/app/superset_home
deploy:
replicas: 1
restart_policy:
condition: any
max_attempts: 5
resources:
limits:
cpus: '2'
memory: 4G
reservations:
cpus: '1'
memory: 2G
healthcheck:
test:
[
"CMD-SHELL",
"celery -A superset.tasks.celery_app:app inspect ping -d celery@$$HOSTNAME",
]
networks:
- overlay-net
command: ["/app/docker/docker-bootstrap.sh", "worker"]


superset-worker-beat:
image: ghcr.io/datakaveri/superset:4.0.2-1
container_name: superset_worker_beat
env_file:
- .env # default
restart: unless-stopped
volumes:
- superset_home:/app/superset_home
deploy:
replicas: 1
restart_policy:
condition: any
max_attempts: 5
resources:
limits:
cpus: '2'
memory: 4G
reservations:
cpus: '1'
memory: 2G
healthcheck:
disable: true
networks:
- overlay-net
command: ["/app/docker/docker-bootstrap.sh", "beat"]


volumes:
superset_home:
redis:

networks:
overlay-net:
external: true
driver: overlay
configs:
requirements:
file: ./docker/requirements-local.txt

73 changes: 73 additions & 0 deletions Docker-Swarm-deployment/analytics/superset_env_file
karun-singh marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

#SUPERSET login username and password
ADMIN_USERNAME=<admin_password>
ADMIN_PASSWORD=<admin_password>

# database configurations (do not modify)
DATABASE_DB=<database_name>
DATABASE_HOST=<database_host>
# Make sure you set this to a unique secure random value on production
DATABASE_PASSWORD=<idatabase_password>
DATABASE_USER=<database_user>

EXAMPLES_DB=examples
EXAMPLES_HOST=no
EXAMPLES_USER=examples
# Make sure you set this to a unique secure random value on production
EXAMPLES_PASSWORD=examples
EXAMPLES_PORT=5432

# database engine specific environment variables
# change the below if you prefer another database engine
DATABASE_PORT=<database_port>

# Select the appropriate dialect such as postgres, mysql, oracle etc.
DATABASE_DIALECT=<database_dialect>

#pass the below values if your using postgres as database, if not please use appropriate env KEYS for specific database
POSTGRES_DB=<postgres_db_name>
POSTGRES_USER=<database_user>

# Make sure you set this to a unique secure random value on production
POSTGRES_PASSWORD=<postgres_user_password>

#MYSQL_DATABASE=superset
#MYSQL_USER=superset
#MYSQL_PASSWORD=superset
#MYSQL_RANDOM_ROOT_PASSWORD=yes

# Add the mapped in /app/pythonpath_docker which allows devs to override stuff
PYTHONPATH=/app/pythonpath:/app/docker/pythonpath_dev
REDIS_HOST=<redis_host>
REDIS_PORT=<redis_port>

FLASK_DEBUG=true
SUPERSET_ENV=production
SUPERSET_LOAD_EXAMPLES=no
CYPRESS_CONFIG=false
SUPERSET_PORT=8088
MAPBOX_API_KEY=''
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this configured?

Copy link
Contributor Author

@SRINI2410 SRINI2410 Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karun-singh, Mapbox key must be generated from https://www.mapbox.com/ and using that key we make use of mapbox service for map related visualization. Key details have been provided by analytics team.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@swarup-e can you please provide a document/guide on how this key has to be generated. We can add it to the readme along with deployment files.


# Make sure you set this to a unique secure random value on production
SUPERSET_SECRET_KEY=<random_base64_alphanum>

ENABLE_PLAYWRIGHT=false
PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true
BUILD_SUPERSET_FRONTEND_IN_DOCKER=true


2 changes: 2 additions & 0 deletions docker/superset/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
FROM apache/superset:4.0.2
COPY docker/ /app/docker/
20 changes: 20 additions & 0 deletions docker/superset/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@

## Superset (a data visualization and data exploration platform)

We are creating custom `Dockerfile` which has neccessary scripts to run and bootstrap superset, on top of official superset image

`docker` directory consists of neccessary scripts to bring up superset


## TL;DR

1. `docker-bootstrap.sh` - This script installs neccessary python modules which are defined in **/docker/requirements-local.txt** file
2. `docker-init.sh`: This script upgrades schema and setup admin user and password for superset
3. `run-server.sh`: This script runs actual flask app i.e., superset

## Build Docker Image
```sh
docker build -t ghcr.io/datakaveri/superset:4.0.2-1
```


60 changes: 60 additions & 0 deletions docker/superset/docker/docker-bootstrap.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
#!/usr/bin/env bash
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

set -eo pipefail

REQUIREMENTS_LOCAL="/app/docker/requirements-local.txt"
# If Cypress run – overwrite the password for admin and export env variables
if [ "$CYPRESS_CONFIG" == "true" ]; then
export SUPERSET_CONFIG=tests.integration_tests.superset_test_config
export SUPERSET_TESTENV=true
export SUPERSET__SQLALCHEMY_DATABASE_URI=postgresql+psycopg2://superset:superset@db:5432/superset
fi
#
# Make sure we have dev requirements installed
#
if [ -f "${REQUIREMENTS_LOCAL}" ]; then
echo "Installing local overrides at ${REQUIREMENTS_LOCAL}"
pip install --no-cache-dir -r "${REQUIREMENTS_LOCAL}"
else
echo "Skipping local overrides"
fi

case "${1}" in
worker)
echo "Starting Celery worker..."
# setting up only 2 workers by default to contain memory usage in dev environments
celery --app=superset.tasks.celery_app:app worker -O fair -l INFO --concurrency=${CELERYD_CONCURRENCY:-2}
;;
beat)
echo "Starting Celery beat..."
rm -f /tmp/celerybeat.pid
celery --app=superset.tasks.celery_app:app beat --pidfile /tmp/celerybeat.pid -l INFO -s "${SUPERSET_HOME}"/celerybeat-schedule
;;
app)
echo "Starting web app (using development server)..."
flask run -p 8088 --with-threads --reload --debugger --host=0.0.0.0
;;
app-gunicorn)
echo "Starting web app..."
/usr/bin/run-server.sh
;;
*)
echo "Unknown Operation!!!"
;;
esac
26 changes: 26 additions & 0 deletions docker/superset/docker/docker-ci.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
#!/usr/bin/env bash
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
/app/docker/docker-init.sh

# TODO: copy config overrides from ENV vars

# TODO: run celery in detached state
export SERVER_THREADS_AMOUNT=8
# start up the web server

/usr/bin/run-server.sh
Loading