Skip to content

Files

Latest commit

Jun 7, 2024
3473140 · Jun 7, 2024

History

History
91 lines (63 loc) · 3.41 KB

README.md

File metadata and controls

91 lines (63 loc) · 3.41 KB

DiSh: Dynamic Shell-Script Distribution

A system for scaling out POSIX shell scripts on distributed file systems. DiSh is part of the PaSh project, which is hosted by the Linux Foundation.

DiSh builds heavily on and extends PaSh (command annotations, compiler infrastructure, and JIT orchestration).

Quick Jump: Installation | Running DiSh | Repo Structure | Evaluation | Community & More | Citing

Installation

The easiest way to play with DiSh is using docker.

The following steps commands will create a virtual cluster on one machine allow you to experiment with DiSh. If you have multiple machines, you can setup docker-swarm and use the swarm instruction in docker-hadoop.

## Clone the repo
git clone --recurse-submodules https://github.com/binpash/dish.git

## Install docker using our script (tested on Ubuntu)
## Alternatively see https://docs.docker.com/engine/install/ to install docker.
(cd dish; ./scripts/setup-docker.sh)


## Create the virtual cluster on the host machine
(cd docker-hadoop; ./setup-compose.sh) # currently takes several minutes due to rebuilding the images
## The cluster can be torn down using `docker compose down`

## Create a shell on the client
docker exec -it nodemanager1 bash

Running DiSh

Let's run a very simple example using DiSh:

cd $DISH_TOP
hdfs dfs -put README.md /README.md # Copies the readme to hdfs

Now, you can run this sample script (or create a script of your own). Run both DiSh and Bash and compare the results!

./di.sh ./scripts/sample.sh
bash ./scripts/sample.sh

Repo Structure

This repo hosts most of the components of the dish development. Some of them are incorporated in PaSh. The structure is as follows:

  • pash: Contains the complete PaSh repo as a submodule. DiSh uses and extends its annotations, compiler, and JIT orchestration infrastructure.
  • evaluation: Shell scripts used for evaluation.
  • runtime: Runtime component — e.g., remote fifo channels.
  • scripts: Scripts related to installation, deployment, and continuous integration.

Community & More

Chat:

Citing

If you used DiSh, consider citing the following paper:

@inproceedings{dish2023nsdi,
author = {Mustafa, Tammam and Kallas, Konstantinos and Das, Pratyush and Vasilakis, Nikos},
title = {{DiSh}: Dynamic {Shell-Script} Distribution},
booktitle = {20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23)},
year = {2023},
isbn = {978-1-939133-33-5},
address = {Boston, MA},
pages = {341--356},
url = {https://www.usenix.org/conference/nsdi23/presentation/mustafa},
publisher = {USENIX Association},
month = apr,
}