Skip to content

Datapoint Scripts

Nicholas Long edited this page Nov 1, 2019 · 1 revision

Prerequisites

This is available only for Rails environments for which Rails.application.config.job_manager == :resque. For most users, this means when OpenStudio Server is run via a Docker deployment or AMI. This functionality is not available for local installs which use delayed_jobs for job management. This functionality is handled by the run_initialization and run finalization methods in the Analysis class.

Scripts are run in a Linux shell. Leveraging this functionality requires Linux scripting.

Datapoint Initialization & Finalization Scripts

To enable complex workflows for advanced use cases, datapoint initialization and finalization scripts exist to provide non-Ruby hooks into the simulation context of OpenStudio Server. These scripts can be defined and uploaded for an analysis using PAT. They function by running inside of the worker container before or after each and every analysis datapoint (OpenStudio simulation) is executed. These scripts should be written in bash and can run any commands available on the system, including calls to Ruby. They can be provided with arguments as standard ARGV inputs.

A common use of datapoint initialization scripts is to modify the environment on a worker node prior to running OpenStudio. If this is your purpose, you probably only want to modify the environment once per datapoint rather than before each simulation. Thus, your initialize script should include a check to determine whether it has already run and exit immediately if true. The Example Scripts below include an example of this.

PAT

The PAT interface supports upload of up to one each of initialize and finalize bash scripts per datapoint and per analysis. It supports an "additional files" upload for any supplementary scripts that are called by initialize.sh

Execution context

Datapoint initialization and finalizations scripts will be unzipped on worker containers to /mnt/openstudio/analysis_<ANALYSIS_UUID>/scripts/datapoint/<(initialize|finalize).sh. Additional files can be included in /mnt/openstudio/analysis_<ANALYSIS_UUID>/scripts/lib/ and called by initialize or finalize scripts. Note that this file structure changed with release 2.6.2. For earlier releases, see Older Releases. The same directory may optionally include the files initialize.args and finalize.args. These files contain a JSON array of arguments to pass to the corresponding script. Resulting project structure:

/measures/<files>
/lib/<uploaded included files>
/seeds/<files>
/weather/<files>
/scripts/(analysis|data_point)/(initialize|finalize).sh
/scripts/(analysis|data_point)/(initialize|finalize).args

As an example, for an analysis with the UUID 7cc755d5-83f7-4a64-92b3-abc7cb0d3294 the datapoint finalize script lives at /mnt/openstudio/analysis_7cc755d5-83f7-4a64-92b3-abc7cb0d3294/scripts/datapoint/finalize.sh. If the file /mnt/openstudio/analysis_7cc755d5-83f7-4a64-92b3-abc7cb0d3294/scripts/datapoint/finalize.args is present and contains the array ["openstudio-standards", "NREL/openstudio-standards", "custom-branch"], then the following will be called after the simulation completes:

./mnt/openstudio/analysis_7cc755d5-83f7-4a64-92b3-abc7cb0d3294/scripts/datapoint/finalize.sh "openstudio-standards" "NREL/openstudio-standards" "custom_branch"

As of 2.6.2, initialization/finalization scripts are allowed 4 hours before timing out. For earlier releases, see Older Releases.

Datapoint initialize scripts are run after it has been guaranteed that the process for downloading the analysis.zip and analysis.json files from the web container onto the worker container and unzipping the analysis.zip file is completed - see the end of the initialize_worker method in run_simulate_data_point.rb. Datapoint finalize scripts are run after the simulation has completed, regardless of the success of the simulation. If the simulation completes sucessfully, however the datapoint finalization script fails to return exit code 0, then the datapoint will be marked as errored. See approxamitly 50 lines before the end of the perform method for more details.

In both initialization and finalization scripts, the same method, run_file, is used for execution of the script. This method writes the script log to the run directory of the datapoint and provides the arguments defined in PAT to the script on invocation. Only two environment variables are provided to the script, SCRIPT_ANALYSIS_ID and SCRIPT_DATA_POINT_ID, which contain the analysis UUID and datapoint UUID of the simulation. The script invocation is then invoked using ruby's spawn command. If you would like an additional environment variable added to the script execution environment, please open a feature issue with your request and rational.

Older Releases

Prior to the 2.6.2 release, an arbitrary number of scripts could be included in the /mnt/openstudio/analysis_<ANALYSIS_UUID>/scripts/worker_<finalization/initialization>/ directories and executed in alphabetical order. For versions < 2.6.2, datapoint initialization and finalization scripts were required to execute within 10 minutes.

Example scripts

Download and unzip a remote file once

The example script downloads and unpacks the zip file defined as the first and only argument in the PAT application once. The files are written to /mnt/openstudio/analysis_<ANALYSIS_UUID>/

#!/usr/bin/env sh

# Switch to directory the script resides in
SCRIPT=$(readlink -f "$0")
SCRIPTPATH=$(dirname "$SCRIPT")
cd $SCRIPTPATH

# move to the top level analysis directory
cd ..
cd ..

# if the lib directory does not exist, create it
if ! [ -d "lib" ]; then
  mkdir "lib"
fi
cd "lib"

# only execute the following if the file does not already exist
FILENAME="${1##*/}"

if ! [ -f $FILENAME ]; then
  CNT="0"

  # Download and extract the archived files
  echo "Retrieving archived files."
  while [ $CNT -le "1" ]; do
    curl --retry 10 -O "$1"
    if ! [ -f $FILENAME ]; then
      echo "ERROR: $FILENAME not successfully downloaded. Aborting..."
      exit 1
    fi
        
    CNT=$((CNT+1))
    if [ $CNT -eq "10" ]; then
      echo "ERROR: Maximum number of retries ($CNT) exceeded. Aborting..."
      exit 1
    fi
  done

  cd ..
else
  echo "Zip file already downloaded"
fi

Run a ruby script

This example executes an uploaded ruby script named example.rb from the lib directory defined as an analysis resource through the PAT application before each datapoint. There are not input arguments in this example.

#!/usr/bin/env sh

# Switch to directory the script resides in
SCRIPT=$(readlink -f "$0")
SCRIPTPATH=$(dirname "$SCRIPT")
cd $SCRIPTPATH

# move to the top level analysis directory
cd ..
cd ..

# if the lib directory does not exist, error
if ! [ -d "lib" ]; then
  echo "ERROR: lib file not uploaded."
  exit 1
fi

# if the example.rb file does not exist, error, otherwise execute it
if [ -f "lib/example.rb" ]; then
  echo "Executing the example.rb file."
  ruby lib/example.rb
else
  echo "ERROR: lib/example.rb not uploaded."
  exit 1
fi

Remove the maximal set of a datapoint after completion

This example finalization script removes as much of the datapoint as is recommended. Any additional deletions may lead to the datapoint erroring out while attempting to complete the perform method in run_simulation_data_point.rb.

#!/usr/bin/env sh

# Switch to directory the script resides in
SCRIPT=$(readlink -f "$0")
SCRIPTPATH=$(dirname "$SCRIPT")
cd $SCRIPTPATH

# Delete selective large files from data_point dir
echo "Cleaning up data_point directory."
DPDIR="data_point_$SCRIPT_DATA_POINT_ID"
cd ..
cd ..

echo "Original files:"
ls -l $DPDIR
ls -l $DPDIR/run
du -h $DPDIR

rm -f $DPDIR/in.osm
rm -f $DPDIR/in.idf

rm -f $DPDIR/run/in.osm
rm -f $DPDIR/run/in.idf
rm -f $DPDIR/run/*.err
rm -f $DPDIR/run/*.json
rm -f $DPDIR/run/*.osw
rm -f $DPDIR/run/*.htm
rm -f $DPDIR/run/*.job
rm -f $DPDIR/run/run.log
rm -f $DPDIR/run/stdout*
rm -r $DPDIR/run/eplusout*

echo "Final files:"
ls -l $DPDIR
ls -l $DPDIR/run
du -h $DPDIR