-
Notifications
You must be signed in to change notification settings - Fork 20
Datapoint Scripts
This is available only for Rails environments for which Rails.application.config.job_manager == :resque
. For most users, this means when OpenStudio Server is run via a Docker deployment or AMI. This functionality is not available for local installs which use delayed_jobs for job management. This functionality is handled by the run_initialization
and run finalization
methods in the Analysis class.
Scripts are run in a Linux shell. Leveraging this functionality requires Linux scripting.
To enable complex workflows for advanced use cases, datapoint initialization and finalization scripts exist to provide non-Ruby hooks into the simulation context of OpenStudio Server. These scripts can be defined and uploaded for an analysis using PAT. They function by running inside of the worker container before or after each and every analysis datapoint (OpenStudio simulation) is executed. These scripts should be written in bash and can run any commands available on the system, including calls to Ruby. They can be provided with arguments as standard ARGV
inputs.
A common use of datapoint initialization scripts is to modify the environment on a worker node prior to running OpenStudio. If this is your purpose, you probably only want to modify the environment once per datapoint rather than before each simulation. Thus, your initialize script should include a check to determine whether it has already run and exit immediately if true. The Example Scripts below include an example of this.
The PAT interface supports upload of up to one each of initialize and finalize bash scripts per datapoint and per analysis. It supports an "additional files" upload for any supplementary scripts that are called by initialize.sh
Datapoint initialization and finalizations scripts will be unzipped on worker containers to /mnt/openstudio/analysis_<ANALYSIS_UUID>/scripts/datapoint/<(initialize|finalize).sh
. Additional files can be included in /mnt/openstudio/analysis_<ANALYSIS_UUID>/scripts/lib/
and called by initialize or finalize scripts. Note that this file structure changed with release 2.6.2. For earlier releases, see Older Releases. The same directory may optionally include the files initialize.args
and finalize.args
. These files contain a JSON array of arguments to pass to the corresponding script. Resulting project structure:
/measures/<files>
/lib/<uploaded included files>
/seeds/<files>
/weather/<files>
/scripts/(analysis|data_point)/(initialize|finalize).sh
/scripts/(analysis|data_point)/(initialize|finalize).args
As an example, for an analysis with the UUID 7cc755d5-83f7-4a64-92b3-abc7cb0d3294
the datapoint finalize script lives at /mnt/openstudio/analysis_7cc755d5-83f7-4a64-92b3-abc7cb0d3294/scripts/datapoint/finalize.sh
. If the file /mnt/openstudio/analysis_7cc755d5-83f7-4a64-92b3-abc7cb0d3294/scripts/datapoint/finalize.args
is present and contains the array ["openstudio-standards", "NREL/openstudio-standards", "custom-branch"]
, then the following will be called after the simulation completes:
./mnt/openstudio/analysis_7cc755d5-83f7-4a64-92b3-abc7cb0d3294/scripts/datapoint/finalize.sh "openstudio-standards" "NREL/openstudio-standards" "custom_branch"
As of 2.6.2, initialization/finalization scripts are allowed 4 hours before timing out. For earlier releases, see Older Releases.
Datapoint initialize scripts are run after it has been guaranteed that the process for downloading the analysis.zip
and analysis.json
files from the web container onto the worker container and unzipping the analysis.zip
file is completed - see the end of the initialize_worker
method in run_simulate_data_point.rb
. Datapoint finalize scripts are run after the simulation has completed, regardless of the success of the simulation. If the simulation completes sucessfully, however the datapoint finalization script fails to return exit code 0, then the datapoint will be marked as errored. See approxamitly 50 lines before the end of the perform
method for more details.
In both initialization and finalization scripts, the same method, run_file
, is used for execution of the script. This method writes the script log to the run directory of the datapoint and provides the arguments defined in PAT to the script on invocation. Only two environment variables are provided to the script, SCRIPT_ANALYSIS_ID
and SCRIPT_DATA_POINT_ID
, which contain the analysis UUID and datapoint UUID of the simulation. The script invocation is then invoked using ruby's spawn
command. If you would like an additional environment variable added to the script execution environment, please open a feature issue with your request and rational.
Prior to the 2.6.2 release, an arbitrary number of scripts could be included in the /mnt/openstudio/analysis_<ANALYSIS_UUID>/scripts/worker_<finalization/initialization>/
directories and executed in alphabetical order. For versions < 2.6.2, datapoint initialization and finalization scripts were required to execute within 10 minutes.
The example script downloads and unpacks the zip file defined as the first and only argument in the PAT application once. The files are written to /mnt/openstudio/analysis_<ANALYSIS_UUID>/
#!/usr/bin/env sh
# Switch to directory the script resides in
SCRIPT=$(readlink -f "$0")
SCRIPTPATH=$(dirname "$SCRIPT")
cd $SCRIPTPATH
# move to the top level analysis directory
cd ..
cd ..
# if the lib directory does not exist, create it
if ! [ -d "lib" ]; then
mkdir "lib"
fi
cd "lib"
# only execute the following if the file does not already exist
FILENAME="${1##*/}"
if ! [ -f $FILENAME ]; then
CNT="0"
# Download and extract the archived files
echo "Retrieving archived files."
while [ $CNT -le "1" ]; do
curl --retry 10 -O "$1"
if ! [ -f $FILENAME ]; then
echo "ERROR: $FILENAME not successfully downloaded. Aborting..."
exit 1
fi
CNT=$((CNT+1))
if [ $CNT -eq "10" ]; then
echo "ERROR: Maximum number of retries ($CNT) exceeded. Aborting..."
exit 1
fi
done
cd ..
else
echo "Zip file already downloaded"
fi
This example executes an uploaded ruby script named example.rb
from the lib
directory defined as an analysis resource through the PAT application before each datapoint. There are not input arguments in this example.
#!/usr/bin/env sh
# Switch to directory the script resides in
SCRIPT=$(readlink -f "$0")
SCRIPTPATH=$(dirname "$SCRIPT")
cd $SCRIPTPATH
# move to the top level analysis directory
cd ..
cd ..
# if the lib directory does not exist, error
if ! [ -d "lib" ]; then
echo "ERROR: lib file not uploaded."
exit 1
fi
# if the example.rb file does not exist, error, otherwise execute it
if [ -f "lib/example.rb" ]; then
echo "Executing the example.rb file."
ruby lib/example.rb
else
echo "ERROR: lib/example.rb not uploaded."
exit 1
fi
This example finalization script removes as much of the datapoint as is recommended. Any additional deletions may lead to the datapoint erroring out while attempting to complete the perform method in run_simulation_data_point.rb
.
#!/usr/bin/env sh
# Switch to directory the script resides in
SCRIPT=$(readlink -f "$0")
SCRIPTPATH=$(dirname "$SCRIPT")
cd $SCRIPTPATH
# Delete selective large files from data_point dir
echo "Cleaning up data_point directory."
DPDIR="data_point_$SCRIPT_DATA_POINT_ID"
cd ..
cd ..
echo "Original files:"
ls -l $DPDIR
ls -l $DPDIR/run
du -h $DPDIR
rm -f $DPDIR/in.osm
rm -f $DPDIR/in.idf
rm -f $DPDIR/run/in.osm
rm -f $DPDIR/run/in.idf
rm -f $DPDIR/run/*.err
rm -f $DPDIR/run/*.json
rm -f $DPDIR/run/*.osw
rm -f $DPDIR/run/*.htm
rm -f $DPDIR/run/*.job
rm -f $DPDIR/run/run.log
rm -f $DPDIR/run/stdout*
rm -r $DPDIR/run/eplusout*
echo "Final files:"
ls -l $DPDIR
ls -l $DPDIR/run
du -h $DPDIR