diff --git a/.gitignore b/.gitignore index b0ca438..1f0be4a 100644 --- a/.gitignore +++ b/.gitignore @@ -1,7 +1,3 @@ -.terraform* -terraform.tfstate* -*.tfvars* -vars_from_terraform __pycache__/ -ansible_collections/ -*.sif +*.egg-info/ +_version.py diff --git a/README.md b/README.md index 252aaeb..8ec0ca3 100644 --- a/README.md +++ b/README.md @@ -1,381 +1,14 @@ -# On-demand Storm Surge Model Infrastructure +# Probabilistic Hurricane Storm Surge Model Workflow +This is a Python based workflow for getting probabilistic results +from an ensemble of storm surge simulations during tropical cyclone +events. -## AWS On-Demand +TO BE COMPLETED... -This workflow uses ERA5 atmospheric forcing from Copernicus project -for hindcast mode. - -In case images are being rebuilt, to upload Docker images to the ECR, -first login to AWS ECR for docker when your AWS environment is set up. - -``` -aws ecr get-login-password --region | docker login --username AWS --password-stdin .dkr.ecr..amazonaws.com -``` - -** IMPORTANT ** -In this document it is assumed that no infrastructure has -been set up before hand. If this repo is being used as a part of -a collaboration, please check with the "admin" who originally sets it up -for the project in order to avoid overriding existing infrastructure. - -**The infrastructure set up at this point is not intended to be used -by multiple people on the same project. One "admin" sets it up and -then the rest of the collaborators can use Prefect to launch jobs** - -Also the names are not *yet* dynamic. Meaning that for separate projects -user need to modify names in Terraform file by hand! In later iterations -this issue will be addressed! - - -### Setting up accounts -To be able to administer the On-Demand workflow on AWS, first you need -to setup your accounts - -#### For AWS -Make sure you added MFA device in the AWS account. Then create a -API key. From AWS Console, go to "My Security Credentials" from -the pull-down and then create "access key". Make sure you note -what your access key is. This will be used later in setting up the -environment. (refer to AWS documentation) - -#### Prefect -You need to create a Prefect account. - -After creating the account create a **project** named `ondemand-stormsurge`. -You could also collaborate on existing project in other accounts -if you're added as collaborator to the team on Prefect. - -Create an **API key** for your account. Note this API key as it is -going to be used when setting up the environment. - - -### Setting up the environment - -Next to use the infrastructure you need to setup the local environment -correcly: - -#### For AWS -**Only on a trusted machine** -Use `aws configure` to configure your permanent keys. This includes -- permanent profile name -- aws_access_key_id -- aws_secret_access_key -- mfa_serial -- region=us-east-1 - -Using `aws-cli` execute the following code (replace the parts in the -brackets < and >, also remove the brackets themselves) - -```sh -# aws --profile sts get-session-token --serial-number arn:aws:iam:::mfa/ --token-code <6_DIGIT_MFA_PIN> -``` - -If everything is setup correctly, you'll receive a response with the -following items for a temporary credentials: -- AccessKeyId -- SecretAccessKey -- SessionToken -- Expiration - -Note that temporary credentials is **required** when using an -AWS account that has MFA setup. - -Copy these (the first 3) values into your `~/.aws/credentials` file. -Note that the the values should be set as the following in the -`credentials` file - -```txt -[temp profile name] -aws_access_key_id = XXXXXXXXXXXXX -aws_secret_access_key = XXXXXXXXXXXXX -aws_session_token = XXXXXXXXXXXXX -``` - -also set these values in your shell environment as (later used by -ansible and prefect): - -```sh -export AWS_ACCESS_KEY_ID=XXXXXXXXXXXXX -export AWS_SECRET_ACCESS_KEY=XXXXXXXXXXXXX -export AWS_SESSION_TOKEN=XXXXXXXXXXXXX -``` - -also for RDHPCS - -```sh -export RDHPCS_S3_ACCESS_KEY_ID=XXXXXXXXXXXXX -export RDHPCS_S3_SECRET_ACCESS_KEY=XXXXXXXXXXXXX -export PW_API_KEY=XXXXXXXX -``` - -and for ERA5 Copernicus -```sh -export CDSAPI_URL=https://cds.climate.copernicus.eu/api/v2 -export CDSAPI_KEY=: -``` - -Now test if your environment works by getting a list of S3 buckets -on the account: - -```sh -aws s3api list-buckets -``` - -You will get a list of all S3 buckets on the account - -#### For Prefect - -The environment for Prefect is setup by authenticating your local -client with the API server (e.g. Prefect cloud) using: - -```sh -prefect auth login --key API_KEY -``` - -#### Packages -Using `conda` create a single environment for Terraform, Prefect, and -Ansible using the environment file. From the repository -root execute: - -```sh -conda env create -f environment.yml -``` - -#### Misc -Create a **keypair** to be used for connecting to EC2 instaces -provisioned for the workflow. - - -### Usage - -The workflow backend setup happens in 3 steps: - -1. Terraform to setup AWS infrastructure such as S3 and manager EC2 -2. Ansible to install necessary packages and launch Prefect agents - on the manager EC2 - - Since temporary AWS credentials are inserted into the manager EC2 - the Ansible playbook needs to be executed when credentials expire -3. Prefect to define the workflow and start the workflow execution - -#### Step 1 -Currently part of the names and addresses in the Terraform -configurations are defined as `locals`. The rest of them need to be -found and modified from various sections of the Terraform file for -different projects to avoid name clash on the same account. - -In the `locals` section at the top of `main.tf` modify the following: -- `pvt_key_path = "/path/to/private_key"` used to create variable file for Ansible -- `pub_key_path = "/path/to/public_key.pub"` used to setup EC2 SSH key -- `dev = "developer_name"` developer name later to be used for dynamic naming - -In the `provider` section update `profile` to the name of the -temporary credentials profile created earlier - -After updating the values go to `terraform/backend` directory and call - -```sh -terraform init -terraform apply -``` - -Verify the changes and if correct type in `yes`. After this step -the Terraform sets up AWS backend and then creates two variable files: -One used by Ansible to connect to manager EC2 (provisioned by Terraform) -and another for Prefect. - -Now that the backend S3 bucket is provisioned go up to `terrafrom` -directory and execute the same commands for the rest of the -infrastructure. - -Before applying the Terraform script, you need to set `account_id` and -`role_prefix` variables in `terraform.tfvars` file (or pass by -`-var="value"` commandline argument) - -```sh -terraform init -terraform apply -``` - -Note that the Terraform state file for the system backend is stored -locally (in `backend` directory), but the state file for the rest of -the system is encrypted and stored on the S3 bucket defined in -`backend.tf`. - - -#### Step 2 - -Now that the backend is ready it's time to setup the manager EC2 -packages and start the agents. To do so Ansible is used. First -make sure that the **environment** variables for AWS and Prefect are -set (as described in the previous section), then go to the -`ansible` directory in the root of the repo and execute - -```sh -conda activate odssm -ansible-galaxy install -r requirements.yml -ansible-playbook -i inventory/inventory ./playbooks/provision-prefect-agent.yml -``` - -Note that Prefect agent Docker images are customized to have -AWS CLI installed. If you would like to further modify the agent -images, you need to change the value of `image` for the "Register ..." -Ansible-tasks. - -Wait for the Playbook to be fully executed. After that if you -SSH to the manager EC2 you should see 3 Prefect agents running on -Docker, 1 local and 2 ECS. - -Note that Currently we don't use Prefect ECS agents and all the logic -is executed by the "local" agent. Later this might change when -Prefect's ECSTask's are utilized. - - -#### Step 3 - -Now it's time to register the defined Prefect workflow and then run it. -From the shell environment. First activate the `prefect` Conda -environment: - -```sh -conda activate odssm -``` - -Then go to `prefect` directory and execute: - -```sh -python workflow/main.py register -``` - -This will register the workflow with the project in your Prefect -cloud account. Note that this doesn't need to be executed everytime. - -Once registered the workflow can be used the next time you set up -the environment. Now to run the workflow, with the `prefect` Conda -environment already activated, execute: - -```sh -prefect run -n end-to-end --param name= --param year= -``` - -For the Ansible playbook to work you also need to set this environment -variable: - -```sh -export PREFECT_AGENT_TOKEN= -``` - - -### Remarks -As mentioned before, current workflow is not designed to be set up -and activated by many users. For backend, one person needs to take -the role of admin and create all the necessary infrastructure as -well as agents and AWS temporary authentication. Then the Prefect -cloud account whose API key is used in the backend setup can start -the process. - -Also note that for the admin, after the first time setup, only the -following steps need to be repeated to update the expired -temporary AWS credentials: -- Setting AWS temporary credentials locally in ~/.aws/credentials -- Setting AWS temporary credentials in local environment -- Setting API key in local environment -- Executing Ansible script - -Note that the person executing the Ansible script needs to have -access to the key used to setup the EC2 when terraform was executed. - -If the role of admin is passed to another person, the tfstate files -from the current admin needs to be shared with the new person -and placed in the `terraform` directory to avoid overriding. - -The new admin can then generate their own keys and the Terraform -script will update the EC2 machine and launch templates with the -new key. - -The static S3 data is duplicated in both AWS infrastructure S3 as well -as PW S3. - - -## Dockerfiles created for on-demand project - -### Testing -Install Docker and Docker-compose on your machine. Usually Docker is installed with `sudo` access requirement for running containers - -#### To test if your environment is setup correctly - -Either inside `main` branch test the Docker image for fetching hurricane info - -In `info/docker` directory -```bash -sudo docker-compose build -``` - -modify `info/docker-compose.yml` and update the `source` to an address that exists on your machine. - -Then call -```bash -sudo docker-compose run hurricane-info-noaa elsa 2021 -``` - -This should fetch the hurricane info for the hurricane specified on the command line. The result is creation of `windswath` and `coops_ssh` directories with data in them as well as empty `mesh`, `setup`, and `sim` directories inside the address specified as `source` in the compose file. - - -#### To test the full pipeline -First, setup the environment variables to the desired hurricane name and year. You can also configure this in `main/.env` file, however if you have the environments set up, they'll always override the values in `.env` file -```bash -export HURRICANE_NAME=elsa -export HURRICANE_YEAR=2021 -``` - -Update `source` addresses in `main/docker-compose.yml` to match existing address on your machine. Note that each `service` in the compose file has its own mapping of sources to targets. Do **not** modify `target` values. Note that you need to update all the `source` values in this file as each one is used for one `service` or step. - -To test all the steps, in addition to this repo you need some static files -- Static geometry shapes (I'll provide `base_geom` and `high_geom`) -- GEBCO DEMs for Geomesh -- NCEI19 DEMs for Geomesh -- TPXO file `h_tpxo9.v1.nc` for PySCHISM -- NWM file `NWM_channel_hydrofabric.tar.gz` for PySCHISM - -Then when all the files and paths above are correctly set up, run - -```bash -sudo -E docker-compose run hurricane-info-noaa -``` - -Note that this time no argument is passed for hurricane name; it will be picked up from the environment. - -After this step is done (like the previous test) you'll get a directory structure needed for running the subsequent steps. - -Now you can run ocsmesh -```bash -sudo -E docker-compose run ocsmesh-noaa -``` -or you can run it in detached mode -```bash -sudo -E docker-compose run -d ocsmesh-noaa -``` - -When meshing is done you'll see `mesh` directory being filled with some files. After that for pyschism run: - -```bash - sudo -E docker-compose run pyschism-noaa - ``` - - or in detached mode - - ```bash - sudo -E docker-compose run -d pyschism-noaa - ``` - -When pyschism is done, you should see `/setup/schism.dir` that contains SCHISM. -In `main/.env` update `SCHISM_NPROCS` value to the number of available physical cores of the machine you're testing on, e.g. `2` or `4` and then run: - - ```bash - sudo -E docker-compose run -d schism-noaa - ``` ## References +- Daneshvar, F., et al. Tech Report (TODO) - Pringle, W. J., Mani, S., Sargsyan, K., Moghimi, S., Zhang, Y. J., Khazaei, B., Myers, E. (January 2023). _Surrogate-Assisted Bayesian Uncertainty Quantification for diff --git a/ansible/ansible.cfg b/ansible/ansible.cfg deleted file mode 100644 index b4b360f..0000000 --- a/ansible/ansible.cfg +++ /dev/null @@ -1,12 +0,0 @@ -[defaults] -become_user=root -ask_pass=False -ask_become_pass=False -roles_path=./roles -ask_vault_pass=False -action_plugins=./action_plugins -filter_plugins=./filter_plugins -callback_plugins=./callback_plugins -host_key_checking=False -collections_paths=./ -interpreter_python=auto_silent diff --git a/ansible/inventory/group_vars/prefect-agent/vars b/ansible/inventory/group_vars/prefect-agent/vars deleted file mode 100644 index b789105..0000000 --- a/ansible/inventory/group_vars/prefect-agent/vars +++ /dev/null @@ -1,6 +0,0 @@ ---- - -prefect_agent_ec2_key: odss-ec2-prefect-agent -prefect_agent_image_type: ami-03fe4d5b1d229063a -prefect_agent_region: us-west-2 -prefect_agent_vpc_subnet_id: subnet-98f04dc5 diff --git a/ansible/inventory/group_vars/thalassa.yml b/ansible/inventory/group_vars/thalassa.yml deleted file mode 100644 index 29049f3..0000000 --- a/ansible/inventory/group_vars/thalassa.yml +++ /dev/null @@ -1,10 +0,0 @@ ---- - -thalassa_image: yosoyjay/thalassa -thalassa_image_version: runtime-20210818 - -thalassa_internal_port: 8000 -thalassa_exposed_port: 10001 - -thalassa_data_mnt: /mnt/data -thalassa_corral_mnt: /mnt/corral \ No newline at end of file diff --git a/ansible/inventory/inventory b/ansible/inventory/inventory deleted file mode 100644 index 23b121a..0000000 --- a/ansible/inventory/inventory +++ /dev/null @@ -1,10 +0,0 @@ -[local] -localhost ansible_connection=local ansible_python_interpreter=python - -[thalassa] -noaa-stofs.tacc.utexas.edu - -[prefect_agent] - -[vars_from_terraform] -localhost ansible_connection=local ansible_python_interpreter=python diff --git a/ansible/playbooks/provision-prefect-agent.yml b/ansible/playbooks/provision-prefect-agent.yml deleted file mode 100644 index bc48500..0000000 --- a/ansible/playbooks/provision-prefect-agent.yml +++ /dev/null @@ -1,222 +0,0 @@ ---- - -# NOTE: Some of the variables are only defined in inventory vars for -# all groups. This file is updated by terraform execution -# -# Run from ansible directory (after terraform vars gen) using -# ansible-playbook -i inventory/inventory ./playbooks/provision-prefect-agent.yml -# -# Docker image authentication take from StackOverflow 63723674 - -- name: Setup EC2 play - hosts: local - gather_facts: false - - vars: - ec2_prefix: odssm-ec2-prefect-agent - ec2_inventory_name: local_ec2_agent - user_name: ec2-user - # ansible var cannot use '-' - ansible_group: prefect_agent - ansible_vars_group: vars_from_terraform - - - tasks: - - name: Setup EC2 task - block: - - name: Verify connectivity to EC2 - ansible.builtin.wait_for: - host: "{{ ec2_public_ip }}" - port: 22 - state: started - - - name: Add instance to group - ansible.builtin.add_host: - name: "{{ ec2_inventory_name }}" - ansible_host: "{{ ec2_public_ip }}" - ansible_user: "{{ user_name }}" - instance_name: "{{ ec2_prefix }}" - groups: - - "{{ ansible_group }}" - - "{{ ansible_vars_group }}" - - - name: Print instance group - debug: - var: ansible_group - - tags: - - setup - - -- name: Configure Prefect agent host - hosts: prefect_agent - gather_facts: True - become: True - - # TODO: Use --key instead of --token for Prefect - vars: - key: "{{ lookup('env', 'PREFECT_AGENT_TOKEN') }}" - rdhpcs_s3_access_key_id: "{{ lookup('env', 'RDHPCS_S3_ACCESS_KEY_ID') }}" - rdhpcs_s3_secret_access_key: "{{ lookup('env', 'RDHPCS_S3_SECRET_ACCESS_KEY') }}" - pw_api_key: "{{ lookup('env', 'PW_API_KEY') }}" - efs_mount_dir: /efs - docker_image: "{{ prefect_image }}:v0.4" - cdsapi_url: "{{ lookup('env', 'CDSAPI_URL') }}" - cdsapi_key: "{{ lookup('env', 'CDSAPI_KEY') }}" - - tasks: - - name: Install packages - yum: - name: - - docker - - python-pip - - python-devel - - "@Development tools" - - nfs-utils - - amazon-efs-utils - state: present - - - name: Start Docker - ansible.builtin.systemd: - name: docker - state: started - - - name: Start NFS (used to mount EFS) - ansible.builtin.systemd: - name: nfs - state: started - - - name: Update pip - pip: - name: pip - extra_args: --upgrade - - - name: Install wheel - pip: - name: wheel - - - name: Install Ansible - pip: - name: - - ansible - - - name: Install Docker python package - pip: - name: - - docker - # Needed to deal with older requests which is not installed via pip - extra_args: --ignore-installed - - - name: Create mount directory - file: - path: "{{ efs_mount_dir }}" - state: directory - mode: 0755 - - - name: Mount EFS volume - mount: - name: "{{ efs_mount_dir }}" - src: "{{ efs_id }}.efs.{{ aws_default_region }}.amazonaws.com:/" - fstype: nfs4 - opts: nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport - state: mounted - - - name: Make sure volume has rw permissions - file: - path: "{{ efs_mount_dir }}" - state: directory - mode: 0755 - - - name: ECR Docker authentication - shell: "aws ecr get-authorization-token" - environment: - AWS_DEFAULT_REGION: "{{ aws_default_region }}" - # fixes aws-cli command output issue - AWS_PAGER: "" - register: ecr_command - - - set_fact: - ecr_authorization_data: "{{ (ecr_command.stdout | from_json).authorizationData[0] }}" - - - set_fact: - ecr_credentials: "{{ (ecr_authorization_data.authorizationToken | b64decode).split(':') }}" - - - name: Login to ECR on Docker - docker_login: - registry: "{{ ecr_authorization_data.proxyEndpoint.rpartition('//')[2] }}" - username: "{{ ecr_credentials[0] }}" - password: "{{ ecr_credentials[1] }}" - reauthorize: yes - - - name: Register Prefect Local agent - docker_container: - name: "prefect-agent-local" - image: "{{ docker_image }}" - container_default_behavior: "compatibility" - env: - PREFECT_AGENT_TOKEN: "{{ key }} " - AWS_DEFAULT_REGION: "{{ aws_default_region }}" - # fixes aws-cli command output issue - AWS_PAGER: "" - CDSAPI_URL: "{{ cdsapi_url }}" - CDSAPI_KEY: "{{ cdsapi_key }}" - - volumes: - - /efs:/efs - command: > - prefect agent local start - --key "{{ key }}" - --label tacc-odssm-local - --name tacc-odssm-agent-local - --log-level INFO - state: started - - - name: Register Prefect Local agent 2 - docker_container: - name: "prefect-agent-local-for-rdhpcs" - image: "{{ docker_image }}" - container_default_behavior: "compatibility" - env: - PREFECT_AGENT_TOKEN: "{{ key }} " # TODO: Remove? - AWS_ACCESS_KEY_ID: "{{ rdhpcs_s3_access_key_id }}" - AWS_SECRET_ACCESS_KEY: "{{ rdhpcs_s3_secret_access_key }}" - PW_API_KEY: "{{ pw_api_key }}" - # fixes aws-cli command output issue - AWS_PAGER: "" - CDSAPI_URL: "{{ cdsapi_url }}" - CDSAPI_KEY: "{{ cdsapi_key }}" - volumes: - - /efs:/efs - command: > - prefect agent local start - --key "{{ key }}" - --label tacc-odssm-local-for-rdhpcs - --name tacc-odssm-agent-local-for-rdhpcs - --log-level INFO - state: started - - name: Register Prefect ECS agents - docker_container: - name: "prefect-agent-ecs" - image: "{{ docker_image }}" - container_default_behavior: "compatibility" - env: - PREFECT_AGENT_TOKEN: "{{ key }} " - AWS_DEFAULT_REGION: "{{ aws_default_region }}" - # fixes aws-cli command output issue - AWS_PAGER: "" - volumes: - - /efs:/efs - command: > - prefect agent ecs start - --launch-type EC2 - --env AWS_DEFAULT_REGION="{{ aws_default_region }}" - --key "{{ key }}" - --label tacc-odssm-ecs - --name tacc-odssm-agent-ecs - --log-level INFO - --cluster workflow - state: started - - - tags: - - config diff --git a/ansible/playbooks/thalassa.yml b/ansible/playbooks/thalassa.yml deleted file mode 100644 index b2b3642..0000000 --- a/ansible/playbooks/thalassa.yml +++ /dev/null @@ -1,5 +0,0 @@ ---- - -- hosts: thalassa - roles: - - thalassa diff --git a/ansible/requirements.yml b/ansible/requirements.yml deleted file mode 100644 index 17d1501..0000000 --- a/ansible/requirements.yml +++ /dev/null @@ -1,17 +0,0 @@ ---- - -collections: -- name: ansible.posix - version: 1.2.0 - -- name: community.docker - version: 1.5.0 - -- name: community.general - version: 3.0.0 - -- name: amazon.aws - version: 1.5.0 - -- name: community.aws - version: 1.5.0 diff --git a/ansible/roles/prefect-agent/tasks/main.yml b/ansible/roles/prefect-agent/tasks/main.yml deleted file mode 100644 index ceb9fab..0000000 --- a/ansible/roles/prefect-agent/tasks/main.yml +++ /dev/null @@ -1,18 +0,0 @@ ---- - -- block: - - name: Provision micro instance on EC2 - ec2: - aws_access_key: "{{ aws_access_key }}" - aws_secret_key: "{{ aws_secret_key }}" - key_name: "{{ prefect_agent_ec2_key }}" - instance_type: t2.micro - image: "{{ prefect_agent_image_type }}" - wait: yes - count: 1 - region: "{{ prefect_agent_region }}" - assign_public_ip: yes - id: odssm-ec2-prefect-agent - - tags: - - provision \ No newline at end of file diff --git a/ansible/roles/thalassa/tasks/main.yml b/ansible/roles/thalassa/tasks/main.yml deleted file mode 100644 index e584739..0000000 --- a/ansible/roles/thalassa/tasks/main.yml +++ /dev/null @@ -1,76 +0,0 @@ ---- - -- block: - - name: Add EPEL repo (Ansible prerequisite) - yum_repository: - name: epel - description: EPEL YUM repo - baseurl: https://download.fedoraproject.org/pub/epel/7/x86_64/ - - - name: Install default packages - yum: - name: - - yum-utils - - tmux - - htop - - "@Development tools" - - python-devel - state: present - - - name: Install Ansible - yum: - name: ansible - state: present - - - name: Install pip - yum: - name: - - python-pip - state: present - - - name: Install pip packages for Docker + Ansible - pip: - name: - - docker==4.4.4 - - docker-compose==1.26.2 - - pyrsistent==0.16.1 - - requests==2.25.1 - - websocket-client==0.32.0 - - - name: Add Docker repo - yum_repository: - name: Docker - description: Docker Repo - skip_if_unavailable: yes - baseurl: https://download.docker.com/linux/centos/docker-ce.repo - - - name: Install Docker - yum: - name: - - docker-ce - - docker-ce-cli - - containerd.io - state: present - - - name: Start Docker - ansible.builtin.systemd: - name: docker - state: started - - - name: Run Thalassa - docker_container: - name: "thalassa-{{ deploy_env }}" - image: "{{ thalassa_image }}:{{ thalassa_image_version }}" - state: started - restart: yes - restart_policy: unless-stopped - pull: true - ports: "{{ thalassa_exposed_port }}:{{ thalassa_internal_port }}" - volumes: - - "{{ thalassa_data_mnt }}:/data" - - "{{ thalassa_corral_mnt }}:/data/corral" - command: "thalassa serve --websocket-origin '*' --port {{ thalassa_internal_port }} --no-show" - - become: true - tags: - - deploy diff --git a/docker/info/docker/.env b/docker/info/docker/.env deleted file mode 100644 index 065abaf..0000000 --- a/docker/info/docker/.env +++ /dev/null @@ -1 +0,0 @@ -HURRINFO_USER=hurricaner diff --git a/docker/info/docker/Dockerfile b/docker/info/docker/Dockerfile deleted file mode 100644 index 1f1a17f..0000000 --- a/docker/info/docker/Dockerfile +++ /dev/null @@ -1,57 +0,0 @@ -FROM continuumio/miniconda3:4.10.3-alpine - -# Create a non-root user -ARG username=hurricaner -ARG uid=1000 -ARG gid=100 - -ENV USER $username -ENV UID $uid -ENV GID $gid -ENV HOME /home/$USER - -# Get necessary packages -RUN apk update && apk upgrade && apk add \ - git - -# New user -RUN adduser --disabled-password --gecos "Non-root user" --uid $UID --home $HOME $USER - -# Create a project directory inside user home -ENV PROJECT_DIR $HOME/app -RUN mkdir $PROJECT_DIR -RUN chown $UID:$GID $PROJECT_DIR -WORKDIR $PROJECT_DIR - - -# Build the conda environment -ENV ENV_PREFIX $HOME/icogsc - -COPY environment.yml /tmp/ -RUN chown $UID:$GID /tmp/environment.yml - -RUN conda install mamba -n base -c conda-forge && \ - mamba update --name base --channel defaults conda && \ - mamba env create --prefix $ENV_PREFIX --file /tmp/environment.yml --force && \ - mamba clean --all --yes - -RUN conda run -p $ENV_PREFIX --no-capture-output \ - pip install stormevents==2.1.2 - -ENV CONDA_DIR /opt/conda - -RUN conda clean --all -RUN apk del git - -RUN mkdir -p $PROJECT_DIR/scripts -COPY docker/hurricane_data.py ${PROJECT_DIR}/scripts/ -ENV PYTHONPATH ${PROJECT_DIR}/scripts/ - - -RUN mkdir -p $PROJECT_DIR/io - -USER $USER - - -# Ref: https://pythonspeed.com/articles/activate-conda-dockerfile/ -ENTRYPOINT [ "conda", "run", "-p", "$ENV_PREFIX", "--no-capture-output", "python", "-m", "hurricane_data" ] diff --git a/docker/info/docker/docker-compose.yml b/docker/info/docker/docker-compose.yml deleted file mode 100644 index 3c97ff3..0000000 --- a/docker/info/docker/docker-compose.yml +++ /dev/null @@ -1,15 +0,0 @@ -version: "3.9" -services: - hurricane-info-noaa: - build: - context: .. - dockerfile: docker/Dockerfile - args: - - username=${HURRINFO_USER} - - uid=1000 - - gid=100 -# command: '/bin/bash' - volumes: - - type: bind - source: /home/ec2-user/data/test/hurricanes - target: /home/${HURRINFO_USER}/app/io/output diff --git a/docker/info/docker/hurricane_data.py b/docker/info/docker/hurricane_data.py deleted file mode 100644 index 84e93c9..0000000 --- a/docker/info/docker/hurricane_data.py +++ /dev/null @@ -1,242 +0,0 @@ -"""User script to get hurricane info relevant to the workflow -This script gether information about: - - Hurricane track - - Hurricane windswath - - Hurricane event dates - - Stations info for historical hurricane -""" - -import sys -import logging -import pathlib -import argparse -import tempfile -from datetime import datetime, timedelta - -import pandas as pd -import geopandas as gpd -from searvey.coops import COOPS_TidalDatum -from searvey.coops import COOPS_TimeZone -from searvey.coops import COOPS_Units -from shapely.geometry import box -from stormevents import StormEvent -from stormevents.nhc import VortexTrack - - -logger = logging.getLogger(__name__) -logger.setLevel(logging.INFO) -logging.basicConfig( - stream=sys.stdout, - format='%(asctime)s,%(msecs)d %(levelname)-8s [%(filename)s:%(lineno)d] %(message)s', - datefmt='%Y-%m-%d:%H:%M:%S') - -EFS_MOUNT_POINT = pathlib.Path('~').expanduser() / f'app/io/output' - -def main(args): - - name_or_code = args.name_or_code - year = args.year - date_out = EFS_MOUNT_POINT / args.date_range_outpath - track_out = EFS_MOUNT_POINT / args.track_outpath - swath_out = EFS_MOUNT_POINT / args.swath_outpath - sta_dat_out = EFS_MOUNT_POINT / args.station_data_outpath - sta_loc_out = EFS_MOUNT_POINT / args.station_location_outpath - is_past_forecast = args.past_forecast - hr_before_landfall = args.hours_before_landfall - - if is_past_forecast and hr_before_landfall < 0: - hr_before_landfall = 48 - - ne_low = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres')) - shp_US = ne_low[ne_low.name.isin(['United States of America', 'Puerto Rico'])].unary_union - - logger.info("Fetching hurricane info...") - event = None - if year == 0: - event = StormEvent.from_nhc_code(name_or_code) - else: - event = StormEvent(name_or_code, year) - nhc_code = event.nhc_code - logger.info("Fetching a-deck track info...") - - # TODO: Get user input for whether its forecast or now! - now = datetime.now() - if (is_past_forecast or (now - event.start_date < timedelta(days=30))): - temp_track = event.track(file_deck='a') - adv_avail = temp_track.unfiltered_data.advisory.unique() - adv_order = ['OFCL', 'HWRF', 'HMON', 'CARQ'] - advisory = adv_avail[0] - for adv in adv_order: - if adv in adv_avail: - advisory = adv - break - - if advisory == "OFCL" and "CARQ" not in adv_avail: - raise ValueError( - "OFCL advisory needs CARQ for fixing missing variables!" - ) - - # NOTE: Track taken from `StormEvent` object is up to now only. - # See GitHub issue #57 for StormEvents - track = VortexTrack(nhc_code, file_deck='a', advisories=[advisory]) - - df_dt = pd.DataFrame(columns=['date_time']) - - if is_past_forecast: - - logger.info( - f"Creating {advisory} track for {hr_before_landfall}" - +" hours before landfall forecast..." - ) - onland_adv_tracks = track.data[track.data.intersects(shp_US)] - candidates = onland_adv_tracks.groupby('track_start_time').nth(0).reset_index() - candidates['timediff'] = candidates.datetime - candidates.track_start_time - track_start = candidates[ - candidates['timediff'] >= timedelta(hours=hr_before_landfall) - ].track_start_time.iloc[-1] - - gdf_track = track.data[track.data.track_start_time == track_start] - # Append before track from previous forecasts: - gdf_track = pd.concat(( - track.data[ - (track.data.track_start_time < track_start) - & (track.data.forecast_hours == 0) - ], - gdf_track - )) - df_dt['date_time'] = (track.start_date, track.end_date) - - - logger.info("Fetching water level measurements from COOPS stations...") - coops_ssh = event.coops_product_within_isotach( - product='water_level', wind_speed=34, - datum=COOPS_TidalDatum.NAVD, - units=COOPS_Units.METRIC, - time_zone=COOPS_TimeZone.GMT, - ) - - else: - # Get the latest track forecast - track_start = track.data.track_start_time.max() - gdf_track = track.data[track.data.track_start_time == track_start] - - # Put both dates as now(), for pyschism to setup forecast - df_dt['date_time'] = (now, now) - - coops_ssh = None - - # NOTE: Fake besttrack: Since PySCHISM supports "BEST" track - # files for its parametric forcing, write track as "BEST" after - # fixing the OFCL by CARQ through StormEvents - gdf_track.advisory = 'BEST' - gdf_track.forecast_hours = 0 - track = VortexTrack(storm=gdf_track, file_deck='b', advisories=['BEST']) - - windswath_dict = track.wind_swaths(wind_speed=34) - windswaths = windswath_dict['BEST'] # Faked BEST - logger.info(f"Fetching {advisory} windswath...") - windswath_time = min(pd.to_datetime(list(windswaths.keys()))) - windswath = windswaths[ - windswath_time.strftime("%Y%m%dT%H%M%S") - ] - - else: - - logger.info("Fetching b-deck track info...") - - df_dt = pd.DataFrame(columns=['date_time']) - df_dt['date_time'] = (event.start_date, event.end_date) - - logger.info("Fetching BEST windswath...") - track = event.track(file_deck='b') - windswath_dict = track.wind_swaths(wind_speed=34) - # NOTE: event.start_date (first advisory date) doesn't - # necessarily match the windswath key which comes from track - # start date for the first advisory (at least in 2021!) - windswaths = windswath_dict['BEST'] - latest_advistory_stamp = max(pd.to_datetime(list(windswaths.keys()))) - windswath = windswaths[ - latest_advistory_stamp.strftime("%Y%m%dT%H%M%S") - ] - - logger.info("Fetching water level measurements from COOPS stations...") - coops_ssh = event.coops_product_within_isotach( - product='water_level', wind_speed=34, - datum=COOPS_TidalDatum.NAVD, - units=COOPS_Units.METRIC, - time_zone=COOPS_TimeZone.GMT, - ) - - logger.info("Writing relevant data to files...") - df_dt.to_csv(date_out) - track.to_file(track_out) - gs = gpd.GeoSeries(windswath) - gdf_windswath = gpd.GeoDataFrame( - geometry=gs, data={'RADII': len(gs) * [34]}, crs="EPSG:4326" - ) - gdf_windswath.to_file(swath_out) - if coops_ssh is not None: - coops_ssh.to_netcdf(sta_dat_out, 'w') - coops_ssh[['x', 'y']].to_dataframe().drop(columns=['nws_id']).to_csv( - sta_loc_out, header=False, index=False) - - -if __name__ == '__main__': - - parser = argparse.ArgumentParser() - - parser.add_argument( - "name_or_code", help="name or NHC code of the storm", type=str) - parser.add_argument( - "year", help="year of the storm", type=int) - - parser.add_argument( - "--date-range-outpath", - help="output date range", - type=pathlib.Path, - required=True - ) - - parser.add_argument( - "--track-outpath", - help="output hurricane track", - type=pathlib.Path, - required=True - ) - - parser.add_argument( - "--swath-outpath", - help="output hurricane windswath", - type=pathlib.Path, - required=True - ) - - parser.add_argument( - "--station-data-outpath", - help="output station data", - type=pathlib.Path, - required=True - ) - - parser.add_argument( - "--station-location-outpath", - help="output station location", - type=pathlib.Path, - required=True - ) - - parser.add_argument( - "--past-forecast", - help="Get forecast data for a past storm", - action='store_true', - ) - - parser.add_argument( - "--hours-before-landfall", - help="Get forecast data for a past storm at this many hour before landfall", - type=int, - ) - - args = parser.parse_args() - - main(args) diff --git a/docker/info/environment.yml b/docker/info/environment.yml deleted file mode 100644 index 22107ab..0000000 --- a/docker/info/environment.yml +++ /dev/null @@ -1,14 +0,0 @@ -name: icogsc -channels: - - conda-forge -dependencies: - - cartopy - - cfunits - - gdal - - geopandas - - geos - - proj - - pygeos - - pyproj - - python=3.9 - - shapely>=1.8 diff --git a/docker/main/.env b/docker/main/.env deleted file mode 100644 index 84422dc..0000000 --- a/docker/main/.env +++ /dev/null @@ -1,10 +0,0 @@ -ONDEMAND_USER=ondemand-user -HURRICANE_NAME=florence -HURRICANE_YEAR=2018 -SCHISM_NPROCS=16 - -OUT_DIR=/home/ec2-user/data/test/hurricanes -SHAPE_DIR=/home/ec2-user/data/test/static/shape -DEM_DIR=/home/ec2-user/data/dem/ -TPXO_DIR=/home/ec2-user/data/test/static/tpxo -NWM_DIR=/home/ec2-user/data/test/static/nwm diff --git a/docker/main/docker-compose.yml b/docker/main/docker-compose.yml deleted file mode 100644 index 071a213..0000000 --- a/docker/main/docker-compose.yml +++ /dev/null @@ -1,101 +0,0 @@ -version: "3.9" -services: - hurricane-info-noaa: - build: - context: ../info - dockerfile: docker/Dockerfile - args: - - username=${ONDEMAND_USER} - - uid=1000 - - gid=100 -# command: '/bin/bash' - command: ${HURRICANE_NAME} ${HURRICANE_YEAR} - volumes: - - type: bind - source: ${OUT_DIR} - target: /home/${ONDEMAND_USER}/app/io/output - - ocsmesh-noaa: -# depends_on: -# - hurricane-info-noaa - build: - context: ../ocsmesh - dockerfile: docker/Dockerfile - args: - - username=${ONDEMAND_USER} - - uid=1000 - - gid=100 -# command: '/bin/sh' - volumes: - - type: bind - source: ${OUT_DIR} - target: /home/${ONDEMAND_USER}/app/io/hurricanes - - type: bind - source: ${SHAPE_DIR} - target: /home/${ONDEMAND_USER}/app/io/shape - - type: bind - source: ${DEM_DIR} - target: /home/${ONDEMAND_USER}/app/io/dem - - pyschism-noaa: -# depends_on: -# - ocsmesh-noaa -# - hurricane-info-noaa - build: - context: ../pyschism - dockerfile: docker/Dockerfile - args: - - username=${ONDEMAND_USER} - - uid=1000 - - gid=100 -# command: '/bin/sh' - volumes: - - type: bind - source: ${OUT_DIR} - target: /home/${ONDEMAND_USER}/app/io/hurricanes - - type: bind - source: ${TPXO_DIR} - target: /home/${ONDEMAND_USER}/.local/share/tpxo - - type: bind - source: ${NWM_DIR} - target: /home/${ONDEMAND_USER}/.local/share/nwm - - schism-noaa: -# depends_on: -# - pyschism-noaa -# - ocsmesh-noaa -# - hurricane-info-noaa - environment: - - SCHISM_NPROCS=${SCHISM_NPROCS} - cap_add: - - SYS_PTRACE - build: - context: ../schism - dockerfile: docker/Dockerfile - args: - - username=${ONDEMAND_USER} - - uid=1000 - - gid=100 -# command: '/bin/sh' - volumes: - - type: bind - source: ${OUT_DIR} - target: /home/${ONDEMAND_USER}/app/io/hurricanes - odssm-post-noaa: -# depends_on: -# - pyschism-noaa -# - ocsmesh-noaa -# - hurricane-info-noaa -# - schism-noaa - build: - context: ../post - dockerfile: docker/Dockerfile - args: - - username=${ONDEMAND_USER} - - uid=1000 - - gid=100 -# command: '/bin/sh' - volumes: - - type: bind - source: ${OUT_DIR} - target: /home/${ONDEMAND_USER}/app/io/hurricanes diff --git a/docker/ocsmesh/docker/.env b/docker/ocsmesh/docker/.env deleted file mode 100644 index edd28fd..0000000 --- a/docker/ocsmesh/docker/.env +++ /dev/null @@ -1 +0,0 @@ -GEOMESH_USER=ocsmesher diff --git a/docker/ocsmesh/docker/Dockerfile b/docker/ocsmesh/docker/Dockerfile deleted file mode 100644 index a0dfed8..0000000 --- a/docker/ocsmesh/docker/Dockerfile +++ /dev/null @@ -1,76 +0,0 @@ -FROM continuumio/miniconda3:4.10.3p0-alpine - -# Create a non-root user -ARG username=ocsmesher -ARG uid=1000 -ARG gid=100 - -ENV USER $username -ENV UID $uid -ENV GID $gid -ENV HOME /home/$USER - -# Get necessary packages -RUN apk update && apk upgrade && apk --no-cache add \ - git \ - gcc \ - g++ \ - make \ - cmake \ - libstdc++ - -# New user -RUN adduser -D -g "Non-root user" -u $UID -h $HOME $USER - -# Create a project directory inside user home -ENV PROJECT_DIR $HOME/app -RUN mkdir $PROJECT_DIR -RUN chown $UID:$GID $PROJECT_DIR -WORKDIR $PROJECT_DIR - - -# Build the conda environment -ENV ENV_PREFIX $HOME/icogsc - -COPY environment.yml /tmp/ -RUN chown $UID:$GID /tmp/environment.yml - -RUN conda install mamba -n base -c conda-forge && \ - mamba update --name base --channel defaults conda && \ - mamba env create --prefix $ENV_PREFIX --file /tmp/environment.yml --force && \ - mamba clean --all --yes - -ENV CONDA_DIR /opt/conda - -RUN git clone https://github.com/dengwirda/jigsaw-python.git && \ - git -C jigsaw-python checkout f875719 && \ - conda run -p $ENV_PREFIX --no-capture-output \ - python3 jigsaw-python/setup.py build_external && \ - cp jigsaw-python/external/jigsaw/bin/* $ENV_PREFIX/bin && \ - cp jigsaw-python/external/jigsaw/lib/* $ENV_PREFIX/lib && \ - conda run -p $ENV_PREFIX --no-capture-output \ - pip install ./jigsaw-python && \ - rm -rf jigsaw-python -RUN conda run -p $ENV_PREFIX --no-capture-output \ - pip install ocsmesh>=1.0.5 - -RUN conda clean --all && apk del \ - git \ - gcc \ - g++ \ - make \ - cmake - - -RUN mkdir -p $PROJECT_DIR/scripts -COPY docker/hurricane_mesh.py ${PROJECT_DIR}/scripts/ -ENV PYTHONPATH ${PROJECT_DIR}/scripts/ - - -RUN mkdir -p $PROJECT_DIR/io - -USER $USER - - -# Ref: https://pythonspeed.com/articles/activate-conda-dockerfile/ -ENTRYPOINT [ "conda", "run", "-p", "$ENV_PREFIX", "--no-capture-output", "python", "-m", "hurricane_mesh" ] diff --git a/docker/ocsmesh/docker/docker-compose.yml b/docker/ocsmesh/docker/docker-compose.yml deleted file mode 100644 index c371a74..0000000 --- a/docker/ocsmesh/docker/docker-compose.yml +++ /dev/null @@ -1,30 +0,0 @@ -version: "3.9" -services: - ocsmesh-noaa: - build: - context: .. - dockerfile: docker/Dockerfile - args: - - username=${GEOMESH_USER} - - uid=1000 - - gid=100 -# command: '/bin/sh' - volumes: - - type: bind - source: /home/ec2-user/data/test/hurricanes/florence_2018/windswath - target: /home/${GEOMESH_USER}/app/io/input/hurricane - - type: bind - source: /home/ec2-user/data/test/static/shape - target: /home/${GEOMESH_USER}/app/io/input/shape - - type: bind - source: /home/ec2-user/data/dem/gebco - target: /home/${GEOMESH_USER}/app/io/input/dem/GEBCO - - type: bind - source: /home/ec2-user/data/dem/ncei19 - target: /home/${GEOMESH_USER}/app/io/input/dem/NCEI19 - - type: bind - source: /home/ec2-user/data/dem/ncei19/tileindex_NCEI_ninth_Topobathy_2014.zip - target: /home/${GEOMESH_USER}/app/io/input/dem/tileindex_NCEI_ninth_Topobathy_2014.zip - - type: bind - source: /home/ec2-user/data/test/hurricanes/florence_2018/mesh - target: /home/${GEOMESH_USER}/app/io/output diff --git a/docker/ocsmesh/docker/hurricane_mesh.py b/docker/ocsmesh/docker/hurricane_mesh.py deleted file mode 100755 index 49b1608..0000000 --- a/docker/ocsmesh/docker/hurricane_mesh.py +++ /dev/null @@ -1,548 +0,0 @@ -#!/usr/bin/env python - -# Import modules -import logging -import os -import pathlib -import argparse -import sys -import warnings - -import numpy as np - -from fiona.drvsupport import supported_drivers -from shapely.geometry import box, MultiLineString -from shapely.ops import polygonize, unary_union, linemerge -from pyproj import CRS, Transformer -import geopandas as gpd - -from ocsmesh import Raster, Geom, Hfun, JigsawDriver, Mesh, utils -from ocsmesh.cli.subset_n_combine import SubsetAndCombine - -EFS_MOUNT_POINT = pathlib.Path('~').expanduser() / f'app/io' - -# Setup modules -# Enable KML driver -#from https://stackoverflow.com/questions/72960340/attributeerror-nonetype-object-has-no-attribute-drvsupport-when-using-fiona -supported_drivers['KML'] = 'rw' -supported_drivers['LIBKML'] = 'rw' - -logger = logging.getLogger(__name__) -logger.setLevel(logging.INFO) -logging.basicConfig( - stream=sys.stdout, - format='%(asctime)s,%(msecs)d %(levelname)-8s [%(filename)s:%(lineno)d] %(message)s', - datefmt='%Y-%m-%d:%H:%M:%S') - - -# Helper functions -def get_raster(path, crs=None): - rast = Raster(path) - if crs and rast.crs != crs: - rast.warp(crs) - return rast - - -def get_rasters(paths, crs=None): - rast_list = list() - for p in paths: - rast_list.append(get_raster(p, crs)) - return rast_list - - -def _generate_mesh_boundary_and_write( - out_dir, mesh_path, mesh_crs='EPSG:4326', threshold=-1000 - ): - - mesh = Mesh.open(str(mesh_path), crs=mesh_crs) - - logger.info('Calculating boundary types...') - mesh.boundaries.auto_generate(threshold=threshold) - - logger.info('Write interpolated mesh to disk...') - mesh.write( - str(out_dir/f'mesh_w_bdry.grd'), format='grd', overwrite=True - ) - - -def _write_mesh_box(out_dir, mesh_path, mesh_crs='EPSG:4326'): - mesh = Mesh.open(str(mesh_path), crs=mesh_crs) - domain_box = box(*mesh.get_multipolygon().bounds) - gdf_domain_box = gpd.GeoDataFrame( - geometry=[domain_box], crs=mesh.crs) - gdf_domain_box.to_file(out_dir/'domain_box') - - -# Main script -def main(args, clients): - - cmd = args.cmd - logger.info(f"The mesh command is {cmd}.") - - clients_dict = {c.script_name: c for c in clients} - - storm_name = str(args.name).lower() - storm_year = str(args.year).lower() - out_dir = EFS_MOUNT_POINT / args.out - - final_mesh_name = 'hgrid.gr3' - write_mesh_box = False - - if cmd == 'subset_n_combine': - final_mesh_name = 'final_mesh.2dm' - write_mesh_box = True - - args.rasters = [ - i for i in (EFS_MOUNT_POINT / args.rasters_dir / 'gebco').iterdir() if i.suffix == '.tif' - ] - - - args.out = out_dir - args.fine_mesh = EFS_MOUNT_POINT / args.fine_mesh - args.coarse_mesh = EFS_MOUNT_POINT / args.coarse_mesh - args.region_of_interset = EFS_MOUNT_POINT / args.region_of_interset - elif cmd == 'hurricane_mesh': - final_mesh_name = 'mesh_no_bdry.2dm' - - if cmd in clients_dict: - clients_dict[cmd].run(args) - else: - raise ValueError(f'Invalid meshing command specified: <{cmd}>') - - #TODO interpolate DEM? - if write_mesh_box: - _write_mesh_box(out_dir, out_dir / final_mesh_name) - _generate_mesh_boundary_and_write(out_dir, out_dir / final_mesh_name) - - -class HurricaneMesher: - - @property - def script_name(self): - return 'hurricane_mesh' - - def __init__(self, sub_parser): - - this_parser = sub_parser.add_parser(self.script_name) - - this_parser.add_argument( - "--nprocs", type=int, help="Number of parallel threads to use when " - "computing geom and hfun.") - - this_parser.add_argument( - "--geom-nprocs", type=int, help="Number of processors used when " - "computing the geom, overrides --nprocs argument.") - - this_parser.add_argument( - "--hfun-nprocs", type=int, help="Number of processors used when " - "computing the hfun, overrides --nprocs argument.") - - this_parser.add_argument( - "--hmax", type=float, help="Maximum mesh size.", - default=20000) - - this_parser.add_argument( - "--hmin-low", type=float, default=1500, - help="Minimum mesh size for low resolution region.") - - this_parser.add_argument( - "--rate-low", type=float, default=2e-3, - help="Expansion rate for low resolution region.") - - this_parser.add_argument( - "--contours", type=float, nargs=2, - help="Contour specification applied to whole domain; " - "contour mesh size needs to be greater that hmin-low", - metavar="SPEC") - - this_parser.add_argument( - "--transition-elev", "-e", type=float, default=-200, - help="Cut off elev for high resolution region") - - this_parser.add_argument( - "--hmin-high", type=float, default=300, - help="Minimum mesh size for high resolution region.") - - this_parser.add_argument( - "--rate-high", type=float, default=1e-3, - help="Expansion rate for high resolution region") - - this_parser.add_argument( - "--shapes-dir", - help="top-level directory that contains shapefiles") - - this_parser.add_argument( - "--windswath", - help="path to NHC windswath shapefile") - - # Similar to the argument for SubsetAndCombine - this_parser.add_argument( - "--out", help="mesh operation output directory") - - def run(self, args): - - nprocs = args.nprocs - - geom_nprocs = nprocs - if args.geom_nprocs: - nprocs = args.geom_nprocs - geom_nprocs = -1 if nprocs == None else nprocs - - hfun_nprocs = nprocs - if args.hfun_nprocs: - nprocs = args.hfun_nprocs - hfun_nprocs = -1 if nprocs == None else nprocs - - storm_name = str(args.name).lower() - storm_year = str(args.year).lower() - - dem_dir = EFS_MOUNT_POINT / args.rasters_dir - shp_dir = EFS_MOUNT_POINT / args.shapes_dir - hurr_info = EFS_MOUNT_POINT / args.windswath - out_dir = EFS_MOUNT_POINT / args.out - - coarse_geom = shp_dir / 'base_geom' - fine_geom = shp_dir / 'high_geom' - - gebco_paths = [i for i in (dem_dir / 'gebco').iterdir() if str(i).endswith('.tif')] - cudem_paths = [i for i in (dem_dir / 'ncei19').iterdir() if str(i).endswith('.tif')] - all_dem_paths = [*gebco_paths, *cudem_paths] - tile_idx_path = f'zip://{str(dem_dir)}/tileindex_NCEI_ninth_Topobathy_2014.zip' - - # Specs - wind_kt = 34 - filter_factor = 3 - max_n_hires_dem = 150 - - - # Geom (hardcoded based on prepared hurricane meshing spec) - z_max_lo = 0 - z_max_hi = 10 - z_max = max(z_max_lo, z_max_hi) - - # Hfun - hmax = args.hmax - - hmin_lo = args.hmin_low - rate_lo = args.rate_low - - contour_specs_lo = [] - if args.contours is not None: - for c_elev, m_size in args.contours: - if hmin_lo > m_size: - warnings.warn( - "Specified contour must have a mesh size" - f" larger than minimum low res size: {hmin_low}") - contour_specs_lo.append((c_elev, rate_lo, m_size)) - - else: - contour_specs_lo = [ - (-4000, rate_lo, 10000), - (-1000, rate_lo, 6000), - (-10, rate_lo, hmin_lo) - ] - - const_specs_lo = [ - (hmin_lo, 0, z_max) - ] - - cutoff_hi = args.transition_elev - hmin_hi = args.hmin_high - rate_hi = args.rate_high - - contour_specs_hi = [ - (0, rate_hi, hmin_hi) - ] - const_specs_hi = [ - (hmin_hi, 0, z_max) - ] - - - # Read inputs - logger.info("Reading input shapes...") - gdf_fine = gpd.read_file(fine_geom) - gdf_coarse = gpd.read_file(coarse_geom) - tile_idx = gpd.read_file(tile_idx_path) - - logger.info("Reading hurricane info...") - gdf = gpd.read_file(hurr_info) - gdf_wind_kt = gdf[gdf.RADII.astype(int) == wind_kt] - - # Simplify high resolution geometry - logger.info("Simplify high-resolution shape...") - gdf_fine = gpd.GeoDataFrame( - geometry=gdf_fine.to_crs("EPSG:3857").simplify(tolerance=hmin_hi / 2).buffer(0).to_crs(gdf_fine.crs), - crs=gdf_fine.crs) - - - # Calculate refinement region - logger.info(f"Create polygon from {wind_kt}kt windswath polygon...") - ext_poly = [i for i in polygonize([ext for ext in gdf_wind_kt.exterior])] - gdf_refine_super_0 = gpd.GeoDataFrame( - geometry=ext_poly, crs=gdf_wind_kt.crs) - - logger.info("Find upstream...") - domain_extent = gdf_fine.to_crs(gdf_refine_super_0.crs).total_bounds - domain_box = box(*domain_extent) - box_tol = 1/1000 * max(domain_extent[2]- domain_extent[0], domain_extent[3] - domain_extent[1]) - gdf_refine_super_0 = gdf_refine_super_0.intersection(domain_box.buffer(-box_tol)) - gdf_refine_super_0.plot() - ext_poly = [i for i in gdf_refine_super_0.explode().geometry] - - dmn_ext = [pl.exterior for mp in gdf_fine.geometry for pl in mp] - wnd_ext = [pl.exterior for pl in ext_poly] - - gdf_dmn_ext = gpd.GeoDataFrame(geometry=dmn_ext, crs=gdf_fine.crs) - gdf_wnd_ext = gpd.GeoDataFrame(geometry=wnd_ext, crs=gdf_wind_kt.crs) - - gdf_ext_over = gpd.overlay(gdf_dmn_ext, gdf_wnd_ext.to_crs(gdf_dmn_ext.crs), how="union") - - gdf_ext_x = gdf_ext_over[gdf_ext_over.intersects(gdf_wnd_ext.to_crs(gdf_ext_over.crs).unary_union)] - - filter_lines_threshold = np.max(gdf_dmn_ext.length) / filter_factor - lnstrs = linemerge([lnstr for lnstr in gdf_ext_x.explode().geometry]) - if not isinstance(lnstrs, MultiLineString): - lnstrs = [lnstrs] - lnstrs = [lnstr for lnstr in lnstrs if lnstr.length < filter_lines_threshold] - gdf_hurr_w_upstream = gdf_wnd_ext.to_crs(gdf_ext_x.crs) - gdf_hurr_w_upstream = gdf_hurr_w_upstream.append( - gpd.GeoDataFrame( - geometry=gpd.GeoSeries(lnstrs), - crs=gdf_ext_x.crs - )) - - - gdf_hurr_w_upstream_poly = gpd.GeoDataFrame( - geometry=gpd.GeoSeries(polygonize(gdf_hurr_w_upstream.unary_union)), - crs=gdf_hurr_w_upstream.crs) - - logger.info("Find intersection of domain polygon with impacted area upstream...") - gdf_refine_super_2 = gpd.overlay( - gdf_fine, gdf_hurr_w_upstream_poly.to_crs(gdf_fine.crs), - how='intersection' - ) - - gdf_refine_super_2.to_file(out_dir / 'dmn_hurr_upstream') - - logger.info("Selecting high resolution DEMs...") - gdf_dem_box = gpd.GeoDataFrame( - columns=['geometry', 'path'], - crs=gdf_refine_super_2.crs) - for path in all_dem_paths: - bbox = Raster(path).get_bbox(crs=gdf_dem_box.crs) - gdf_dem_box = gdf_dem_box.append( - gpd.GeoDataFrame( - {'geometry': [bbox], - 'path': str(path)}, - crs=gdf_dem_box.crs) - ) - gdf_dem_box = gdf_dem_box.reset_index() - - lo_res_paths = gebco_paths - - # TODO: use sjoin instead?! - gdf_hi_res_box = gdf_dem_box[gdf_dem_box.geometry.intersects(gdf_refine_super_2.unary_union)].reset_index() - hi_res_paths = gdf_hi_res_box.path.values.tolist() - - - # For refine cut off either use static geom at e.g. 200m depth or instead just use low-res for cut off polygon - - - # Or intersect with full geom? (timewise an issue for hfun creation) - logger.info("Calculate refinement area cutoff...") - cutoff_dem_paths = [i for i in gdf_hi_res_box.path.values.tolist() if pathlib.Path(i) in lo_res_paths] - cutoff_geom = Geom( - get_rasters(cutoff_dem_paths), - base_shape=gdf_coarse.unary_union, - base_shape_crs=gdf_coarse.crs, - zmax=cutoff_hi, - nprocs=geom_nprocs) - cutoff_poly = cutoff_geom.get_multipolygon() - - gdf_cutoff = gpd.GeoDataFrame( - geometry=gpd.GeoSeries(cutoff_poly), - crs=cutoff_geom.crs) - - gdf_draft_refine = gpd.overlay(gdf_refine_super_2, gdf_cutoff.to_crs(gdf_refine_super_2.crs), how='difference') - - refine_polys = [pl for pl in gdf_draft_refine.unary_union] - - gdf_final_refine = gpd.GeoDataFrame( - geometry=refine_polys, - crs=gdf_draft_refine.crs) - - - logger.info("Write landfall area to disk...") - gdf_final_refine.to_file(out_dir/'landfall_refine_area') - - gdf_geom = gpd.overlay( - gdf_coarse, - gdf_final_refine.to_crs(gdf_coarse.crs), - how='union') - - domain_box = box(*gdf_fine.total_bounds) - gdf_domain_box = gpd.GeoDataFrame( - geometry=[domain_box], crs=gdf_fine.crs) - gdf_domain_box.to_file(out_dir/'domain_box') - - geom = Geom(gdf_geom.unary_union, crs=gdf_geom.crs) - - - logger.info("Create low-res size function...") - hfun_lo = Hfun( - get_rasters(lo_res_paths), - base_shape=gdf_coarse.unary_union, - base_shape_crs=gdf_coarse.crs, - hmin=hmin_lo, - hmax=hmax, - nprocs=hfun_nprocs, - method='fast') - - logger.info("Add refinement spec to low-res size function...") - for ctr in contour_specs_lo: - hfun_lo.add_contour(*ctr) - hfun_lo.add_constant_value(value=ctr[2], lower_bound=ctr[0]) - - for const in const_specs_lo: - hfun_lo.add_constant_value(*const) - - # hfun_lo.add_subtidal_flow_limiter(upper_bound=z_max) - # hfun_lo.add_subtidal_flow_limiter(hmin=hmin_lo, upper_bound=z_max) - - - logger.info("Compute low-res size function...") - jig_hfun_lo = hfun_lo.msh_t() - - - logger.info("Write low-res size function to disk...") - Mesh(jig_hfun_lo).write( - str(out_dir/f'hfun_lo_{hmin_hi}.2dm'), - format='2dm', - overwrite=True) - - - # For interpolation after meshing and use GEBCO for mesh size calculation in refinement area. - hfun_hi_rast_paths = hi_res_paths - if len(hi_res_paths) > max_n_hires_dem: - hfun_hi_rast_paths = gebco_paths - - logger.info("Create high-res size function...") - hfun_hi = Hfun( - get_rasters(hfun_hi_rast_paths), - base_shape=gdf_final_refine.unary_union, - base_shape_crs=gdf_final_refine.crs, - hmin=hmin_hi, - hmax=hmax, - nprocs=hfun_nprocs, - method='fast') - - # Apply low resolution criteria on hires as ewll - logger.info("Add refinement spec to high-res size function...") - for ctr in contour_specs_lo: - hfun_hi.add_contour(*ctr) - hfun_hi.add_constant_value(value=ctr[2], lower_bound=ctr[0]) - - for ctr in contour_specs_hi: - hfun_hi.add_contour(*ctr) - hfun_hi.add_constant_value(value=ctr[2], lower_bound=ctr[0]) - - for const in const_specs_hi: - hfun_hi.add_constant_value(*const) - - # hfun_hi.add_subtidal_flow_limiter(upper_bound=z_max) - - logger.info("Compute high-res size function...") - jig_hfun_hi = hfun_hi.msh_t() - - logger.info("Write high-res size function to disk...") - Mesh(jig_hfun_hi).write( - str(out_dir/f'hfun_hi_{hmin_hi}.2dm'), - format='2dm', - overwrite=True) - - - jig_hfun_lo = Mesh.open(str(out_dir/f'hfun_lo_{hmin_hi}.2dm'), crs="EPSG:4326").msh_t - jig_hfun_hi = Mesh.open(str(out_dir/f'hfun_hi_{hmin_hi}.2dm'), crs="EPSG:4326").msh_t - - - logger.info("Combine size functions...") - gdf_final_refine = gpd.read_file(out_dir/'landfall_refine_area') - - utils.clip_mesh_by_shape( - jig_hfun_hi, - shape=gdf_final_refine.to_crs(jig_hfun_hi.crs).unary_union, - fit_inside=True, - in_place=True) - - jig_hfun_final = utils.merge_msh_t( - jig_hfun_lo, jig_hfun_hi, - drop_by_bbox=False, - can_overlap=False, - check_cross_edges=True) - - - logger.info("Write final size function to disk...") - hfun_mesh = Mesh(jig_hfun_final) - hfun_mesh.write( - str(out_dir/f'hfun_comp_{hmin_hi}.2dm'), - format='2dm', - overwrite=True) - - - hfun = Hfun(hfun_mesh) - - logger.info("Generate mesh...") - driver = JigsawDriver(geom=geom, hfun=hfun, initial_mesh=True) - mesh = driver.run() - - - utils.reproject(mesh.msh_t, "EPSG:4326") - mesh.write( - str(out_dir/f'mesh_raw_{hmin_hi}.2dm'), - format='2dm', - overwrite=True) - - mesh = Mesh.open(str(out_dir/f'mesh_raw_{hmin_hi}.2dm'), crs="EPSG:4326") - - dst_crs = "EPSG:4326" - interp_rast_list = [ - *get_rasters(gebco_paths, dst_crs), - *get_rasters(gdf_hi_res_box.path.values, dst_crs)] - - # TODO: Fix the deadlock issue with multiple cores when interpolating - logger.info("Interpolate DEMs on the generated mesh...") - mesh.interpolate(interp_rast_list, nprocs=1, method='nearest') - - logger.info("Write raw mesh to disk...") - mesh.write( - str(out_dir/f'mesh_{hmin_hi}.2dm'), - format='2dm', - overwrite=True) - - # Write the same mesh with a generic name - mesh.write( - str(out_dir/f'mesh_no_bdry.2dm'), - format='2dm', - overwrite=True) - - - -if __name__ == '__main__': - - parser = argparse.ArgumentParser() - parser.add_argument( - "name", help="name of the storm", type=str) - parser.add_argument( - "year", help="year of the storm", type=int) - parser.add_argument( - "--rasters-dir", help="top-level directory that contains rasters") - - subparsers = parser.add_subparsers(dest='cmd') - subset_client = SubsetAndCombine(subparsers) - hurrmesh_client = HurricaneMesher(subparsers) - - args = parser.parse_args() - - logger.info(f"Mesh arguments are {args}.") - - main(args, [hurrmesh_client, subset_client]) diff --git a/docker/ocsmesh/environment.yml b/docker/ocsmesh/environment.yml deleted file mode 100644 index 8b2d996..0000000 --- a/docker/ocsmesh/environment.yml +++ /dev/null @@ -1,29 +0,0 @@ -name: icogsc -channels: - - conda-forge -dependencies: - - python=3.9 - - gdal - - geos - - proj - - netcdf4 - - udunits2 - - pyproj - - shapely>=1.8,<2 - - rasterio - - fiona - - pygeos - - geopandas - - utm - - scipy<1.8 - - numba - - numpy>=1.21 - - matplotlib - - requests - - tqdm - - mpi4py - - pyarrow - - pytz - - geoalchemy2 - - colored-traceback - - typing-extensions diff --git a/docker/post/docker/Dockerfile b/docker/post/docker/Dockerfile deleted file mode 100644 index 89171db..0000000 --- a/docker/post/docker/Dockerfile +++ /dev/null @@ -1,66 +0,0 @@ -FROM continuumio/miniconda3:4.10.3p0-alpine - -# Create a non-root user -ARG username=pyschismer -ARG uid=1000 -ARG gid=100 -ARG post_repo=odssm_post - -ENV USER $username -ENV UID $uid -ENV GID $gid -ENV HOME /home/$USER - -# Get necessary packages -RUN apk update && apk upgrade && apk --no-cache add \ - git \ - gcc \ - g++ \ - make \ - cmake \ - patch \ - libstdc++ - -# New user -RUN adduser -D -g "Non-root user" -u $UID -h $HOME $USER - -# Create a project directory inside user home -ENV PROJECT_DIR $HOME/app -RUN mkdir $PROJECT_DIR -WORKDIR $PROJECT_DIR - - -# Build the conda environment -ENV ENV_PREFIX $HOME/icogsc - -COPY environment.yml /tmp/ -RUN chown $UID:$GID /tmp/environment.yml - -RUN conda install mamba -n base -c conda-forge && \ - mamba update --name base --channel defaults conda && \ - mamba env create --prefix $ENV_PREFIX --file /tmp/environment.yml --force && \ - mamba clean --all --yes - -RUN conda clean --all -RUN apk del git -RUN apk del gcc -RUN apk del g++ -RUN apk del make -RUN apk del cmake - - -RUN mkdir -p $PROJECT_DIR/scripts -COPY docker/*.py ${PROJECT_DIR}/scripts/ -ENV PYTHONPATH ${PROJECT_DIR}/scripts/ - -ENV CONDA_DIR /opt/conda - - -RUN mkdir -p $PROJECT_DIR/io - -USER $USER - -RUN echo "source $CONDA_DIR/etc/profile.d/conda.sh" >> ~/.profile - -# Ref: https://pythonspeed.com/articles/activate-conda-dockerfile/ -ENTRYPOINT ["conda", "run", "-p", "$ENV_PREFIX", "--no-capture-output", "python", "-m", "generate_viz"] diff --git a/docker/post/docker/generate_viz.py b/docker/post/docker/generate_viz.py deleted file mode 100644 index 1553e31..0000000 --- a/docker/post/docker/generate_viz.py +++ /dev/null @@ -1,1004 +0,0 @@ -""" -Dynamic map hindcast implementation - -############################################################### -# Original development from https://github.com/ocefpaf/python_hurricane_gis_map -# # Exploring the NHC GIS Data -# -# This notebook aims to demonstrate how to create a simple interactive GIS map with the National Hurricane Center predictions [1] and CO-OPS [2] observations along the Hurricane's path. -# -# -# 1. http://www.nhc.noaa.gov/gis/ -# 2. https://opendap.co-ops.nos.noaa.gov/ioos-dif-sos/ -# -# -# NHC codes storms are coded with 8 letter names: -# - 2 char for region `al` → Atlantic -# - 2 char for number `11` is Irma -# - and 4 char for year, `2017` -# -# Browse http://www.nhc.noaa.gov/gis/archive_wsurge.php?year=2017 to find other hurricanes code. -############################################################### -""" - -__author__ = 'Saeed Moghimi' -__copyright__ = 'Copyright 2020, UCAR/NOAA' -__license__ = 'GPL' -__version__ = '1.0' -__email__ = 'moghimis@gmail.com' - -import argparse -import logging -import os -import sys -import pathlib -import warnings -from glob import glob -from datetime import datetime, timedelta, timezone -from importlib import resources - -import numpy as np -import pandas as pd -import arrow -import f90nml -from bokeh.resources import CDN -from bokeh.plotting import figure -from bokeh.models import Title -from bokeh.embed import file_html -from bokeh.models import Range1d, HoverTool -from branca.element import IFrame -import folium -from folium.plugins import Fullscreen, MarkerCluster, MousePosition -import netCDF4 -import matplotlib as mpl -import matplotlib.tri as Tri -import matplotlib.pyplot as plt -from shapely.geometry import Polygon, LineString, box -from geopandas import GeoDataFrame -import geopandas as gpd -from pyschism.mesh import Hgrid -import cfunits -from retrying import retry -from searvey import coops - -import defn as defn -import hurricane_funcs as hurr_f - -_logger = logging.getLogger() -mpl.use('Agg') - -warnings.filterwarnings("ignore", category=DeprecationWarning) - -EFS_MOUNT_POINT = pathlib.Path('~').expanduser() / 'app/io' - -def ceil_dt(date=datetime.now(), delta=timedelta(minutes=30)): - """ - Rounds up the input date based on the `delta` time period tolerance - - Examples - -------- - now = datetime.now() - print(now) - print(ceil_dt(now,timedelta(minutes=30) )) - - """ - - date_min = datetime.min - if date.tzinfo: - date_min = date_min.replace(tzinfo=date.tzinfo) - return date + (date_min - date) % delta - -@retry(stop_max_attempt_number=5, wait_fixed=3000) -def get_coops(start, end, sos_name, units, bbox, datum='NAVD', verbose=True): - """ - function to read COOPS data - We need to retry in case of failure b/c the server cannot handle - the high traffic during hurricane season. - """ - - - coops_stas = coops.coops_stations_within_region(region=box(*bbox)) - # TODO: NAVD 88? - coops_data = coops.coops_product_within_region( - 'water_level', region=box(*bbox), start_date=start, end_date=end) - station_names = [ - coops_stas[coops_stas.index == i].name.values[0] - for i in coops_data.nos_id.astype(int).values - ] - staobs_df = coops_data.assign( - {'station_name': ('nos_id', station_names)} - ).reset_coords().drop( - ['f', 's', 'q', 'nws_id'] - ).to_dataframe().reset_index( - level='nos_id' - ).rename( - columns={ - 'x': 'lon', - 'y': 'lat', - 'v': 'ssh', - 't': 'time', - 'nos_id': 'station_code', - } - ).astype( - {'station_code': 'int64'} - ) - staobs_df.index = staobs_df.index.tz_localize(tz=timezone.utc) - - return staobs_df - - - -def make_plot_1line(obs, label=None): - # TOOLS="hover,crosshair,pan,wheel_zoom,zoom_in,zoom_out,box_zoom,undo,redo,reset,tap,save,box_select,poly_select,lasso_select," - TOOLS = 'crosshair,pan,wheel_zoom,zoom_in,zoom_out,box_zoom,reset,save,' - - p = figure( - toolbar_location='above', - x_axis_type='datetime', - width=defn.width, - height=defn.height, - tools=TOOLS, - ) - - if obs.station_code.isna().sum() == 0: - station_code = obs.station_code.array[0] - p.add_layout( - Title(text=f"Station: {station_code}", - text_font_style='italic'), - 'above') - - if obs.station_name.isna().sum() == 0: - station_name = obs.station_name.array[0] - p.add_layout( - Title(text=station_name, text_font_size='10pt'), - 'above') - - p.yaxis.axis_label = label - - obs_val = obs.ssh.to_numpy().squeeze() - - l1 = p.line( - x=obs.index, - y=obs_val, - line_width=5, - line_cap='round', - line_join='round', - legend_label='model', - color='#0000ff', - alpha=0.7, - ) - - minx = obs.index.min() - maxx = obs.index.max() - - p.x_range = Range1d(start=minx, end=maxx) - - p.legend.location = 'top_left' - - p.add_tools(HoverTool(tooltips=[('model', '@y'), ], renderers=[l1], ), ) - return p - - -def make_plot_2line(obs, model=None, label=None, remove_mean_diff=False, bbox_bias=0.0): - # TOOLS="hover,crosshair,pan,wheel_zoom,zoom_in,zoom_out,box_zoom,undo,redo,reset,tap,save,box_select,poly_select,lasso_select," - TOOLS = 'crosshair,pan,wheel_zoom,zoom_in,zoom_out,box_zoom,reset,save,' - - p = figure( - toolbar_location='above', - x_axis_type='datetime', - width=defn.width, - height=defn.height, - tools=TOOLS, - ) - - if obs.station_code.isna().sum() == 0: - station_code = obs.station_code.array[0] - p.add_layout( - Title(text=f"Station: {station_code}", - text_font_style='italic'), - 'above') - - if obs.station_name.isna().sum() == 0: - station_name = obs.station_name.array[0] - p.add_layout( - Title(text=station_name, text_font_size='10pt'), - 'above') - - p.yaxis.axis_label = label - - obs_val = obs.ssh.to_numpy().squeeze() - - l1 = p.line( - x=obs.index, - y=obs_val, - line_width=5, - line_cap='round', - line_join='round', - legend_label='obs.', - color='#0000ff', - alpha=0.7, - ) - - if model is not None: - mod_val = model.ssh.to_numpy().squeeze() - - if ('SSH' in label) and remove_mean_diff: - mod_val = mod_val + obs_val.mean() - mod_val.mean() - - if ('SSH' in label) and bbox_bias is not None: - mod_val = mod_val + bbox_bias - - l0 = p.line( - x=model.index, - y=mod_val, - line_width=5, - line_cap='round', - line_join='round', - legend_label='model', - color='#9900cc', - alpha=0.7, - ) - - - minx = max(model.index.min(), obs.index.min()) - maxx = min(model.index.max(), obs.index.max()) - - minx = model.index.min() - maxx = model.index.max() - else: - minx = obs.index.min() - maxx = obs.index.max() - - p.x_range = Range1d(start=minx, end=maxx) - - p.legend.location = 'top_left' - - p.add_tools( - HoverTool(tooltips=[('model', '@y'), ], renderers=[l0], ), - HoverTool(tooltips=[('obs', '@y'), ], renderers=[l1], ), - ) - - return p - - -################# -def make_marker(p, location, fname, color='green', icon='stats'): - html = file_html(p, CDN, fname) - # iframe = IFrame(html , width=defn.width+45+defn.height, height=defn.height+80) - iframe = IFrame(html, width=defn.width * 1.1, height=defn.height * 1.2) - # popup = folium.Popup(iframe, max_width=2650+defn.height) - popup = folium.Popup(iframe) - iconm = folium.Icon(color=color, icon=icon) - marker = folium.Marker(location=location, popup=popup, icon=iconm) - return marker - - -############################### -def read_max_water_level_file(fgrd='hgrid.gr3', felev='maxelev.gr3', cutoff=True): - - hgrid = Hgrid.open(fgrd, crs='EPSG:4326') - h = -hgrid.values - bbox = hgrid.get_bbox('EPSG:4326', output_type='bbox') - - elev = Hgrid.open(felev, crs='EPSG:4326') - mzeta = -elev.values - D = mzeta - - #Mask dry nodes - NP = len(mzeta) - idxs = np.where(h < 0) - D[idxs] = np.maximum(0, mzeta[idxs]+h[idxs]) - - idry = np.zeros(NP) - idxs = np.where(mzeta+h <= 1e-6) - idry[idxs] = 1 - - MinVal = np.min(mzeta) - MaxVal = np.max(mzeta) - NumLevels = 21 - - if cutoff: - MinVal = max(MinVal, 0.0) - MaxVal = min(MaxVal, 2.4) - NumLevels = 12 - _logger.info(f'MinVal is {MinVal}') - _logger.info(f'MaxVal is {MaxVal}') - - step = 0.2 # m - levels = np.arange(MinVal, MaxVal + step, step=step) - _logger.info(f'levels is {levels}') - - fig, ax = plt.subplots() - tri = elev.triangulation - mask = np.any(np.where(idry[tri.triangles], True, False), axis=1) - tri.set_mask(mask) - - contour = ax.tricontourf( - tri, - mzeta, - vmin=MinVal, - vmax=MaxVal, - levels=levels, - cmap=defn.my_cmap, - extend='max') - - return contour, MinVal, MaxVal, levels - - -############################################################# -def contourf_to_geodataframe(contour_obj): - - """Transform a `matplotlib.contour.ContourSet` to a GeoDataFrame""" - - polygons, colors = [], [] - for i, polygon in enumerate(contour_obj.collections): - mpoly = [] - for path in polygon.get_paths(): - try: - path.should_simplify = False - poly = path.to_polygons() - # Each polygon should contain an exterior ring + maybe hole(s): - exterior, holes = [], [] - if len(poly) > 0 and len(poly[0]) > 3: - # The first of the list is the exterior ring : - exterior = poly[0] - # Other(s) are hole(s): - if len(poly) > 1: - holes = [h for h in poly[1:] if len(h) > 3] - mpoly.append(Polygon(exterior, holes)) - except: - _logger.warning('Warning: Geometry error when making polygon #{}'.format(i)) - if len(mpoly) > 1: - mpoly = MultiPolygon(mpoly) - polygons.append(mpoly) - colors.append(polygon.get_facecolor().tolist()[0]) - elif len(mpoly) == 1: - polygons.append(mpoly[0]) - colors.append(polygon.get_facecolor().tolist()[0]) - return GeoDataFrame(geometry=polygons, data={'RGBA': colors}, crs={'init': 'epsg:4326'}) - - -################# -def convert_to_hex(rgba_color): - red = str(hex(int(rgba_color[0] * 255)))[2:].capitalize() - green = str(hex(int(rgba_color[1] * 255)))[2:].capitalize() - blue = str(hex(int(rgba_color[2] * 255)))[2:].capitalize() - - if blue == '0': - blue = '00' - if red == '0': - red = '00' - if green == '0': - green = '00' - - return '#' + red + green + blue - - -################# -def get_model_station_ssh(sim_date, sta_in_file, sta_out_file, stations_info): - """Read model ssh""" - - station_dist_tolerance = 0.0001 # degrees - - # Get rid of time zone and convert to string "--
T" - sim_date_str = sim_date.astimezone(timezone.utc).strftime('%Y-%m-%dT%H') - - - #Read model output - sta_data = np.loadtxt(sta_out_file) - time_deltas = sta_data[:, 0].ravel().astype('timedelta64[s]') - sta_date = pd.DatetimeIndex( - data=np.datetime64(sim_date_str) + time_deltas, - tz=timezone.utc, - name="date_time") - - sta_zeta = sta_data[:, 1:] - - _logger.debug(len(sta_zeta[:,1])) - _logger.debug(type(sta_date)) - _logger.debug(type(sta_zeta)) - - df_staout = pd.DataFrame(data=sta_zeta, index=sta_date) - df_staout_melt = df_staout.melt( - ignore_index=False, - value_name="ssh", - var_name="staout_index") - - df_stain = pd.read_csv( - sta_in_file, - sep=' ', - header=None, - skiprows=2, - usecols=[1, 2], - names=["lon", "lat"]) - - df_stasim = df_staout_melt.merge( - df_stain, left_on='staout_index', right_index=True) - - gdf_stasim = gpd.GeoDataFrame( - df_stasim, - geometry=gpd.points_from_xy(df_stasim.lon, df_stasim.lat)) - - gdf_sta_info = gpd.GeoDataFrame( - stations_info, - geometry=gpd.points_from_xy( - stations_info.lon, stations_info.lat)) - - gdf_staout_w_info = gpd.sjoin_nearest( - gdf_stasim, - gdf_sta_info.drop(columns=['lon','lat']), - lsuffix='staout', rsuffix='real_station', - max_distance=station_dist_tolerance) - - # Now go back to DF or keep GDF and remove lon lat columns? - df_staout_w_info = pd.DataFrame(gdf_staout_w_info.drop( - columns=['geometry'])) - df_staout_w_info = df_staout_w_info.rename( - columns={'nos_id': 'station_code'} - ).astype( - {'station_code': 'int64'} - ) - - # TODO: Reset index or keep date as index? -# df_staout_w_info['date_time'] = df_staout_w_info.index -# df_staout_w_info = df_staout_w_info.reset_index(drop=True) - - - return df_staout_w_info - - - -################ -def get_storm_bbox(cone_gdf_list, pos_gdf_list, bbox_from_track=True): - # Find the bounding box to search the data. - last_cone = cone_gdf_list[-1]['geometry'].iloc[0] - track = LineString([point['geometry'] for point in pos_gdf_list]) - if bbox_from_track: - track_lons = track.coords.xy[0] - track_lats = track.coords.xy[1] - bbox = ( - min(track_lons) - 2, min(track_lats) - 2, - max(track_lons) + 2, max(track_lats) + 2, - ) - else: - bounds = np.array([last_cone.buffer(2).bounds, track.buffer(2).bounds]).reshape(4, 2) - lons, lats = bounds[:, 0], bounds[:, 1] - bbox = lons.min(), lats.min(), lons.max(), lats.max() - - return bbox - - -def get_storm_dates(pos_gdf_list): - # Ignoring the timezone, like AST (Atlantic Time Standard) b/c - # those are not a unique identifiers and we cannot disambiguate. - - if 'FLDATELBL' in pos_gdf_list[0].keys(): - start = pos_gdf_list[0]['FLDATELBL'] - end = pos_gdf_list[-1]['FLDATELBL'] - date_format = 'YYYY-MM-DD h:mm A ddd' - - elif 'ADVDATE' in pos_gdf_list[0].keys(): - # older versions (e.g. IKE) - start = pos_gdf_list[0]['ADVDATE'] - end = pos_gdf_list[-1]['ADVDATE'] - date_format = 'YYMMDD/hhmm' - - else: - msg = 'Check for correct time stamp and adapt the code !' - _logger.error(msg) - raise ValueError(msg) - - beg_date = arrow.get(start, date_format).datetime - end_date = arrow.get(end, date_format).datetime - - return beg_date, end_date - - -def get_stations_info(bbox): - - # Not using static local file anymore! - # We should get the same stations we use for observation -# df = coops.coops_stations() - df = coops.coops_stations_within_region(region=box(*bbox)) - stations_info = df.assign( - lon=df.geometry.apply('x'), - lat=df.geometry.apply('y'), - ).reset_index().rename( - {'name': 'station_name', 'nos_id': 'station_code'} - ) - - # Some stations are duplicate with different NOS ID but the same NWS ID - stations_info = stations_info.drop_duplicates(subset=['nws_id']) - stations_info = stations_info[stations_info.nws_id != ''] - - - return stations_info - - -def get_adjusted_times_for_station_outputs(staout_df_w_info, freq): - - # Round up datetime smaller than 30 minutes - start_date = ceil_dt(staout_df_w_info.index.min().to_pydatetime()) - end_date = ceil_dt(staout_df_w_info.index.max().to_pydatetime()) - - new_index_dates = pd.date_range( - start=start_date.replace(tzinfo=None), - end=end_date.replace(tzinfo=None), - freq=freq, - tz=start_date.tzinfo - ) - - return new_index_dates - - -def adjust_stations_time_and_data(time_indexed_df, freq, groupby): - - new_index_dates = get_adjusted_times_for_station_outputs( - time_indexed_df, freq) - - - adj_staout_df = pd.concat( - df.reindex( - index=new_index_dates, limit=1, method='nearest').drop_duplicates() - for idx, df in time_indexed_df.groupby(by=groupby)) - adj_staout_df.loc[np.abs(adj_staout_df['ssh']) > 10, 'ssh'] = np.nan - adj_staout_df = adj_staout_df[adj_staout_df.ssh.notna()] - - return adj_staout_df - - -def get_esri_url(layer_name): - - pos = 'MapServer/tile/{z}/{y}/{x}' - base = 'http://services.arcgisonline.com/arcgis/rest/services' - layer_info = dict( - Imagery='World_Imagery/MapServer', - Ocean_Base='Ocean/World_Ocean_Base', - Topo_Map='World_Topo_Map/MapServer', - Physical_Map='World_Physical_Map/MapServer', - Terrain_Base='World_Terrain_Base/MapServer', - NatGeo_World_Map='NatGeo_World_Map/MapServer', - Shaded_Relief='World_Shaded_Relief/MapServer', - Ocean_Reference='Ocean/World_Ocean_Reference', - Navigation_Charts='Specialty/World_Navigation_Charts', - Street_Map='World_Street_Map/MapServer' - ) - - layer = layer_info.get(layer_name) - if layer is None: - layer = layer_info['Imagery'] - - url = f'{base}/{layer}/{pos}' - return url - - -def folium_create_base_map(bbox_str, layer_name_list=None): - - # Here is the final result. Explore the map by clicking on - # the map features plotted! - bbox_ary = np.fromstring(bbox_str, sep=',') - lon = 0.5 * (bbox_ary[0] + bbox_ary[2]) - lat = 0.5 * (bbox_ary[1] + bbox_ary[3]) - - m = folium.Map( - location=[lat, lon], - tiles='OpenStreetMap', - zoom_start=4, control_scale=True) - Fullscreen(position='topright', force_separate_button=True).add_to(m) - - if layer_name_list is None: - return m - - for lyr_nm in layer_name_list: - url = get_esri_url(lyr_nm) - - lyr = folium.TileLayer(tiles=url, name=lyr_nm, attr='ESRI', overlay=False) - lyr.add_to(m) - - return m - - -def folium_add_max_water_level_contour(map_obj, max_water_level_contours_gdf, MinVal, MaxVal): - ## Get colors in Hex - colors_elev = [] - for i in range(len(max_water_level_contours_gdf)): - color = defn.my_cmap(i / len(max_water_level_contours_gdf)) - colors_elev.append(mpl.colors.to_hex(color)) - - # assign to geopandas obj - max_water_level_contours_gdf['RGBA'] = colors_elev - - # plot geopandas obj - maxele = folium.GeoJson( - max_water_level_contours_gdf, - name='Maximum water level [m above MSL]', - style_function=lambda feature: { - 'fillColor': feature['properties']['RGBA'], - 'color': feature['properties']['RGBA'], - 'weight': 1.0, - 'fillOpacity': 0.6, - 'line_opacity': 0.6, - }, - ) - - maxele.add_to(map_obj) - - # Add colorbar - color_scale = folium.StepColormap( - colors_elev, - # index=color_domain, - vmin=MinVal, - vmax=MaxVal, - caption='Maximum water level [m above MSL]', - ) - map_obj.add_child(color_scale) - - -def folium_add_ssh_time_series(map_obj, staout_df_w_info, obs_df=None): -# marker_cluster_estofs_ssh = MarkerCluster(name='CO-OPS SSH observations') - marker_cluster_estofs_ssh = MarkerCluster(name='Simulation SSH [m above MSL]') - - _logger.info(' > plot model only') - - - by = ["staout_index", "station_code"] - for (staout_idx, st_code), df in staout_df_w_info.groupby(by=by): - fname = df.station_code.array[0] - location = df.lat.array[0], df.lon.array[0] - if st_code is None or obs_df is None: - p = make_plot_1line(df, label='SSH [m above MSL]') - else: - p = make_plot_2line( - obs=obs_df[obs_df.station_code == st_code], - remove_mean_diff=True, - model=df, label='SSH [m]') - marker = make_marker(p, location=location, fname=fname) - marker.add_to(marker_cluster_estofs_ssh) - - marker_cluster_estofs_ssh.add_to(map_obj) - - -def folium_add_bbox(map_obj, bbox_str): - ## Plotting bounding box - bbox_ary = np.fromstring(bbox_str, sep=',') - p = folium.PolyLine(get_coordinates(bbox_ary), color='#009933', weight=2, opacity=0.6) - - p.add_to(map_obj) - - -def folium_add_storm_latest_cone(map_obj, cone_gdf_list, pos_gdf_list): - latest_cone_style = { - 'fillOpacity': 0.1, - 'color': 'red', - 'stroke': 1, - 'weight': 1.5, - 'opacity': 0.8, - } - # Latest cone prediction. - latest = cone_gdf_list[-1] - ### - if 'FLDATELBL' in pos_gdf_list[0].keys(): # Newer storms have this information - names3 = 'Cone prediction as of {}'.format(latest['ADVDATE'].values[0]) - else: - names3 = 'Cone prediction' - ### - folium.GeoJson( - data=latest.__geo_interface__, - name=names3, - style_function=lambda feat: latest_cone_style, - ).add_to(map_obj) - - -def folium_add_storm_all_cones( - map_obj, - cone_gdf_list, - pos_gdf_list, - track_radius, - storm_name, - storm_year - ): - cone_style = { - 'fillOpacity': 0, - 'color': 'lightblue', - 'stroke': 1, - 'weight': 0.3, - 'opacity': 0.3, - } - marker_cluster1 = MarkerCluster(name='NHC cone predictions') - marker_cluster1.add_to(map_obj) - if 'FLDATELBL' not in pos_gdf_list[0].keys(): # Newer storms have this information - names3 = 'Cone prediction' - - # Past cone predictions. - for cone in cone_gdf_list[:-1]: - folium.GeoJson( - data=cone.__geo_interface__, style_function=lambda feat: cone_style, - ).add_to(marker_cluster1) - - # Latest points prediction. - for k, row in last_pts.iterrows(): - - if 'FLDATELBL' in pos_gdf_list[0].keys(): # Newer storms have this information - date = row['FLDATELBL'] - hclass = row['TCDVLP'] - popup = '{}
{}'.format(date, hclass) - if 'tropical' in hclass.lower(): - hclass = 'tropical depression' - - color = defn.colors_hurricane_condition[hclass.lower()] - else: - popup = '{}
{}'.format(storm_name, storm_year) - color = defn.colors_hurricane_condition['hurricane'] - - location = row['LAT'], row['LON'] - folium.CircleMarker( - location=location, - radius=track_radius, - fill=True, - color=color, - popup=popup, - ).add_to(map_obj) - - -def folium_add_storm_track( - map_obj, - pos_gdf_list, - track_radius, - storm_name, - storm_year - ): - # marker_cluster3 = MarkerCluster(name='Track') - # marker_cluster3.add_to(map_obj) - for point in pos_gdf_list: - if 'FLDATELBL' in pos_gdf_list[0].keys(): # Newer storms have this information - date = point['FLDATELBL'] - hclass = point['TCDVLP'] - popup = defn.template_track_popup.format( - storm_name, date, hclass) - - if 'tropical' in hclass.lower(): - hclass = 'tropical depression' - - color = defn.colors_hurricane_condition[hclass.lower()] - else: - popup = '{}
{}'.format(storm_name, storm_year) - color = defn.colors_hurricane_condition['hurricane'] - - location = point['LAT'], point['LON'] - folium.CircleMarker( - location=location, radius=track_radius, fill=True, color=color, popup=popup, - ).add_to(map_obj) - - -def get_schism_date(param_file): - - # Use f90nml to read parameters from the input mirror - - # &OPT - # start_year = 2000 !int - # start_month = 1 !int - # start_day = 1 !int - # start_hour = 0 !double - # utc_start = 8 !double - # / - params = f90nml.read(str(param_file)) - - opt = params.get('opt', {}) - - year = opt.get('start_year', 2000) - month = opt.get('start_month', 1) - day = opt.get('start_day', 1) - hour = int(opt.get('start_hour', 0)) - tz_rel_utc = opt.get('utc_start', 8) - - sim_tz = timezone(timedelta(hours=tz_rel_utc)) - sim_date = datetime(year, month, day, hour, tzinfo=sim_tz) - - return sim_date - - -def folium_finalize_map(map_obj, storm_name, storm_year, Date, FCT): - html = map_obj.get_root().html - if storm_name and storm_year: - html.add_child(folium.Element( - defn.template_storm_info.format(storm_name,storm_year))) - - html.add_child(folium.Element(defn.template_fct_info.format(Date, FCT))) - html.add_child(folium.Element(defn.disclaimer)) - - folium.LayerControl().add_to(map_obj) - MousePosition().add_to(map_obj) - - -def main(args): - - schism_dir = EFS_MOUNT_POINT / args.schismdir - storm_name = args.name - storm_year = args.year - - storm_tag = f"{storm_name.upper()}_{storm_year}" - grid_file = schism_dir / "hgrid.gr3" - - draw_bbox = False - plot_cones = True - plot_latest_cone_only = True - track_radius = 5 - freq = '30min' - - sta_in_file = schism_dir / "station.in" - if not sta_in_file.exists(): - _logger.warning('Stations input file is not found!') - sta_in_file = None - - results_dir = schism_dir / "outputs" - if not results_dir.exists(): - raise ValueError("Simulation results directory not found!") - - _logger.info(f'results_dir: {str(results_dir)}') - - sta_out_file = results_dir / 'staout_1' - if not sta_out_file.exists(): - _logger.warning('Points time-series file is not found!') - sta_out_file = None - - felev = results_dir / 'maxelev.gr3' - if not felev.exists(): - raise FileNotFoundError('Maximum elevation file is not found!') - - param_file = results_dir / 'param.out.nml' - if not param_file.exists(): - raise FileNotFoundError('Parameter file not found!') - - - if not grid_file.exists(): - raise FileNotFoundError('Grid file not found!') - - - post_dir = schism_dir / 'viz' - if not post_dir.exists(): - post_dir.mkdir(exist_ok=True, parents=True) - - - no_sta = False - if sta_out_file is None or sta_in_file is None: - # Station in file is needed for lat-lon - no_sta = True - - -##################################################### - sim_date = get_schism_date(param_file) - Date = sim_date.strftime('%Y%m%d') - FCT = ceil_dt(sim_date, timedelta(hours=6)).hour - - -##################################################### - bbox_str = args.bbox_str - if storm_tag is not None: - - _logger.info(f' > Read NHC information for {storm_name} {storm_year} ... ') - ts_code, hurr_prod_tag = hurr_f.get_nhc_storm_info(str(storm_year), storm_name) - - # download gis zip files - hurr_gis_path = hurr_f.download_nhc_gis_files(hurr_prod_tag, post_dir) - - # get advisory cones and track points - cone_gdf_list, pos_gdf_list, last_pts = hurr_f.read_advisory_cones_info( - hurr_prod_tag, hurr_gis_path, str(storm_year), ts_code) - - bbox = get_storm_bbox(cone_gdf_list, pos_gdf_list) - start_date, end_date = get_storm_dates(pos_gdf_list) - - bbox_str = ', '.join(format(v, '.2f') for v in bbox) - _logger.info(' > bbox: {}\nstart: {}\n end: {}'.format( - bbox_str, start_date, end_date)) - - -##################################################### - obs_df = None - if not no_sta: - - stations_info = get_stations_info(bbox) - staout_df_w_info = get_model_station_ssh( - sim_date, sta_in_file, sta_out_file, stations_info) - adj_station_df = adjust_stations_time_and_data( - staout_df_w_info, freq, "staout_index") - - - start_dt = adj_station_df.index.min().to_pydatetime() - end_dt = adj_station_df.index.max().to_pydatetime() - - all_obs_df = get_coops( - start=start_dt, - end=end_dt, - sos_name='water_surface_height_above_reference_datum', - units=cfunits.Units('meters'), - datum = 'MSL', - bbox=bbox, - ) - - # Get observation from stations that have a corresponding - # model time history output - obs_df = all_obs_df[all_obs_df.station_code.isin( - np.unique(adj_station_df.station_code.to_numpy()))] - - # To get smaller html file -# obs_df = adjust_stations_time_and_data( -# obs_df, freq, "station_code") - - -##################################################### - _logger.info(' > Put together the final map') - m = folium_create_base_map(bbox_str, layer_name_list=["Imagery"]) - - -##################################################### - _logger.info(' > Plot max water elev ..') - contour, MinVal, MaxVal, levels = read_max_water_level_file(fgrd=grid_file, felev=felev) - max_water_level_contours_gdf = contourf_to_geodataframe(contour) - - folium_add_max_water_level_contour(m, max_water_level_contours_gdf, MinVal, MaxVal) - -##################################################### - if not no_sta: - _logger.info(' > Plot SSH stations ..') - - folium_add_ssh_time_series(m, adj_station_df, obs_df) - -##################################################### - if draw_bbox: - folium_add_bbox(m, bbox_str) - -##################################################### - if storm_tag is not None: - if plot_cones: - _logger.info(' > Plot NHC cone predictions') - - if plot_latest_cone_only: - folium_add_storm_latest_cone(m, cone_gdf_list, pos_gdf_list) - else: - folium_add_storm_all_cones( - m, cone_gdf_list, pos_gdf_list, - track_radius, - storm_name, storm_year) - - _logger.info(' > Plot points along the final track ..') - folium_add_storm_track( - m, pos_gdf_list, track_radius, - storm_name, storm_year) - - -##################################################### - _logger.info(' > Add disclaimer and storm name ..') - folium_finalize_map(m, storm_name, storm_year, Date, FCT) - - _logger.info(' > Save file ...') - - - fname = os.path.join(post_dir, '{}_{}_{}.html'.format(storm_tag, Date, FCT)) - _logger.info(fname) - m.save(fname) - - -def entry(): - parser = argparse.ArgumentParser() - - parser.add_argument( - "name", help="name of the storm", type=str) - - parser.add_argument( - "year", help="year of the storm", type=int) - - parser.add_argument( - "schismdir", type=pathlib.Path) - - parser.add_argument('--vdatum', default='MSL') - parser.add_argument( - '--bbox-str', - default='-99.0,5.0,-52.8,46.3', - help='format: lon_min,lat_min,lon_max,lat_max') - - main(parser.parse_args()) - -if __name__ == "__main__": - warnings.filterwarnings("ignore", category=DeprecationWarning) - entry() diff --git a/docker/post/environment.yml b/docker/post/environment.yml deleted file mode 100644 index 605d54c..0000000 --- a/docker/post/environment.yml +++ /dev/null @@ -1,96 +0,0 @@ -name: odssm-post-env -channels: - - conda-forge - - defaults -dependencies: - - python>=3.9 # because of searvey - - pygeos - - geos - - gdal - - proj - - pyproj - - cartopy - - udunits2 - - shapely>=1.8.0 - - arrow - - attrs - - backcall - - beautifulsoup4 - - bokeh - - branca - - brotlipy - - bs4 - - certifi - - cffi - - cftime - - cfunits - - cfgrib - - chardet - - click - - click-plugins - - cligj - - cryptography - - cycler - - decorator - - f90nml - - fiona - - folium - - gdal - - geopandas - - geos - - geotiff - - glib - - icu - - idna - - ipython - - ipython_genutils - - jedi - - jinja2 - - kiwisolver - - krb5 - - lxml - - markupsafe - - matplotlib - - munch - - netcdf4 - - hdf5 - - numpy - - olefile - - packaging - - pandas - - parso - - pexpect - - pickleshare - - pillow - - prompt-toolkit - - ptyprocess - - pycparser - - pygeos - - pygments - - pyopenssl - - pyparsing - - pyproj - - pysocks - - python-wget - - pytz - - pyyaml - - readline - - requests - - retrying - - rtree - - setuptools - - shapely - - six - - searvey - - soupsieve - - tbb - - tiledb - - tk - - tornado - - traitlets - - typing_extensions - - wcwidth - - wheel - - zstd - - pip: - - pyschism diff --git a/docker/prefect-aws/Dockerfile b/docker/prefect-aws/Dockerfile deleted file mode 100755 index c4c0e6f..0000000 --- a/docker/prefect-aws/Dockerfile +++ /dev/null @@ -1,72 +0,0 @@ -FROM continuumio/miniconda3:22.11.1-alpine - -# Create a non-root user -ARG username=ocsmesher -ARG uid=1000 -ARG gid=100 - -ENV USER $username -ENV UID $uid -ENV GID $gid -ENV HOME /home/$USER - -# Get necessary packages -RUN apk update && apk upgrade && apk --no-cache add \ - tzdata \ - libstdc++ \ - groff \ - less \ - curl \ - zip - -# New user -RUN adduser -D -g "Non-root user" -u $UID -h $HOME $USER - -# Build the conda environment -COPY environment.yml /tmp/ -RUN chown $UID:$GID /tmp/environment.yml - -RUN conda install mamba -n base -c conda-forge && \ - mamba update --name base --channel defaults conda && \ - mamba env create --name odssm --file /tmp/environment.yml --force && \ - mamba clean --all --yes - - -ENV CONDA_DIR /opt/conda - -# run the postBuild script to install any JupyterLab extensions - - - -# AWS has its own python distro -RUN curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" && \ - unzip awscliv2.zip && \ - ./aws/install && \ - rm -rf awscliv2.zip aws - -RUN mkdir -p /scripts -COPY pw_client.py /scripts/pw_client.py -ENV PYTHONPATH=/scripts - -RUN source $CONDA_DIR/etc/profile.d/conda.sh && \ - conda activate odssm && \ - pip install dunamai && \ - conda deactivate - -RUN apk del curl zip - - -# Set default entry -COPY entrypoint.sh /usr/local/bin/ -RUN chown $UID:$GID /usr/local/bin/entrypoint.sh && \ - chmod u+x /usr/local/bin/entrypoint.sh - -# https://github.com/PrefectHQ/prefect/issues/3061 -ENV TZ UTC -RUN cp /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone - -USER $USER - -RUN echo "source $CONDA_DIR/etc/profile.d/conda.sh" >> ~/.profile - -ENTRYPOINT [ "/usr/local/bin/entrypoint.sh" ] diff --git a/docker/prefect-aws/entrypoint.sh b/docker/prefect-aws/entrypoint.sh deleted file mode 100644 index 2b95ddf..0000000 --- a/docker/prefect-aws/entrypoint.sh +++ /dev/null @@ -1,4 +0,0 @@ -#!/bin/sh --login -set -e -conda activate odssm -exec "$@" diff --git a/docker/prefect-aws/environment.yml b/docker/prefect-aws/environment.yml deleted file mode 100644 index 13fe544..0000000 --- a/docker/prefect-aws/environment.yml +++ /dev/null @@ -1,12 +0,0 @@ -name: odssm -channels: - - conda-forge - - defaults -dependencies: - - python=3.10 - - prefect=1.4, <2 - - cloudpickle - - requests - - dnspython - - boto3 - - dunamai diff --git a/docker/prefect-aws/pw_client.py b/docker/prefect-aws/pw_client.py deleted file mode 100644 index 8cf5aea..0000000 --- a/docker/prefect-aws/pw_client.py +++ /dev/null @@ -1,112 +0,0 @@ -import requests -import json -import pprint as pp - -class Client(): - - def __init__(self, url, key): - self.url = url - self.api = url+'/api' - self.key = key - self.session = requests.Session() - self.headers = { - 'Content-Type': 'application/json' - } - - def upload_dataset(self, filename, path): - req = self.session.post(self.api + "/datasets/upload?key="+self.key, - data={'dir': path}, - files={'file':open(filename, 'rb')}) - req.raise_for_status() - data = json.loads(req.text) - return data - - def download_dataset(self, file): - url=self.api + "/datasets/download?key=" + self.key + '&file=' + file - #print url - req = self.session.get(url) - req.raise_for_status() - return req.content - - def find_datasets(self, path, ext=''): - url = self.api + "/datasets/find?key=" + self.key + "&path=" + path + "&ext=" + ext - #print url - req = self.session.get(url) - req.raise_for_status() - data = json.loads(req.text) - return data - - def get_job_tail(self, jid, file, lastline): - url = self.api + "/jobs/"+jid+"/tail?key=" + self.key + "&file=" + file + "&line="+str(lastline) - try: - req = self.session.get(url) - req.raise_for_status() - data = req.text - except: - data = "" - return data - - def start_job(self,workflow,inputs,user): - inputs = json.dumps(inputs) - req = self.session.post(self.api + "/tools",data={'user':user,'tool_xml': "/workspaces/"+user+"/workflows/"+workflow+"/workflow.xml",'key':self.key,'tool_id':workflow,'inputs':inputs}) - req.raise_for_status() - data = json.loads(req.text) - jid=data['jobs'][0]['id'] - djid=str(data['decoded_job_id']) - return jid,djid - - def get_job_state(self, jid): - url = self.api + "/jobs/"+ jid + "?key=" + self.key - req = self.session.get(url) - req.raise_for_status() - data = json.loads(req.text) - return data['state'] - - def get_job_credit_info(self, jid): - url = self.api + "/jobs/"+ jid + "/monitor?key=" + self.key - req = self.session.get(url) - req.raise_for_status() - data = json.loads(req.text) - # return data['info'] - return data - - def get_resources(self): - req = self.session.get(self.api + "/resources?key=" + self.key) - req.raise_for_status() - data = json.loads(req.text) - return data - - def get_resource(self, name): - req = self.session.get(self.api + "/resources/list?key=" + self.key + "&name=" + name) - req.raise_for_status() - try: - data = json.loads(req.text) - return data - except: - return None - - def start_resource(self, name): - req = self.session.get(self.api + "/resources/start?key=" + self.key + "&name=" + name) - req.raise_for_status() - return req.text - - def stop_resource(self, name): - req = self.session.get(self.api + "/resources/stop?key=" + self.key + "&name=" + name) - req.raise_for_status() - return req.text - - def update_resource(self, name, params): - update = "&name={}".format(name) - for key, value in params.items(): - update = "{}&{}={}".format(update, key, value) - req = self.session.post(self.api + "/resources/set?key=" + self.key + update) - req.raise_for_status() - return req.text - - def get_account(self): - url = self.api + "/account?key=" + self.key - req = self.session.get(url) - req.raise_for_status() - data = json.loads(req.text) - return data - \ No newline at end of file diff --git a/docker/pyschism/docker/.env b/docker/pyschism/docker/.env deleted file mode 100644 index a09408b..0000000 --- a/docker/pyschism/docker/.env +++ /dev/null @@ -1 +0,0 @@ -PYSCHISM_USER=pyschismer diff --git a/docker/pyschism/docker/Dockerfile b/docker/pyschism/docker/Dockerfile deleted file mode 100644 index 8873c0e..0000000 --- a/docker/pyschism/docker/Dockerfile +++ /dev/null @@ -1,68 +0,0 @@ -FROM continuumio/miniconda3:22.11.1-alpine - -# Create a non-root user -ARG username=pyschismer -ARG uid=1000 -ARG gid=100 - -ENV USER $username -ENV UID $uid -ENV GID $gid -ENV HOME /home/$USER - -# Get necessary packages -RUN apk update && apk upgrade && apk add \ - git - -# New user -RUN adduser -D -g "Non-root user" -u $UID -h $HOME $USER - -# Create a project directory inside user home -ENV PROJECT_DIR $HOME/app -RUN mkdir $PROJECT_DIR -WORKDIR $PROJECT_DIR - - -# Build the conda environment -ENV ENV_PREFIX $HOME/icogsc - -COPY environment.yml /tmp/ -RUN chown $UID:$GID /tmp/environment.yml - -RUN conda install mamba -n base -c conda-forge && \ - mamba update --name base --channel defaults conda && \ - mamba env create --prefix $ENV_PREFIX --file /tmp/environment.yml --force && \ - mamba clean --all --yes - -# TODO: After perturbation schism branch is merged update this -# conda run -p $ENV_PREFIX --no-capture-output \ -# pip install "ensembleperturbation>=1.0.0" -RUN git clone https://github.com/schism-dev/pyschism.git && \ - git -C pyschism checkout 96e52fd && \ - conda run -p $ENV_PREFIX --no-capture-output \ - pip install ./pyschism && \ - rm -rf pyschism && \ - conda run -p $ENV_PREFIX --no-capture-output \ - pip install "coupledmodeldriver>=1.6.3" && \ - conda run -p $ENV_PREFIX --no-capture-output \ - pip install "ensembleperturbation>=1.1.2" - -ENV CONDA_DIR /opt/conda - -RUN conda clean --all -RUN apk del git - -RUN mkdir -p $PROJECT_DIR/io -RUN chown -R $UID:$GID $HOME - -USER $USER - -RUN mkdir -p $PROJECT_DIR/scripts -COPY docker/*.py ${PROJECT_DIR}/scripts/ -COPY docker/refs ${PROJECT_DIR}/refs/ -ENV PYTHONPATH ${PROJECT_DIR}/scripts/ - -RUN mkdir -p $HOME/.local/share/pyschism - -# Ref: https://pythonspeed.com/articles/activate-conda-dockerfile/ -ENTRYPOINT ["conda", "run", "-p", "$ENV_PREFIX", "--no-capture-output", "python", "-m"] diff --git a/docker/pyschism/docker/analyze_ensemble.py b/docker/pyschism/docker/analyze_ensemble.py deleted file mode 100644 index 77c4d3e..0000000 --- a/docker/pyschism/docker/analyze_ensemble.py +++ /dev/null @@ -1,354 +0,0 @@ -from argparse import ArgumentParser -from pathlib import Path -import pickle - -import chaospy -import dask -from matplotlib import pyplot -import numpy -from sklearn.linear_model import LassoCV, ElasticNetCV, LinearRegression -from sklearn.model_selection import ShuffleSplit, LeaveOneOut -import xarray - -from ensembleperturbation.parsing.adcirc import subset_dataset -from ensembleperturbation.perturbation.atcf import VortexPerturbedVariable -from ensembleperturbation.plotting.perturbation import plot_perturbations -from ensembleperturbation.plotting.surrogate import ( - plot_kl_surrogate_fit, - plot_selected_percentiles, - plot_selected_validations, - plot_sensitivities, - plot_validations, -) -from ensembleperturbation.uncertainty_quantification.karhunen_loeve_expansion import ( - karhunen_loeve_expansion, - karhunen_loeve_prediction, -) -from ensembleperturbation.uncertainty_quantification.surrogate import ( - percentiles_from_surrogate, - sensitivities_from_surrogate, - surrogate_from_karhunen_loeve, - surrogate_from_training_set, - validations_from_surrogate, -) -from ensembleperturbation.utilities import get_logger - -EFS_MOUNT_POINT = Path('~').expanduser() / 'app/io' -LOGGER = get_logger('klpc_wetonly') - - - -def main(args): - - tracks_dir = EFS_MOUNT_POINT / args.tracks_dir - ensemble_dir = EFS_MOUNT_POINT / args.ensemble_dir - - analyze(tracks_dir, ensemble_dir/'analyze') - - - -def analyze(tracks_dir, analyze_dir): - # KL parameters - variance_explained = 0.9999 - # subsetting parameters - isotach = 34 # -kt wind swath of the cyclone - depth_bounds = 25.0 - point_spacing = None - node_status_mask = 'always_wet' - # analysis type - variable_name = 'zeta_max' - use_depth = True # for depths - # use_depth = False # for elevations - log_space = False # normal linear space - # log_space = True # use log-scale to force surrogate to positive values only - training_runs = 'korobov' - validation_runs = 'random' - # PC parameters - polynomial_order = 3 - # cross_validator = ShuffleSplit(n_splits=10, test_size=12, random_state=666) - # cross_validator = ShuffleSplit(random_state=666) - cross_validator = LeaveOneOut() - # regression_model = LassoCV( - # fit_intercept=False, cv=cross_validator, selection='random', random_state=666 - # ) - regression_model = ElasticNetCV( - fit_intercept=False, - cv=cross_validator, - l1_ratio=0.5, - selection='random', - random_state=666, - ) - # regression_model = LinearRegression(fit_intercept=False) - regression_name = 'ElasticNet_LOO' - if training_runs == 'quadrature': - use_quadrature = True - else: - use_quadrature = False - - make_perturbations_plot = True - make_klprediction_plot = True - make_klsurrogate_plot = True - make_sensitivities_plot = True - make_validation_plot = True - make_percentile_plot = True - - save_plots = True - - storm_name = None - - if log_space: - output_directory = analyze_dir / f'outputs_log_{regression_name}' - else: - output_directory = analyze_dir / f'outputs_linear_{regression_name}' - if not output_directory.exists(): - output_directory.mkdir(parents=True, exist_ok=True) - - subset_filename = output_directory / 'subset.nc' - kl_filename = output_directory / 'karhunen_loeve.pkl' - kl_surrogate_filename = output_directory / 'kl_surrogate.npy' - surrogate_filename = output_directory / 'surrogate.npy' - kl_validation_filename = output_directory / 'kl_surrogate_fit.nc' - sensitivities_filename = output_directory / 'sensitivities.nc' - validation_filename = output_directory / 'validation.nc' - percentile_filename = output_directory / 'percentiles.nc' - - filenames = ['perturbations.nc', 'maxele.63.nc'] - if storm_name is None: - storm_name = tracks_dir / 'original.22' - - datasets = {} - existing_filenames = [] - for filename in filenames: - filename = analyze_dir / filename - if filename.exists(): - datasets[filename.name] = xarray.open_dataset(filename, chunks='auto') - else: - raise FileNotFoundError(filename.name) - - perturbations = datasets[filenames[0]] - max_elevations = datasets[filenames[1]] - min_depth = 0.8 * max_elevations.h0 # the minimum allowable depth - - perturbations = perturbations.assign_coords( - type=( - 'run', - ( - numpy.where( - perturbations['run'].str.contains(training_runs), - 'training', - numpy.where( - perturbations['run'].str.contains(validation_runs), - 'validation', - 'none', - ), - ) - ), - ) - ) - - if len(numpy.unique(perturbations['type'][:])) == 1: - perturbations['type'][:] = numpy.random.choice( - ['training', 'validation'], size=len(perturbations.run), p=[0.7, 0.3] - ) - LOGGER.info('dividing 70/30% for training/testing the model') - - training_perturbations = perturbations.sel(run=perturbations['type'] == 'training') - validation_perturbations = perturbations.sel(run=perturbations['type'] == 'validation') - - if make_perturbations_plot: - plot_perturbations( - training_perturbations=training_perturbations, - validation_perturbations=validation_perturbations, - runs=perturbations['run'].values, - perturbation_types=perturbations['type'].values, - track_directory=tracks_dir, - output_directory=output_directory if save_plots else None, - ) - - variables = { - variable_class.name: variable_class() - for variable_class in VortexPerturbedVariable.__subclasses__() - } - - distribution = chaospy.J( - *( - variables[variable_name].chaospy_distribution() - for variable_name in perturbations['variable'].values - ) - ) - - # sample based on subset and excluding points that are never wet during training run - if not subset_filename.exists(): - LOGGER.info('subsetting nodes') - subset = subset_dataset( - ds=max_elevations, - variable=variable_name, - maximum_depth=depth_bounds, - wind_swath=[storm_name, isotach], - node_status_selection={ - 'mask': node_status_mask, - 'runs': training_perturbations['run'], - }, - point_spacing=point_spacing, - output_filename=subset_filename, - ) - - # subset chunking can be disturbed by point_spacing so load from saved filename always - LOGGER.info(f'loading subset from "{subset_filename}"') - subset = xarray.open_dataset(subset_filename) - if 'element' in subset: - elements = subset['element'] - subset = subset[variable_name] - - # divide subset into training/validation runs - with dask.config.set(**{'array.slicing.split_large_chunks': True}): - training_set = subset.sel(run=training_perturbations['run']) - validation_set = subset.sel(run=validation_perturbations['run']) - - LOGGER.info(f'total {training_set.shape} training samples') - LOGGER.info(f'total {validation_set.shape} validation samples') - - training_set_adjusted = training_set.copy(deep=True) - - if use_depth: - training_set_adjusted += training_set_adjusted['depth'] # + adjusted_min_depth - - if log_space: - training_set_adjusted = numpy.log(training_set_adjusted) - - # Evaluating the Karhunen-Loeve expansion - nens, ngrid = training_set.shape - if not kl_filename.exists(): - LOGGER.info( - f'Evaluating Karhunen-Loeve expansion from {ngrid} grid nodes and {nens} ensemble members' - ) - kl_expansion = karhunen_loeve_expansion( - training_set_adjusted.values, - neig=variance_explained, - method='PCA', - output_directory=output_directory, - ) - else: - LOGGER.info(f'loading Karhunen-Loeve expansion from "{kl_filename}"') - with open(kl_filename, 'rb') as kl_handle: - kl_expansion = pickle.load(kl_handle) - - LOGGER.info(f'found {kl_expansion["neig"]} Karhunen-Loeve modes') - LOGGER.info(f'Karhunen-Loeve expansion: {list(kl_expansion)}') - - # plot prediction versus actual simulated - if make_klprediction_plot: - kl_predicted = karhunen_loeve_prediction( - kl_dict=kl_expansion, - actual_values=training_set_adjusted, - ensembles_to_plot=[0, int(nens / 2), nens - 1], - element_table=elements if point_spacing is None else None, - plot_directory=output_directory, - ) - - # evaluate the surrogate for each KL sample - kl_training_set = xarray.DataArray(data=kl_expansion['samples'], dims=['run', 'mode']) - kl_surrogate_model = surrogate_from_training_set( - training_set=kl_training_set, - training_perturbations=training_perturbations, - distribution=distribution, - filename=kl_surrogate_filename, - use_quadrature=use_quadrature, - polynomial_order=polynomial_order, - regression_model=regression_model, - ) - - # plot kl surrogate model versus training set - if make_klsurrogate_plot: - kl_fit = validations_from_surrogate( - surrogate_model=kl_surrogate_model, - training_set=kl_training_set, - training_perturbations=training_perturbations, - filename=kl_validation_filename, - ) - - plot_kl_surrogate_fit( - kl_fit=kl_fit, - output_filename=output_directory / 'kl_surrogate_fit.png' if save_plots else None, - ) - - # convert the KL surrogate model to the overall surrogate at each node - surrogate_model = surrogate_from_karhunen_loeve( - mean_vector=kl_expansion['mean_vector'], - eigenvalues=kl_expansion['eigenvalues'], - modes=kl_expansion['modes'], - kl_surrogate_model=kl_surrogate_model, - filename=surrogate_filename, - ) - - if make_sensitivities_plot: - sensitivities = sensitivities_from_surrogate( - surrogate_model=surrogate_model, - distribution=distribution, - variables=perturbations['variable'], - nodes=subset, - element_table=elements if point_spacing is None else None, - filename=sensitivities_filename, - ) - plot_sensitivities( - sensitivities=sensitivities, - storm=storm_name, - output_filename=output_directory / 'sensitivities.png' if save_plots else None, - ) - - if make_validation_plot: - node_validation = validations_from_surrogate( - surrogate_model=surrogate_model, - training_set=training_set, - training_perturbations=training_perturbations, - validation_set=validation_set, - validation_perturbations=validation_perturbations, - convert_from_log_scale=log_space, - convert_from_depths=use_depth, - minimum_allowable_value=min_depth if use_depth else None, - element_table=elements if point_spacing is None else None, - filename=validation_filename, - ) - - plot_validations( - validation=node_validation, - output_directory=output_directory if save_plots else None, - ) - - plot_selected_validations( - validation=node_validation, - run_list=validation_set['run'][ - numpy.linspace(0, validation_set.shape[0], 6, endpoint=False).astype(int) - ].values, - output_directory=output_directory if save_plots else None, - ) - - if make_percentile_plot: - percentiles = [10, 50, 90] - node_percentiles = percentiles_from_surrogate( - surrogate_model=surrogate_model, - distribution=distribution, - training_set=validation_set, - percentiles=percentiles, - convert_from_log_scale=log_space, - convert_from_depths=use_depth, - minimum_allowable_value=min_depth if use_depth else None, - element_table=elements if point_spacing is None else None, - filename=percentile_filename, - ) - - plot_selected_percentiles( - node_percentiles=node_percentiles, - perc_list=percentiles, - output_directory=output_directory if save_plots else None, - ) - - -if __name__ == '__main__': - - parser = ArgumentParser() - parser.add_argument('-d', '--ensemble-dir') - parser.add_argument('-t', '--tracks-dir') - parser.add_argument('-s', '--sequential', action='store_true') - - main(parser.parse_args()) diff --git a/docker/pyschism/docker/combine_ensemble.py b/docker/pyschism/docker/combine_ensemble.py deleted file mode 100644 index 909976f..0000000 --- a/docker/pyschism/docker/combine_ensemble.py +++ /dev/null @@ -1,32 +0,0 @@ -from argparse import ArgumentParser -from pathlib import Path - -from ensembleperturbation.client.combine_results import combine_results -from ensembleperturbation.utilities import get_logger - -EFS_MOUNT_POINT = Path('~').expanduser() / 'app/io' -LOGGER = get_logger('klpc_wetonly') - - - -def main(args): - - tracks_dir = EFS_MOUNT_POINT / args.tracks_dir - ensemble_dir = EFS_MOUNT_POINT / args.ensemble_dir - - output = combine_results( - model='schism', - adcirc_like=True, - output=ensemble_dir/'analyze', - directory=ensemble_dir, - parallel=not args.sequential - ) - -if __name__ == '__main__': - - parser = ArgumentParser() - parser.add_argument('-d', '--ensemble-dir') - parser.add_argument('-t', '--tracks-dir') - parser.add_argument('-s', '--sequential', action='store_true') - - main(parser.parse_args()) diff --git a/docker/pyschism/docker/docker-compose.yml b/docker/pyschism/docker/docker-compose.yml deleted file mode 100644 index ff41fd4..0000000 --- a/docker/pyschism/docker/docker-compose.yml +++ /dev/null @@ -1,28 +0,0 @@ -version: "3.9" -services: - pyschism-noaa: - build: - context: .. - dockerfile: docker/Dockerfile - args: - - username=${PYSCHISM_USER} - - uid=1000 - - gid=100 -# command: '/bin/sh' - volumes: - - type: bind - source: /home/ec2-user/data/test/hurricanes/florence_2018/mesh - target: /home/${PYSCHISM_USER}/app/io/input/mesh - - type: bind - source: /home/ec2-user/data/test/hurricanes/florence_2018/coops_ssh - target: /home/${PYSCHISM_USER}/app/io/input/coops_ssh - - type: bind - source: /home/ec2-user/data/test/hurricanes/florence_2018/setup - target: /home/${PYSCHISM_USER}/app/io/output - - type: bind - source: /home/ec2-user/data/test/static/tpxo - target: /home/${PYSCHISM_USER}/.local/share/tpxo - - type: bind - source: /home/ec2-user/data/test/static/nwm - target: /home/${PYSCHISM_USER}/.local/share/pyschism/nwm - diff --git a/docker/pyschism/docker/setup_ensemble.py b/docker/pyschism/docker/setup_ensemble.py deleted file mode 100644 index d2c7d13..0000000 --- a/docker/pyschism/docker/setup_ensemble.py +++ /dev/null @@ -1,276 +0,0 @@ -import os -import glob -import logging -import tempfile -from argparse import ArgumentParser -from copy import deepcopy -from datetime import datetime, timedelta -from pathlib import Path - - -import geopandas as gpd -import pandas as pd -from coupledmodeldriver import Platform -from coupledmodeldriver.configure.forcings.base import TidalSource -from coupledmodeldriver.configure import ( - BestTrackForcingJSON, - TidalForcingJSON, - NationalWaterModelFocringJSON, -) -from coupledmodeldriver.generate import SCHISMRunConfiguration -from coupledmodeldriver.generate.schism.script import SchismEnsembleGenerationJob -from coupledmodeldriver.generate import generate_schism_configuration -from stormevents import StormEvent -from stormevents.nhc.track import VortexTrack -from pyschism.mesh import Hgrid -from pyschism.forcing import NWM -from ensembleperturbation.perturbation.atcf import perturb_tracks - -import wwm - - -logger = logging.getLogger(__name__) -logger.setLevel(logging.INFO) - -EFS_MOUNT_POINT = Path('~').expanduser() / 'app/io' - -def main(args): - - track_path = EFS_MOUNT_POINT / args.track_file - out_dir = EFS_MOUNT_POINT / args.output_directory - dt_rng_path = EFS_MOUNT_POINT / args.date_range_file - tpxo_dir = EFS_MOUNT_POINT / args.tpxo_dir - nwm_file = EFS_MOUNT_POINT / args.nwm_file - mesh_dir = EFS_MOUNT_POINT / args.mesh_directory - hr_prelandfall = args.hours_before_landfall - use_wwm = args.use_wwm - - workdir = out_dir - mesh_file = mesh_dir / 'mesh_w_bdry.grd' - - workdir.mkdir(exist_ok=True) - - dt_data = pd.read_csv(dt_rng_path, delimiter=',') - date_1, date_2 = pd.to_datetime(dt_data.date_time).dt.strftime( - "%Y%m%d%H").values - model_start_time = datetime.strptime(date_1, "%Y%m%d%H") - model_end_time = datetime.strptime(date_2, "%Y%m%d%H") - spinup_time = timedelta(days=2) - - # More processing for caching - with tempfile.TemporaryDirectory() as tmpdir: - # NOTE: The output of write is not important. Calling - # `write` results in the relevant files being cached! - nwm = NWM(nwm_file=nwm_file, cache=True) - nwm.write( - output_directory=tmpdir, - gr3=Hgrid.open(mesh_file, crs=4326), - start_date=model_start_time - spinup_time, - end_date=model_end_time - model_start_time + spinup_time, - overwrite=True, - ) - nwm.pairings.save_json( - sources=workdir / 'source.json', - sinks=workdir / 'sink.json' - ) - - forcing_configurations = [] - forcing_configurations.append(TidalForcingJSON( - resource=tpxo_dir / 'h_tpxo9.v1.nc', - tidal_source=TidalSource.TPXO)) - forcing_configurations.append( - NationalWaterModelFocringJSON( - resource=nwm_file, - cache=True, - source_json=workdir / 'source.json', - sink_json=workdir / 'sink.json', - pairing_hgrid=mesh_file - ) - ) - forcing_configurations.append( - BestTrackForcingJSON( - nhc_code=f'{args.name}{args.year}', - interval_seconds=3600, - nws=20 - ) - ) - - - platform = Platform.LOCAL - - perturb_begin = model_start_time - unpertubed = None - if hr_prelandfall is not None and hr_prelandfall >= 0: - # Calculate time to landfall based on track and coastline - # and then perturb ONLY from the requested hours before landfall - countries = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres')) - usa = countries[countries.name.isin( - ["United States of America", "Puerto Rico"] - )] - - orig_track = VortexTrack.from_file(track_path) - track_dat = orig_track.data - onland = track_dat.geometry.set_crs(4326).intersects(usa.unary_union) - if onland.any(): - landfall_idx = onland[onland].index.min() - else: - logger.warn("The track doesn't cross US territories!") - landfall_idx = 0 - landfall_time = pd.Timestamp( - track_dat.iloc[landfall_idx].datetime - ) - toi = landfall_time - timedelta(hours=hr_prelandfall) - perturb_idx = (track_dat.datetime - toi).abs().argsort().iloc[0] - - if perturb_idx > 0: - # If only part of the track needs to be updated - unpertubed = deepcopy(orig_track) - unpertubed.end_date = track_dat.iloc[perturb_idx - 1].datetime - - # NOTE: Perturbation dataframe is truncated based on the - # passed `perturb_begin` to `perturb_tracks(...)` - perturb_begin = track_dat.iloc[perturb_idx].datetime - - perturbations = perturb_tracks( - perturbations=args.num_perturbations, - directory=workdir/'track_files', - storm=track_path, - variables=[ - 'cross_track', - 'along_track', - 'radius_of_maximum_winds', - 'max_sustained_wind_speed', - ], - sample_from_distribution=args.sample_from_distribution, - sample_rule=args.sample_rule, - quadrature=args.quadrature, - start_date=perturb_begin, - end_date=model_end_time, - overwrite=True - ) - - if perturb_begin != model_start_time: - # Read generated tracks and append to unpertubed section - perturbed_tracks = glob.glob(str(workdir/'track_files'/'*.22')) - for pt in perturbed_tracks: - if 'original' in pt: - continue -# perturbed_segment = pd.read_csv(pt, header=None) - perturbed_segment = VortexTrack.from_file(pt) - full_track = pd.concat( - (unpertubed.fort_22(), perturbed_segment.fort_22()), - ignore_index=True - ) - # Overwrites the perturbed-segment-only file - full_track.to_csv(pt, index=False, header=False) - - - run_config_kwargs = { - 'mesh_directory': mesh_dir, - 'modeled_start_time': model_start_time, - 'modeled_end_time': model_end_time, - 'modeled_timestep': timedelta(seconds=150), - 'tidal_spinup_duration': spinup_time, - 'forcings': forcing_configurations, - 'perturbations': perturbations, - 'platform': platform, -# 'schism_executable': 'pschism_PAHM_TVD-VL' - } - - run_configuration = SCHISMRunConfiguration( - **run_config_kwargs, - ) - run_configuration['schism']['hgrid_path'] = mesh_file - - run_configuration.write_directory( - directory=workdir, absolute=False, overwrite=False, - ) - - # Now generate the setup - generate_schism_configuration(**{ - 'configuration_directory': workdir, - 'output_directory': workdir, - 'relative_paths': True, - 'overwrite': True, - 'parallel': True - }) - - if use_wwm: - wwm.setup_wwm(mesh_file, workdir, ensemble=True) - - -def parse_arguments(): - argument_parser = ArgumentParser() - - argument_parser.add_argument( - "--track-file", - help="path to the storm track file for parametric wind setup", - type=Path, - required=True - ) - - argument_parser.add_argument( - '--output-directory', - default=None, - required=True, - help='path to store generated configuration files' - ) - argument_parser.add_argument( - "--date-range-file", - required=True, - type=Path, - help="path to the file containing simulation date range" - ) - argument_parser.add_argument( - '-n', '--num-perturbations', - type=int, - required=True, - help='path to input mesh (`hgrid.gr3`, `manning.gr3` or `drag.gr3`)', - ) - argument_parser.add_argument( - "--tpxo-dir", - required=True, - type=Path, - help="path to the TPXO dataset directory", - ) - argument_parser.add_argument( - "--nwm-file", - required=True, - type=Path, - help="path to the NWM hydrofabric dataset", - ) - argument_parser.add_argument( - '--mesh-directory', - required=True, - help='path to input mesh (`hgrid.gr3`, `manning.gr3` or `drag.gr3`)', - ) - argument_parser.add_argument( - "--sample-from-distribution", action="store_true" - ) - argument_parser.add_argument( - "--sample-rule", type=str, default='random' - ) - argument_parser.add_argument( - "--quadrature", action="store_true" - ) - argument_parser.add_argument( - "-b", "--hours-before-landfall", type=int - ) - argument_parser.add_argument( - "--use-wwm", action="store_true" - ) - - argument_parser.add_argument( - "name", help="name of the storm", type=str) - - argument_parser.add_argument( - "year", help="year of the storm", type=int) - - - args = argument_parser.parse_args() - - return args - - -if __name__ == "__main__": - main(parse_arguments()) diff --git a/docker/pyschism/docker/setup_model.py b/docker/pyschism/docker/setup_model.py deleted file mode 100755 index 1c77f46..0000000 --- a/docker/pyschism/docker/setup_model.py +++ /dev/null @@ -1,533 +0,0 @@ -#!/usr/bin/env python -import os -import pathlib -from datetime import datetime, timedelta, timezone -import logging -import argparse -import shutil -import hashlib -import fcntl -from time import time -import tempfile -from contextlib import contextmanager, ExitStack - -import numpy as np -import pandas as pd -import geopandas as gpd -import f90nml -from matplotlib.transforms import Bbox - -from pyschism import dates -from pyschism.enums import NWSType -from pyschism.driver import ModelConfig -from pyschism.forcing.bctides import iettype, ifltype -from pyschism.forcing.nws import GFS, HRRR, ERA5, BestTrackForcing -from pyschism.forcing.nws.nws2 import hrrr3 -from pyschism.forcing.source_sink import NWM -from pyschism.mesh import Hgrid, gridgr3 -from pyschism.mesh.fgrid import ManningsN -from pyschism.stations import Stations - -import wwm - -logger = logging.getLogger(__name__) -logger.setLevel(logging.INFO) - -CDSAPI_URL = "https://cds.climate.copernicus.eu/api/v2" -EFS_MOUNT_POINT = pathlib.Path('~').expanduser() / 'app/io' -TPXO_LINK_PATH = pathlib.Path('~').expanduser() / '.local/share/tpxo' -NWM_LINK_PATH = pathlib.Path('~').expanduser() / '.local/share/pyschism/nwm' - - -@contextmanager -def pushd(directory): - '''Temporarily modify current directory - - Parameters - ---------- - directory: str, pathlike - the directory to use as cwd during this context - - Returns - ------- - None - ''' - - origin = os.getcwd() - try: - os.chdir(directory) - yield - - finally: - os.chdir(origin) - - -def get_main_cache_path(cache_dir, storm, year): - - return cache_dir / f'{storm.lower()}_{year}' - -def get_meteo_cache_path(source, main_cache_path, bbox, start_date, end_date): - - m = hashlib.md5() - m.update(np.round(bbox.corners(), decimals=2).tobytes()) - m.update(start_date.strftime("%Y-%m-%d:%H:%M:%S").encode('utf8')) - m.update(end_date.strftime("%Y-%m-%d:%H:%M:%S").encode('utf8')) - - meteo_cache_path = main_cache_path / f"{source}_{m.hexdigest()}" - return meteo_cache_path - - -@contextmanager -def cache_lock(cache_path): - - if not cache_path.exists(): - cache_path.mkdir(parents=True, exist_ok=True) - - with open(cache_path / ".cache.lock", "w") as fp: - try: - fcntl.flock(fp.fileno(), fcntl.LOCK_EX) - yield - - finally: - fcntl.flock(fp.fileno(), fcntl.LOCK_UN) - -def from_meteo_cache(meteo_cache_path, sflux_dir): - - # TODO: Generalize - # Redundant check - if not meteo_cache_path.exists(): - return False - - contents = list(meteo_cache_path.iterdir()) - if not any(p.match("sflux_inputs.txt") for p in contents): - return False - - logger.info("Creating sflux from cache...") - - # Copy files from cache dir to sflux dir - for p in contents: - dest = sflux_dir / p.relative_to(meteo_cache_path) - if p.is_dir(): - shutil.copytree(p, dest) - else: - shutil.copy(p, dest) - - logger.info("Done copying cached sflux.") - - return True - - -def copy_meteo_cache(sflux_dir, meteo_cache_path): - - # TODO: Generalize - logger.info("Copying cache files to main cache location...") - # Copy files from sflux dir to cache dir - - # Clean meteo_cache_path if already populated? - contents_dst = list(meteo_cache_path.iterdir()) - contents_dst = [p for p in contents_dst if p.suffix != ".lock"] - for p in contents_dst: - if p.is_dir(): - shutil.rmtree(p) - else: - os.remove(p) - - # Copy files from cache dir to sflux dir - contents_src = list(sflux_dir.iterdir()) - for p in contents_src: - dest = meteo_cache_path / p.relative_to(sflux_dir) - if p.is_dir(): - shutil.copytree(p, dest) - else: - shutil.copy(p, dest) - - logger.info("Done copying cache files to main cache location.") - -def setup_schism_model( - mesh_path, - domain_bbox_path, - date_range_path, - station_info_path, - out_dir, - main_cache_path, - parametric_wind=False, - nhc_track_file=None, - storm_id=None, - use_wwm=False, - ): - - - domain_box = gpd.read_file(domain_bbox_path) - atm_bbox = Bbox(domain_box.to_crs('EPSG:4326').total_bounds.reshape(2,2)) - - schism_dir = out_dir - schism_dir.mkdir(exist_ok=True, parents=True) - logger.info("Calculating times and dates") - dt = timedelta(seconds=150.) - - # Use an integer for number of steps or a timedelta to approximate - # number of steps internally based on timestep - nspool = timedelta(minutes=20.) - - - # measurement days +7 days of simulation: 3 ramp, 2 prior - # & 2 after the measurement dates - dt_data = pd.read_csv(date_range_path, delimiter=',') - date_1, date_2 = pd.to_datetime(dt_data.date_time).dt.strftime( - "%Y%m%d%H").values - date_1 = datetime.strptime(date_1, "%Y%m%d%H") - date_2 = datetime.strptime(date_2, "%Y%m%d%H") - - - # If there are no observation data, it's hindcast mode - hindcast_mode = (station_info_path).is_file() - if hindcast_mode: - # If in hindcast mode run for 4 days: 2 days prior to now to - # 2 days after. - logger.info("Setup hindcast mode") - start_date = date_1 - timedelta(days=2) - end_date = date_2 + timedelta(days=2) - else: - - logger.info("Setup forecast mode") - - # If in forecast mode then date_1 == date_2, and simulation - # will run for about 3 days: abou 1 day prior to now to 2 days - # last meteo (HRRR) cycle after. - # - # Since HRRR forecasts are 48 hours on 6-hour cycles, find an end - # date which is 48 hours after the latest cycle before now! Note - # that the last cycle upload to either AWS or NOMADS server might - # take MORE than 1 hour in realtime cases. Also the oldest - # cycle on NOMADS is t00z from previous day - last_meteo_cycle = np.datetime64( - pd.DatetimeIndex([date_2 - timedelta(hours=2)]).floor('6H').values[0], 'h' - ).tolist() - oneday_before_last_cycle = last_meteo_cycle - timedelta(days=1) - start_date = oneday_before_last_cycle.replace(hour=0) - end_date = last_meteo_cycle + timedelta(days=2) - - rnday = end_date - start_date - - dramp = timedelta(days=1.) - - hgrid = Hgrid.open(mesh_path, crs="epsg:4326") - fgrid = ManningsN.linear_with_depth( - hgrid, - min_value=0.02, max_value=0.05, - min_depth=-1.0, max_depth=-3.0) - - coops_stations = None - stations_file = station_info_path - if stations_file.is_file(): - st_data = np.genfromtxt(stations_file, delimiter=',') - coops_stations = Stations( - nspool_sta=nspool, - crs="EPSG:4326", - elev=True, u=True, v=True) - for coord in st_data: - coops_stations.add_station(coord[0], coord[1]) - - atmospheric = None - if parametric_wind: - # NOTE: SCHISM supports parametric ofcl forecast as well - if nhc_track_file is not None and nhc_track_file.is_file(): - atmospheric = BestTrackForcing.from_nhc_bdeck(nhc_bdeck=nhc_track_file) - elif storm_id is not None: - atmospheric = BestTrackForcing(storm=storm_id) - else: - ValueError("Storm track information is not provided!") - else: - # For hindcast ERA5 is used and for forecast - # GFS and hrrr3.HRRR. Neither ERA5 nor the GFS and - # hrrr3.HRRR combination are supported by nws2 mechanism - pass - - - logger.info("Creating model configuration ...") - config = ModelConfig( - hgrid=hgrid, - fgrid=fgrid, - iettype=iettype.Iettype3(database="tpxo"), - ifltype=ifltype.Ifltype3(database="tpxo"), - nws=atmospheric, - source_sink=NWM(), - ) - - if config.forcings.nws and getattr(config.forcings.nws, 'sflux_2', None): - config.forcings.nws.sflux_2.inventory.file_interval = timedelta(hours=6) - - logger.info("Creating cold start ...") - # create reference dates - coldstart = config.coldstart( - stations=coops_stations, - start_date=start_date, - end_date=start_date + rnday, - timestep=dt, - dramp=dramp, - dramp_ss=dramp, - drampwind=dramp, - nspool=timedelta(hours=1), - elev=True, - dahv=True, - ) - - logger.info("Writing to disk ...") - if not parametric_wind: - - # In hindcast mode ERA5 is used manually: temporary solution - - sflux_dir = (schism_dir / "sflux") - sflux_dir.mkdir(exist_ok=True, parents=True) - - # Workaround for ERA5 not being compatible with NWS2 object - meteo_cache_kwargs = { - 'bbox': atm_bbox, - 'start_date': start_date, - 'end_date': start_date + rnday - } - - if hindcast_mode: - meteo_cache_path = get_meteo_cache_path( - 'era5', main_cache_path, **meteo_cache_kwargs - ) - else: - meteo_cache_path = get_meteo_cache_path( - 'gfs_hrrr', main_cache_path, **meteo_cache_kwargs - ) - - with cache_lock(meteo_cache_path): - if not from_meteo_cache(meteo_cache_path, sflux_dir): - if hindcast_mode: - era5 = ERA5() - era5.write( - outdir=schism_dir / "sflux", - start_date=start_date, - rnday=rnday.total_seconds() / timedelta(days=1).total_seconds(), - air=True, rad=True, prc=True, - bbox=atm_bbox, - overwrite=True) - - else: - - - with ExitStack() as stack: - - # Just to make sure there are not permission - # issues for temporary data (e.g. HRRR tmpdir - # in current dir) - tempdir = stack.enter_context(tempfile.TemporaryDirectory()) - stack.enter_context(pushd(tempdir)) - - gfs = GFS() - gfs.write( - outdir=schism_dir / "sflux", - level=1, - start_date=start_date, - rnday=rnday.total_seconds() / timedelta(days=1).total_seconds(), - air=True, rad=True, prc=True, - bbox=atm_bbox, - overwrite=True - ) - - # If we should limit forecast to 2 days, then - # why not use old HRRR implementation? Because - # We have prior day, today and 1 day forecast (?) - # BUT the new implementation has issues getting - # 2day forecast! - hrrr = HRRR() - hrrr.write( - outdir=schism_dir / "sflux", - level=2, - start_date=start_date, - rnday=rnday.total_seconds() / timedelta(days=1).total_seconds(), - air=True, rad=True, prc=True, - bbox=atm_bbox, - overwrite=True - ) - -# hrrr3.HRRR( -# start_date=start_date, -# rnday=rnday.total_seconds() / timedelta(days=1).total_seconds(), -# record=2, -# bbox=atm_bbox -# ) -# for i, nc_file in enumerate(sorted(pathlib.Path().glob('*/*.nc'))): -# dst_air = schism_dir / "sflux" / f"sflux_air_2.{i:04d}.nc" -# shutil.move(nc_file, dst_air) -# pathlib.Path(schism_dir / "sflux" / f"sflux_prc_2.{i:04d}.nc").symlink_to( -# dst_air -# ) -# pathlib.Path(schism_dir / "sflux" / f"sflux_rad_2.{i:04d}.nc").symlink_to( -# dst_air -# ) - - - with open(schism_dir / "sflux" / "sflux_inputs.txt", "w") as f: - f.write("&sflux_inputs\n/\n") - - copy_meteo_cache(sflux_dir, meteo_cache_path) - - windrot = gridgr3.Windrot.default(hgrid) - windrot.write(schism_dir / "windrot_geo2proj.gr3", overwrite=True) - ## end of workaround - - # Workaround for bug #30 - coldstart.param.opt.wtiminc = coldstart.param.core.dt - coldstart.param.opt.nws = NWSType.CLIMATE_AND_FORECAST.value - ## end of workaround - - - - # Workaround for station bug #32 - if coops_stations is not None: - coldstart.param.schout.nspool_sta = int( - round(nspool.total_seconds() / coldstart.param.core.dt)) - ## end of workaround - - with ExitStack() as stack: - - # Just to make sure there are not permission - # issues for temporary data (e.g. HRRR tmpdir - # in current dir) - tempdir = stack.enter_context(tempfile.TemporaryDirectory()) - stack.enter_context(pushd(tempdir)) - - coldstart.write(schism_dir, overwrite=True) - - # Workardoun for hydrology param bug #34 - nm_list = f90nml.read(schism_dir / 'param.nml') - nm_list['opt']['if_source'] = 1 - nm_list.write(schism_dir / 'param.nml', force=True) - ## end of workaround - - ## Workaround to make sure outputs directory is copied from/to S3 - try: - os.mknod(schism_dir / "outputs" / "_") - except FileExistsError: - pass - ## end of workaround - - if use_wwm: - wwm.setup_wwm(mesh_path, schism_dir, ensemble=False) - - logger.info("Setup done") - -def main(args): - - storm_name = str(args.name).lower() - storm_year = str(args.year).lower() - param_wind = args.parametric_wind - - mesh_path = EFS_MOUNT_POINT / args.mesh_file - bbox_path = EFS_MOUNT_POINT / args.domain_bbox_file - dt_rng_path = EFS_MOUNT_POINT / args.date_range_file - st_loc_path = EFS_MOUNT_POINT / args.station_location_file - out_dir = EFS_MOUNT_POINT / args.out - nhc_track = None if args.track_file is None else EFS_MOUNT_POINT / args.track_file - cache_path = get_main_cache_path( - EFS_MOUNT_POINT / args.cache_dir, storm_name, storm_year - ) - tpxo_dir = EFS_MOUNT_POINT / args.tpxo_dir - nwm_dir = EFS_MOUNT_POINT / args.nwm_dir - use_wwm = args.use_wwm - - if TPXO_LINK_PATH.is_dir(): - shutil.rmtree(TPXO_LINK_PATH) - if NWM_LINK_PATH.is_dir(): - shutil.rmtree(NWM_LINK_PATH) - os.symlink(tpxo_dir, TPXO_LINK_PATH, target_is_directory=True) - os.symlink(nwm_dir, NWM_LINK_PATH, target_is_directory=True) - - - setup_schism_model( - mesh_path, - bbox_path, - dt_rng_path, - st_loc_path, - out_dir, - cache_path, - parametric_wind=param_wind, - nhc_track_file=nhc_track, - storm_id=f'{storm_name}{storm_year}', - use_wwm=use_wwm - ) - - -if __name__ == '__main__': - - parser = argparse.ArgumentParser() - - - parser.add_argument( - "--parametric-wind", "-w", - help="flag to switch to parametric wind setup", action="store_true") - - parser.add_argument( - "--mesh-file", - help="path to the file containing computational grid", - type=pathlib.Path - ) - - parser.add_argument( - "--domain-bbox-file", - help="path to the file containing domain bounding box", - type=pathlib.Path - ) - - parser.add_argument( - "--date-range-file", - help="path to the file containing simulation date range", - type=pathlib.Path - ) - - parser.add_argument( - "--station-location-file", - help="path to the file containing station locations", - type=pathlib.Path - ) - - parser.add_argument( - "--cache-dir", - help="path to the cache directory", - type=pathlib.Path - ) - - parser.add_argument( - "--track-file", - help="path to the storm track file for parametric wind setup", - type=pathlib.Path - ) - - parser.add_argument( - "--tpxo-dir", - help="path to the TPXO database directory", - type=pathlib.Path - ) - - parser.add_argument( - "--nwm-dir", - help="path to the NWM stream vector database directory", - type=pathlib.Path - ) - - parser.add_argument( - "--out", - help="path to the setup output (solver input) directory", - type=pathlib.Path - ) - - parser.add_argument( - "--use-wwm", action="store_true" - ) - - parser.add_argument( - "name", help="name of the storm", type=str) - - parser.add_argument( - "year", help="year of the storm", type=int) - - - args = parser.parse_args() - - main(args) diff --git a/docker/pyschism/docker/wwm.py b/docker/pyschism/docker/wwm.py deleted file mode 100644 index 0ab04ef..0000000 --- a/docker/pyschism/docker/wwm.py +++ /dev/null @@ -1,276 +0,0 @@ -from __future__ import annotations -from copy import deepcopy -from datetime import datetime, timedelta -from pathlib import Path - -import f90nml -import numpy as np -from pyschism.mesh.base import Elements -from pyschism.mesh.base import Gr3 -from pyschism.mesh.gridgr3 import Gr3Field -from pyschism.param.param import Param - - -REFS = Path('~').expanduser() / 'app/refs' - -def setup_wwm(mesh_file: Path, setup_dir: Path, ensemble: bool): - '''Output is - - hgrid_WWM.gr3 - - param.nml - - wwmbnd.gr3 - - wwminput.nml - ''' - - - runs_dir = [setup_dir] - if ensemble: - spinup_dir = setup_dir/'spinup' - runs_dir = setup_dir.glob('runs/*') - - schism_grid = Gr3.open(mesh_file, crs=4326) - wwm_grid = break_quads(schism_grid) - wwm_bdry = Gr3Field.constant(wwm_grid, 0.0) - - # TODO: Update spinup - # NOTE: Requires setup of WWM hotfile - - # Update runs - for run in runs_dir: - wwm_grid.write(run / 'hgrid_WWM.gr3', format='gr3') - wwm_bdry.write(run / 'wwmbnd.gr3', format='gr3') - - schism_nml = update_schism_params(run / 'param.nml') - schism_nml.write(run / 'param.nml', force=True) - - wwm_nml = get_wwm_params(run_name=run.name, schism_nml=schism_nml) - wwm_nml.write(run / 'wwminput.nml') - - - -def break_quads(pyschism_mesh: Gr3) -> Gr3 | Gr3Field: - # Create new Elements and set it for the Gr3.elements - quads = pyschism_mesh.quads - if len(quads) == 0: - new_mesh = deepcopy(pyschism_mesh) - - else: - tmp = np.hstack((quads, quads[0, 0][None, None])) - broken = np.vstack((tmp[:, :3], tmp[:, 2:])) - trias = pyschism_mesh.triangles - final_trias = np.vstack((trias, broken)) - # NOTE: Node IDs and indexs are the same as before - elements = { - idx+1: list(map(pyschism_mesh.nodes.get_id_by_index, tri)) - for idx, tri in enumerate(final_trias) - } - - new_mesh = deepcopy(pyschism_mesh) - new_mesh.elements = Elements(pyschism_mesh.nodes, elements) - - - return new_mesh - - - -def get_wwm_params(run_name, schism_nml) -> f90nml.Namelist: - - # Get relevant values from SCHISM setup - begin_time = datetime( - year=schism_nml['opt']['start_year'], - month=schism_nml['opt']['start_month'], - day=schism_nml['opt']['start_day'], - # TODO: Handle decimal hour - hour=int(schism_nml['opt']['start_hour']), - ) - end_time = begin_time + timedelta(days=schism_nml['core']['rnday']) - delta_t = schism_nml['core']['dt'] - mdc = schism_nml['core']['mdc2'] - msc = schism_nml['core']['msc2'] - nstep_wwm = schism_nml['opt']['nstep_wwm'] - - time_fmt = '%Y%m%d.%H%M%S' - wwm_delta_t = nstep_wwm * delta_t - - # For now just read the example file update relevant names and write - wwm_params = f90nml.read(REFS/'wwminput.nml') - wwm_params.uppercase = True - - proc_nml = wwm_params['PROC'] - proc_nml['PROCNAME'] = run_name - # Time for start the simulation, ex:yyyymmdd. hhmmss - proc_nml['BEGTC'] = begin_time.strftime(time_fmt) - # Time step (MUST match dt*nstep_wwm in SCHISM!) - proc_nml['DELTC'] = wwm_delta_t - # Unity of time step - proc_nml['UNITC'] = 'SEC' - # Time for stop the simulation, ex:yyyymmdd. hhmmss - proc_nml['ENDTC'] = end_time.strftime(time_fmt) - # Minimum water depth. THis must be same as h0 in selfe - proc_nml['DMIN'] = 0.01 - - grid_nml = wwm_params['GRID'] - # Number of directional bins - grid_nml['MDC'] = mdc - # Number of frequency bins - grid_nml['MSC'] = msc - # Name of the grid file. hgrid.gr3 if IGRIDTYPE = 3 (SCHISM) - grid_nml['FILEGRID'] = 'hgrid_WWM.gr3' - # Gridtype used. - grid_nml['IGRIDTYPE'] = 3 - - bouc_nml = wwm_params['BOUC'] - # Begin time of the wave boundary file (FILEWAVE) - bouc_nml['BEGTC'] = begin_time.strftime(time_fmt) - # Time step in FILEWAVE - bouc_nml['DELTC'] = 1 - # Unit can be HR, MIN, SEC - bouc_nml['UNITC'] = 'HR' - # End time - bouc_nml['ENDTC'] = end_time.strftime(time_fmt) - # Boundary file defining boundary conditions and Neumann nodes. - bouc_nml['FILEBOUND'] = 'wwmbnd.gr3' - bouc_nml['BEGTC_OUT'] = 20030908.000000 - bouc_nml['DELTC_OUT'] = 600.000000000000 - bouc_nml['UNITC_OUT'] = 'SEC' - bouc_nml['ENDTC_OUT'] = 20031008.000000 - - hist_nml = wwm_params['HISTORY'] - # Start output time, yyyymmdd. hhmmss; - # must fit the simulation time otherwise no output. - # Default is same as PROC%BEGTC - hist_nml['BEGTC'] = begin_time.strftime(time_fmt) - # Time step for output; if smaller than simulation time step, the latter is used (output every step for better 1D 2D spectra analysis) - hist_nml['DELTC'] = 1 - # Unit - hist_nml['UNITC'] = 'SEC' - # Stop time output, yyyymmdd. hhmmss - # Default is same as PROC%ENDC - hist_nml['ENDTC'] = end_time.strftime(time_fmt) - # Time scoop (sec) for history files - hist_nml['DEFINETC'] = 86400 - hist_nml['FILEOUT'] = 'wwm_hist.dat' - - sta_nml = wwm_params['STATION'] - # Start simulation time, yyyymmdd. hhmmss; must fit the simulation time otherwise no output - # Default is same as PROC%BEGTC - sta_nml['BEGTC'] = begin_time.strftime(time_fmt) - # Time step for output; if smaller than simulation time step, the latter is used (output every step for better 1D 2D spectra analysis) - sta_nml['DELTC'] = wwm_delta_t - # Unit - sta_nml['UNITC'] = 'SEC' - # Stop time simulation, yyyymmdd. hhmmss - # Default is same as PROC%ENDC - sta_nml['ENDTC'] = end_time.strftime(time_fmt) - # Time for definition of station files - sta_nml['DEFINETC'] = 86400 - - # TODO: Add hot file? - hot_nml = wwm_params['HOTFILE'] - # Write hotfile - hot_nml['LHOTF'] = False - #'.nc' suffix will be added -# hot_nml['FILEHOT_OUT'] = 'wwm_hot_out' -# #Starting time of hotfile writing. With ihot!=0 in SCHISM, -# # this will be whatever the new hotstarted time is (even with ihot=2) -# hot_nml['BEGTC'] = '20030908.000000' -# # time between hotfile writes -# hot_nml['DELTC'] = 86400. -# # unit used above -# hot_nml['UNITC'] = 'SEC' -# # Ending time of hotfile writing (adjust with BEGTC) -# hot_nml['ENDTC'] = '20031008.000000' -# # Applies only to netcdf -# # If T then hotfile contains 2 last records. -# # If F then hotfile contains N record if N outputs -# # have been done. -# # For binary only one record. -# hot_nml['LCYCLEHOT'] = True -# # 1: binary hotfile of data as output -# # 2: netcdf hotfile of data as output (default) -# hot_nml['HOTSTYLE_OUT'] = 2 -# # 0: hotfile in a single file (binary or netcdf) -# # MPI_REDUCE is then used and thus youd avoid too freq. output -# # 1: hotfiles in separate files, each associated -# # with one process -# hot_nml['MULTIPLEOUT'] = 0 -# # (Full) hot file name for input -# hot_nml['FILEHOT_IN'] = 'wwm_hot_in.nc' -# # 1: binary hotfile of data as input -# # 2: netcdf hotfile of data as input (default) -# hot_nml['HOTSTYLE_IN'] = 2 -# # Position in hotfile (only for netcdf) -# # for reading -# hot_nml['IHOTPOS_IN'] = 1 -# # 0: read hotfile from one single file -# # 1: read hotfile from multiple files (must use same # of CPU?) -# hot_nml['MULTIPLEIN'] = 0 - - return wwm_params - - -def update_schism_params(path: Path) -> f90nml.Namelist: - - schism_nml = f90nml.read(path) - - core_nml = schism_nml['core'] - core_nml['msc2'] = 24 - core_nml['mdc2'] = 30 - - opt_nml = schism_nml['opt'] - opt_nml['icou_elfe_wwm'] = 1 - opt_nml['nstep_wwm'] = 4 - opt_nml['iwbl'] = 0 - opt_nml['hmin_radstress'] = 1. - # TODO: Revisit for spinup support - # NOTE: Issue 7#issuecomment-1482848205 oceanmodeling fork -# opt_nml['nrampwafo'] = 0 - opt_nml['drampwafo'] = 0. - opt_nml['turbinj'] = 0.15 - opt_nml['turbinjds'] = 1.0 - opt_nml['alphaw'] = 0.5 - - - # NOTE: Python index is different from the NML index - schout_nml = schism_nml['schout'] - - schout_nml['iof_hydro'] = [1] - schout_nml['iof_wwm'] = [0 for i in range(17)] - - schout_nml.start_index.update(iof_hydro=[14], iof_wwm=[1]) - - #sig. height (m) {sigWaveHeight} 2D - schout_nml['iof_wwm'][0] = 1 - #Mean average period (sec) - TM01 {meanWavePeriod} 2D - schout_nml['iof_wwm'][1] = 0 - #Zero down crossing period for comparison with buoy (s) - TM02 {zeroDowncrossPeriod} 2D - schout_nml['iof_wwm'][2] = 0 - #Average period of wave runup/overtopping - TM10 {TM10} 2D - schout_nml['iof_wwm'][3] = 0 - #Mean wave number (1/m) {meanWaveNumber} 2D - schout_nml['iof_wwm'][4] = 0 - #Mean wave length (m) {meanWaveLength} 2D - schout_nml['iof_wwm'][5] = 0 - #Mean average energy transport direction (degr) - MWD in NDBC? {meanWaveDirection} 2D - schout_nml['iof_wwm'][6] = 0 - #Mean directional spreading (degr) {meanDirSpreading} 2D - schout_nml['iof_wwm'][7] = 0 - #Discrete peak period (sec) - Tp {peakPeriod} 2D - schout_nml['iof_wwm'][8] = 1 - #Continuous peak period based on higher order moments (sec) {continuousPeakPeriod} 2D - schout_nml['iof_wwm'][9] = 0 - #Peak phase vel. (m/s) {peakPhaseVel} 2D - schout_nml['iof_wwm'][10] = 0 - #Peak n-factor {peakNFactor} 2D - schout_nml['iof_wwm'][11] = 0 - #Peak group vel. (m/s) {peakGroupVel} 2D - schout_nml['iof_wwm'][12] = 0 - #Peak wave number {peakWaveNumber} 2D - schout_nml['iof_wwm'][13] = 0 - #Peak wave length {peakWaveLength} 2D - schout_nml['iof_wwm'][14] = 0 - #Peak (dominant) direction (degr) {dominantDirection} 2D - schout_nml['iof_wwm'][15] = 1 - #Peak directional spreading {peakSpreading} 2D - schout_nml['iof_wwm'][16] = 0 - - return schism_nml diff --git a/docker/pyschism/environment.yml b/docker/pyschism/environment.yml deleted file mode 100644 index ce24f19..0000000 --- a/docker/pyschism/environment.yml +++ /dev/null @@ -1,39 +0,0 @@ -name: icogsc -channels: - - conda-forge -dependencies: - - python<3.10 - - pip - - gdal - - geos - - proj - - netcdf4 - - hdf5 - - cartopy - - cfunits - - cf-python - - cfgrib - - esmf - - esmpy - - cfdm - - udunits2 - - pyproj - - shapely>=1.8, <2 - - rasterio - - fiona - - pygeos - - geopandas>=0.10.0 - - pandas<1.5.0 # moved SettingWithCopyWarning - - utm - - scipy - - numpy - - matplotlib - - requests - - tqdm - - mpi4py - - pyarrow - - pytz - - geoalchemy2 - - seawater - - pip: - - chaospy>=4.2.7 diff --git a/docker/schism/docker/.env b/docker/schism/docker/.env deleted file mode 100644 index 2669c7b..0000000 --- a/docker/schism/docker/.env +++ /dev/null @@ -1,2 +0,0 @@ -SCHISM_USER=schismer -SCHISM_NPROCS=4 diff --git a/docker/schism/docker/Dockerfile b/docker/schism/docker/Dockerfile deleted file mode 100644 index d549d63..0000000 --- a/docker/schism/docker/Dockerfile +++ /dev/null @@ -1,106 +0,0 @@ -FROM ubuntu:22.10 - -# Create a non-root user -ARG username=schismer -ARG uid=1000 -ARG gid=100 -ARG ioprefix=/app/io -ENV USER $username -ENV UID $uid -ENV GID $gid -ENV HOME /home/$USER - -# Get necessary packages -RUN apt-get update && apt-get upgrade -y && apt-get install -y \ - git \ - gcc \ - g++ \ - gfortran \ - make \ - cmake \ - openmpi-bin libopenmpi-dev \ - libhdf5-dev \ - libnetcdf-dev libnetcdf-mpi-dev libnetcdff-dev \ - python3 \ - python-is-python3 - -# New user -RUN adduser --disabled-password --gecos "Non-root user" --uid $UID --home $HOME $USER - -# Create a project directory inside user home -ENV PROJECT_DIR $HOME/app -RUN mkdir -p $PROJECT_DIR -WORKDIR $PROJECT_DIR - -# Install SCHISM -RUN \ - git clone https://github.com/schism-dev/schism.git && \ - git -C schism checkout 0741120 && \ - mkdir -p schism/build && \ - PREV_PWD=$PWD && \ - cd schism/build && \ - cmake ../src/ \ - -DCMAKE_Fortran_COMPILER=mpifort \ - -DCMAKE_C_COMPILER=mpicc \ - -DNetCDF_Fortran_LIBRARY=$(nc-config --libdir)/libnetcdff.so \ - -DNetCDF_C_LIBRARY=$(nc-config --libdir)/libnetcdf.so \ - -DNetCDF_INCLUDE_DIR=$(nc-config --includedir) \ - -DUSE_PAHM=TRUE \ - -DCMAKE_Fortran_FLAGS_RELEASE="-O2 -ffree-line-length-none -fallow-argument-mismatch" && \ - make -j8 && \ - mv bin/* -t /usr/bin/ && \ - rm -rf * && \ - cmake ../src/ \ - -DCMAKE_Fortran_COMPILER=mpifort \ - -DCMAKE_C_COMPILER=mpicc \ - -DNetCDF_Fortran_LIBRARY=$(nc-config --libdir)/libnetcdff.so \ - -DNetCDF_C_LIBRARY=$(nc-config --libdir)/libnetcdf.so \ - -DNetCDF_INCLUDE_DIR=$(nc-config --includedir) \ - -DUSE_PAHM=TRUE \ - -DUSE_WWM=TRUE \ - -DCMAKE_Fortran_FLAGS_RELEASE="-O2 -ffree-line-length-none -fallow-argument-mismatch" && \ - make -j8 && \ - mv bin/* -t /usr/bin/ && \ - cd ${PREV_PWD} && \ - rm -rf schism - - -RUN apt-get remove -y git -RUN apt-get remove -y gcc -RUN apt-get remove -y g++ -RUN apt-get remove -y gfortran -RUN apt-get remove -y make -RUN apt-get remove -y cmake -RUN apt-get remove -y python3 -RUN apt-get remove -y python-is-python3 -RUN apt-get remove -y libopenmpi-dev -RUN apt-get remove -y libhdf5-dev -RUN apt-get remove -y libnetcdf-dev libnetcdf-mpi-dev libnetcdff-dev - -RUN apt-get install -y libnetcdf-c++4-1 libnetcdf-c++4 libnetcdf-mpi-19 libnetcdf19 libnetcdff7 -RUN apt-get install -y libhdf5-103-1 libhdf5-cpp-103-1 libhdf5-openmpi-103-1 -RUN apt-get install -y libopenmpi3 -RUN DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC apt-get -y install tzdata -RUN apt-get install -y expect - -RUN apt-get clean autoclean -RUN apt-get autoremove --yes -RUN rm -rf /var/lib/{apt,dpkg,cache,log}/ - -# Set default entry -COPY docker/entrypoint.sh /usr/local/bin/ -RUN chown $UID:$GID /usr/local/bin/entrypoint.sh && \ - chmod u+x /usr/local/bin/entrypoint.sh - -# Helper scripts -COPY docker/combine_gr3.exp $PROJECT_DIR -RUN chown -R $UID:$GID $PROJECT_DIR - - -# Volume mount points -RUN mkdir -p $ioprefix/output -RUN mkdir -p $ioprefix/input - -USER $USER - -ENTRYPOINT [ "/usr/local/bin/entrypoint.sh" ] diff --git a/docker/schism/docker/docker-compose.yml b/docker/schism/docker/docker-compose.yml deleted file mode 100644 index 9af92fc..0000000 --- a/docker/schism/docker/docker-compose.yml +++ /dev/null @@ -1,22 +0,0 @@ -version: "3.9" -services: - schism-noaa: - environment: - - SCHISM_NPROCS=${SCHISM_NPROCS} - cap_add: - - SYS_PTRACE - build: - context: .. - dockerfile: docker/Dockerfile - args: - - username=${SCHISM_USER} - - uid=1000 - - gid=100 -# command: '/bin/sh' - volumes: - - type: bind - source: /home/ec2-user/data/test/hurricanes/florence_2018/setup/schism.dir - target: /home/${SCHISM_USER}/app/io/input/ - - type: bind - source: /home/ec2-user/data/test/hurricanes/florence_2018/sim - target: /home/${SCHISM_USER}/app/io/output/ diff --git a/docker/schism/docker/entrypoint.sh b/docker/schism/docker/entrypoint.sh deleted file mode 100644 index ab6e4aa..0000000 --- a/docker/schism/docker/entrypoint.sh +++ /dev/null @@ -1,45 +0,0 @@ -#!/bin/bash -#exec "$@" -cd io/$1 - -# MCA issue https://github.com/open-mpi/ompi/issues/4948 -#mpirun --mca btl_vader_single_copy_mechanism none -np $SCHISM_NPROCS pschism_TVD-VL - -# If SYS_PTRACE capability added for container we can use MCA - -echo "Starting solver..." -date - -set -ex - -mkdir -p outputs -mpirun -np $SCHISM_NPROCS $2 4 - - -echo "Combining outputs..." -date -# NOTE: Due to the scribed IO, there's no need to combine main output -#pushd outputs -#times=$(ls schout_* | grep -o "schout[0-9_]\+" | awk 'BEGIN {FS = "_"}; {print $3}' | sort -h | uniq ) -#for i in $times; do -# combine_output11 -b $i -e $i -#done -#popd - -# Combine hotstart -pushd outputs -if ls hotstart* >/dev/null 2>&1; then - times=$(ls hotstart_* | grep -o "hotstart[0-9_]\+" | awk 'BEGIN {FS = "_"}; {print $3}' | sort -h | uniq ) - for i in $times; do - combine_hotstart7 --iteration $i - done -fi -popd - -expect -f $HOME/app/combine_gr3.exp maxelev 1 -expect -f $HOME/app/combine_gr3.exp maxdahv 3 -mv maxdahv.gr3 maxelev.gr3 -t outputs - - -echo "Done" -date diff --git a/docs/workflow.pdf b/docs/workflow.pdf deleted file mode 100755 index c8fd7b5..0000000 Binary files a/docs/workflow.pdf and /dev/null differ diff --git a/environment.yml b/environment.yml index 0af1717..28cf939 100644 --- a/environment.yml +++ b/environment.yml @@ -1,14 +1,45 @@ -name: odssm +name: stormworkflow channels: - conda-forge - - defaults dependencies: - - python=3.10 - - prefect=1.4, <2 - - cloudpickle - - ansible-core - - terraform + - cartopy + - cf-python + - cfdm + - cfgrib + - cfunits + - colored-traceback + - cmocean + - esmf + - esmpy + - fiona + - gdal + - geoalchemy2 + - geopandas>=0.13 + - geos + - hdf5 + - importlib_metadata<8 # Fix issue with esmpy Author import + - matplotlib + - mpi4py + - netcdf4 + - numpy + - numba + - ocsmesh==1.5.3 + - pandas + - pip + - proj + - pyarrow + - pygeos + - pyproj + - python<3.11 + - pytz + - shapely>=2 + - rasterio - requests - - dnspython - - boto3 - - dunamai + - rtree + - scipy + - seawater + - typing-extensions + - tqdm + - udunits2 + - utm + - xarray==2023.7.0 diff --git a/prefect/workflow/conf.py b/prefect/workflow/conf.py deleted file mode 100644 index 0ef7d4f..0000000 --- a/prefect/workflow/conf.py +++ /dev/null @@ -1,68 +0,0 @@ -import os -import pathlib -from collections import namedtuple - -import boto3 -import dunamai -import yaml -from prefect.run_configs.base import UniversalRun - -def _get_git_version(): - version = dunamai.Version.from_git() - ver_str = version.commit - if version.dirty: - ver_str = ver_str + ' + uncommitted' - return ver_str - -def _get_docker_versions(): - - version_dict = {} - ecs = boto3.client('ecs') - # Workflow always uses the latest ECS task - tasks_latest = ecs.list_task_definitions()['taskDefinitionArns'] - for t in tasks_latest: - taskDef = ecs.describe_task_definition(taskDefinition=t)['taskDefinition'] - contDefs = taskDef['containerDefinitions'] - for c in contDefs: - name = c['name'] - image = c['image'] - version_dict[name] = image - return version_dict - -# Version info -COMMIT_HASH = _get_git_version() -DOCKER_VERS = _get_docker_versions() - -# Constants -PW_URL = "https://noaa.parallel.works" -PW_S3 = "noaa-nos-none-ca-hsofs-c" -PW_S3_PREFIX = "Soroosh.Mani" - -STATIC_S3 = "tacc-nos-icogs-static" -RESULT_S3 = "tacc-icogs-results" - -PREFECT_PROJECT_NAME = "ondemand-stormsurge" #"odssm" -LOG_STDERR = True - -THIS_FILE = pathlib.Path(__file__) -TERRAFORM_CONFIG_FILE = THIS_FILE.parent.parent/'vars_from_terraform' - -WORKFLOW_TAG_NAME = "Workflow Tag" -INIT_FINI_LOCK = "/efs/.initfini.lock" - -run_cfg_local_aws_cred = UniversalRun(labels=['tacc-odssm-local']) -run_cfg_local_pw_cred = UniversalRun(labels=['tacc-odssm-local-for-rdhpcs']) -run_cfg_rdhpcsc_mesh_cluster = UniversalRun(labels=['tacc-odssm-rdhpcs-mesh-cluster']) -run_cfg_rdhpcsc_schism_cluster = UniversalRun(labels=['tacc-odssm-rdhpcs-schism-cluster']) - -# TODO: Make environment based configs dynamic -pw_s3_cred = dict( - aws_access_key_id=os.getenv('RDHPCS_S3_ACCESS_KEY_ID'), - aws_secret_access_key=os.getenv('RDHPCS_S3_SECRET_ACCESS_KEY'), -) - -with open(TERRAFORM_CONFIG_FILE, 'r') as f: - locals().update(**yaml.load(f, Loader=yaml.Loader)) - -# TODO: Get from var file in conf -log_group_name='odssm_ecs_task_docker_logs' diff --git a/prefect/workflow/flows/__init__.py b/prefect/workflow/flows/__init__.py deleted file mode 100644 index cbf2906..0000000 --- a/prefect/workflow/flows/__init__.py +++ /dev/null @@ -1,2 +0,0 @@ -import flows.infra -import flows.jobs diff --git a/prefect/workflow/flows/infra.py b/prefect/workflow/flows/infra.py deleted file mode 100644 index b845c2b..0000000 --- a/prefect/workflow/flows/infra.py +++ /dev/null @@ -1,82 +0,0 @@ -from prefect import case - -from tasks.params import param_storm_name, param_storm_year, param_run_id -from tasks.infra import ( - task_format_list_cluster_instance_arns, - task_list_cluster_instance_arns, - task_check_if_ec2_needed, - task_format_spinup_cluster_ec2, - task_spinup_cluster_ec2, - task_client_wait_for_ec2, - task_list_cluster_tasks, - task_format_list_cluster_tasks, - task_check_cluster_shutdown, - task_format_list_cluster_instance_ids, - task_list_cluster_instance_ids, - task_term_instances, - task_format_term_ec2, - task_create_ec2_w_tag, - task_destroy_ec2_by_tag) -from tasks.utils import task_pylist_from_jsonlist, task_get_run_tag -from flows.utils import LocalAWSFlow - -def make_flow_create_infra(flow_name, cluster_name, ec2_template): - with LocalAWSFlow(flow_name) as flow: - result_ecs_instances = task_list_cluster_instance_arns( - task_format_list_cluster_instance_arns( - cluster=cluster_name)) - result_need_ec2 = task_check_if_ec2_needed(rv_shell=result_ecs_instances) - with case(result_need_ec2, True): - result_spinup_ec2 = task_spinup_cluster_ec2( - task_format_spinup_cluster_ec2( - template_id=ec2_template)) - result_wait_ec2 = task_client_wait_for_ec2( - waiter_kwargs=dict( - InstanceIds=task_pylist_from_jsonlist(result_spinup_ec2) - ) - ) - return flow - -def make_flow_teardown_infra(flow_name, cluster_name): - with LocalAWSFlow(flow_name) as flow: - result_tasks = task_list_cluster_tasks( - task_format_list_cluster_tasks( - cluster=cluster_name)) - result_can_shutdown = task_check_cluster_shutdown( - rv_shell=result_tasks) - with case(result_can_shutdown, True): - result_ecs_instances = task_list_cluster_instance_ids( - task_format_list_cluster_instance_ids( - cluster=cluster_name)) - task_term_instances( - command=task_format_term_ec2( - instance_id_list=result_ecs_instances - ) - ) - return flow - -def make_flow_create_infra_v2(flow_name, cluster_name, ec2_template): - - # NOTE: `cluster_name` is not used in this version - with LocalAWSFlow(flow_name) as flow: - result_run_tag = task_get_run_tag( - param_storm_name, param_storm_year, param_run_id) - - result_ec2_ids = task_create_ec2_w_tag( - ec2_template, result_run_tag) - - result_wait_ec2 = task_client_wait_for_ec2( - waiter_kwargs=dict(InstanceIds=result_ec2_ids) - ) - return flow - -def make_flow_teardown_infra_v2(flow_name, cluster_name): - - # NOTE: `cluster_name` is not used in this version - with LocalAWSFlow(flow_name) as flow: - - result_run_tag = task_get_run_tag( - param_storm_name, param_storm_year, param_run_id) - - task_destroy_ec2_by_tag(result_run_tag) - return flow diff --git a/prefect/workflow/flows/jobs/__init__.py b/prefect/workflow/flows/jobs/__init__.py deleted file mode 100644 index c345c6c..0000000 --- a/prefect/workflow/flows/jobs/__init__.py +++ /dev/null @@ -1,2 +0,0 @@ -import flows.jobs.ecs -import flows.jobs.pw diff --git a/prefect/workflow/flows/jobs/ecs.py b/prefect/workflow/flows/jobs/ecs.py deleted file mode 100644 index 0dc26ae..0000000 --- a/prefect/workflow/flows/jobs/ecs.py +++ /dev/null @@ -1,510 +0,0 @@ -from functools import partial - -from prefect import apply_map, unmapped, case, task -from prefect.utilities.edges import unmapped -from prefect.tasks.secrets import EnvVarSecret -from prefect.tasks.files.operations import Glob -from prefect.tasks.prefect.flow_run import create_flow_run, wait_for_flow_run - -from conf import ( - OCSMESH_CLUSTER, OCSMESH_TEMPLATE_1_ID, OCSMESH_TEMPLATE_2_ID, - SCHISM_CLUSTER, SCHISM_TEMPLATE_ID, - VIZ_CLUSTER, VIZ_TEMPLATE_ID, - WF_CLUSTER, WF_TEMPLATE_ID, WF_IMG, - ECS_TASK_ROLE, ECS_EXEC_ROLE, - PREFECT_PROJECT_NAME, -) -from tasks.params import ( - param_storm_name, param_storm_year, param_run_id, - param_use_parametric_wind, param_schism_dir, - param_subset_mesh, param_ensemble, - param_mesh_hmax, - param_mesh_hmin_low, param_mesh_rate_low, - param_mesh_trans_elev, - param_mesh_hmin_high, param_mesh_rate_high, - param_ensemble_n_perturb, param_hr_prelandfall, - param_ensemble_sample_rule, - param_past_forecast, - param_wind_coupling, - param_schism_exec, -) -from tasks.infra import ContainerInstance, task_add_ecs_attribute_for_ec2 -from tasks.jobs import ( - task_start_ecs_task, - task_format_start_task, - shell_run_task, - task_client_wait_for_ecs, - task_retrieve_task_docker_logs, - task_kill_task_if_wait_fails, - task_format_kill_timedout, - task_check_docker_success) -from tasks.utils import ( - ECSTaskDetail, - task_check_param_true, - task_pylist_from_jsonlist, - task_get_run_tag, - task_get_flow_run_id, - task_bundle_params, - task_replace_tag_in_template, - task_convert_str_to_path, - task_return_value_if_param_true, - task_return_value_if_param_false, - task_return_this_if_param_true_else_that -) -from flows.utils import LocalAWSFlow, flow_dependency, task_create_ecsrun_config - - - -def _use_if(param, is_true, value): - if is_true: - task = task_return_value_if_param_true - else: - task = task_return_value_if_param_false - - return lambda: task(param=param, value=value) - - -def _tag(template): - return lambda: task_replace_tag_in_template( - storm_name=param_storm_name, - storm_year=param_storm_year, - run_id=param_run_id, - template_str=str(template)) - - -def _tag_n_use_if(param, is_true, template): - if is_true: - task = task_return_value_if_param_true - else: - task = task_return_value_if_param_false - - return lambda: task( - param=param, - value=task_replace_tag_in_template( - storm_name=param_storm_name, - storm_year=param_storm_year, - run_id=param_run_id, - template_str=str(template) - ) - ) - - -def _use_if_and(*and_conds, value=None): - assert value is not None - assert len(and_conds) % 2 == 0 - tasks_args = [] - conds_iter = iter(and_conds) - for par, is_true in zip(conds_iter, conds_iter): - if is_true: - tasks_args.append((task_return_value_if_param_true, par)) - else: - tasks_args.append((task_return_value_if_param_false, par)) - - def _call_task(remains): - assert len(remains) > 0 - task_arg = remains[0] - if len(remains) == 1: - return task_arg[0](param=task_arg[1], value=value) - return task_arg[0](param=task_arg[1], value=_call_task(remains[1:])) - - def _task_recurse(): - return _call_task(tasks_args) - - return _task_recurse - - -def _tag_n_use_if_and(*and_conds, template=None): - assert template is not None - assert len(and_conds) % 2 == 0 - tasks_args = [] - conds_iter = iter(and_conds) - for par, is_true in zip(conds_iter, conds_iter): - if is_true: - tasks_args.append((task_return_value_if_param_true, par)) - else: - tasks_args.append((task_return_value_if_param_false, par)) - - def _call_task(remains): - assert len(remains) > 0 - task_arg = remains[0] - if len(remains) == 1: - return task_arg[0]( - param=task_arg[1], - value=task_replace_tag_in_template( - storm_name=param_storm_name, - storm_year=param_storm_year, - run_id=param_run_id, - template_str=str(template) - ) - ) - return task_arg[0](param=task_arg[1], value=_call_task(remains[1:])) - - def _task_recurse(): - return _call_task(tasks_args) - - return _task_recurse - - -info_flow_ecs_task_details = { - "sim-prep-info-aws": ECSTaskDetail( - OCSMESH_CLUSTER, OCSMESH_TEMPLATE_2_ID, "odssm-info", "info", - [ - '--date-range-outpath', - _tag('{tag}/setup/dates.csv'), - '--track-outpath', - _tag('{tag}/nhc_track/hurricane-track.dat'), - '--swath-outpath', - _tag('{tag}/windswath'), - '--station-data-outpath', - _tag('{tag}/coops_ssh/stations.nc'), - '--station-location-outpath', - _tag('{tag}/setup/stations.csv'), - _use_if(param_past_forecast, True, '--past-forecast'), - _use_if(param_past_forecast, True, "--hours-before-landfall"), - _use_if(param_past_forecast, True, param_hr_prelandfall), - param_storm_name, param_storm_year, - ], - "hurricane info", - 60, 20, []), - "sim-prep-mesh-aws": ECSTaskDetail( - OCSMESH_CLUSTER, OCSMESH_TEMPLATE_1_ID, "odssm-mesh", "mesh", [ - param_storm_name, param_storm_year, - "--rasters-dir", 'dem', - # If subsetting flag is False - _use_if(param_subset_mesh, False, "hurricane_mesh"), - _use_if(param_subset_mesh, False, "--hmax"), - _use_if(param_subset_mesh, False, param_mesh_hmax), - _use_if(param_subset_mesh, False, "--hmin-low"), - _use_if(param_subset_mesh, False, param_mesh_hmin_low), - _use_if(param_subset_mesh, False, "--rate-low"), - _use_if(param_subset_mesh, False, param_mesh_rate_low), - _use_if(param_subset_mesh, False, "--transition-elev"), - _use_if(param_subset_mesh, False, param_mesh_trans_elev), - _use_if(param_subset_mesh, False, "--hmin-high"), - _use_if(param_subset_mesh, False, param_mesh_hmin_high), - _use_if(param_subset_mesh, False, "--rate-high"), - _use_if(param_subset_mesh, False, param_mesh_rate_high), - _use_if(param_subset_mesh, False, "--shapes-dir"), - _use_if(param_subset_mesh, False, 'shape'), - _use_if(param_subset_mesh, False, "--windswath"), - _tag_n_use_if( - param_subset_mesh, False, 'hurricanes/{tag}/windswath' - ), - # If subsetting flag is True - _use_if(param_subset_mesh, True, "subset_n_combine"), - _use_if(param_subset_mesh, True, 'grid/HSOFS_250m_v1.0_fixed.14'), - _use_if(param_subset_mesh, True, 'grid/WNAT_1km.14'), - _tag_n_use_if( - param_subset_mesh, True, 'hurricanes/{tag}/windswath' - ), - # Other shared options - "--out", _tag('hurricanes/{tag}/mesh'), - ], - "meshing", - 60, 180, []), - "sim-prep-setup-aws": ECSTaskDetail( - OCSMESH_CLUSTER, OCSMESH_TEMPLATE_2_ID, "odssm-prep", "prep", [ - # Command and arguments for deterministic run - _use_if(param_ensemble, False, "setup_model"), - _use_if_and( - param_use_parametric_wind, True, param_ensemble, False, - value="--parametric-wind" - ), - _use_if(param_ensemble, False, "--mesh-file"), - _tag_n_use_if( - param_ensemble, False, 'hurricanes/{tag}/mesh/mesh_w_bdry.grd' - ), - _use_if(param_ensemble, False, "--domain-bbox-file"), - _tag_n_use_if( - param_ensemble, False, 'hurricanes/{tag}/mesh/domain_box/' - ), - _use_if(param_ensemble, False, "--station-location-file"), - _tag_n_use_if( - param_ensemble, False, 'hurricanes/{tag}/setup/stations.csv' - ), - _use_if(param_ensemble, False, "--out"), - _tag_n_use_if( - param_ensemble, False, 'hurricanes/{tag}/setup/schism.dir/' - ), - _use_if_and( - param_use_parametric_wind, True, param_ensemble, False, - value="--track-file" - ), - _tag_n_use_if_and( - param_use_parametric_wind, True, param_ensemble, False, - template='hurricanes/{tag}/nhc_track/hurricane-track.dat', - ), - _use_if(param_ensemble, False, "--cache-dir"), - _use_if(param_ensemble, False, 'cache'), - _use_if(param_ensemble, False, "--nwm-dir"), - _use_if(param_ensemble, False, 'nwm'), - # Command and arguments for ensemble run - _use_if(param_ensemble, True, "setup_ensemble"), - _use_if(param_ensemble, True, "--track-file"), - _tag_n_use_if( - param_ensemble, True, 'hurricanes/{tag}/nhc_track/hurricane-track.dat', - ), - _use_if(param_ensemble, True, "--output-directory"), - _tag_n_use_if( - param_ensemble, True, 'hurricanes/{tag}/setup/ensemble.dir/' - ), - _use_if(param_ensemble, True, "--num-perturbations"), - _use_if(param_ensemble, True, param_ensemble_n_perturb), - _use_if(param_ensemble, True, '--mesh-directory'), - _tag_n_use_if( - param_ensemble, True, 'hurricanes/{tag}/mesh/' - ), - _use_if(param_ensemble, True, "--sample-from-distribution"), -# _use_if(param_ensemble, True, "--quadrature"), - _use_if(param_ensemble, True, "--sample-rule"), - _use_if(param_ensemble, True, param_ensemble_sample_rule), - _use_if(param_ensemble, True, "--hours-before-landfall"), - _use_if(param_ensemble, True, param_hr_prelandfall), - _use_if(param_ensemble, True, "--nwm-file"), - _use_if(param_ensemble, - True, - "nwm/NWM_v2.0_channel_hydrofabric/nwm_v2_0_hydrofabric.gdb" - ), - # Common arguments - "--date-range-file", - _tag('hurricanes/{tag}/setup/dates.csv'), - "--tpxo-dir", 'tpxo', - _use_if(param_wind_coupling, True, "--use-wwm"), - param_storm_name, param_storm_year], - "setup", - 60, 180, ["CDSAPI_URL", "CDSAPI_KEY"]), - "schism-run-aws-single": ECSTaskDetail( - SCHISM_CLUSTER, SCHISM_TEMPLATE_ID, "odssm-solve", "solve", [ - param_schism_dir, - param_schism_exec - ], - "SCHISM", - 60, 240, []), - "viz-sta-html-aws": ECSTaskDetail( - VIZ_CLUSTER, VIZ_TEMPLATE_ID, "odssm-post", "post", [ - param_storm_name, param_storm_year, - _tag('hurricanes/{tag}/setup/schism.dir/'), - ], - "visualization", - 20, 45, []), - "viz-cmb-ensemble-aws": ECSTaskDetail( - SCHISM_CLUSTER, SCHISM_TEMPLATE_ID, "odssm-prep", "prep", [ - 'combine_ensemble', - '--ensemble-dir', - _tag('hurricanes/{tag}/setup/ensemble.dir/'), - '--tracks-dir', - _tag('hurricanes/{tag}/setup/ensemble.dir/track_files'), - ], - "Combine ensemble output files", - 60, 90, []), - "viz-ana-ensemble-aws": ECSTaskDetail( - SCHISM_CLUSTER, SCHISM_TEMPLATE_ID, "odssm-prep", "prep", [ - 'analyze_ensemble', - '--ensemble-dir', - _tag('hurricanes/{tag}/setup/ensemble.dir/'), - '--tracks-dir', - _tag('hurricanes/{tag}/setup/ensemble.dir/track_files'), - ], - "Analyze combined ensemble output", - 60, 90, []), -} - -def helper_call_prefect_task_for_ecs_job( - cluster_name, - ec2_template, - description, - name_ecs_task, - name_docker, - command, - wait_delay=60, - wait_attempt=150, - environment=None): - - additional_kwds = {} - if environment is not None: - env = additional_kwds.setdefault('env', []) - for item in environment: - env.append({ - "name": item, - "value": EnvVarSecret(item, raise_if_missing=True)} - ) - - - # Using container instance per ecs flow, NOT main flow - thisflow_run_id = task_get_flow_run_id() - with ContainerInstance(thisflow_run_id, ec2_template): - - result_ecs_task = task_start_ecs_task( - task_args=dict( - name=f'Start {description}', - ), - command=task_format_start_task( - template=shell_run_task, - cluster=cluster_name, - docker_cmd=command, - name_ecs_task=name_ecs_task, - name_docker=name_docker, - run_tag=thisflow_run_id, - **additional_kwds) - ) - - result_wait_ecs = task_client_wait_for_ecs( - waiter_kwargs=dict( - cluster=cluster_name, - tasks=task_pylist_from_jsonlist(result_ecs_task), - WaiterConfig=dict(Delay=wait_delay, MaxAttempts=wait_attempt) - ) - ) - - task_retrieve_task_docker_logs( - tasks=task_pylist_from_jsonlist(result_ecs_task), - log_prefix=name_ecs_task, - container_name=name_docker, - upstream_tasks=[result_wait_ecs]) - - # Timeout based on Prefect wait - task_kill_task_if_wait_fails.map( - upstream_tasks=[unmapped(result_wait_ecs)], - command=task_format_kill_timedout.map( - cluster=unmapped(cluster_name), - task=task_pylist_from_jsonlist(result_ecs_task) - ) - ) - - result_docker_success = task_check_docker_success( - upstream_tasks=[result_wait_ecs], - cluster_name=cluster_name, - tasks=task_pylist_from_jsonlist(result_ecs_task)) - - return result_docker_success - - - -@task -def _task_pathlist_to_strlist(path_list, rel_to=None): - '''PosixPath objects are not picklable and need to be converted to string''' - return [str(p) if rel_to is None else str(p.relative_to(rel_to)) for p in path_list] - - - -def make_flow_generic_ecs_task(flow_name): - - task_detail = info_flow_ecs_task_details[flow_name] - - with LocalAWSFlow(flow_name) as flow: - ref_task = helper_call_prefect_task_for_ecs_job( - cluster_name=task_detail.name_ecs_cluster, - ec2_template=task_detail.id_ec2_template, - description=task_detail.description, - name_ecs_task=task_detail.name_ecs_task, - name_docker=task_detail.name_docker, - wait_delay=task_detail.wait_delay, - wait_attempt=task_detail.wait_max_attempt, - environment=task_detail.env_secrets, - command=[ - i() if callable(i) else i - for i in task_detail.docker_args]) - - flow.set_reference_tasks([ref_task]) - return flow - - -def make_flow_solve_ecs_task(child_flow): - - - ref_tasks = [] - with LocalAWSFlow("schism-run-aws-ensemble") as flow: - - result_is_ensemble_on = task_check_param_true(param_ensemble) - with case(result_is_ensemble_on, False): - rundir = task_replace_tag_in_template( - storm_name=param_storm_name, - storm_year=param_storm_year, - run_id=param_run_id, - template_str='hurricanes/{tag}/setup/schism.dir/' - ) - - ref_tasks.append( - flow_dependency( - flow_name=child_flow.name, - upstream_tasks=None, - parameters=task_bundle_params( - name=param_storm_name, - year=param_storm_year, - run_id=param_run_id, - schism_dir=rundir, - schism_exec=task_return_this_if_param_true_else_that( - param_wind_coupling, - 'pschism_WWM_PAHM_TVD-VL', - 'pschism_PAHM_TVD-VL', - ) - ) - ) - ) - - with case(result_is_ensemble_on, True): - result_ensemble_dir = task_replace_tag_in_template( - storm_name=param_storm_name, - storm_year=param_storm_year, - run_id=param_run_id, - template_str='hurricanes/{tag}/setup/ensemble.dir/' - ) - - run_tag = task_get_run_tag( - storm_name=param_storm_name, - storm_year=param_storm_year, - run_id=param_run_id) - - # Start an EC2 to manage ensemble flow runs - with ContainerInstance(run_tag, WF_TEMPLATE_ID) as ec2_ids: - - task_add_ecs_attribute_for_ec2(ec2_ids, WF_CLUSTER, run_tag) - ecs_config = task_create_ecsrun_config(run_tag) - coldstart_task = flow_dependency( - flow_name=child_flow.name, - upstream_tasks=None, - parameters=task_bundle_params( - name=param_storm_name, - year=param_storm_year, - run_id=param_run_id, - schism_dir=result_ensemble_dir + '/spinup', - schism_exec='pschism_PAHM_TVD-VL', - ), - run_config=ecs_config, - ) - - hotstart_dirs = Glob(pattern='runs/*')( - path=task_convert_str_to_path('/efs/' + result_ensemble_dir) - ) - - flow_run_uuid = create_flow_run.map( - flow_name=unmapped(child_flow.name), - project_name=unmapped(PREFECT_PROJECT_NAME), - parameters=task_bundle_params.map( - name=unmapped(param_storm_name), - year=unmapped(param_storm_year), - run_id=unmapped(param_run_id), - schism_exec=unmapped( - task_return_this_if_param_true_else_that( - param_wind_coupling, - 'pschism_WWM_PAHM_TVD-VL', - 'pschism_PAHM_TVD-VL', - ) - ), - schism_dir=_task_pathlist_to_strlist( - hotstart_dirs, rel_to='/efs' - ) - ), - upstream_tasks=[unmapped(coldstart_task)], - run_config=unmapped(ecs_config) - ) - - hotstart_task = wait_for_flow_run.map( - flow_run_uuid, raise_final_state=unmapped(True)) - - - ref_tasks.append(coldstart_task) - ref_tasks.append(hotstart_task) - - flow.set_reference_tasks(ref_tasks) - return flow diff --git a/prefect/workflow/flows/jobs/pw.py b/prefect/workflow/flows/jobs/pw.py deleted file mode 100644 index 51096c8..0000000 --- a/prefect/workflow/flows/jobs/pw.py +++ /dev/null @@ -1,367 +0,0 @@ -from prefect import unmapped, case, task -from prefect.tasks.secrets import EnvVarSecret -from prefect.tasks.control_flow import merge -from prefect.tasks.files.operations import Glob - -from conf import PW_S3, PW_S3_PREFIX -from tasks.params import ( - param_storm_name, param_storm_year, param_run_id, - param_subset_mesh, param_ensemble, - param_mesh_hmax, - param_mesh_hmin_low, param_mesh_rate_low, - param_mesh_trans_elev, - param_mesh_hmin_high, param_mesh_rate_high, - param_use_rdhpcs_post, - param_wind_coupling, -) -from tasks.jobs import ( - task_submit_slurm, - task_format_mesh_slurm, - task_format_schism_slurm, - task_wait_slurm_done, - task_run_rdhpcs_job) -from tasks.data import ( - task_download_s3_to_luster, - task_format_copy_s3_to_lustre, - task_upload_luster_to_s3, - task_format_copy_lustre_to_s3, - task_upload_to_rdhpcs, - task_format_s3_upload, - task_download_from_rdhpcs, - task_format_s3_download, - task_delete_from_rdhpcs, - task_format_s3_delete) -from tasks.infra import ( - task_start_rdhpcs_cluster, - task_stop_rdhpcs_cluster) -from tasks.utils import ( - task_check_param_true, - task_bundle_params, task_get_run_tag, - task_replace_tag_in_template, - task_convert_str_to_path, - task_return_value_if_param_true, - task_return_value_if_param_false, - task_return_this_if_param_true_else_that, -) -from flows.utils import ( - LocalPWFlow, RDHPCSMeshFlow, RDHPCSSolveFlow, flow_dependency) - - -def helper_mesh_args(argument, is_true): - if is_true: - return lambda: task_return_value_if_param_true( - param=param_subset_mesh, - value=argument) - return lambda: task_return_value_if_param_false( - param=param_subset_mesh, - value=argument) - - -def helper_mesh_arglist(*args): - return [i() if callable(i) else i for i in args] - - -@task -def _task_pathlist_to_strlist(path_list, rel_to=None): - '''PosixPath objects are not picklable and need to be converted to string''' - return [str(p) if rel_to is None else str(p.relative_to(rel_to)) for p in path_list] - -def make_flow_mesh_rdhpcs_pw_task(): - with RDHPCSMeshFlow(f"sim-prep-rdhpcs-mesh-cluster-task") as flow: - - result_run_tag = task_get_run_tag( - param_storm_name, param_storm_year, param_run_id) - - # 1. Copy files from S3 to /luster - result_s3_to_lustre = task_download_s3_to_luster( - command=task_format_copy_s3_to_lustre( - run_tag=result_run_tag, - bucket_name=PW_S3, - bucket_prefix=PW_S3_PREFIX)) - - # 2. Call sbatch on slurm job - result_mesh_slurm_submitted_id = task_submit_slurm( - command=task_format_mesh_slurm( - storm_name=param_storm_name, - storm_year=param_storm_year, - kwds=helper_mesh_arglist( - "--tag", lambda: result_run_tag, - helper_mesh_args("hurricane_mesh", False), - helper_mesh_args("--hmax", False), - helper_mesh_args(param_mesh_hmax, False), - helper_mesh_args("--hmin-low", False), - helper_mesh_args(param_mesh_hmin_low, False), - helper_mesh_args("--rate-low", False), - helper_mesh_args(param_mesh_rate_low, False), - helper_mesh_args("--transition-elev", False), - helper_mesh_args(param_mesh_trans_elev, False), - helper_mesh_args("--hmin-high", False), - helper_mesh_args(param_mesh_hmin_high, False), - helper_mesh_args("--rate-high", False), - helper_mesh_args(param_mesh_rate_high, False), - helper_mesh_args("subset_n_combine", True), - helper_mesh_args("FINEMESH_PLACEHOLDER", True), - helper_mesh_args("COARSEMESH_PLACEHOLDER", True), - helper_mesh_args("ROI_PLACEHOLDER", True), - ), - upstream_tasks=[result_s3_to_lustre])) - - # 3. Check slurm job status - result_wait_slurm_done = task_wait_slurm_done( - job_id=result_mesh_slurm_submitted_id) - - # 4. Copy /luster to S3 - result_lustre_to_s3 = task_upload_luster_to_s3( - command=task_format_copy_lustre_to_s3( - run_tag=result_run_tag, - bucket_name=PW_S3, - bucket_prefix=PW_S3_PREFIX, - upstream_tasks=[result_wait_slurm_done])) - return flow - -def make_flow_mesh_rdhpcs(mesh_pw_task_flow): - - - with LocalPWFlow(f"sim-prep-mesh-rdhpcs") as flow: - - result_run_tag = task_get_run_tag( - param_storm_name, param_storm_year, param_run_id) - - result_pw_api_key = EnvVarSecret("PW_API_KEY") - - # 1. COPY HURR INFO TO S3 USING LOCAL AGENT FOR RDHPCS - result_upload_to_rdhpcs = task_upload_to_rdhpcs( - command=task_format_s3_upload( - run_tag=result_run_tag, - bucket_name=PW_S3, - bucket_prefix=PW_S3_PREFIX)) - - # 2. START RDHPCS MESH CLUSTER - result_start_rdhpcs_cluster = task_start_rdhpcs_cluster( - upstream_tasks=[result_upload_to_rdhpcs], - api_key=result_pw_api_key, - cluster_name="odssmmeshv22" - ) - - # NOTE: Using disowned user bootstrap script instead - # 3. START PREFECT AGENT ON MESH CLUSTER -# result_start_prefect_agent = task_run_rdhpcs_job( -# upstream_tasks=[result_start_rdhpcs_cluster], -# api_key=result_pw_api_key, -# workflow_name="odssm_agent_mesh") - - # Note: there's no need to wait, whenever the tasks that need - # cluster agent will wait until it is started - # 4. RUN RDHPCS PREFECT TASK - # 5. WAIT RDHPCS PREFECT TASK - # TODO: Use dummy task dependent on taskflow run added in main! - after_mesh_on_rdhpcs = flow_dependency( - flow_name=mesh_pw_task_flow.name, - upstream_tasks=[result_start_rdhpcs_cluster], - parameters=task_bundle_params( - name=param_storm_name, - year=param_storm_year, - run_id=param_run_id, - mesh_hmax=param_mesh_hmax, - mesh_hmin_low=param_mesh_hmin_low, - mesh_rate_low=param_mesh_rate_low, - mesh_cutoff=param_mesh_trans_elev, - mesh_hmin_high=param_mesh_hmin_high, - mesh_rate_high=param_mesh_rate_high, - subset_mesh=param_subset_mesh, - ) - ) - - # 6. STOP RDHPCS MESH CLUSTER? FIXME -# result_stop_rdhpcs_cluster = task_stop_rdhpcs_cluster( -# upstream_tasks=[result_wait_rdhpcs_job], -# api_key=result_pw_api_key, -# cluster_name="odssm_mesh_v2_2" -# ) - - # 7. COPY MESH FROM S3 TO EFS - result_download_from_rdhpcs = task_download_from_rdhpcs( -# upstream_tasks=[result_wait_rdhpcs_job], - upstream_tasks=[after_mesh_on_rdhpcs], - command=task_format_s3_download( - run_tag=result_run_tag, - bucket_name=PW_S3, - bucket_prefix=PW_S3_PREFIX)) - - # NOTE: We remove storm dir after simulation is done - - return flow - - -def make_flow_solve_rdhpcs_pw_task(): - with RDHPCSSolveFlow(f"run-schism-rdhpcs-schism-cluster-task") as flow: - - result_run_tag = task_get_run_tag( - param_storm_name, param_storm_year, param_run_id) - - result_is_ensemble_on = task_check_param_true(param_ensemble) - - # 1. Copy files from S3 to /luster - result_s3_to_lustre = task_download_s3_to_luster( - command=task_format_copy_s3_to_lustre( - run_tag=result_run_tag, - bucket_name=PW_S3, - bucket_prefix=PW_S3_PREFIX)) - - # 2. Call sbatch on slurm job - # 3. Check slurm job status - with case(result_is_ensemble_on, False): - result_rundir = task_replace_tag_in_template( - storm_name=param_storm_name, - storm_year=param_storm_year, - run_id=param_run_id, - template_str='hurricanes/{tag}/setup/schism.dir/' - ) - result_after_single_run = task_submit_slurm( - command=task_format_schism_slurm( - run_path=result_rundir, - schism_exec=task_return_this_if_param_true_else_that( - param_wind_coupling, - 'pschism_WWM_PAHM_TVD-VL', - 'pschism_PAHM_TVD-VL', - ), - upstream_tasks=[result_s3_to_lustre])) - - result_wait_slurm_done_1 = task_wait_slurm_done( - job_id=result_after_single_run) - - with case(result_is_ensemble_on, True): - result_ensemble_dir = task_replace_tag_in_template( - storm_name=param_storm_name, - storm_year=param_storm_year, - run_id=param_run_id, - template_str='hurricanes/{tag}/setup/ensemble.dir/') - - result_after_coldstart = task_submit_slurm( - command=task_format_schism_slurm( - run_path=result_ensemble_dir + '/spinup', - schism_exec='pschism_PAHM_TVD-VL', - upstream_tasks=[result_s3_to_lustre])) - result_wait_slurm_done_spinup = task_wait_slurm_done( - job_id=result_after_coldstart) - - - hotstart_dirs = Glob(pattern='runs/*')( - path=task_convert_str_to_path('/lustre/' + result_ensemble_dir) - ) - - # TODO: Somehow failure in coldstart task doesn't fail the - # whole flow due to these mapped tasks -- why? - result_after_hotstart = task_submit_slurm.map( - command=task_format_schism_slurm.map( - run_path=_task_pathlist_to_strlist( - hotstart_dirs, rel_to='/lustre'), - schism_exec=unmapped(task_return_this_if_param_true_else_that( - param_wind_coupling, - 'pschism_WWM_PAHM_TVD-VL', - 'pschism_PAHM_TVD-VL', - )), - upstream_tasks=[unmapped(result_wait_slurm_done_spinup)])) - result_wait_slurm_done_2 = task_wait_slurm_done.map( - job_id=result_after_hotstart) - - result_wait_slurm_done = merge( - result_wait_slurm_done_1, result_wait_slurm_done_2 - ) - - - # 4. Copy /luster to S3 - result_lustre_to_s3 = task_upload_luster_to_s3( - command=task_format_copy_lustre_to_s3( - run_tag=result_run_tag, - bucket_name=PW_S3, - bucket_prefix=PW_S3_PREFIX, - upstream_tasks=[result_wait_slurm_done])) - return flow - - -def make_flow_solve_rdhpcs(solve_pw_task_flow): - - - with LocalPWFlow(f"run-schism-rdhpcs") as flow: - - result_run_tag = task_get_run_tag( - param_storm_name, param_storm_year, param_run_id) - - result_pw_api_key = EnvVarSecret("PW_API_KEY") - result_is_rdhpcspost_on = task_check_param_true( - param_use_rdhpcs_post) - - # NOTE: We should have the mesh in S3 bucket from before, but we - # need the hurricane schism setup now - - # 1. COPY HURR SETUP TO S3 USING LOCAL AGENT FOR RDHPCS - result_upload_to_rdhpcs = task_upload_to_rdhpcs( - command=task_format_s3_upload( - run_tag=result_run_tag, - bucket_name=PW_S3, - bucket_prefix=PW_S3_PREFIX)) - - # 2. START RDHPCS SOLVE CLUSTER - result_start_rdhpcs_cluster = task_start_rdhpcs_cluster( - upstream_tasks=[result_upload_to_rdhpcs], - api_key=result_pw_api_key, - cluster_name="odssmschismv22" - ) - - # NOTE: Using disowned user bootstrap script instead - # 3. START PREFECT AGENT ON SOLVE CLUSTER -# result_start_prefect_agent = task_run_rdhpcs_job( -# upstream_tasks=[result_start_rdhpcs_cluster], -# api_key=result_pw_api_key, -# workflow_name="odssm_agent_solve") - - # Note: there's no need to wait, whenever the tasks that need - # cluster agent will wait until it is started - # 4. RUN RDHPCS SOLVE JOB - # 5. WAIT FOR SOLVE JOB TO FINISH - # TODO: Use dummy task dependent on taskflow run added in main! - after_schism_on_rdhpcs = flow_dependency( - flow_name=solve_pw_task_flow.name, - upstream_tasks=[result_start_rdhpcs_cluster], - parameters=task_bundle_params( - name=param_storm_name, - year=param_storm_year, - run_id=param_run_id, - ensemble=param_ensemble, - couple_wind=param_wind_coupling, - ) - ) - - - # 6. STOP RDHPCS SOLVE CLUSTER? FIXME -# result_stop_rdhpcs_cluster = task_stop_rdhpcs_cluster( -# upstream_tasks=[result_wait_rdhpcs_job], -# api_key=result_pw_api_key, -# cluster_name="odssm_schism_v2_2" -# ) - - # 7. COPY SOLUTION FROM S3 TO EFS - with case(result_is_rdhpcspost_on, False): - result_download_from_rdhpcs = task_download_from_rdhpcs( -# upstream_tasks=[result_wait_rdhpcs_job], - upstream_tasks=[after_schism_on_rdhpcs], - command=task_format_s3_download( - run_tag=result_run_tag, - bucket_name=PW_S3, - bucket_prefix=PW_S3_PREFIX)) - - with case(result_is_rdhpcspost_on, True): - # TODO: - pass - - # 8. DELETE STORM FILES FROM RDHPCS S3? FIXME -# result_delete_from_rdhpcs = task_delete_from_rdhpcs( -# upstream_tasks=[result_download_from_rdhpcs], -# command=task_format_s3_delete( -# storm_name=param_storm_name, -# storm_year=param_storm_year, -# bucket_name=PW_S3, -# bucket_prefix=PW_S3_PREFIX)) - - return flow diff --git a/prefect/workflow/flows/utils.py b/prefect/workflow/flows/utils.py deleted file mode 100644 index 321f72d..0000000 --- a/prefect/workflow/flows/utils.py +++ /dev/null @@ -1,164 +0,0 @@ -from contextlib import contextmanager -from functools import partial - - -from dunamai import Version -from slugify import slugify -import prefect -from prefect import Flow, case, task -from prefect.tasks.prefect import StartFlowRun -from prefect.tasks.prefect.flow_run import create_flow_run, wait_for_flow_run -from prefect.backend.flow_run import FlowRunView -from prefect.tasks.prefect.flow_run_cancel import CancelFlowRun -from prefect.storage import S3 -from prefect.engine.results.s3_result import S3Result -from prefect.executors import LocalDaskExecutor -from prefect.triggers import all_finished -from prefect.run_configs.ecs import ECSRun - -from conf import ( - S3_BUCKET, PW_S3, PW_S3_PREFIX, pw_s3_cred, - PREFECT_PROJECT_NAME, - WF_CLUSTER, WF_IMG, WF_ECS_TASK_ARN, - ECS_TASK_ROLE, ECS_EXEC_ROLE, ECS_SUBNET_ID, ECS_EC2_SG, - run_cfg_local_aws_cred, - run_cfg_local_pw_cred, - run_cfg_rdhpcsc_mesh_cluster, - run_cfg_rdhpcsc_schism_cluster) - -LocalAWSFlow = partial( - Flow, - storage=S3(bucket=S3_BUCKET), - run_config=run_cfg_local_aws_cred) - - -@contextmanager -def LocalPWFlow(flow_name): - ver = Version.from_git() - flow = Flow( - name=flow_name, - result=S3Result( - bucket=PW_S3, - location=f'{PW_S3_PREFIX}/prefect-results/{{flow_run_id}}' - ), - storage=S3( - key=f'{PW_S3_PREFIX}/prefect-flows/{slugify(flow_name)}/{ver.commit}{".mod" if ver.dirty else ""}', - bucket=PW_S3, - client_options=pw_s3_cred - ), - run_config=run_cfg_local_pw_cred - ) - with flow as inctx_flow: - yield inctx_flow - - inctx_flow.executor = LocalDaskExecutor(scheduler="processes", num_workers=10) - -@contextmanager -def RDHPCSMeshFlow(flow_name): - ver = Version.from_git() - flow = Flow( - name=flow_name, - result=S3Result( - bucket=PW_S3, - location=f'{PW_S3_PREFIX}/prefect-results/{{flow_run_id}}' - ), - storage=S3( - key=f'{PW_S3_PREFIX}/prefect-flows/{slugify(flow_name)}/{ver.commit}{".mod" if ver.dirty else ""}', - bucket=PW_S3, - client_options=pw_s3_cred - ), - run_config=run_cfg_rdhpcsc_mesh_cluster - ) - with flow as inctx_flow: - yield inctx_flow - - -@contextmanager -def RDHPCSSolveFlow(flow_name): - ver = Version.from_git() - flow = Flow( - name=flow_name, - result=S3Result( - bucket=PW_S3, - location=f'{PW_S3_PREFIX}/prefect-results/{{flow_run_id}}' - ), - storage=S3( - key=f'{PW_S3_PREFIX}/prefect-flows/{slugify(flow_name)}/{ver.commit}{".mod" if ver.dirty else ""}', - bucket=PW_S3, - client_options=pw_s3_cred - ), - run_config=run_cfg_rdhpcsc_schism_cluster - ) - with flow as inctx_flow: - yield inctx_flow - - -@task(name="Create ECSRun config") -def task_create_ecsrun_config(run_tag): - ecs_config = ECSRun( - task_definition_arn=WF_ECS_TASK_ARN, - # Use instance profile instead of task role -# task_role_arn=ECS_TASK_ROLE, -# execution_role_arn=ECS_EXEC_ROLE, - labels=['tacc-odssm-ecs'], - run_task_kwargs=dict( - cluster=WF_CLUSTER, - launchType='EC2', -# networkConfiguration={ -# 'awsvpcConfiguration': { -# 'subnets': [ECS_SUBNET_ID], -# 'securityGroups': ECS_EC2_SG, -# 'assignPublicIp': 'DISABLED', -# }, -# }, - placementConstraints=[ -# {'type': 'distinctInstance'}, - {'type': 'memberOf', - 'expression': f"attribute:run-tag == '{run_tag}'" - } - ], - ) - ) - - return ecs_config - -@task(name="Check if child flow is still running", trigger=all_finished) -def _task_is_childflow_still_running(flow_run_id): - flow_run_vu = FlowRunView.from_flow_run_id(flow_run_id) - logger = prefect.context.get("logger") - logger.info("*****************") - logger.info(flow_run_vu.state) - logger.info(type(flow_run_vu.state)) - logger.info("*****************") - return False - - -def flow_dependency(flow_name, parameters, upstream_tasks, **kwargs): - flow_run_uuid = create_flow_run( - flow_name=flow_name, - parameters=parameters, - project_name=PREFECT_PROJECT_NAME, - upstream_tasks=upstream_tasks, - task_args={'name': f'Start "{flow_name}"'}, - **kwargs) - - task_wait_for_flow = wait_for_flow_run( - flow_run_uuid, raise_final_state=True, - task_args={'name': f'Wait for "{flow_name}"'} - ) - - # TODO: Check for fail wait state and call cancel if still running -# child_flow_running = _task_is_childflow_still_running( -# flow_run_uuid, -# upstream_tasks=[task_wait_for_flow], -# ) -# with case(child_flow_running, True): -# task_cancel_flow = CancelFlowRun()(flow_run_id=flow_run_uuid) - - return task_wait_for_flow - -# Deprecated -FlowDependency = partial( - StartFlowRun, - wait=True, - project_name=PREFECT_PROJECT_NAME) diff --git a/prefect/workflow/main.py b/prefect/workflow/main.py deleted file mode 100644 index e25fb97..0000000 --- a/prefect/workflow/main.py +++ /dev/null @@ -1,294 +0,0 @@ -# Run from prefect directory (after terraform vars gen) using -# prefect run --name sim-prep --param name=florance --param year=2018 - - -# For logging, use `logger = prefect.context.get("logger")` within tasks -import argparse -import warnings -import pathlib - -from prefect import case -from prefect.utilities import graphql -from prefect.client import Client -from prefect.tasks.control_flow import merge - -from conf import PREFECT_PROJECT_NAME, INIT_FINI_LOCK -from tasks.params import ( - param_storm_name, param_storm_year, - param_use_rdhpcs, param_use_rdhpcs_post, param_run_id, - param_use_parametric_wind, param_subset_mesh, param_ensemble, - param_mesh_hmax, - param_mesh_hmin_low, param_mesh_rate_low, - param_mesh_trans_elev, - param_mesh_hmin_high, param_mesh_rate_high, - param_ensemble_n_perturb, param_hr_prelandfall, - param_ensemble_sample_rule, - param_past_forecast, - param_wind_coupling, - ) -from tasks.data import ( - task_copy_s3_data, - task_init_run, - task_final_results_to_s3, - task_cleanup_run, - task_cache_to_s3, - task_cleanup_efs) -from tasks.utils import ( - task_check_param_true, - task_bundle_params, - task_get_flow_run_id, - task_get_run_tag, - FLock) -from flows.jobs.ecs import ( - make_flow_generic_ecs_task, - make_flow_solve_ecs_task - ) -from flows.jobs.pw import( - make_flow_mesh_rdhpcs_pw_task, - make_flow_mesh_rdhpcs, - make_flow_solve_rdhpcs_pw_task, - make_flow_solve_rdhpcs) -from flows.utils import LocalAWSFlow, flow_dependency - - -# TODO: Later add build image and push to ECS logic into Prefect workflow - -# TODO: Use subprocess.run to switch backend here -# TODO: Create user config file to be session based? https://docs-v1.prefect.io/core/concepts/configuration.html#environment-variables - -def _check_project(): - client = Client() - print(f"Connecting to {client.api_server}...") - - qry = graphql.parse_graphql({'query': {'project': ['name']}}) - rsp = client.graphql(qry) - - prj_names = [i['name'] for i in rsp['data']['project']] - if PREFECT_PROJECT_NAME in prj_names: - print(f"Project {PREFECT_PROJECT_NAME} found on {client.api_server}!") - return - - print(f"Creating project {PREFECT_PROJECT_NAME} on {client.api_server}...") - client.create_project(project_name=PREFECT_PROJECT_NAME) - print("Done!") - - -def _make_workflow(): - # Create flow objects - flow_sim_prep_info_aws = make_flow_generic_ecs_task("sim-prep-info-aws") - flow_sim_prep_mesh_aws = make_flow_generic_ecs_task("sim-prep-mesh-aws") - flow_sim_prep_setup_aws = make_flow_generic_ecs_task("sim-prep-setup-aws") - flow_mesh_rdhpcs_pw_task = make_flow_mesh_rdhpcs_pw_task() - flow_mesh_rdhpcs = make_flow_mesh_rdhpcs(flow_mesh_rdhpcs_pw_task) - flow_schism_single_run_aws = make_flow_generic_ecs_task("schism-run-aws-single") - flow_schism_ensemble_run_aws = make_flow_solve_ecs_task(flow_schism_single_run_aws) - flow_solve_rdhpcs_pw_task = make_flow_solve_rdhpcs_pw_task() - flow_solve_rdhpcs = make_flow_solve_rdhpcs(flow_solve_rdhpcs_pw_task) - flow_sta_html_aws = make_flow_generic_ecs_task("viz-sta-html-aws") - flow_cmb_ensemble_aws = make_flow_generic_ecs_task( - "viz-cmb-ensemble-aws" - ) - flow_ana_ensemble_aws = make_flow_generic_ecs_task( - "viz-ana-ensemble-aws" - ) - - - with LocalAWSFlow("end-to-end") as flow_main: - - result_flow_run_id = task_get_flow_run_id() - - result_run_tag = task_get_run_tag( - param_storm_name, param_storm_year, result_flow_run_id) - - result_is_rdhpcs_on = task_check_param_true(param_use_rdhpcs) - result_is_ensemble_on = task_check_param_true(param_ensemble) - result_is_rdhpcspost_on = task_check_param_true(param_use_rdhpcs_post) - - with FLock(INIT_FINI_LOCK, task_args={'name': 'Sync init'}): - result_copy_task = task_copy_s3_data() - result_init_run = task_init_run( - result_run_tag, upstream_tasks=[result_copy_task]) - - result_bundle_params_1 = task_bundle_params( - name=param_storm_name, - year=param_storm_year, - rdhpcs=param_use_rdhpcs, - rdhpcs_post=param_use_rdhpcs_post, - run_id=result_flow_run_id, - parametric_wind=param_use_parametric_wind, - ensemble=param_ensemble, - hr_before_landfall=param_hr_prelandfall, - past_forecast=param_past_forecast, - couple_wind=param_wind_coupling, - ) - - result_bundle_params_2 = task_bundle_params( - name=param_storm_name, - year=param_storm_year, - rdhpcs=param_use_rdhpcs, - run_id=result_flow_run_id, - subset_mesh=param_subset_mesh, - mesh_hmax=param_mesh_hmax, - mesh_hmin_low=param_mesh_hmin_low, - mesh_rate_low=param_mesh_rate_low, - mesh_cutoff=param_mesh_trans_elev, - mesh_hmin_high=param_mesh_hmin_high, - mesh_rate_high=param_mesh_rate_high - ) - - result_bundle_params_3 = task_bundle_params( - name=param_storm_name, - year=param_storm_year, - run_id=result_flow_run_id, - parametric_wind=param_use_parametric_wind, - ensemble=param_ensemble, - ensemble_num_perturbations=param_ensemble_n_perturb, - hr_before_landfall=param_hr_prelandfall, - couple_wind=param_wind_coupling, - ensemble_sample_rule=param_ensemble_sample_rule, - ) - - after_sim_prep_info = flow_dependency( - flow_name=flow_sim_prep_info_aws.name, - upstream_tasks=[result_init_run], - parameters=result_bundle_params_1) - - # TODO: Meshing based-on original track for now - # TODO: If mesh each track: diff mesh - - - with case(result_is_rdhpcs_on, True): - after_sim_prep_mesh_b1 = flow_dependency( - flow_name=flow_mesh_rdhpcs.name, - upstream_tasks=[after_sim_prep_info], - parameters=result_bundle_params_2) - with case(result_is_rdhpcs_on, False): - after_sim_prep_mesh_b2 = flow_dependency( - flow_name=flow_sim_prep_mesh_aws.name, - upstream_tasks=[after_sim_prep_info], - parameters=result_bundle_params_2) - after_sim_prep_mesh = merge(after_sim_prep_mesh_b1, after_sim_prep_mesh_b2) - - after_sim_prep_setup = flow_dependency( - flow_name=flow_sim_prep_setup_aws.name, - upstream_tasks=[after_sim_prep_mesh], - parameters=result_bundle_params_3) - - with case(result_is_rdhpcs_on, True): - after_run_schism_b1 = flow_dependency( - flow_name=flow_solve_rdhpcs.name, - upstream_tasks=[after_sim_prep_setup], - parameters=result_bundle_params_1) - with case(result_is_rdhpcs_on, False): - after_run_schism_b2 = flow_dependency( - flow_name=flow_schism_ensemble_run_aws.name, - upstream_tasks=[after_sim_prep_setup], - parameters=result_bundle_params_1) - after_run_schism = merge(after_run_schism_b1, after_run_schism_b2) - - - with case(result_is_ensemble_on, True): - with case(result_is_rdhpcspost_on, False): - after_cmb_ensemble = flow_dependency( - flow_name=flow_cmb_ensemble_aws.name, - upstream_tasks=[after_run_schism], - parameters=result_bundle_params_1) - after_ana_ensemble = flow_dependency( - flow_name=flow_ana_ensemble_aws.name, - upstream_tasks=[after_cmb_ensemble], - parameters=result_bundle_params_1) - - with case(result_is_rdhpcspost_on, True): - # TODO: - pass - - with case(result_is_ensemble_on, False): - after_sta_html = flow_dependency( - flow_name=flow_sta_html_aws.name, - upstream_tasks=[after_run_schism], - parameters=result_bundle_params_1) - after_gen_viz = merge(after_ana_ensemble, after_sta_html) - - # TODO: Make this a separate flow? - after_results_to_s3 = task_final_results_to_s3( - param_storm_name, param_storm_year, result_run_tag, - upstream_tasks=[after_gen_viz]) - - after_cleanup_run = task_cleanup_run( - result_run_tag, upstream_tasks=[after_results_to_s3]) - - with FLock(INIT_FINI_LOCK, upstream_tasks=[after_cleanup_run], task_args={'name': 'Sync cleanup'}): - after_cache_storage = task_cache_to_s3( - upstream_tasks=[after_cleanup_run]) - task_cleanup_efs( - result_run_tag, - upstream_tasks=[after_cache_storage]) - - flow_main.set_reference_tasks([after_cleanup_run]) - - all_flows = [ - flow_sim_prep_info_aws, - flow_sim_prep_mesh_aws, - flow_sim_prep_setup_aws, - flow_mesh_rdhpcs_pw_task, - flow_mesh_rdhpcs, - flow_schism_single_run_aws, - flow_schism_ensemble_run_aws, - flow_solve_rdhpcs_pw_task, - flow_solve_rdhpcs, - flow_sta_html_aws, - flow_cmb_ensemble_aws, - flow_ana_ensemble_aws, - flow_main - ] - - return all_flows - -def _regiser(flows): - # Register unregistered flows - for flow in flows: - flow.register(project_name=PREFECT_PROJECT_NAME) - -def _viz(flows, out_dir, flow_names): - flow_dict = {f.name: f for f in flows} - for flow_nm in flow_names: - flow = flow_dict.get(flow_nm) - if flow is None: - warnings.warn(f'Flow with the name {flow_nm} NOT found!') - flow.visualize(filename=out_dir/flow.name, format='dot') - -def _list(flows): - flow_names = [f.name for f in flows] - print("\n".join(flow_names)) - - -def _main(args): - - _check_project() - all_flows = _make_workflow() - if args.command in ["register", None]: - _regiser(all_flows) - - elif args.command == "visualize": - _viz(all_flows, args.out_dir, args.flowname) - - elif args.command == "list": - _list(all_flows) - - else: - raise ValueError("Invalid command!") - -if __name__ == "__main__": - - parser = argparse.ArgumentParser() - subparsers = parser.add_subparsers(dest="command") - - reg_parser = subparsers.add_parser('register') - viz_parser = subparsers.add_parser('visualize') - list_parser = subparsers.add_parser('list') - - viz_parser.add_argument('flowname', nargs='+') - viz_parser.add_argument( - '--out-dir', '-d', type=pathlib.Path, default='.') - - _main(parser.parse_args()) diff --git a/prefect/workflow/pw_client.py b/prefect/workflow/pw_client.py deleted file mode 100755 index f5fa0d9..0000000 --- a/prefect/workflow/pw_client.py +++ /dev/null @@ -1,112 +0,0 @@ -import requests -import json -import pprint as pp - -class Client(): - - def __init__(self, url, key): - self.url = url - self.api = url+'/api' - self.key = key - self.session = requests.Session() - self.headers = { - 'Content-Type': 'application/json' - } - - def upload_dataset(self, filename, path): - req = self.session.post(self.api + "/datasets/upload?key="+self.key, - data={'dir': path}, - files={'file':open(filename, 'rb')}) - req.raise_for_status() - data = json.loads(req.text) - return data - - def download_dataset(self, file): - url=self.api + "/datasets/download?key=" + self.key + '&file=' + file - #print url - req = self.session.get(url) - req.raise_for_status() - return req.content - - def find_datasets(self, path, ext=''): - url = self.api + "/datasets/find?key=" + self.key + "&path=" + path + "&ext=" + ext - #print url - req = self.session.get(url) - req.raise_for_status() - data = json.loads(req.text) - return data - - def get_job_tail(self, jid, file, lastline): - url = self.api + "/jobs/"+jid+"/tail?key=" + self.key + "&file=" + file + "&line="+str(lastline) - try: - req = self.session.get(url) - req.raise_for_status() - data = req.text - except: - data = "" - return data - - def start_job(self,workflow,inputs,user): - inputs = json.dumps(inputs) - req = self.session.post(self.api + "/tools",data={'user':user,'tool_xml': "/workspaces/"+user+"/workflows/"+workflow+"/workflow.xml",'key':self.key,'tool_id':workflow,'inputs':inputs}) - req.raise_for_status() - data = json.loads(req.text) - jid=data['jobs'][0]['id'] - djid=str(data['decoded_job_id']) - return jid,djid - - def get_job_state(self, jid): - url = self.api + "/jobs/"+ jid + "?key=" + self.key - req = self.session.get(url) - req.raise_for_status() - data = json.loads(req.text) - return data['state'] - - def get_job_credit_info(self, jid): - url = self.api + "/jobs/"+ jid + "/monitor?key=" + self.key - req = self.session.get(url) - req.raise_for_status() - data = json.loads(req.text) - # return data['info'] - return data - - def get_resources(self): - req = self.session.get(self.api + "/resources?key=" + self.key) - req.raise_for_status() - data = json.loads(req.text) - return data - - def get_resource(self, name): - req = self.session.get(self.api + "/resources/list?key=" + self.key + "&name=" + name) - req.raise_for_status() - try: - data = json.loads(req.text) - return data - except: - return None - - def start_resource(self, name): - req = self.session.get(self.api + "/resources/start?key=" + self.key + "&name=" + name) - req.raise_for_status() - return req.text - - def stop_resource(self, name): - req = self.session.get(self.api + "/resources/stop?key=" + self.key + "&name=" + name) - req.raise_for_status() - return req.text - - def update_resource(self, name, params): - update = "&name={}".format(name) - for key, value in params.items(): - update = "{}&{}={}".format(update, key, value) - req = self.session.post(self.api + "/resources/set?key=" + self.key + update) - req.raise_for_status() - return req.text - - def get_account(self): - url = self.api + "/account?key=" + self.key - req = self.session.get(url) - req.raise_for_status() - data = json.loads(req.text) - return data - \ No newline at end of file diff --git a/prefect/workflow/tasks/__init__.py b/prefect/workflow/tasks/__init__.py deleted file mode 100644 index 8b13789..0000000 --- a/prefect/workflow/tasks/__init__.py +++ /dev/null @@ -1 +0,0 @@ - diff --git a/prefect/workflow/tasks/data.py b/prefect/workflow/tasks/data.py deleted file mode 100644 index 954ffde..0000000 --- a/prefect/workflow/tasks/data.py +++ /dev/null @@ -1,210 +0,0 @@ -import pathlib -import shutil -import subprocess -import json -from datetime import datetime, timezone - -import boto3 -import prefect -from prefect import task -from prefect.tasks.shell import ShellTask -from prefect.tasks.templates import StringFormatter -from prefect.engine.signals import SKIP - -from conf import LOG_STDERR, RESULT_S3, STATIC_S3, COMMIT_HASH, DOCKER_VERS - -task_copy_s3_data = ShellTask( - name="Copy s3 to efs", - command='\n'.join([ - f"aws s3 sync s3://{STATIC_S3} /efs", - "chown ec2-user:ec2-user -R /efs", - "chmod 751 -R /efs", - ]) -) - -@task(name="Initialize simulation directory") -def task_init_run(run_tag): - root = pathlib.Path(f"/efs/hurricanes/{run_tag}/") - root.mkdir() - - # Get current time with local timezone info - # https://stackoverflow.com/questions/2720319/python-figure-out-local-timezone - now = datetime.now(timezone.utc).astimezone() - - # Log run info and parameters - info_file_path = root / 'run_info.json' - run_info = {} - run_info['start_date'] = now.strftime("%Y-%m-%d %H:%M:%S %Z") - run_info['run_tag'] = run_tag - run_info['git_commit'] = COMMIT_HASH - run_info['ecs_images'] = DOCKER_VERS - run_info['prefect'] = {} - run_info['prefect']['parameters'] = prefect.context.parameters - run_info['prefect']['flow_id'] = prefect.context.flow_id - run_info['prefect']['flow_run_id'] = prefect.context.flow_run_id - run_info['prefect']['flow_run_name'] = prefect.context.flow_run_name - - with open(info_file_path, 'w') as info_file: - json.dump(run_info, info_file, indent=2) - - for subdir in ['mesh', 'setup', 'sim', 'nhc_track', 'coops_ssh']: - (root / subdir).mkdir() - - - - -task_format_s3_upload = StringFormatter( - name="Prepare path to upload to rdhpcs S3", - template='; '.join( - ["find /efs/hurricanes/{run_tag}/ -type l -exec bash -c" - + " 'for i in \"$@\"; do" - + " readlink $i > $i.symlnk;" - # Don't remove actual links from the EFS -# + " rm -rf $i;" - + " done' _ {{}} +", - "aws s3 sync --no-follow-symlinks" - + " /efs/hurricanes/{run_tag}/" - + " s3://{bucket_name}/{bucket_prefix}/hurricanes/{run_tag}/", - ])) - -task_upload_to_rdhpcs = ShellTask( - name="Copy from efs to rdhpcs s3", -) - -task_format_s3_download = StringFormatter( - name="Prepare path to download from rdhpcs S3", - template='; '.join( - ["aws s3 sync" - + " s3://{bucket_name}/{bucket_prefix}/hurricanes/{run_tag}/" - + " /efs/hurricanes/{run_tag}/", - "find /efs/hurricanes/{run_tag}/ -type f -name '*.symlnk' -exec bash -c" - + " 'for i in \"$@\"; do" - + " ln -sf $(cat $i) ${{i%.symlnk}};" - + " rm -rf $i;" - + " done' _ {{}} +", - ])) - -task_download_from_rdhpcs = ShellTask( - name="Download from rdhpcs s3 to efs", -) - - -task_format_copy_s3_to_lustre = StringFormatter( - name="Prepare path to Luster from S3 on RDHPCS cluster", - template='; '.join( - ["aws s3 sync" - + " s3://{bucket_name}/{bucket_prefix}/hurricanes/{run_tag}/" - + " /lustre/hurricanes/{run_tag}/", - "find /lustre/hurricanes/{run_tag}/ -type f -name '*.symlnk' -exec bash -c" - + " 'for i in \"$@\"; do" - + " ln -sf $(cat $i) ${{i%.symlnk}};" - + " rm -rf $i;" - + " done' _ {{}} +", - ])) - -task_download_s3_to_luster = ShellTask( - name="Download data from RDHPCS S3 onto RDHPCS cluster /lustre", - return_all=True, - log_stderr=LOG_STDERR, -) - -task_format_copy_lustre_to_s3 = StringFormatter( - name="Prepare path to S3 from Luster on RDHPCS cluster", - template='; '.join( - ["find /lustre/hurricanes/{run_tag}/ -type l -exec bash -c" - + " 'for i in \"$@\"; do" - + " readlink $i > $i.symlnk;" - # Don't remove the actual links from the luster -# + " rm -rf $i;" - + " done' _ {{}} +", - "aws s3 sync --no-follow-symlinks" - + " /lustre/hurricanes/{run_tag}/" - + " s3://{bucket_name}/{bucket_prefix}/hurricanes/{run_tag}/" - ])) - -task_upload_luster_to_s3 = ShellTask( - name="Upload data from RDHPCS cluster /lustre onto RDHPCS S3", - return_all=True, - log_stderr=LOG_STDERR, -) - -task_format_s3_delete = StringFormatter( - name="Prepare path to remove from rdhpcs S3", - template="aws s3 rm --recursive" - + " s3://{bucket_name}/{bucket_prefix}/hurricanes/{run_tag}/") - -task_delete_from_rdhpcs = ShellTask( - name="Delete from rdhpcs s3", -) - -@task(name="Copy final results to S3 for longterm storage") -def task_final_results_to_s3(storm_name, storm_year, run_tag): - s3 = boto3.client("s3") - src = pathlib.Path(f'/efs/hurricanes/{run_tag}') - prefix = f'{storm_name}_{storm_year}_' - - aws_rsp = s3.list_objects_v2(Bucket=RESULT_S3, Delimiter='/') - - try: - top_level = [k['Prefix'] for k in aws_rsp['CommonPrefixes']] - except KeyError: - top_level = [] - - old_runs = [i.strip('/') for i in top_level if i.startswith(prefix)] - run_numstr = [i[len(prefix):] for i in old_runs] - run_nums = [int(i) for i in run_numstr if i.isnumeric()] - - next_num = 1 - if len(run_nums) > 0: - next_num = max(run_nums) + 1 - # Zero filled number - dest = f'{prefix}{next_num:03d}' - - for p in src.rglob("*"): - # For S3 object storage folders are meaningless - if p.is_dir(): - continue - - # Ignore thsese - ignore_patterns = [ - "max*_*", - "schout_*_*.nc", - "hotstart_*_*.nc", - "local_to_global_*", - "nonfatal_*" - ] - if any(p.match(pat) for pat in ignore_patterns): - continue - - s3.upload_file( - str(p), RESULT_S3, f'{dest}/{p.relative_to(src)}') - -@task(name="Cleanup run directory after run") -def task_cleanup_run(run_tag): - # Remove the current run's directory - base = pathlib.Path('/efs/hurricanes/') - src = base / run_tag - shutil.rmtree(src) - - -task_cache_to_s3 = ShellTask( - name="Sync all cached files with static S3", - command='\n'.join([ - "mkdir -p /efs/cache", # To avoid error if no cache! - f"aws s3 sync /efs/cache s3://{STATIC_S3}/cache/" - ]) -) - -@task(name="Cleanup EFS after run") -def task_cleanup_efs(run_tag): - - base = pathlib.Path('/efs/hurricanes/') - - # If there are no other runs under hurricane cleanup EFS - if any(not i.match("./_") for i in base.glob("*")): - # This means there are other ongoing runs or failed runs that - # may need inspections, so don't cleanup EFS - raise SKIP("Other run directories exist in EFS, skip cleanup!") - - for p in pathlib.Path('/efs').glob("*"): - shutil.rmtree(p) diff --git a/prefect/workflow/tasks/infra.py b/prefect/workflow/tasks/infra.py deleted file mode 100644 index 404ec04..0000000 --- a/prefect/workflow/tasks/infra.py +++ /dev/null @@ -1,309 +0,0 @@ -import time -import json -import multiprocessing, time, signal - -import boto3 -import prefect -from prefect.tasks.shell import ShellTask -from prefect.tasks.aws.client_waiter import AWSClientWait -from prefect import task -from prefect.triggers import all_finished -from prefect.tasks.templates import StringFormatter -from prefect.engine.signals import SKIP -from prefect import resource_manager -from prefect.agent.ecs.agent import ECSAgent - -import pw_client -from conf import LOG_STDERR, PW_URL, WORKFLOW_TAG_NAME - -shell_list_cluster_instance_arns = " ".join([ - "aws ecs list-container-instances", - "--cluster {cluster}", - "--output json" -]) - -task_format_list_cluster_instance_arns = StringFormatter( - name="Format list cluster instance ARNs", - template=shell_list_cluster_instance_arns -) - -task_list_cluster_instance_arns = ShellTask( - name="List cluster instance arns", - return_all=True, # To get stdout as return value - log_stderr=LOG_STDERR, -) - - -# TODO: Check if instance is running (e.g. vs exist but STOPPED) -@task(name="Check if EC2 is needed") -def task_check_if_ec2_needed(rv_shell): - - aws_rv = json.loads("\n".join(rv_shell)) - ec2_arn_list = aws_rv.get('containerInstanceArns', []) - - is_needed = len(ec2_arn_list) == 0 - - return is_needed - -task_client_wait_for_ec2 = AWSClientWait( - name='Wait for ECS task', - client='ec2', - waiter_name='instance_status_ok' -) - -@task(name="Check for instance shutdown") -def task_check_cluster_shutdown(rv_shell): - - task_arns = json.loads("\n".join(rv_shell)) - can_shutdown = len(task_arns) == 0 - - return can_shutdown - -shell_term_instances = " ".join([ - "aws ec2 terminate-instances", - "--instance-ids {instance_id_list}" - ]) - -task_format_term_ec2 = StringFormatter( - name="Format term ec2 command", - template=shell_term_instances) - -task_term_instances = ShellTask( - name="Stop cluster instances" -) - - -shell_spinup_cluster_ec2 = " ".join([ - "aws ec2 run-instances", - "--launch-template LaunchTemplateId={template_id}", - "--query Instances[*].InstanceId", - "--output json", -]) -task_format_spinup_cluster_ec2 = StringFormatter( - name="Format EC2 spinup command", - template=shell_spinup_cluster_ec2, - ) - -task_spinup_cluster_ec2 = ShellTask( - name="EC2 for cluster", - return_all=True, # Multi line for list of tasks as json - log_stderr=LOG_STDERR, -) - -shell_list_cluster_tasks = " ".join([ - "aws ecs list-tasks", - "--cluster {cluster}", - "--query taskArns", - "--output json" -]) -task_format_list_cluster_tasks = StringFormatter( - name="Format list cluster tasks command", - template=shell_list_cluster_tasks) - -task_list_cluster_tasks = ShellTask( - name="List cluster tasks", - return_all=True, # To potentially multiline json string - log_stderr=LOG_STDERR, -) - -shell_list_cluster_instance_ids = " ".join([ - "aws ecs list-container-instances", - "--cluster {cluster}", - "--query containerInstanceArns", - "--output text", - "| xargs", - "aws ecs describe-container-instances", - "--cluster {cluster}", - "--query containerInstances[*].ec2InstanceId", - "--output text", - "--container-instances" # THIS MUST BE THE LAST ONE FOR XARGS -]) - -task_format_list_cluster_instance_ids = StringFormatter( - name="Format list cluster instance ids command", - template=shell_list_cluster_instance_ids -) - -task_list_cluster_instance_ids = ShellTask( - name="List cluster instance IDs", - return_all=False, # To get single line text output list - log_stderr=LOG_STDERR, -) - - -@task(name="Create EC2 instance with unique tag") -def task_create_ec2_w_tag(template_id, run_tag): - ec2_resource = boto3.resource('ec2') - ec2_client = boto3.client('ec2') - - ec2_instances = ec2_resource.create_instances( - LaunchTemplate={'LaunchTemplateId': template_id}, - MinCount=1, MaxCount=1 - ) - - instance_ids = [ - ec2_inst.instance_id for ec2_inst in ec2_instances] - - waiter = ec2_client.get_waiter('instance_exists') - waiter.wait(InstanceIds=instance_ids) - - ec2_resource.create_tags( - Resources=instance_ids, - Tags=[{'Key': WORKFLOW_TAG_NAME, 'Value': run_tag}] - ) - - return instance_ids - -@task(name="Destroy EC2 instance by unique tag") -def task_destroy_ec2_by_tag(run_tag): - - ec2_client = boto3.client('ec2') - - filter_by_run_tag = [{ - 'Name': f'tag:{WORKFLOW_TAG_NAME}', 'Values': [run_tag]}] - - response = ec2_client.describe_instances(Filters=filter_by_run_tag) - instance_ids = [ - instance['InstanceId'] - for rsv in response.get('Reservations', []) - for instance in rsv.get('Instances', []) - ] - - - if len(instance_ids) == 0: - raise SKIP( - message="Could NOT find any instances tagged for this run") - - response = ec2_client.terminate_instances( - InstanceIds=instance_ids) - - -@task(name="Add run tag attribute to ECS instance") -def task_add_ecs_attribute_for_ec2(ec2_instance_ids, cluster, run_tag): - - ecs_client = boto3.client('ecs') - response = ecs_client.list_container_instances(cluster=cluster) - all_ecs_instance_arns = response['containerInstanceArns'] - if len(all_ecs_instance_arns) == 0: - raise FAIL( - message=f"Could NOT find any instances associated with cluster {cluster}") - - response = ecs_client.describe_container_instances( - cluster=cluster, - containerInstances=all_ecs_instance_arns) - - ecs_instance_arns = [] - for container_instance_info in response['containerInstances']: - ec2_instance_id = container_instance_info['ec2InstanceId'] - if ec2_instance_id in ec2_instance_ids: - ecs_instance_arns.append(container_instance_info['containerInstanceArn']) - break - - for inst_arn in ecs_instance_arns: - ecs_client.put_attributes( - cluster=cluster, - attributes=[ - { - 'name': 'run-tag', - 'value': run_tag, - 'targetType': 'container-instance', - 'targetId': inst_arn - }, - ] - ) - -@resource_manager -class ContainerInstance: - def __init__(self, run_tag, template_id): - self.tag = run_tag - self.template = template_id - - def setup(self): - "Create container instances for the run specified by tag" - - ec2_client = boto3.client('ec2') - - ec2_instance_ids = task_create_ec2_w_tag.run(self.template, self.tag) - - waiter = ec2_client.get_waiter('instance_status_ok') - waiter.wait(InstanceIds=ec2_instance_ids) - - response = ec2_client.describe_instances(InstanceIds=ec2_instance_ids) - instance_ips = [ - instance['PublicIpAddress'] - for rsv in response.get('Reservations', []) - for instance in rsv.get('Instances', []) - ] - - logger = prefect.context.get("logger") - logger.info(f"EC2 public IPs: {','.join(instance_ips)}") - - return ec2_instance_ids - - def cleanup(self, ec2_instance_ids): - "Shutdown the container instance" - - # NOTE: We destroy by tag - task_destroy_ec2_by_tag.run(self.tag) - - -@task(name="Start RDHPCS cluster") -def task_start_rdhpcs_cluster(api_key, cluster_name): - c = pw_client.Client(PW_URL, api_key) - - # check if resource exists and is on - cluster = c.get_resource(cluster_name) - if cluster: - if cluster['status'] == "on": - return - - # if resource not on, start it - time.sleep(0.2) - c.start_resource(cluster_name) - - else: - raise ValueError("Cluster name could not be found!") - - while True: - time.sleep(10) - - current_state = c.get_resources() - - for cluster in current_state: - if cluster['name'] != cluster_name: - continue - - if cluster['status'] != 'on': - continue - - state = cluster['state'] - - if 'masterNode' not in cluster['state']: - continue - - if cluster['state']['masterNode'] == None: - continue - - ip = cluster['state']['masterNode'] - return ip - - -@task(name="Stop RDHPCS cluster", trigger=all_finished) -def task_stop_rdhpcs_cluster(api_key, cluster_name): - - c = pw_client.Client(PW_URL, api_key) - - # check if resource exists and is on - cluster = c.get_resource(cluster_name) - if cluster: - if cluster['status'] == "off": - return - - # TODO: Check if another job is running on the cluster - - # if resource not on, start it - time.sleep(0.2) - c.stop_resource(cluster_name) - - else: - raise ValueError("Cluster name could not be found!") diff --git a/prefect/workflow/tasks/jobs.py b/prefect/workflow/tasks/jobs.py deleted file mode 100644 index 28c6855..0000000 --- a/prefect/workflow/tasks/jobs.py +++ /dev/null @@ -1,276 +0,0 @@ -import subprocess -import time -import json -from functools import partial - -import boto3 -import prefect -from prefect import task -from prefect.tasks.shell import ShellTask -from prefect.tasks.templates import StringFormatter -from prefect.tasks.aws.client_waiter import AWSClientWait -from prefect.triggers import any_failed, all_finished -from prefect.engine.signals import FAIL - -import pw_client -from conf import LOG_STDERR, PW_URL, WORKFLOW_TAG_NAME, log_group_name - - - -shell_run_task = " ".join([ - "aws ecs start-task", - "--cluster {cluster}", - "--task-definition {name_ecs_task}", - "--overrides '{overrides}'", - "--query tasks[*].taskArn", - "--output json", - "--container-instances {instance_ids}" -# "--count 5" # TEST: run and stop multiple tasks - ]) - -@task(name="Prepare run command") -def task_format_start_task(template, **kwargs): - aux = {} - cluster = kwargs['cluster'] - env_list = kwargs.get('env', []) - if len(env_list) > 0: - aux['environment'] = env_list - - run_tag = kwargs.pop("run_tag") - if run_tag is None: - raise FAIL(message="Run tag is NOT provided for the task!") - - overrides_storm = json.dumps({ - "containerOverrides": [ - { - "name": f'{kwargs["name_docker"]}', - "command": [f'{seg}' for seg in kwargs['docker_cmd'] if seg is not None], - **aux - } - ], - }) - - ec2_client = boto3.client('ec2') - filter_by_run_tag = [{ - 'Name': f'tag:{WORKFLOW_TAG_NAME}', 'Values': [run_tag]}] - response = ec2_client.describe_instances(Filters=filter_by_run_tag) - ec2_instance_ids = [ - instance['InstanceId'] - for rsv in response.get('Reservations', []) - for instance in rsv.get('Instances', []) - ] - - if len(ec2_instance_ids) == 0: - raise FAIL( - message="Could NOT find any EC2 instances tagged for this run") - - ecs_client = boto3.client('ecs') - response = ecs_client.list_container_instances(cluster=cluster) - all_ecs_instance_arns = response['containerInstanceArns'] - if len(all_ecs_instance_arns) == 0: - raise FAIL( - message=f"Could NOT find any instances associated with cluster {cluster}") - - response = ecs_client.describe_container_instances( - cluster=cluster, - containerInstances=all_ecs_instance_arns) - - ecs_instance_arns = [] - for container_instance_info in response['containerInstances']: - ec2_instance_id = container_instance_info['ec2InstanceId'] - if ec2_instance_id in ec2_instance_ids: - ecs_instance_arns.append(container_instance_info['containerInstanceArn']) - break - - if len(ecs_instance_arns) == 0: - raise FAIL( - message="Could NOT find any container instances tagged for this run") - - formatted_cmd = template.format( - overrides=overrides_storm, - instance_ids=" ".join(ecs_instance_arns), - **kwargs) - - return formatted_cmd - -task_start_ecs_task = ShellTask( - name='Run ECS task', - return_all=True, # Need json list - log_stderr=LOG_STDERR, -) - -# NOTE: We cannot use CLI aws ecs wait because it timesout after -# 100 attempts made 6 secs apart. -task_client_wait_for_ecs = AWSClientWait( - name='Wait for ECS task', - client='ecs', - waiter_name='tasks_stopped' -) - - -@task(name="Retrieve task logs", trigger=all_finished) -def task_retrieve_task_docker_logs(log_prefix, container_name, tasks): - - logger = prefect.context.get("logger") - logs = boto3.client('logs') - ecs = boto3.client('ecs') - - task_ids = [t.split('/')[-1] for t in tasks] - - get_events = partial( - logs.filter_log_events, - logGroupName=log_group_name, - logStreamNames=[ - f"{log_prefix}/{container_name}/{task_id}" - for task_id in task_ids - ], - interleaved=True - ) - - response = get_events() - events = response['events'] - for e in events: - logger.info(e['message']) - - while len(events) > 0: - response = get_events(nextToken=response['nextToken']) - events = response['events'] - for e in events: - logger.info(e['message']) - - - -shell_kill_timedout = " ".join([ - "aws ecs stop-task", - "--cluster {cluster}", - "--reason \"Timed out\"", - "--task {task}" - ]) -task_format_kill_timedout = StringFormatter( - name="Prepare kill command", - template=shell_kill_timedout) -task_kill_task_if_wait_fails = ShellTask( - name='Kill timed-out tasks', - return_all=True, - log_stderr=LOG_STDERR, - trigger=any_failed -) - -@task(name="Check docker success") -def task_check_docker_success(tasks, cluster_name): - ecs = boto3.client('ecs') - response = ecs.describe_tasks(cluster=cluster_name, tasks=tasks) - logger = prefect.context.get("logger") - exit_codes = [] - try: - for task in response['tasks']: - for container in task['containers']: - try: - exit_codes.append(container['exitCode']) - except KeyError: - logger.error(container['reason']) - raise FAIL(message="A task description doesn't have exit code!") - except KeyError: - logger.error(response) - raise FAIL(message="ECS task decription cannot be parsed!") - - if any(int(rv) != 0 for rv in exit_codes): - raise FAIL(message="Docker returned non-zero code!") - - -# Using workflow-json on RDHPCS-C -@task(name="Run RDHPCS job") -def task_run_rdhpcs_job(api_key, workflow_name, **workflow_inputs): - c = pw_client.Client(PW_URL, api_key) - - # get the account username - account = c.get_account() - - user = account['info']['username'] - - job_id, decod_job_id = c.start_job(workflow_name, workflow_inputs, user) - return decod_job_id - - -@task(name="Wait for RDHPCS job") -def task_wait_rdhpcs_job(api_key, decod_job_id): - - c = pw_client.Client(PW_URL, api_key) - while True: - time.sleep(5) - try: - state = c.get_job_state(decod_job_id) - except: - state = "starting" - - if state == 'ok': - break - elif (state == 'deleted' or state == 'error'): - raise Exception('Simulation had an error. Please try again') - - -@task(name="Prepare Slurm script to submit the batch job") -def task_format_mesh_slurm(storm_name, storm_year, kwds): - return " ".join( - ["sbatch", - ",".join([ - "--export=ALL", - f"KWDS=\"{' '.join(str(i) for i in kwds if i is not None)}\"", - f"STORM=\"{storm_name}\"", - f"YEAR=\"{storm_year}\"", - ]), - "~/mesh.sbatch"] - ) - - -task_submit_slurm = ShellTask( - name="Submit batch job on meshing cluster", - return_all=False, # Need single line reult for job ID extraction - log_stderr=LOG_STDERR, -) - - -@task(name="Wait for slurm job") -def task_wait_slurm_done(job_id): - - logger = prefect.context.get("logger") - logger.info(f"Waiting for job with ID: {job_id}") - while True: - time.sleep(10) - - result = subprocess.run( - ["sacct", "--format=State", - "--parsable2", f"--jobs={job_id}"], - capture_output=True, - text=True) - - # A single job can have sub-jobs (e.g. srun calls) - stdout = result.stdout - stderr = result.stderr - # Skip header ("State") - status = stdout.strip().split('\n')[1:] - - # TODO: Add timeout? - if any(st in ('RUNNING', 'PENDING', 'NODE_FAIL') for st in status): - # TODO: Any special handling for node failure? - continue - - # If it's not running or pending we can safely look at finalized - # log, whether it's a failure or finished without errors - logger.info('Fetching SLURM logs...') - with open(f'slurm-{job_id}.out') as slurm_log: - logger.info(''.join(slurm_log.readlines())) - - if all(st == 'COMPLETED' for st in status): - break - - raise RuntimeError(f"Slurm job failed with status {status}") - -task_format_schism_slurm = StringFormatter( - name="Prepare Slurm script to submit the batch job", - template=" ".join( - ["sbatch", - "--export=ALL,STORM_PATH=\"{run_path}\",SCHISM_EXEC=\"{schism_exec}\"", - "~/schism.sbatch"] - ) -) diff --git a/prefect/workflow/tasks/params.py b/prefect/workflow/tasks/params.py deleted file mode 100644 index a42d103..0000000 --- a/prefect/workflow/tasks/params.py +++ /dev/null @@ -1,25 +0,0 @@ -from prefect import Parameter - -# Define parameters -param_storm_name = Parameter('name') -param_storm_year = Parameter('year') -param_use_rdhpcs = Parameter('rdhpcs', default=False) -param_use_rdhpcs_post = Parameter('rdhpcs_post', default=False) -param_use_parametric_wind = Parameter('parametric_wind', default=False) -param_run_id = Parameter('run_id') -param_schism_dir = Parameter('schism_dir') -param_schism_exec = Parameter('schism_exec') -param_subset_mesh = Parameter('subset_mesh', default=False) -param_past_forecast = Parameter('past_forecast', default=False) -param_hr_prelandfall = Parameter('hr_before_landfall', default=-1) -param_wind_coupling = Parameter('couple_wind', default=False) -param_ensemble = Parameter('ensemble', default=False) -param_ensemble_n_perturb = Parameter('ensemble_num_perturbations', default=40) -param_ensemble_sample_rule = Parameter('ensemble_sample_rule', default='korobov') - -param_mesh_hmax = Parameter('mesh_hmax', default=20000) -param_mesh_hmin_low = Parameter('mesh_hmin_low', default=1500) -param_mesh_rate_low = Parameter('mesh_rate_low', default=2e-3) -param_mesh_trans_elev = Parameter('mesh_cutoff', default=-200) -param_mesh_hmin_high = Parameter('mesh_hmin_high', default=300) -param_mesh_rate_high = Parameter('mesh_rate_high', default=1e-3) diff --git a/prefect/workflow/tasks/utils.py b/prefect/workflow/tasks/utils.py deleted file mode 100644 index 77ab16d..0000000 --- a/prefect/workflow/tasks/utils.py +++ /dev/null @@ -1,105 +0,0 @@ -import fcntl -import json -from dataclasses import dataclass -from functools import partial -from typing import List -from pathlib import Path - -import prefect -from prefect import task -from prefect import resource_manager - - -@task(name="List from JSON") -def task_pylist_from_jsonlist(json_lines): - return json.loads("\n".join(json_lines)) - - -@task(name="Check parameter is true") -def task_check_param_true(param): - return param in [True, 1, 'True', 'true', '1'] - -@task(name="Return flag if boolean parameter is true") -def task_return_value_if_param_true(param, value): - if param in [True, 1, 'True', 'true', '1']: - return value - return None - -@task(name="Return flag if boolean parameter is false") -def task_return_value_if_param_false(param, value): - if param in [True, 1, 'True', 'true', '1']: - return None - return value - -@task(name="Return flag if boolean parameter is true") -def task_return_this_if_param_true_else_that(param, this, that): - if param in [True, 1, 'True', 'true', '1']: - return this - return that - - -@task(name="Create param dict") -def task_bundle_params(existing_bundle=None, **kwargs): - par_dict = kwargs - if isinstance(existing_bundle, dict): - par_dict = existing_bundle.copy() - par_dict.update(kwargs) - return par_dict - -@task(name="Get run ID") -def task_get_flow_run_id(): - return prefect.context.get('flow_run_id') - - -@task(name="Get run tag") -def task_get_run_tag(storm_name, storm_year, run_id): - return f'{storm_name}_{storm_year}_{run_id}' - -@task(name="Add tag prefix to localpath") -def task_add_tag_path_prefix(storm_name, storm_year, run_id, local_path): - return f'{storm_name}_{storm_year}_{run_id}' / localpath - -@task(name="Replace tag in template") -def task_replace_tag_in_template(storm_name, storm_year, run_id, template_str): - return template_str.format(tag=f'{storm_name}_{storm_year}_{run_id}') - - -@task(name="Convert string to path object") -def task_convert_str_to_path(string): - return Path(string) - - -@task(name="Info printing") -def task_print_info(object_to_print): - logger = prefect.context.get("logger") - logger.info("*****************") - logger.info(object_to_print) - logger.info("*****************") - -@resource_manager(name="File mutex") -class FLock: - def __init__(self, path): - self.path = path - - def setup(self): - "Create a locked file in the specified address and returns file object" - file_obj = open(self.path, 'w') - fcntl.flock(file_obj.fileno(), fcntl.LOCK_EX) - return file_obj - - def cleanup(self, file_obj): - "Removes the lock" - fcntl.flock(file_obj.fileno(), fcntl.LOCK_UN) - file_obj.close() - -@dataclass(frozen=True) -class ECSTaskDetail: - name_ecs_cluster: str - id_ec2_template: str - name_ecs_task: str - name_docker: str - docker_args: List - description: str - wait_delay: float - wait_max_attempt: int - env_secrets: List diff --git a/pyproject.toml b/pyproject.toml new file mode 100644 index 0000000..8a8a1e7 --- /dev/null +++ b/pyproject.toml @@ -0,0 +1,97 @@ +[build-system] +requires = ["setuptools>=64", "setuptools_scm>=8"] +build-backend = "setuptools.build_meta" + +[project] +name = "stormworkflow" +dynamic = ["version"] + +authors = [ + {name = "Soroosh Mani", email = "soroosh.mani@noaa.gov"}, + {name = "William Pringle", email = "wpringle@anl.gov"}, + {name = "Fariborz Daneshvar", email = "fariborz.daneshvar@noaa.gov"}, +] +maintainers = [ + {name = "Soroosh Mani", email = "soroosh.mani@noaa.gov"} +] + +readme = {file = "README.txt", content-type = "text/markdown"} + +description = "A set of scripts to generate probabilistic storm surge results!" + +license = {file = "LICENSE"} + +requires-python = ">= 3.8, < 3.11" + +dependencies = [ + "cartopy", + "cf-python", + "cfdm", + "cfgrib", + "cfunits", + "chaospy>=4.2.7", + "coupledmodeldriver>=1.6.6", + "colored-traceback", + "cmocean", + "dask", + "dask-jobqueue", + "ensembleperturbation>=1.1.2", + "fiona", + "geoalchemy2", + "geopandas==0.14", # to address AttributeError. Should be fixed later in EnsemblePerturbation +# "geopandas", + "matplotlib", + "mpi4py", + "netCDF4", + "numpy", + "numba", + "ocsmesh==1.5.3", + "pandas", + "pyarrow", + "pygeos", + "pyproj", + "pyschism>=0.1.15", + "pytz", + "pyyaml", + "shapely>=2", + "stormevents==2.2.4", + "rasterio", + "requests", + "rtree", + "scipy", + "seawater", + "typing-extensions", + "tqdm", + "utm", + "xarray==2023.7.0", +] + +[tool.setuptools_scm] +version_file = "stormworkflow/_version.py" + +[tool.setuptools] +include-package-data = true + +[tool.setuptools.packages.find] +namespaces = true +where = ["."] + +[tool.setuptools.package-data] +"stormworkflow.slurm" = ["*.sbatch"] +"stormworkflow.scripts" = ["*.sh", "*.exp"] +"stormworkflow.refs" = ["*.nml", "*.yaml"] + +[project.urls] +#Homepage = "https://example.com" +#Documentation = "https://readthedocs.org" +Repository = "https://github.com/oceanmodeling/ondemand-storm-workflow.git" + +[project.scripts] +run_ensemble = "stormworkflow.main:main" +hurricane_data = "stormworkflow.prep.hurricane_data:cli" +hurricane_mesh = "stormworkflow.prep.hurricane_mesh:cli" +download_data = "stormworkflow.prep.download_data:cli" +setup_ensemble = "stormworkflow.prep.setup_ensemble:cli" +combine_ensemble = "stormworkflow.post.combine_ensemble:cli" +analyze_ensemble = "stormworkflow.post.analyze_ensemble:cli" +storm_roc_curve = "stormworkflow.post.ROC_single_run:cli" diff --git a/rdhpcs/clusters/mesh_cluster.json b/rdhpcs/clusters/mesh_cluster.json deleted file mode 100644 index 715c3e5..0000000 --- a/rdhpcs/clusters/mesh_cluster.json +++ /dev/null @@ -1,24 +0,0 @@ -{ -"availability_zone":"us-east-1b", -"controller_image":"latest", -"controller_net_type":false, -"export_fs_type":"xfs", -"image_disk_count":1, -"image_disk_name":"snap-04f8963f5d94148b6", -"image_disk_size_gb":250, -"management_shape":"c5n.2xlarge", -"partition_config":[ - { - "availability_zone":"us-east-1b", - "default":"YES", - "elastic_image":"latest", - "enable_spot":false, - "instance_type":"r5n.16xlarge", - "max_node_num":102, - "name":"compute", - "net_type":false, - "architecture":"amd64" - } -], -"region":"us-east-1" -} diff --git a/rdhpcs/clusters/mesh_init.sh b/rdhpcs/clusters/mesh_init.sh deleted file mode 100644 index 3278bda..0000000 --- a/rdhpcs/clusters/mesh_init.sh +++ /dev/null @@ -1,44 +0,0 @@ -DISOWN - -# We want the disowned script only on head node -if [ "$(hostname | grep -o mgmt)" != "mgmt" ]; then - exit -fi - -export PATH=$PATH:/usr/local/bin - -sudo yum update -y && sudo yum upgrade -y - -# TODO: Use lustre instead of home -sudo yum install -y tmux -cp -v /contrib/Soroosh.Mani/configs/.vimrc ~ -cp -v /contrib/Soroosh.Mani/configs/.tmux.conf ~ - -cd ~ -cp -v /contrib/Soroosh.Mani/scripts/hurricane_mesh.py ~ -cp -v /contrib/Soroosh.Mani/scripts/mesh.sbatch ~ - -cp -v /contrib/Soroosh.Mani/pkgs/odssm-mesh.tar.gz . -mkdir odssm-mesh -pushd odssm-mesh -tar -xf ../odssm-mesh.tar.gz -rm -rf ../odssm-mesh.tar.gz -bin/conda-unpack -popd - -cp -v /contrib/Soroosh.Mani/pkgs/odssm-prefect.tar.gz . -mkdir odssm-prefect -pushd odssm-prefect -tar -xf ../odssm-prefect.tar.gz -rm -rf ../odssm-prefect.tar.gz -bin/conda-unpack -popd - -aws s3 sync s3://noaa-nos-none-ca-hsofs-c/Soroosh.Mani/dem /lustre/dem -aws s3 sync s3://noaa-nos-none-ca-hsofs-c/Soroosh.Mani/shape /lustre/shape -aws s3 sync s3://noaa-nos-none-ca-hsofs-c/Soroosh.Mani/grid /lustre/grid -date > ~/_initialized_ - -# This is executed only for head (ALLNODES not specified at the top) -source odssm-prefect/bin/activate -prefect agent local start --key `cat /contrib/Soroosh.Mani/secrets/prefect.key` --label tacc-odssm-rdhpcs-mesh-cluster --name tacc-odssm-agent-rdhpcs-mesh-cluster --log-level INFO diff --git a/rdhpcs/clusters/mesh_lustre.json b/rdhpcs/clusters/mesh_lustre.json deleted file mode 100644 index 29f38ac..0000000 --- a/rdhpcs/clusters/mesh_lustre.json +++ /dev/null @@ -1,5 +0,0 @@ -{ -"fsxcompression":"LZ4", -"fsxdeployment":"SCRATCH_2", -"storage_capacity":"1200" -} diff --git a/rdhpcs/clusters/schism_cluster.json b/rdhpcs/clusters/schism_cluster.json deleted file mode 100644 index c4f2388..0000000 --- a/rdhpcs/clusters/schism_cluster.json +++ /dev/null @@ -1,23 +0,0 @@ -{ -"availability_zone":"us-east-1b", -"controller_image":"latest", -"controller_net_type":false, -"export_fs_type":"xfs", -"image_disk_count":1, -"image_disk_name":"snap-04f8963f5d94148b6", -"image_disk_size_gb":250, -"management_shape":"c5n.2xlarge", -"partition_config":[ - { - "availability_zone":"us-east-1b", - "default":"YES", - "elastic_image":"latest", - "enable_spot":false, - "instance_type":"c5n.18xlarge", - "max_node_num":102, - "name":"compute", - "net_type":true - } -], -"region":"us-east-1" -} diff --git a/rdhpcs/clusters/schism_init.sh b/rdhpcs/clusters/schism_init.sh deleted file mode 100644 index 6d255a2..0000000 --- a/rdhpcs/clusters/schism_init.sh +++ /dev/null @@ -1,39 +0,0 @@ -DISOWN - -# We want the disowned script only on head node -if [ "$(hostname | grep -o mgmt)" != "mgmt" ]; then - exit -fi - -export PATH=$PATH:/usr/local/bin - -sudo yum update -y && sudo yum upgrade -y - -# TODO: Use lustre instead of home -sudo yum install -y tmux -cp -v /contrib/Soroosh.Mani/configs/.vimrc ~ -cp -v /contrib/Soroosh.Mani/configs/.tmux.conf ~ - -cd ~ -cp -v /contrib/Soroosh.Mani/scripts/schism.sbatch ~ -cp -v /contrib/Soroosh.Mani/scripts/combine_gr3.exp ~ - -cp -L -r /contrib/Soroosh.Mani/pkgs/schism . - -echo "export PATH=\$HOME/schism/bin/:\$PATH" >> ~/.bash_profile -echo "export PATH=\$HOME/schism/bin/:\$PATH" >> ~/.bashrc - -cp -v /contrib/Soroosh.Mani/pkgs/odssm-prefect.tar.gz . -mkdir odssm-prefect -pushd odssm-prefect -tar -xf ../odssm-prefect.tar.gz -rm -rf ../odssm-prefect.tar.gz -bin/conda-unpack -popd - -# No static files is needed for run! -date > ~/_initialized_ - -# This is executed only for head (ALLNODES not specified at the top) -source odssm-prefect/bin/activate -prefect agent local start --key `cat /contrib/Soroosh.Mani/secrets/prefect.key` --label tacc-odssm-rdhpcs-schism-cluster --name tacc-odssm-agent-rdhpcs-schism-cluster --log-level INFO diff --git a/rdhpcs/clusters/schism_lustre.json b/rdhpcs/clusters/schism_lustre.json deleted file mode 100644 index 29f38ac..0000000 --- a/rdhpcs/clusters/schism_lustre.json +++ /dev/null @@ -1,5 +0,0 @@ -{ -"fsxcompression":"LZ4", -"fsxdeployment":"SCRATCH_2", -"storage_capacity":"1200" -} diff --git a/rdhpcs/scripts/combine_gr3.exp b/rdhpcs/scripts/combine_gr3.exp deleted file mode 120000 index 01ba28b..0000000 --- a/rdhpcs/scripts/combine_gr3.exp +++ /dev/null @@ -1 +0,0 @@ -../../docker/schism/docker/combine_gr3.exp \ No newline at end of file diff --git a/rdhpcs/scripts/compile_schism.sh b/rdhpcs/scripts/compile_schism.sh deleted file mode 100755 index a8cc8ee..0000000 --- a/rdhpcs/scripts/compile_schism.sh +++ /dev/null @@ -1,110 +0,0 @@ -#!/bin/bash - -## helper script for compiling schism -## moghimis@gmail.com - -prev_dir=$PWD - -commit=0741120 - -pkg_dir='/contrib/Soroosh.Mani/pkgs' -src_dir='/tmp/schism/sandbox' -install_dir="$pkg_dir/schism.$commit" -link_path="$pkg_dir/schism" - -function _compile { - # Download schism - git clone https://github.com/schism-dev/schism.git $src_dir - - ## Based on Zizang's email - module purge - - module load cmake - module load intel/2021.3.0 - module load impi/2021.3.0 - module load hdf5/1.10.6 - module load netcdf/4.7.0 - - - #for cmake - export CMAKE_Fortran_COMPILER=mpiifort - export CMAKE_CXX_COMPILER=mpiicc - export CMAKE_C_COMPILER=mpiicc - export FC=ifort - export MPI_HEADER_PATH='/apps/oneapi/mpi/2021.3.0' - # - - export NETCDF='/apps/netcdf/4.7.0/intel/18.0.5.274' - - export NetCDF_C_DIR=$NETCDF - export NetCDF_INCLUDE_DIR=$NETCDF"/include" - export NetCDF_LIBRARIES=$NETCDF"/lib" - export NetCDF_FORTRAN_DIR=$NETCDF - - export TVD_LIM=VL - # - cd ${src_dir} - git checkout $commit - - #clean cmake build folder - rm -rf build_mpiifort - mkdir build_mpiifort - - #cmake - cd build_mpiifort - cmake ../src \ - -DCMAKE_Fortran_COMPILER=$CMAKE_Fortran_COMPILER \ - -DCMAKE_CXX_COMPILER=$CMAKE_CXX_COMPILER \ - -DCMAKE_C_COMPILER=$CMAKE_C_COMPILER \ - -DMPI_HEADER_PATH=$MPI_HEADER_PATH \ - -DNetCDF_C_DIR=$NetCDF_C_DIR \ - -DNetCDF_INCLUDE_DIR=$NetCDF_INCLUDE_DIR \ - -DNetCDF_LIBRARIES=$NetCDF_LIBRARIES \ - -DNetCDF_FORTRAN_DIR=$NetCDF_FORTRAN_DIR \ - -DTVD_LIM=$TVD_LIM \ - -DUSE_PAHM=TRUE \ - -DCMAKE_C_FLAGS="-no-multibyte-chars" \ - -DCMAKE_CXX_FLAGS="-no-multibyte-chars" - - #gnu make - make -j 6 - - mkdir -p $install_dir - cp -L -r bin/ $install_dir - - rm -rf * - cmake ../src \ - -DCMAKE_Fortran_COMPILER=$CMAKE_Fortran_COMPILER \ - -DCMAKE_CXX_COMPILER=$CMAKE_CXX_COMPILER \ - -DCMAKE_C_COMPILER=$CMAKE_C_COMPILER \ - -DMPI_HEADER_PATH=$MPI_HEADER_PATH \ - -DNetCDF_C_DIR=$NetCDF_C_DIR \ - -DNetCDF_INCLUDE_DIR=$NetCDF_INCLUDE_DIR \ - -DNetCDF_LIBRARIES=$NetCDF_LIBRARIES \ - -DNetCDF_FORTRAN_DIR=$NetCDF_FORTRAN_DIR \ - -DTVD_LIM=$TVD_LIM \ - -DUSE_PAHM=TRUE \ - -DUSE_WWM=TRUE \ - -DCMAKE_C_FLAGS="-no-multibyte-chars" \ - -DCMAKE_CXX_FLAGS="-no-multibyte-chars" - - #gnu make - make -j 6 - - cp -L -r bin/ $install_dir - - if [ -f $link_path ]; then - rm $link_path - fi - ln -sf $install_dir $link_path - - rm -rf $src_dir - - cd $prev_dir -} - -if [ -d "$install_dir/bin" ]; then - echo "SCHISM commit $commit is alread compiled!" -else - _compile -fi diff --git a/rdhpcs/scripts/hurricane_mesh.py b/rdhpcs/scripts/hurricane_mesh.py deleted file mode 100755 index 099a9cf..0000000 --- a/rdhpcs/scripts/hurricane_mesh.py +++ /dev/null @@ -1,555 +0,0 @@ -#!/usr/bin/env python - -# Import modules -import logging -import os -import pathlib -import argparse -import sys -import warnings - -import numpy as np - -from fiona.drvsupport import supported_drivers -from shapely.geometry import box, MultiLineString -from shapely.ops import polygonize, unary_union, linemerge -from pyproj import CRS, Transformer -import geopandas as gpd - -from ocsmesh import Raster, Geom, Hfun, JigsawDriver, Mesh, utils -from ocsmesh.cli.subset_n_combine import SubsetAndCombine - - -# Setup modules -# Enable KML driver -#from https://stackoverflow.com/questions/72960340/attributeerror-nonetype-object-has-no-attribute-drvsupport-when-using-fiona -supported_drivers['KML'] = 'rw' -supported_drivers['LIBKML'] = 'rw' - -logger = logging.getLogger(__name__) -logger.setLevel(logging.INFO) -logging.basicConfig( - stream=sys.stdout, - format='%(asctime)s,%(msecs)d %(levelname)-8s [%(filename)s:%(lineno)d] %(message)s', - datefmt='%Y-%m-%d:%H:%M:%S') - - -# Helper functions -def get_raster(path, crs=None): - rast = Raster(path) - if crs and rast.crs != crs: - rast.warp(crs) - return rast - - -def get_rasters(paths, crs=None): - rast_list = list() - for p in paths: - rast_list.append(get_raster(p, crs)) - return rast_list - - -def _generate_mesh_boundary_and_write( - out_dir, mesh_path, mesh_crs='EPSG:4326', threshold=-1000 - ): - - mesh = Mesh.open(str(mesh_path), crs=mesh_crs) - - logger.info('Calculating boundary types...') - mesh.boundaries.auto_generate(threshold=threshold) - - logger.info('Write interpolated mesh to disk...') - mesh.write( - str(out_dir/f'mesh_w_bdry.grd'), format='grd', overwrite=True - ) - - -def _write_mesh_box(out_dir, mesh_path, mesh_crs='EPSG:4326'): - mesh = Mesh.open(str(mesh_path), crs=mesh_crs) - domain_box = box(*mesh.get_multipolygon().bounds) - gdf_domain_box = gpd.GeoDataFrame( - geometry=[domain_box], crs=mesh.crs) - gdf_domain_box.to_file(out_dir/'domain_box') - - -# Main script -def main(args, clients): - - cmd = args.cmd - logger.info(f"The mesh command is {cmd}.") - - clients_dict = {c.script_name: c for c in clients} - - io = pathlib.Path('/lustre') - - storm_name = str(args.name).lower() - storm_year = str(args.year).lower() - tag = args.tag - if tag is None: - tag = f'{storm_name.lower()}_{storm_year}' - - logger.info(f"The simulation tag is {tag}.") - - dem_dir = pathlib.Path(io / 'dem') - out_dir = io / 'hurricanes' / tag / 'mesh' - out_dir.mkdir(exist_ok=True, parents=True) - - final_mesh_name = 'hgrid.gr3' - write_mesh_box = False - - - - if cmd == 'subset_n_combine': - final_mesh_name = 'final_mesh.2dm' - write_mesh_box = True - - args.rasters = [i for i in (dem_dir / 'gebco').iterdir() if i.suffix == '.tif'] - args.out = out_dir - args.fine_mesh = io / 'grid' / 'HSOFS_250m_v1.0_fixed.14' - args.coarse_mesh = io / 'grid' / 'WNAT_1km.14' - args.region_of_interset = io / 'hurricanes' / tag / 'windswath' - - elif cmd == 'hurricane_mesh': - final_mesh_name = 'mesh_no_bdry.2dm' - - if cmd in clients_dict: - clients_dict[cmd].run(args) - else: - raise ValueError(f'Invalid meshing command specified: <{cmd}>') - - #TODO interpolate DEM? - if write_mesh_box: - _write_mesh_box(out_dir, out_dir / final_mesh_name) - _generate_mesh_boundary_and_write(out_dir, out_dir / final_mesh_name) - - -class HurricaneMesher: - - @property - def script_name(self): - return 'hurricane_mesh' - - def __init__(self, sub_parser): - - this_parser = sub_parser.add_parser(self.script_name) - - this_parser.add_argument( - "--nprocs", type=int, help="Number of parallel threads to use when " - "computing geom and hfun.") - - this_parser.add_argument( - "--geom-nprocs", type=int, help="Number of processors used when " - "computing the geom, overrides --nprocs argument.") - - this_parser.add_argument( - "--hfun-nprocs", type=int, help="Number of processors used when " - "computing the hfun, overrides --nprocs argument.") - - this_parser.add_argument( - "--hmax", type=float, help="Maximum mesh size.", - default=20000) - - this_parser.add_argument( - "--hmin-low", type=float, default=1500, - help="Minimum mesh size for low resolution region.") - - this_parser.add_argument( - "--rate-low", type=float, default=2e-3, - help="Expansion rate for low resolution region.") - - this_parser.add_argument( - "--contours", type=float, nargs=2, - help="Contour specification applied to whole domain; " - "contour mesh size needs to be greater that hmin-low", - metavar="SPEC") - - this_parser.add_argument( - "--transition-elev", "-e", type=float, default=-200, - help="Cut off elev for high resolution region") - - this_parser.add_argument( - "--hmin-high", type=float, default=300, - help="Minimum mesh size for high resolution region.") - - this_parser.add_argument( - "--rate-high", type=float, default=1e-3, - help="Expansion rate for high resolution region") - - - def run(self, args): - - nprocs = args.nprocs - - geom_nprocs = nprocs - if args.geom_nprocs: - nprocs = args.geom_nprocs - geom_nprocs = -1 if nprocs == None else nprocs - - hfun_nprocs = nprocs - if args.hfun_nprocs: - nprocs = args.hfun_nprocs - hfun_nprocs = -1 if nprocs == None else nprocs - - io = pathlib.Path('/lustre') - - storm_name = str(args.name).lower() - storm_year = str(args.year).lower() - tag = args.tag - if tag is None: - tag = f'{storm_name.lower()}_{storm_year}' - - dem_dir = pathlib.Path(io / 'dem') - shp_dir = pathlib.Path(io / 'shape') - hurr_info = pathlib.Path( - io / 'hurricanes' / tag / 'windswath') - out_dir = pathlib.Path( - io / 'hurricanes' / tag / 'mesh') - out_dir.mkdir(exist_ok=True, parents=True) - - coarse_geom = shp_dir / 'base_geom' - fine_geom = shp_dir / 'high_geom' - - gebco_paths = [i for i in (dem_dir / 'gebco').iterdir() if str(i).endswith('.tif')] - cudem_paths = [i for i in (dem_dir / 'ncei19').iterdir() if str(i).endswith('.tif')] - all_dem_paths = [*gebco_paths, *cudem_paths] - tile_idx_path = f'zip://{str(dem_dir)}/tileindex_NCEI_ninth_Topobathy_2014.zip' - - - # Specs - wind_kt = 34 - filter_factor = 3 - max_n_hires_dem = 150 - - - # Geom (hardcoded based on prepared hurricane meshing spec) - z_max_lo = 0 - z_max_hi = 10 - z_max = max(z_max_lo, z_max_hi) - - # Hfun - hmax = args.hmax - - hmin_lo = args.hmin_low - rate_lo = args.rate_low - - contour_specs_lo = [] - if args.contours is not None: - for c_elev, m_size in args.contours: - if hmin_lo > m_size: - warnings.warn( - "Specified contour must have a mesh size" - f" larger than minimum low res size: {hmin_low}") - contour_specs_lo.append((c_elev, rate_lo, m_size)) - - else: - contour_specs_lo = [ - (-4000, rate_lo, 10000), - (-1000, rate_lo, 6000), - (-10, rate_lo, hmin_lo) - ] - - const_specs_lo = [ - (hmin_lo, 0, z_max) - ] - - cutoff_hi = args.transition_elev - hmin_hi = args.hmin_high - rate_hi = args.rate_high - - contour_specs_hi = [ - (0, rate_hi, hmin_hi) - ] - const_specs_hi = [ - (hmin_hi, 0, z_max) - ] - - - # Read inputs - logger.info("Reading input shapes...") - gdf_fine = gpd.read_file(fine_geom) - gdf_coarse = gpd.read_file(coarse_geom) - tile_idx = gpd.read_file(tile_idx_path) - - logger.info("Reading hurricane info...") - gdf = gpd.read_file(hurr_info) - gdf_wind_kt = gdf[gdf.RADII.astype(int) == wind_kt] - - # Simplify high resolution geometry - logger.info("Simplify high-resolution shape...") - gdf_fine = gpd.GeoDataFrame( - geometry=gdf_fine.to_crs("EPSG:3857").simplify(tolerance=hmin_hi / 2).buffer(0).to_crs(gdf_fine.crs), - crs=gdf_fine.crs) - - - # Calculate refinement region - logger.info(f"Create polygon from {wind_kt}kt windswath polygon...") - ext_poly = [i for i in polygonize([ext for ext in gdf_wind_kt.exterior])] - gdf_refine_super_0 = gpd.GeoDataFrame( - geometry=ext_poly, crs=gdf_wind_kt.crs) - - logger.info("Find upstream...") - domain_extent = gdf_fine.to_crs(gdf_refine_super_0.crs).total_bounds - domain_box = box(*domain_extent) - box_tol = 1/1000 * max(domain_extent[2]- domain_extent[0], domain_extent[3] - domain_extent[1]) - gdf_refine_super_0 = gdf_refine_super_0.intersection(domain_box.buffer(-box_tol)) - gdf_refine_super_0.plot() - ext_poly = [i for i in gdf_refine_super_0.explode().geometry] - - dmn_ext = [pl.exterior for mp in gdf_fine.geometry for pl in mp] - wnd_ext = [pl.exterior for pl in ext_poly] - - gdf_dmn_ext = gpd.GeoDataFrame(geometry=dmn_ext, crs=gdf_fine.crs) - gdf_wnd_ext = gpd.GeoDataFrame(geometry=wnd_ext, crs=gdf_wind_kt.crs) - - gdf_ext_over = gpd.overlay(gdf_dmn_ext, gdf_wnd_ext.to_crs(gdf_dmn_ext.crs), how="union") - - gdf_ext_x = gdf_ext_over[gdf_ext_over.intersects(gdf_wnd_ext.to_crs(gdf_ext_over.crs).unary_union)] - - filter_lines_threshold = np.max(gdf_dmn_ext.length) / filter_factor - lnstrs = linemerge([lnstr for lnstr in gdf_ext_x.explode().geometry]) - if not isinstance(lnstrs, MultiLineString): - lnstrs = [lnstrs] - lnstrs = [lnstr for lnstr in lnstrs if lnstr.length < filter_lines_threshold] - gdf_hurr_w_upstream = gdf_wnd_ext.to_crs(gdf_ext_x.crs) - gdf_hurr_w_upstream = gdf_hurr_w_upstream.append( - gpd.GeoDataFrame( - geometry=gpd.GeoSeries(lnstrs), - crs=gdf_ext_x.crs - )) - - - gdf_hurr_w_upstream_poly = gpd.GeoDataFrame( - geometry=gpd.GeoSeries(polygonize(gdf_hurr_w_upstream.unary_union)), - crs=gdf_hurr_w_upstream.crs) - - logger.info("Find intersection of domain polygon with impacted area upstream...") - gdf_refine_super_2 = gpd.overlay( - gdf_fine, gdf_hurr_w_upstream_poly.to_crs(gdf_fine.crs), - how='intersection' - ) - - gdf_refine_super_2.to_file(out_dir / 'dmn_hurr_upstream') - - logger.info("Selecting high resolution DEMs...") - gdf_dem_box = gpd.GeoDataFrame( - columns=['geometry', 'path'], - crs=gdf_refine_super_2.crs) - for path in all_dem_paths: - bbox = Raster(path).get_bbox(crs=gdf_dem_box.crs) - gdf_dem_box = gdf_dem_box.append( - gpd.GeoDataFrame( - {'geometry': [bbox], - 'path': str(path)}, - crs=gdf_dem_box.crs) - ) - gdf_dem_box = gdf_dem_box.reset_index() - - lo_res_paths = gebco_paths - - # TODO: use sjoin instead?! - gdf_hi_res_box = gdf_dem_box[gdf_dem_box.geometry.intersects(gdf_refine_super_2.unary_union)].reset_index() - hi_res_paths = gdf_hi_res_box.path.values.tolist() - - - # For refine cut off either use static geom at e.g. 200m depth or instead just use low-res for cut off polygon - - - # Or intersect with full geom? (timewise an issue for hfun creation) - logger.info("Calculate refinement area cutoff...") - cutoff_dem_paths = [i for i in gdf_hi_res_box.path.values.tolist() if pathlib.Path(i) in lo_res_paths] - cutoff_geom = Geom( - get_rasters(cutoff_dem_paths), - base_shape=gdf_coarse.unary_union, - base_shape_crs=gdf_coarse.crs, - zmax=cutoff_hi, - nprocs=geom_nprocs) - cutoff_poly = cutoff_geom.get_multipolygon() - - gdf_cutoff = gpd.GeoDataFrame( - geometry=gpd.GeoSeries(cutoff_poly), - crs=cutoff_geom.crs) - - gdf_draft_refine = gpd.overlay(gdf_refine_super_2, gdf_cutoff.to_crs(gdf_refine_super_2.crs), how='difference') - - refine_polys = [pl for pl in gdf_draft_refine.unary_union] - - gdf_final_refine = gpd.GeoDataFrame( - geometry=refine_polys, - crs=gdf_draft_refine.crs) - - - logger.info("Write landfall area to disk...") - gdf_final_refine.to_file(out_dir/'landfall_refine_area') - - gdf_geom = gpd.overlay( - gdf_coarse, - gdf_final_refine.to_crs(gdf_coarse.crs), - how='union') - - domain_box = box(*gdf_fine.total_bounds) - gdf_domain_box = gpd.GeoDataFrame( - geometry=[domain_box], crs=gdf_fine.crs) - gdf_domain_box.to_file(out_dir/'domain_box') - - geom = Geom(gdf_geom.unary_union, crs=gdf_geom.crs) - - - logger.info("Create low-res size function...") - hfun_lo = Hfun( - get_rasters(lo_res_paths), - base_shape=gdf_coarse.unary_union, - base_shape_crs=gdf_coarse.crs, - hmin=hmin_lo, - hmax=hmax, - nprocs=hfun_nprocs, - method='fast') - - logger.info("Add refinement spec to low-res size function...") - for ctr in contour_specs_lo: - hfun_lo.add_contour(*ctr) - hfun_lo.add_constant_value(value=ctr[2], lower_bound=ctr[0]) - - for const in const_specs_lo: - hfun_lo.add_constant_value(*const) - - # hfun_lo.add_subtidal_flow_limiter(upper_bound=z_max) - # hfun_lo.add_subtidal_flow_limiter(hmin=hmin_lo, upper_bound=z_max) - - - logger.info("Compute low-res size function...") - jig_hfun_lo = hfun_lo.msh_t() - - - logger.info("Write low-res size function to disk...") - Mesh(jig_hfun_lo).write( - str(out_dir/f'hfun_lo_{hmin_hi}.2dm'), - format='2dm', - overwrite=True) - - - # For interpolation after meshing and use GEBCO for mesh size calculation in refinement area. - hfun_hi_rast_paths = hi_res_paths - if len(hi_res_paths) > max_n_hires_dem: - hfun_hi_rast_paths = gebco_paths - - logger.info("Create high-res size function...") - hfun_hi = Hfun( - get_rasters(hfun_hi_rast_paths), - base_shape=gdf_final_refine.unary_union, - base_shape_crs=gdf_final_refine.crs, - hmin=hmin_hi, - hmax=hmax, - nprocs=hfun_nprocs, - method='fast') - - # Apply low resolution criteria on hires as ewll - logger.info("Add refinement spec to high-res size function...") - for ctr in contour_specs_lo: - hfun_hi.add_contour(*ctr) - hfun_hi.add_constant_value(value=ctr[2], lower_bound=ctr[0]) - - for ctr in contour_specs_hi: - hfun_hi.add_contour(*ctr) - hfun_hi.add_constant_value(value=ctr[2], lower_bound=ctr[0]) - - for const in const_specs_hi: - hfun_hi.add_constant_value(*const) - - # hfun_hi.add_subtidal_flow_limiter(upper_bound=z_max) - - logger.info("Compute high-res size function...") - jig_hfun_hi = hfun_hi.msh_t() - - logger.info("Write high-res size function to disk...") - Mesh(jig_hfun_hi).write( - str(out_dir/f'hfun_hi_{hmin_hi}.2dm'), - format='2dm', - overwrite=True) - - - jig_hfun_lo = Mesh.open(str(out_dir/f'hfun_lo_{hmin_hi}.2dm'), crs="EPSG:4326").msh_t - jig_hfun_hi = Mesh.open(str(out_dir/f'hfun_hi_{hmin_hi}.2dm'), crs="EPSG:4326").msh_t - - - logger.info("Combine size functions...") - gdf_final_refine = gpd.read_file(out_dir/'landfall_refine_area') - - utils.clip_mesh_by_shape( - jig_hfun_hi, - shape=gdf_final_refine.to_crs(jig_hfun_hi.crs).unary_union, - fit_inside=True, - in_place=True) - - jig_hfun_final = utils.merge_msh_t( - jig_hfun_lo, jig_hfun_hi, - drop_by_bbox=False, - can_overlap=False, - check_cross_edges=True) - - - logger.info("Write final size function to disk...") - hfun_mesh = Mesh(jig_hfun_final) - hfun_mesh.write( - str(out_dir/f'hfun_comp_{hmin_hi}.2dm'), - format='2dm', - overwrite=True) - - - hfun = Hfun(hfun_mesh) - - logger.info("Generate mesh...") - driver = JigsawDriver(geom=geom, hfun=hfun, initial_mesh=True) - mesh = driver.run() - - - utils.reproject(mesh.msh_t, "EPSG:4326") - mesh.write( - str(out_dir/f'mesh_raw_{hmin_hi}.2dm'), - format='2dm', - overwrite=True) - - mesh = Mesh.open(str(out_dir/f'mesh_raw_{hmin_hi}.2dm'), crs="EPSG:4326") - - dst_crs = "EPSG:4326" - interp_rast_list = [ - *get_rasters(gebco_paths, dst_crs), - *get_rasters(gdf_hi_res_box.path.values, dst_crs)] - - # TODO: Fix the deadlock issue with multiple cores when interpolating - logger.info("Interpolate DEMs on the generated mesh...") - mesh.interpolate(interp_rast_list, nprocs=1, method='nearest') - - logger.info("Write raw mesh to disk...") - mesh.write( - str(out_dir/f'mesh_{hmin_hi}.2dm'), - format='2dm', - overwrite=True) - - # Write the same mesh with a generic name - mesh.write( - str(out_dir/f'mesh_no_bdry.2dm'), - format='2dm', - overwrite=True) - - - -if __name__ == '__main__': - - parser = argparse.ArgumentParser() - parser.add_argument( - "--tag", "-t", - help="storm tag used for path creation", type=str) - parser.add_argument( - "name", help="name of the storm", type=str) - parser.add_argument( - "year", help="year of the storm", type=int) - - subparsers = parser.add_subparsers(dest='cmd') - subset_client = SubsetAndCombine(subparsers) - hurrmesh_client = HurricaneMesher(subparsers) - - args = parser.parse_args() - - logger.info(f"Mesh arguments are {args}.") - - main(args, [hurrmesh_client, subset_client]) diff --git a/rdhpcs/scripts/mesh.sbatch b/rdhpcs/scripts/mesh.sbatch deleted file mode 100644 index 3d1f669..0000000 --- a/rdhpcs/scripts/mesh.sbatch +++ /dev/null @@ -1,19 +0,0 @@ -#!/bin/bash -#SBATCH --parsable -#SBATCH --exclusive -#SBATCH --mem=0 - -# Wiating for _initialized_ indicating cluster is properly initialized -while [ ! -f ~/_initialized_ ]; -do - echo "Waiting for cluster initialization..." - sleep 10s -done - -# To redirect all the temp file creations in OCSMesh to luster file sys -export TMPDIR=/lustre/.tmp -mkdir -p $TMPDIR - -source ~/odssm-mesh/bin/activate -echo Executing: python \"~/hurricane_mesh.py ${KWDS} ${STORM} ${YEAR}\"... -python ~/hurricane_mesh.py ${STORM} ${YEAR} ${KWDS} diff --git a/rdhpcs/scripts/schism.sbatch b/rdhpcs/scripts/schism.sbatch deleted file mode 100644 index 6dafe58..0000000 --- a/rdhpcs/scripts/schism.sbatch +++ /dev/null @@ -1,65 +0,0 @@ -#!/bin/bash -#SBATCH --parsable -#SBATCH --exclusive -#SBATCH --mem=0 -#SBATCH --nodes=5 -#SBATCH --ntasks-per-node=36 - -# Wiating for _initialized_ indicating cluster is properly initialized -while [ ! -f ~/_initialized_ ]; -do - echo "Waiting for cluster initialization..." - sleep 10s -done - -PATH=~/schism/bin/:$PATH - -module purge - -module load cmake -module load intel/2021.3.0 -module load impi/2021.3.0 -module load hdf5/1.10.6 -module load netcdf/4.7.0 - -export MV2_ENABLE_AFFINITY=0 -ulimit -s unlimited - -echo "Starting solver..." -date - -set -ex - -pushd /lustre/${STORM_PATH} -mkdir -p outputs -#srun --mpi=pmi2 pschism_TVD-VL 4 -mpirun --ppn ${SLURM_TASKS_PER_NODE} ${SCHISM_EXEC} 4 - -if [ $? -eq 0 ]; then - echo "Combining outputs..." - date - # NOTE: Due to new IO, there's no need for combining main output -# pushd outputs -# times=$(ls schout_* | grep -o "schout[0-9_]\+" | awk 'BEGIN {FS = "_"}; {print $3}' | sort -h | uniq ) -# for i in $times; do -# combine_output11 -b $i -e $i -# done -# popd - # Combine hotstart - pushd outputs - if ls hotstart* >/dev/null 2>&1; then - times=$(ls hotstart_* | grep -o "hotstart[0-9_]\+" | awk 'BEGIN {FS = "_"}; {print $3}' | sort -h | uniq ) - for i in $times; do - combine_hotstart7 --iteration $i - done - fi - popd - - expect -f ~/combine_gr3.exp maxelev 1 - expect -f ~/combine_gr3.exp maxdahv 3 - mv maxdahv.gr3 maxelev.gr3 -t outputs -fi - - -echo "Done" -date diff --git a/requirements.txt b/requirements.txt new file mode 100644 index 0000000..5cc4a85 --- /dev/null +++ b/requirements.txt @@ -0,0 +1,5 @@ +chaospy>=4.2.7 +stormevents==2.2.3 +pyschism>=0.1.15 +coupledmodeldriver>=1.6.6 +ensembleperturbation>=1.1.2 diff --git a/singularity/info/environment.yml b/singularity/info/environment.yml deleted file mode 100644 index 22107ab..0000000 --- a/singularity/info/environment.yml +++ /dev/null @@ -1,14 +0,0 @@ -name: icogsc -channels: - - conda-forge -dependencies: - - cartopy - - cfunits - - gdal - - geopandas - - geos - - proj - - pygeos - - pyproj - - python=3.9 - - shapely>=1.8 diff --git a/singularity/info/files/hurricane_data.py b/singularity/info/files/hurricane_data.py deleted file mode 100644 index 827a521..0000000 --- a/singularity/info/files/hurricane_data.py +++ /dev/null @@ -1,325 +0,0 @@ -"""User script to get hurricane info relevant to the workflow -This script gether information about: - - Hurricane track - - Hurricane windswath - - Hurricane event dates - - Stations info for historical hurricane -""" - -import sys -import logging -import pathlib -import argparse -import tempfile -import numpy as np -from datetime import datetime, timedelta - -import pandas as pd -import geopandas as gpd -from searvey.coops import COOPS_TidalDatum -from searvey.coops import COOPS_TimeZone -from searvey.coops import COOPS_Units -from shapely.geometry import box -from stormevents import StormEvent -from stormevents.nhc import VortexTrack - - -logger = logging.getLogger(__name__) -logger.setLevel(logging.INFO) -logging.basicConfig( - stream=sys.stdout, - format='%(asctime)s,%(msecs)d %(levelname)-8s [%(filename)s:%(lineno)d] %(message)s', - datefmt='%Y-%m-%d:%H:%M:%S') - - -def main(args): - - name_or_code = args.name_or_code - year = args.year - date_out = args.date_range_outpath - track_out = args.track_outpath - swath_out = args.swath_outpath - sta_dat_out = args.station_data_outpath - sta_loc_out = args.station_location_outpath - is_past_forecast = args.past_forecast - hr_before_landfall = args.hours_before_landfall - lead_times = args.lead_times - - if hr_before_landfall < 0: - hr_before_landfall = 48 - - ne_low = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres')) - shp_US = ne_low[ne_low.name.isin(['United States of America', 'Puerto Rico'])].unary_union - - logger.info("Fetching hurricane info...") - event = None - if year == 0: - event = StormEvent.from_nhc_code(name_or_code) - else: - event = StormEvent(name_or_code, year) - nhc_code = event.nhc_code - logger.info("Fetching a-deck track info...") - - prescribed = None - if lead_times is not None and lead_times.is_file(): - leadtime_dict = pd.read_json(lead_times, orient='index') - leadtime_table = leadtime_dict.drop(columns='leadtime').merge( - leadtime_dict.leadtime.apply( - lambda x: pd.Series({v: k for k, v in x.items()}) - ).apply(pd.to_datetime, format='%Y%m%d%H'), - left_index=True, - right_index=True - ).set_index('ALnumber') - - if nhc_code.lower() in leadtime_table.index: - storm_all_times = leadtime_table.loc[nhc_code.lower()].dropna() - if len(storm_all_times.shape) > 1: - storm_all_times = storm_all_times.iloc[0] - if hr_before_landfall in storm_all_times: - prescribed = storm_all_times[hr_before_landfall] - - # TODO: Get user input for whether its forecast or now! - now = datetime.now() - df_dt = pd.DataFrame(columns=['date_time']) - if (is_past_forecast or (now - event.start_date < timedelta(days=30))): - temp_track = event.track(file_deck='a') - adv_avail = temp_track.unfiltered_data.advisory.unique() - adv_order = ['OFCL', 'HWRF', 'HMON', 'CARQ'] - advisory = adv_avail[0] - for adv in adv_order: - if adv in adv_avail: - advisory = adv - break - - if advisory == "OFCL" and "CARQ" not in adv_avail: - raise ValueError( - "OFCL advisory needs CARQ for fixing missing variables!" - ) - - # NOTE: Track taken from `StormEvent` object is up to now only. - # See GitHub issue #57 for StormEvents - track = VortexTrack(nhc_code, file_deck='a', advisories=[advisory]) - - - if is_past_forecast: - - logger.info( - f"Creating {advisory} track for {hr_before_landfall}" - +" hours before landfall forecast..." - ) - if prescribed is not None: - start_times = track.data.track_start_time.unique() - leastdiff_idx = np.argmin(abs(start_times - prescribed)) - forecast_start = start_times[leastdiff_idx] - - - else: - onland_adv_tracks = track.data[track.data.intersects(shp_US)] - if onland_adv_tracks.empty: - # If it doesn't landfall on US, check with other countries - onland_adv_tracks = track.data[ - track.data.intersects(ne_low.unary_union) - ] - - candidates = onland_adv_tracks.groupby('track_start_time').nth(0).reset_index() - candidates['timediff'] = candidates.datetime - candidates.track_start_time - forecast_start = candidates[ - candidates['timediff'] >= timedelta(hours=hr_before_landfall) - ].track_start_time.iloc[-1] - - gdf_track = track.data[track.data.track_start_time == forecast_start] - # Append before track from previous forecasts: - gdf_track = pd.concat(( - track.data[ - (track.data.track_start_time < forecast_start) - & (track.data.forecast_hours == 0) - ], - gdf_track - )) - df_dt['date_time'] = ( - track.start_date, track.end_date, forecast_start - ) - - - logger.info("Fetching water level measurements from COOPS stations...") - coops_ssh = event.coops_product_within_isotach( - product='water_level', wind_speed=34, - datum=COOPS_TidalDatum.NAVD, - units=COOPS_Units.METRIC, - time_zone=COOPS_TimeZone.GMT, - ) - - else: - # Get the latest track forecast - forecast_start = track.data.track_start_time.max() - gdf_track = track.data[track.data.track_start_time == forecast_start] - gdf_track = pd.concat(( - track.data[ - (track.data.track_start_time < forecast_start) - & (track.data.forecast_hours == 0) - ], - gdf_track - )) - - # Put both dates as now(), for pyschism to setup forecast - df_dt['date_time'] = ( - track.start_date, track.end_date, forecast_start - ) - - coops_ssh = None - - # NOTE: Fake besttrack: Since PySCHISM supports "BEST" track - # files for its parametric forcing, write track as "BEST" after - # fixing the OFCL by CARQ through StormEvents - # NOTE: Fake best track AFTER perturbation -# gdf_track.advisory = 'BEST' -# gdf_track.forecast_hours = 0 - track = VortexTrack(storm=gdf_track, file_deck='a', advisories=[advisory]) - - windswath_dict = track.wind_swaths(wind_speed=34) - # NOTE: Fake best track AFTER perturbation -# windswaths = windswath_dict['BEST'] # Faked BEST - windswaths = windswath_dict[advisory] - logger.info(f"Fetching {advisory} windswath...") - windswath_time = min(pd.to_datetime(list(windswaths.keys()))) - windswath = windswaths[ - windswath_time.strftime("%Y%m%dT%H%M%S") - ] - - else: - - logger.info("Fetching b-deck track info...") - - - logger.info("Fetching BEST windswath...") - track = event.track(file_deck='b') - # Drop duplicate rows based on isotach and time without minutes - # (PaHM doesn't take minutes into account) - gdf_track = track.data - gdf_track.datetime = gdf_track.datetime.dt.floor('h') - gdf_track = gdf_track.drop_duplicates(subset=['datetime', 'isotach_radius'], keep='last') - track = VortexTrack(storm=gdf_track, file_deck='b', advisories=['BEST']) - - perturb_start = track.start_date - if hr_before_landfall: - if prescribed is not None: - # NOTE: track_start_time is the genesis for best track - times = track.data.datetime.unique() - leastdiff_idx = np.argmin(abs(times - prescribed)) - perturb_start = times[leastdiff_idx] - else: - onland_adv_tracks = track.data[track.data.intersects(shp_US)] - if onland_adv_tracks.empty: - # If it doesn't landfall on US, check with other countries - onland_adv_tracks = track.data[ - track.data.intersects(ne_low.unary_union) - ] - onland_date = onland_adv_tracks.datetime.iloc[0] - perturb_start = track.data[ - onland_date - track.data.datetime >= timedelta(hours=hr_before_landfall) - ].datetime.iloc[-1] - - df_dt['date_time'] = ( - track.start_date, track.end_date, perturb_start - ) - - windswath_dict = track.wind_swaths(wind_speed=34) - # NOTE: event.start_date (first advisory date) doesn't - # necessarily match the windswath key which comes from track - # start date for the first advisory (at least in 2021!) - windswaths = windswath_dict['BEST'] - latest_advistory_stamp = max(pd.to_datetime(list(windswaths.keys()))) - windswath = windswaths[ - latest_advistory_stamp.strftime("%Y%m%dT%H%M%S") - ] - - logger.info("Fetching water level measurements from COOPS stations...") - coops_ssh = event.coops_product_within_isotach( - product='water_level', wind_speed=34, - datum=COOPS_TidalDatum.NAVD, - units=COOPS_Units.METRIC, - time_zone=COOPS_TimeZone.GMT, - ) - - logger.info("Writing relevant data to files...") - df_dt.to_csv(date_out) - # Remove duplicate entries for similar isotach and time - # (e.g. Dorian19 and Ian22 best tracks) - track.to_file(track_out) - gs = gpd.GeoSeries(windswath) - gdf_windswath = gpd.GeoDataFrame( - geometry=gs, data={'RADII': len(gs) * [34]}, crs="EPSG:4326" - ) - gdf_windswath.to_file(swath_out) - if coops_ssh is not None: - coops_ssh.to_netcdf(sta_dat_out, 'w') - coops_ssh[['x', 'y']].to_dataframe().drop(columns=['nws_id']).to_csv( - sta_loc_out, header=False, index=False) - - -if __name__ == '__main__': - - parser = argparse.ArgumentParser() - - parser.add_argument( - "name_or_code", help="name or NHC code of the storm", type=str) - parser.add_argument( - "year", help="year of the storm", type=int) - - parser.add_argument( - "--date-range-outpath", - help="output date range", - type=pathlib.Path, - required=True - ) - - parser.add_argument( - "--track-outpath", - help="output hurricane track", - type=pathlib.Path, - required=True - ) - - parser.add_argument( - "--swath-outpath", - help="output hurricane windswath", - type=pathlib.Path, - required=True - ) - - parser.add_argument( - "--station-data-outpath", - help="output station data", - type=pathlib.Path, - required=True - ) - - parser.add_argument( - "--station-location-outpath", - help="output station location", - type=pathlib.Path, - required=True - ) - - parser.add_argument( - "--past-forecast", - help="Get forecast data for a past storm", - action='store_true', - ) - - parser.add_argument( - "--hours-before-landfall", - help="Get forecast data for a past storm at this many hour before landfall", - type=int, - default=-1, - ) - - parser.add_argument( - "--lead-times", - type=pathlib.Path, - help="Helper file for prescribed lead times", - ) - - args = parser.parse_args() - - main(args) diff --git a/singularity/info/info.def b/singularity/info/info.def deleted file mode 100644 index edaeed3..0000000 --- a/singularity/info/info.def +++ /dev/null @@ -1,35 +0,0 @@ -BootStrap: docker -#From: centos:centos7.8.2003 -From: continuumio/miniconda3:23.3.1-0-alpine - -%files - environment.yml - files/hurricane_data.py /scripts/ - -%environment - export PYTHONPATH=/scripts - -%post -# yum update -y && yum upgrade -y - apk update && apk upgrade && apk add git - - conda install mamba -n base -c conda-forge - conda install libarchive -n base -c conda-forge - mamba update --name base --channel defaults conda - mamba env create -n info --file /environment.yml - mamba clean --all --yes - - conda run -n info --no-capture-output \ - pip install stormevents==2.2.0 - - - conda clean --all - apk del git - - -%runscript - conda run -n info --no-capture-output python -m hurricane_data $* - - -%labels - Author "Soroosh Mani" diff --git a/singularity/ocsmesh/.ocsmesh.def.swp b/singularity/ocsmesh/.ocsmesh.def.swp deleted file mode 100644 index 345fa1d..0000000 Binary files a/singularity/ocsmesh/.ocsmesh.def.swp and /dev/null differ diff --git a/singularity/ocsmesh/environment.yml b/singularity/ocsmesh/environment.yml deleted file mode 100644 index f8e2d6d..0000000 --- a/singularity/ocsmesh/environment.yml +++ /dev/null @@ -1,29 +0,0 @@ -name: icogsc -channels: - - conda-forge -dependencies: - - python<3.11 - - gdal - - geos - - proj - - netcdf4 - - udunits2 - - pyproj - - shapely - - rasterio - - fiona - - pygeos - - geopandas - - utm - - scipy - - numba - - numpy>=1.21 - - matplotlib - - requests - - tqdm - - mpi4py - - pyarrow - - pytz - - geoalchemy2 - - colored-traceback - - typing-extensions diff --git a/singularity/ocsmesh/ocsmesh.def b/singularity/ocsmesh/ocsmesh.def deleted file mode 100644 index 12ea35c..0000000 --- a/singularity/ocsmesh/ocsmesh.def +++ /dev/null @@ -1,56 +0,0 @@ -BootStrap: docker -#From: centos:centos7.8.2003 -From: continuumio/miniconda3:23.3.1-0-alpine - -%files - environment.yml - files/hurricane_mesh.py /scripts/ - -%environment - export PYTHONPATH=/scripts - -%post - ENV_NAME=ocsmesh - - apk update && apk upgrade && apk --no-cache add \ - git \ - gcc \ - g++ \ - make \ - cmake \ - libstdc++ \ - libarchive - - conda install mamba -n base -c conda-forge - mamba update --name base --channel defaults conda - mamba env create -n $ENV_NAME --file /environment.yml - mamba clean --all --yes - - git clone https://github.com/dengwirda/jigsaw-python.git - git -C jigsaw-python checkout f875719 - conda run -n $ENV_NAME --no-capture-output \ - python3 jigsaw-python/setup.py build_external - cp jigsaw-python/external/jigsaw/bin/* $ENV_PREFIX/bin - cp jigsaw-python/external/jigsaw/lib/* $ENV_PREFIX/lib - conda run -n $ENV_NAME --no-capture-output \ - pip install ./jigsaw-python - rm -rf jigsaw-python - git clone https://github.com/noaa-ocs-modeling/ocsmesh - git -C ocsmesh checkout cc0b82a #subset fix branch - conda run -n $ENV_NAME --no-capture-output \ - pip install ./ocsmesh - - conda clean --all && apk del \ - git \ - gcc \ - g++ \ - make \ - cmake - - -%runscript - conda run -n ocsmesh --no-capture-output python -m hurricane_mesh $* - - -%labels - Author "Soroosh Mani" diff --git a/singularity/post/environment.yml b/singularity/post/environment.yml deleted file mode 100644 index 357ae8b..0000000 --- a/singularity/post/environment.yml +++ /dev/null @@ -1,97 +0,0 @@ -name: odssm-post-env -channels: - - conda-forge - - defaults -dependencies: - - appdirs - - python>=3.9 # because of searvey - - pygeos - - geos - - gdal - - proj - - pyproj - - cartopy - - udunits2 - - shapely>=1.8.0 - - arrow - - attrs - - backcall - - beautifulsoup4 - - bokeh - - branca - - brotlipy - - bs4 - - certifi - - cffi - - cftime - - cfunits - - cfgrib - - chardet - - click - - click-plugins - - cligj - - cryptography - - cycler - - decorator - - f90nml - - fiona - - folium - - gdal - - geopandas - - geos - - geotiff - - glib - - icu - - idna - - ipython - - ipython_genutils - - jedi - - jinja2 - - kiwisolver - - krb5 - - lxml - - markupsafe - - matplotlib - - munch - - netcdf4 - - hdf5 - - numpy - - olefile - - packaging - - pandas - - parso - - pexpect - - pickleshare - - pillow - - prompt-toolkit - - ptyprocess - - pycparser - - pygeos - - pygments - - pyopenssl - - pyparsing - - pyproj - - pysocks - - python-wget - - pytz - - pyyaml - - readline - - requests - - retrying - - rtree - - setuptools - - shapely - - six - - searvey - - soupsieve - - tbb - - tiledb - - tk - - tornado - - traitlets - - typing_extensions - - wcwidth - - wheel - - zstd - - pip: - - pyschism diff --git a/singularity/post/files/defn.py b/singularity/post/files/defn.py deleted file mode 100644 index acb7e8b..0000000 --- a/singularity/post/files/defn.py +++ /dev/null @@ -1,79 +0,0 @@ -from matplotlib.colors import LinearSegmentedColormap -import matplotlib.pyplot as plt - -cdict = { - 'red': ( - (0.0, 1, 1), - (0.05, 1, 1), - (0.11, 0, 0), - (0.66, 1, 1), - (0.89, 1, 1), - (1, 0.5, 0.5), - ), - 'green': ( - (0.0, 1, 1), - (0.05, 1, 1), - (0.11, 0, 0), - (0.375, 1, 1), - (0.64, 1, 1), - (0.91, 0, 0), - (1, 0, 0), - ), - 'blue': ((0.0, 1, 1), (0.05, 1, 1), (0.11, 1, 1), (0.34, 1, 1), (0.65, 0, 0), (1, 0, 0)), -} - -jetMinWi = LinearSegmentedColormap('my_colormap', cdict, 256) -my_cmap = plt.cm.jet - -# Color code for the point track. -colors_hurricane_condition = { - 'subtropical depression': '#ffff99', - 'tropical depression': '#ffff66', - 'tropical storm': '#ffcc99', - 'subtropical storm': '#ffcc66', - 'hurricane': 'red', - 'major hurricane': 'crimson', -} - -width = 750 -height = 250 - -# Constants -noaa_logo = 'https://www.nauticalcharts.noaa.gov/images/noaa-logo-no-ring-70.png' - -template_track_popup = """ -
-
{} condition

- 'Date: {}
- Condition: {}
- """ - -template_storm_info = """ -
  Storm: {}
-   Year: {}  
-
- """ - -template_fct_info = """ -
  Date: {}UTC
-   FCT : t{}z  
-
- """ - -disclaimer = """ -
  Hurricane Explorer; - NOAA/NOS/OCS
-   Contact: Saeed.Moghimi@noaa.gov  
-   Disclaimer: Experimental product. All configurations and results are pre-decisional.
-
- - """ diff --git a/singularity/post/files/hurricane_funcs.py b/singularity/post/files/hurricane_funcs.py deleted file mode 100644 index b60d403..0000000 --- a/singularity/post/files/hurricane_funcs.py +++ /dev/null @@ -1,273 +0,0 @@ -from __future__ import division, print_function - -# !/usr/bin/env python -# -*- coding: utf-8 -*- -""" - -Functions for handling nhc data - - -""" - -__author__ = 'Saeed Moghimi' -__copyright__ = 'Copyright 2020, UCAR/NOAA' -__license__ = 'GPL' -__version__ = '1.0' -__email__ = 'moghimis@gmail.com' - -import pandas as pd -import geopandas as gpd -import numpy as np -import sys -from glob import glob -import requests -from bs4 import BeautifulSoup - -try: - from urllib.request import urlopen, urlretrieve -except: - from urllib import urlopen, urlretrieve -import lxml.html - -import wget - -# from highwatermarks import HighWaterMarks -# from collections import OrderedDict -# import json -import os - - -################## -def url_lister(url): - urls = [] - connection = urlopen(url) - dom = lxml.html.fromstring(connection.read()) - for link in dom.xpath('//a/@href'): - urls.append(link) - return urls - - -################# -def download(url, path, fname): - sys.stdout.write(fname + '\n') - if not os.path.isfile(path): - urlretrieve(url, filename=path, reporthook=progress_hook(sys.stdout)) - sys.stdout.write('\n') - sys.stdout.flush() - - -################# -def progress_hook(out): - """ - Return a progress hook function, suitable for passing to - urllib.retrieve, that writes to the file object *out*. - """ - - def it(n, bs, ts): - got = n * bs - if ts < 0: - outof = '' - else: - # On the last block n*bs can exceed ts, so we clamp it - # to avoid awkward questions. - got = min(got, ts) - outof = '/%d [%d%%]' % (ts, 100 * got // ts) - out.write('\r %d%s' % (got, outof)) - out.flush() - - return it - - -################# -def get_nhc_storm_info(year, name): - """ - - """ - - print('Read list of hurricanes from NHC based on year') - - if int(year) < 2008: - print(' ERROR: GIS Data is not available for storms before 2008 ') - sys.exit('Exiting .....') - - url = 'http://www.nhc.noaa.gov/gis/archive_wsurge.php?year=' + year - - # r = requests.get(url,headers=headers,verify=False) - r = requests.get(url, verify=False) - - soup = BeautifulSoup(r.content, 'lxml') - - table = soup.find('table') - # table = [row.get_text().strip().split(maxsplit=1) for row in table.find_all('tr')] - - tab = [] - for row in table.find_all('tr'): - tmp = row.get_text().strip().split() - tab.append([tmp[0], tmp[-1]]) - - print(tab) - - df = pd.DataFrame(data=tab[:], columns=['identifier', 'name'], ).set_index('name') - - ############################### - - print(' > based on specific storm go fetch gis files') - hid = df.to_dict()['identifier'][name.upper()] - al_code = ('{}' + year).format(hid) - hurricane_gis_files = '{}_5day'.format(al_code) - - return al_code, hurricane_gis_files - - -################# -# @retry(stop_max_attempt_number=5, wait_fixed=3000) -def download_nhc_gis_files(hurricane_gis_files, rundir): - """ - """ - - base = os.path.abspath(os.path.join(rundir, 'nhcdata', hurricane_gis_files)) - - if len(glob(base + '/*')) < 1: - nhc = 'http://www.nhc.noaa.gov/gis/forecast/archive/' - - # We don't need the latest file b/c that is redundant to the latest number. - fnames = [ - fname - for fname in url_lister(nhc) - if fname.startswith(hurricane_gis_files) and 'latest' not in fname - ] - - if not os.path.exists(base): - os.makedirs(base) - - for fname in fnames: - path1 = os.path.join(base, fname) - if not os.path.exists(path1): - url = '{}/{}'.format(nhc, fname) - download(url, path1, fname) - - return base - ################################# - - -# Only needed to run on binder! -# See https://gitter.im/binder-project/binder?at=59bc2498c101bc4e3acfc9f1 -os.environ['CPL_ZIP_ENCODING'] = 'UTF-8' - - -def read_advisory_cones_info(hurricane_gis_files, base, year, code): - print(' > Read cones shape file ...') - - cones, points = [], [] - for fname in sorted(glob(os.path.join(base, '{}_*.zip'.format(hurricane_gis_files)))): - number = os.path.splitext(os.path.split(fname)[-1])[0].split('_')[-1] - - # read cone shapefiles - - if int(year) < 2014: - # al092008.001_5day_pgn.shp - divd = '.' - else: - divd = '-' - - pgn = gpd.read_file( - ('/{}' + divd + '{}_5day_pgn.shp').format(code, number), - vfs='zip://{}'.format(fname), - ) - cones.append(pgn) - - # read points shapefiles - pts = gpd.read_file( - ('/{}' + divd + '{}_5day_pts.shp').format(code, number), - vfs='zip://{}'.format(fname), - ) - # Only the first "obsevartion." - points.append(pts.iloc[0]) - - return cones, points, pts - - -################# -def download_nhc_best_track(year, code): - """ - - """ - - url = 'http://ftp.nhc.noaa.gov/atcf/archive/{}/'.format(year) - fname = 'b{}.dat.gz'.format(code) - base = os.path.abspath(os.path.join(os.path.curdir, 'data', code + '_best_track')) - - if not os.path.exists(base): - os.makedirs(base) - - path1 = os.path.join(base, fname) - # download(url, path,fname) - if not os.path.exists(url + fname): - wget.download(url + fname, out=base) - - return base - - -################# -def download_nhc_gis_best_track(year, code): - """ - - """ - - url = 'http://www.nhc.noaa.gov/gis/best_track/' - fname = '{}_best_track.zip'.format(code) - base = os.path.abspath(os.path.join(os.path.curdir, 'data', code + '_best_track')) - - if not os.path.exists(base): - os.makedirs(base) - - path = os.path.join(base, fname) - # download(url, path,fname) - if not os.path.exists(url + fname): - wget.download(url + fname, out=base) - return base - - -################# -def read_gis_best_track(base, code): - """ - - """ - print(' > Read GIS Best_track file ...') - - fname = base + '/{}_best_track.zip'.format(code) - - points = gpd.read_file(('/{}_pts.shp').format(code), vfs='zip://{}'.format(fname)) - - radii = gpd.read_file(('/{}_radii.shp').format(code), vfs='zip://{}'.format(fname)) - - line = gpd.read_file(('/{}_lin.shp').format(code), vfs='zip://{}'.format(fname)) - - return line, points, radii - - -def get_coordinates(bbox): - """ - Create bounding box coordinates for the map. It takes flat or - nested list/numpy.array and returns 5 points that closes square - around the borders. - - Examples - -------- - >>> bbox = [-87.40, 24.25, -74.70, 36.70] - >>> len(get_coordinates(bbox)) - 5 - - """ - bbox = np.asanyarray(bbox).ravel() - if bbox.size == 4: - bbox = bbox.reshape(2, 2) - coordinates = [] - coordinates.append([bbox[0][1], bbox[0][0]]) - coordinates.append([bbox[0][1], bbox[1][0]]) - coordinates.append([bbox[1][1], bbox[1][0]]) - coordinates.append([bbox[1][1], bbox[0][0]]) - coordinates.append([bbox[0][1], bbox[0][0]]) - else: - raise ValueError('Wrong number corners.' ' Expected 4 got {}'.format(bbox.size)) - return coordinates diff --git a/singularity/post/post.def b/singularity/post/post.def deleted file mode 100644 index be0df3a..0000000 --- a/singularity/post/post.def +++ /dev/null @@ -1,28 +0,0 @@ -BootStrap: docker -#From: centos:centos7.8.2003 -From: continuumio/miniconda3:23.3.1-0-alpine - -%files - environment.yml - files/*.py /scripts/ - -%environment - export PYTHONPATH=/scripts - -%post - ENV_NAME=post - - apk update && apk upgrade - - conda install mamba -n base -c conda-forge - mamba update --name base --channel defaults conda - mamba env create -n $ENV_NAME --file /environment.yml - mamba clean --all --yes - - -%runscript - conda run -n post --no-capture-output python -m generate_viz $* - - -%labels - Author "Soroosh Mani" diff --git a/singularity/prep/environment.yml b/singularity/prep/environment.yml deleted file mode 100644 index 0d24986..0000000 --- a/singularity/prep/environment.yml +++ /dev/null @@ -1,41 +0,0 @@ -name: icogsc -channels: - - conda-forge -dependencies: - - python<3.10 - - pip - - gdal - - geos - - proj - - netcdf4 - - hdf5 - - cartopy - - cfunits - - cf-python - - cfgrib - - cmocean - - esmf - - esmpy - - cfdm - - udunits2 - - pyproj - - shapely>=2 - - rasterio - - fiona - - geopandas>=0.11.0 - - rtree - - pandas - - utm - - scipy - - numpy - - matplotlib - - requests - - tqdm - - mpi4py - - pyarrow - - pytz - - geoalchemy2 - - seawater - - xarray==2023.7.0 - - pip: - - chaospy>=4.2.7 diff --git a/singularity/prep/files/refs/param.nml b/singularity/prep/files/refs/param.nml deleted file mode 100755 index 70c9afc..0000000 --- a/singularity/prep/files/refs/param.nml +++ /dev/null @@ -1,69 +0,0 @@ -&CORE - ipre=0 - ibc=1 - ibtp=0 - nspool=24 - ihfskip=11088 - dt=150.0 - rnday=19.25 - - msc2 = 24 !same as msc in .nml ... for consitency check between SCHISM and WWM - mdc2 = 30 !same as mdc in .nml -/ - -&OPT - start_year=2018 - start_month=8 - start_day=30 - start_hour=6.0 - utc_start=-0.0 - ics=2 - ihot=1 - nchi=-1 - hmin_man=1.0 - ic_elev=1 - nws=-1 - wtiminc=150.0 - - icou_elfe_wwm = 1 - nstep_wwm = 4 !call WWM every this many time steps - iwbl = 0 !wave boundary layer formulation (used only if USE_WMM and - !icou_elfe_wwm/=0 and nchi=1. If icou_elfe_wwm=0, set iwbl=0): - !1-modified Grant-Madsen formulation; 2-Soulsby (1997) - hmin_radstress = 1. !min. total water depth used only in radiation stress calculation [m] -! nrampwafo = 0 !ramp-up option for the wave forces (1: on; 0: off) - drampwafo = 0. !ramp-up period in days for the wave forces (no ramp-up if <=0) - turbinj = 0.15 !% of depth-induced wave breaking energy injected in turbulence - !(default: 0.15 (15%), as proposed by Feddersen, 2012) - turbinjds = 1.0 !% of wave energy dissipated through whitecapping injected in turbulence - !(default: 1 (100%), as proposed by Paskyabi et al. 2012) - alphaw = 0.5 !for itur=4 : scaling parameter for the surface roughness z0s = alphaw*Hm0. - !If negative z0s = abs(alphaw) e.g. z0s=0.2 m (Feddersen and Trowbridge, 2005) - ! Vortex Force terms (off/on:0/1) -/ - -&SCHOUT - nhot=1 - nhot_write=11088 - - iof_hydro(14) = 1 - - iof_wwm(1) = 1 !sig. height (m) {sigWaveHeight} 2D - iof_wwm(2) = 0 !Mean average period (sec) - TM01 {meanWavePeriod} 2D - iof_wwm(3) = 0 !Zero down crossing period for comparison with buoy (s) - TM02 {zeroDowncrossPeriod} 2D - iof_wwm(4) = 0 !Average period of wave runup/overtopping - TM10 {TM10} 2D - iof_wwm(5) = 0 !Mean wave number (1/m) {meanWaveNumber} 2D - iof_wwm(6) = 0 !Mean wave length (m) {meanWaveLength} 2D - iof_wwm(7) = 0 !Mean average energy transport direction (degr) - MWD in NDBC? {meanWaveDirection} 2D - iof_wwm(8) = 0 !Mean directional spreading (degr) {meanDirSpreading} 2D - iof_wwm(9) = 1 !Discrete peak period (sec) - Tp {peakPeriod} 2D - iof_wwm(10) = 0 !Continuous peak period based on higher order moments (sec) {continuousPeakPeriod} 2D - iof_wwm(11) = 0 !Peak phase vel. (m/s) {peakPhaseVel} 2D - iof_wwm(12) = 0 !Peak n-factor {peakNFactor} 2D - iof_wwm(13) = 0 !Peak group vel. (m/s) {peakGroupVel} 2D - iof_wwm(14) = 0 !Peak wave number {peakWaveNumber} 2D - iof_wwm(15) = 0 !Peak wave length {peakWaveLength} 2D - iof_wwm(16) = 1 !Peak (dominant) direction (degr) {dominantDirection} 2D - iof_wwm(17) = 0 !Peak directional spreading {peakSpreading} 2D - -/ diff --git a/singularity/prep/files/refs/wwminput.nml b/singularity/prep/files/refs/wwminput.nml deleted file mode 100755 index f0996e5..0000000 --- a/singularity/prep/files/refs/wwminput.nml +++ /dev/null @@ -1,667 +0,0 @@ -! This is the main input for WWM -! Other mandatory inputs: wwmbnd.gr3 (boundary flag files; see below) -! Depending on the choices of parameters below you may need additional inputs - -&PROC - PROCNAME = 'schism_wwm_2003_test' ! Project Name - DIMMODE = 2 ! Mode of run (ex: 1 = 1D, 2 = 2D) always 2D when coupled to SCHISM - LSTEA = F ! steady mode; under development - LQSTEA = F ! Quasi-Steady Mode; In this case WWM-II is doing subiterations defined as DELTC/NQSITER unless QSCONVI is not reached - LSPHE = T ! Spherical coordinates (lon/lat) - LNAUTIN = T ! Nautical convention for all inputs given in degrees - LNAUTOUT = T ! Output in Nautical convention - ! If T, 0 is _from_ north, 90 is from east etc; - ! If F, maths. convention - 0: to east; 90: going to north - LMONO_IN = F ! For prescribing monochromatic wave height Hmono as a boundary conditions; incident wave is defined as monochromatic wave height, which is Hmono = sqrt(2) * Hs - LMONO_OUT = F ! Output wave heights in terms of Lmono - BEGTC = '20180830.000000' ! Time for start the simulation, ex:yyyymmdd. hhmmss - DELTC = 600 ! Time step (MUST match dt*nstep_wwm in SCHISM!) - UNITC = 'SEC' ! Unity of time step - ENDTC = '20181030.000000' ! Time for stop the simulation, ex:yyyymmdd. hhmmss - DMIN = 0.01 ! Minimum water depth. THis must be same as h0 in selfe -/ - -&COUPL - LCPL = T ! Couple with current model ... main switch - keep it on for SCHISM-WWM - LROMS = F ! ROMS (set as F) - LTIMOR = F ! TIMOR (set as F) - LSHYFEM = F ! SHYFEM (set as F) - RADFLAG = 'LON' ! LON: Longuet-Higgin; VOR: vortex formulation - LETOT = F ! Option to compute the wave induced radiation stress. If .T. the radiation stress is based on the integrated wave spectrum - ! e.g. Etot = Int,0,inf;Int,0,2*pi[N(sigma,theta)]dsigma,dtheta. If .F. the radiation stress is estimated as given in Roland et al. (2008) based - ! on the directional spectra itself. It is always desirable to use .F., since otherwise the spectral informations are truncated and therefore - ! LETOT = .T., is only for testing and developers! - NLVT = 10 ! Number of vertical Layers; not used with SCHISM - DTCOUP = 600. ! Couple time step - not used when coupled to SCHISM - IMET_DRY = 0 ! -/ - -&GRID - LCIRD = T ! Full circle in directional space - LSTAG = F ! Stagger directional bins with a half Dtheta; may use T only for regular grid to avoid char. line aligning with grid line - MINDIR = 0. ! Minimum direction for simulation (unit: degrees; nautical convention; 0: from N; 90: from E); not used if LCIRD = .T. - MAXDIR = 360. ! Maximum direction for simulation (unit: degrees); may be < MINDIR; not used if LCIRD = .T. - MDC = 30 ! Number of directional bins - FRLOW = 0.04 ! Low frequency limit of the discrete wave period (Hz; 1/period) - FRHIGH = 1. ! High frequency limit of the discrete wave period. - MSC = 24 ! Number of frequency bins - FILEGRID = 'hgrid_WWM.gr3' ! Name of the grid file. hgrid.gr3 if IGRIDTYPE = 3 (SCHISM) - IGRIDTYPE = 3 ! Gridtype used. - ! 1 ~ XFN system.dat - ! 2 ~ WWM-PERIODIC - ! 3 ~ SCHISM - ! 4 ~ old WWM type - LSLOP = F ! Bottom Slope limiter (default=F) - SLMAX = 0.2 ! Max Slope; - LVAR1D = F ! For 1d-mode if variable dx is used; not used with SCHISM - LOPTSIG = F ! Use optimal distributions of freq. in spectral space ... fi+1 = fi * 1.1. Take care what you high freq. limit is! - CART2LATLON = F, - LATLON2CART = F, - APPLY_DXP_CORR = F, - USE_EXACT_FORMULA_SPHERICAL_AREA = T, ! Use spherical formular for triangle area computation. - !LEXPORT_GRID_MOD_OUT = F -/ - -&INIT - LHOTR = F ! Use hotstart file (see &HOTFILE section) - LINID = F ! Initial condition; F for default; use T if using WW3 as i.c. etc - INITSTYLE = 1 ! 1 - Parametric Jonswap, 2 - Read from Global NETCDF files, work only if IBOUNDFORMAT=2 -/ - -&BOUC - LBCSE = F ! The wave boundary data is time dependent - LBCWA = T ! Parametric Wave Spectra - LBCSP = F ! Specify (non-parametric) wave spectra, specified in 'FILEWAVE' below - LINHOM = F ! Non-uniform wave b.c. in space - LBSP1D = F ! 1D (freq. space only) format for FILEWAVE if LBCSP=T and LINHOM=F - LBSP2D = F ! Not used now - LBINTER = F ! Do interpolation in time if LBCSE=T (not available for quasi-steady mode within the subtime steps) - BEGTC = '20180830.000000' ! Begin time of the wave boundary file (FILEWAVE) - DELTC = 1 ! Time step in FILEWAVE - UNITC = 'HR' ! Unit can be HR, MIN, SEC - ENDTC = '20181030.000000' ! End time - FILEBOUND = 'wwmbnd.gr3' ! Boundary file defining boundary conditions and Neumann nodes. - ! In this file there is following definition Flag 0: not on boundary; 3: Neumann (0 gradient only for advection part); - ! 2: active bnd (Dirichlet). Bnd flags imported from SCHISM: ! 1: exterior bnd; -1: interior (islands) - ! exterio and interior boundaries need not to be defined. - IBOUNDFORMAT = 1 ! - FILEWAVE = 'bndfiles.dat' ! Boundary file defining boundary input - LINDSPRDEG = F ! If 1-d wave spectra are read this flag defines whether the input for the directional spreading is in degrees (true) or exponent (false) - LPARMDIR = F ! If LPARMDIR is true then directional spreading is read from WBDS and must be in exponential format at this time, only valid for 1d Spectra - ! For WW3 boundary input also set LINHOM=T, LBCSE=T and this works only for spherical coordinates - - WBHS = 2. ! Hs at the boundary for parametric spectra - WBSS = 2 ! 1 or -1: Pierson-Moskowitz, 2 or -2: JONSWAP, 3 or -3: all in one BIN, - ! 4: Gauss. The sign decides whether WBTP below is - ! peak (+) or mean period (-) - WBTP = 8. ! Tp at the boundary (sec); mean or peak depending on the sign of WBSS - WBDM = 90.0 ! Avg. Wave Direction at the boundary - WBDSMS = 1 ! Directional spreading value in degrees (1) or as exponent (2) - WBDS = 10. ! Directioanl spreading at the boundary (degrees/exponent) - WBGAUSS = 0.1 ! factor for gaussian distribution if WBSS=1 - ! End section for LBCWA=T and LINHOM=F - WBPKEN = 3.3 ! Peak enhancement factor for Jonswap Spectra if WBSS=2 - MULTIPLE_IN = T, - NETCDF_OUT_PARAM = F, - NETCDF_OUT_SPECTRA = F, - NETCDF_OUT_FILE = 'boundary_out_spec.nc' - USE_SINGLE_OUT = T, - BEGTC_OUT = 20030908.000000 , - DELTC_OUT = 600.000000000000 , - UNITC_OUT = SEC , - ENDTC_OUT = 20031008.000000 , - EXTRAPOLATION_ALLOWED = F, - HACK_HARD_SET_IOBP = F, - !PARAMWRITE = T, - NETCDF_IN_FILE = 'bndfiles.dat' - !LEXPORT_BOUC_MOD_OUT = F, - EXPORT_BOUC_DELTC = 0.00 -/ -&WIND ! THIS IS NOW USED IN SCHISM - LSEWD = F ! Time dependend wind input - BEGTC = '20030101.000000' ! Begin time - DELTC = 60.0 ! Time step - UNITC = 'MIN' ! Unit - ENDTC = '20030102.000000' ! End time - LINTERWD = T ! Interpolate linear within the wind input time step - LSTWD = T ! Steady wind - LCWIN = T ! Constant wind - LWDIR = T ! Define wind using wind direction rather than vel. vectors - WDIR = 140.0 ! Wind direction if LWDIR=T - WVEL = 10.0 ! Wind velocity ... - CWINDX = 30.0 ! wind x-vec if LWDIR=F - CWINDY = 0.0 ! wind y-vec - FILEWIND = 'wind.dat' ! wind input data file; input file format: write(*,*) curtx; write(*,*) curty - WINDFAC = 1. ! Factor for wind scaling - IWINDFORMAT = 1 ! kind of wind input - ! 1 - ASCII, - ! 2 - DWD_NETCDF - ! 3 - NOAA CFRS - ! 4 - NOAA NARR - ! 5 - netCDF WRF/ROMS forcing (Uwind,Vwind,LON,LAT,wind_time are used), fast bilinear interp - LWINDFROMWWM = F, ! Wind is coming from WWM (true) or from SCHISM(false). This is under developement. If F, the following parameters in this section are ignored. For SELFE users, use F. - GRIB_FILE_TYPE = 1, - EXTRAPOLATION_ALLOWED = F, - USE_STEPRANGE = T, - MULTIPLE_IN = T, - !LEXPORT_WIND_MOD_OUT = F, - EXPORT_WIND_DELTC = 0.00, - !LSAVE_INTERP_ARRAY = F -/ -&CURR !NOT USED WITH SCHISM - LSECU = F ! Time dependend currents - BEGTC = '20030908.000000' ! Beginn time - DELTC = 600 ! Time step - UNITC = 'SEC' ! Unit - ENDTC = '20031008.000000' ! End time - LINTERCU = F ! Interpolate linear within the wind input time step - LSTCU = F ! Steady current - LCCUR = F ! Constant current - CCURTX = 0.0 ! current x-vec - CCURTY = 0.0 ! current y-vec - FILECUR = 'current.dat' ! Current file name; input file format: write(*,*) curtx; write(*,*) curty - LERGINP = F ! read timor file for input ... ergzus.bin - CURFAC = 1.000000 - ICURRFORMAT = 1 - MULTIPLE_IN = T, - !LEXPORT_CURR_MOD_OUT = F, - EXPORT_CURR_DELTC = 0.000000000000000E+000 -/ - -&WALV !NOT USED WITH SCHISM - LSEWL = F ! Time dependend elev. - BEGTC = '20030908.000000' ! Begin time - DELTC = 1 ! Time step - UNITC = 'HR' ! Unit - ENDTC = '20031008.000000' ! End time - LINTERWL = F ! Interpolate linear within the wind input time step - LSTWL = T ! Steady water level - LCWLV = T ! Constant water level - CWATLV = 0.0 ! elevation of the water level [m] - FILEWATL = ' ' ! water level file name; input file format: write(*,*) eta - LERGINP = F, - WALVFAC = 1.00000000000000 , - IWATLVFORMAT = 1, - MULTIPLE_IN = T, - !LEXPORT_WALV_MOD_OUT = F, - EXPORT_WALV_DELTC = 0.000000000000000E+000 -/ - -&ENGS !SOURCE TERMS - !ISOURCE = 1 ! Source Term Formulation for deep water: 1 ~ Ardhuin et al. (WW3), 2 ~ Janssen et al., (ECMWF), ~ 3 ~ Komen et al. 1984, (SWAN), (DEFAULT: 1) - MESNL = 1 ! Nonlinear Interaction NL4 , 1 ~ on, 0 ~ off (DIA), (DEFAULT: 1) - MESIN = 1 ! Wind input 1 ~ on, 0 ~ off, (DEFAULT: 1) - IFRIC = 1 ! Now only JONSWAP friction will add Roland & Ardhuin soon. - MESBF = 1 ! Bottomg friction: 1 ~ on, 0 ~ off (JONSWAP Formulation); (DEFAULT: 1) - FRICC = 0.067 ! Cjon - Bottom friction coefficient (always positive); (DEFAULT: 0.067) - MESBR = 1 ! Shallow water wave breaking; 0: off; 1: on: BJ78 same as in SWAN, (DEFAULT: 1) - ICRIT = 1 ! Wave breaking criterion: set as 1 - SWAN, 2 - Dingemans; (DEFAULT: 2) - IBREAK = 1 ! Now only Battjes & Janssen - B_ALP = 0.5 ! Dissipation proportionality coefficient, (DEFAULT: 0.5) - BRCR = 0.78 ! Wave breaking coefficient for Const. type wave breaking criterion; range: 0.6-1.1 (suggested 0.78) - MEVEG = 0 - LMAXETOT = T ! Limit shallow water wave height by wave breaking limiter (default=T) - MESDS = 1 ! Whitecapping 1 ~ on, 0 ~ off; (DEFAULT: 1) - MESTR = 1 ! Nonlinear Interaction in shallow water SNL3: 1 ~ on, 0 ~ off (DEFAULT: 0) - TRICO = 0.1 ! proportionality const. (\alpha_EB); default is 0.1; (DEFAULT: 0.1) - TRIRA = 5. ! ratio of max. freq. considered in triads over mean freq.; 2.5 is suggested; (DEFAULT: 2.5) - TRIURS = 0.1 ! critical Ursell number; if Ursell # < TRIURS; triads are not computed; (DEFAULT: 0.1) -/ - - -&SIN4 ! Input parameter for ST4 source terms do not touch or reach our paper about this ... - ZWND = 10.0000000000000, - ALPHA0 = 9.499999694526196E-003, - Z0MAX = 0.000000000000000E+000, - BETAMAX = 1.54000000000000, - SINTHP = 2.00000000000000, - ZALP = 6.000000052154064E-003, - TAUWSHELTER = 0.300000011920929, - SWELLFPAR = 1.00000000000000, - SWELLF = 0.660000026226044, - SWELLF2 = -1.799999922513962E-002, - SWELLF3 = 2.199999988079071E-002, - SWELLF4 = 150000.000000000, - SWELLF5 = 1.20000004768372, - SWELLF6 = 0.000000000000000E+000, - SWELLF7 = 360000.000000000, - Z0RAT = 3.999999910593033E-002, - SINBR = 0.000000000000000E+000, -/ - -&SDS4 ! Input parameter for ST4 dissipation terms do not touch or reach our paper about this ... - SDSC1 = 0.000000000000000E+000, - FXPM3 = 4.00000000000000, - FXFM3 = 2.50000000000000, - FXFMAGE = 0.000000000000000E+000, - SDSC2 = -2.200000017182902E-005, - SDSCUM = -0.403439998626709, - SDSSTRAIN = 0.000000000000000E+000, - SDSC4 = 1.00000000000000, - SDSC5 = 0.000000000000000E+000, - SDSC6 = 0.300000011920929, - SDSBR = 8.999999845400453E-004, - SDSBR2 = 0.800000011920929, - SDSP = 2.00000000000000, - SDSISO = 2.00000000000000, - SDSBCK = 0.000000000000000E+000, - SDSABK = 1.50000000000000, - SDSPBK = 4.00000000000000, - SDSBINT = 0.300000011920929, - SDSHCK = 1.50000000000000, - SDSDTH = 80.0000000000000, - SDSCOS = 2.00000000000000, - SDSBRF1 = 0.500000000000000, - SDSBRFDF = 0.000000000000000E+000, - SDSBM0 = 1.00000000000000, - SDSBM1 = 0.000000000000000E+000, - SDSBM2 = 0.000000000000000E+000, - SDSBM3 = 0.000000000000000E+000, - SDSBM4 = 0.000000000000000E+000, - SDSHFGEN = 0.000000000000000E+000, - SDSLFGEN = 0.000000000000000E+000, - WHITECAPWIDTH = 0.300000011920929, - FXINCUT = 0.000000000000000E+000, - FXDSCUT = 0.000000000000000E+000, -/ - -&NUMS - ICOMP = 3 - ! This parameter controls the way how the splitting is done and whether implicit or explicit schemes are used for spectral advection - ! ICOMP = 0 - ! This means that all dimensions are integrated using explicit methods. Similar - ! to WW3, actually the same schemes are available in WW3 4.1. - ! ICOMP = 1 - ! This mean that advection in geographical space is done using implicit - ! Methods, source terms and spectral space are still integrated as done in - ! WW3. - ! ICOMP = 2 - ! This means that the advection is done using implicit methods and that the - ! source terms are integrated semi-implicit using Patankar rules and linearized - ! source terms as done in SWAN. Spectral part is still a fractional step - ! ICOMP = 3: fully implicit and no splitting - - AMETHOD = 7 - ! AMETHOD controls the different Methods in geographical space - ! AMETHOD = 0 - ! No Advection in geo. Space - ! AMETHOD = 1 - ! Explicit N-Scheme for ICOMP = 0 and Implicit N-Scheme for ICOMP > 0 - ! AMETHOD = 2 - ! PSI-Scheme for ICOMP = 0 and Implicit - ! Crank-Nicholson N-Scheme for ICOMP > 0 - ! AMETHOD = 3 - ! LFPSI Scheme for ICOMP = 0 and Implicit two time level N2 scheme for ICOMP > 0 - - ! AMETHOD = 4 - ! Like AMETHOD = 1 but using PETSc based on small matrices MNP**2. this can be efficient on small to medium scale cluster up to say 128 Nodes. - - ! AMETHOD = 5 - ! Like AMETHOD = 1 but using PETSc and assembling the full matrix and the source terms at once (MNP * MDC * MSC)**2. number of equations - ! this is for large scale applications - - ! Remark for AMETHOD = 4 and 5. This methods are new and only tested on a few cases where the results look reasonable and do not depend on the number of CPU's which - ! valdiates the correct implementation. The scaling performance is anticipated to be "quite poor" at this time. Many different consituents influence the parallel speedup. - ! Please let me know all the information you have in order to improve and accelarate the developement of implicit parallel WWM-III. - ! Have fun ... Aron and Thomas. - ! AMETHOD = 6 - BCGS Solver - ! AMETHOD = 7 - GAUSS and JACOBI SOLVER - SMETHOD = 1 - ! This switch controls the way the source terms are integrated. 0: no source terms; - ! 1: splitting using RK-3 and SI for fast and slow modes 2: semi-implicit; - ! 3: R-K3 (if ICOMP=0 or 1) - slow; 4: Dynamic Splitting (experimental) - - DMETHOD = 2 - ! This switch controls the numerical method in directional space. - ! DMETHOD = 0 - ! No advection in directional space - ! DMETHOD = 1 - ! Crank-Nicholson (RTHETA = 0.5) or Euler Implicit scheme (RTHETA = 1.0) - ! DMEHOD = 2 - ! Ultimate Quickest as in WW3 (usually best) - ! DMETHOD = 3 - ! RK5-WENO - ! DMETHOD = 4 - ! Explicit FVM Upwind scheme - MELIM = 1 ! Source Term Limiter on/off (1/0) default values = 1 - LITERSPLIT = F ! T: double Strang split; F: simple split (more efficienct). Default: F - - LFILTERTH = F - ! LFILTERTH: use a CFL filter to limit the advection vel. In directional space. This is similar to WW3. - ! Mostly not used. WWMII is always stable. - MAXCFLTH = 1.0 ! Max Cfl in Theta space; used only if LFILTERTH=T - FMETHOD = 1 - ! This switch controls the numerical method used in freq. space - ! = 0 - ! No Advection in spectral space - ! = 1 - ! Ultimate Quickest as in WW3 (best) - LFILTERSIG = F ! Limit the advection velocitiy in freq. space (usually F) - MAXCFLSIG = 1.0 ! Max Cfl in freq. space; used only if LFILTERSIG=T - LDIFR = F ! Use phase decoupled diffraction approximation according to Holthuijsen et al. (2003) (usually T; if crash, use F) - IDIFFR = 1 ! Extended WAE accounting for higher order effects WAE becomes nonlinear; 1: Holthuijsen et al. ; 2: Liau et al. ; 3: Toledo et al. (in preparation) - LCONV = F ! Estimate convergence criterian and write disk (quasi-steady - qstea.out) - LCFL = F ! Write out CFL numbers; use F to save time - NQSITER = 1 ! # of quasi-steady (Q-S) sub-divisions within each WWM time step (trial and errors) - QSCONV1 = 0.98 ! Number of grid points [%/100] that have to fulfill abs. wave height criteria EPSH1 - QSCONV2 = 0.98 ! Number of grid points [%/100] that have to fulfill rel. wave height criteria EPSH2 - QSCONV3 = 0.98 ! Number of grid points [%/100] that have to fulfill sum. rel. wave action criteria EPSH3 - QSCONV4 = 0.98 ! Number of grid points [%/100] that have to fulfill rel. avg. wave steepness criteria EPSH4 - QSCONV5 = 0.98 ! Number of grid points [%/100] that have to fulfill avg. rel. wave period criteria EPSH5 - - LEXPIMP = F ! Use implicit schemes for freq. lower than given below by FREQEXP; used only if ICOMP=0 - FREQEXP = 0.1 ! Minimum frequency for explicit schemes; only used if LEXPIMP=T and ICOMP=0 - EPSH1 = 0.01 ! Convergence criteria for rel. wave height ! EPSH1 < CONVK1 = REAL(ABS(HSOLD(IP)-HS2)/HS2) - EPSH2 = 0.01 ! Convergence criteria for abs. wave height ! EPSH2 < CONVK2 = REAL(ABS(HS2-HSOLD(IP))) - EPSH3 = 0.01 ! Convergence criteria for the rel. sum of wave action ! EPSH3 < CONVK3 = REAL(ABS(SUMACOLD(IP)-SUMAC)/SUMAC) - EPSH4 = 0.01 ! Convergence criteria for the rel. avg. wave steepness criteria ! EPSH4 < CONVK4 = REAL(ABS(KHS2-KHSOLD(IP))/KHSOLD(IP)) - EPSH5 = 0.01 ! Convergence criteria for the rel. avg. waveperiod ! EPSH5 < REAL(ABS(TM02-TM02OLD(IP))/TM02OLD(IP)) - LVECTOR = F ! Use optmized propagation routines for large high performance computers e.g. at least more than 128 CPU. Try LVECTOR=F first. - IVECTOR = 2 ! USed if LVECTOR=T; Different flavours of communications - ! LVECTOR = 1; same propagation style as if LVECTOR = F, this is for testing and development - ! LVECTOR = 2; all spectral bins are propagated with the same time step and communications is done only once per sub-iteration - ! LVECTOR = 3; all directions with the same freq. are propgated using the same time step the communications is done for each freq. - ! LVECTOR = 4; 2 but for mixed open-mpi, code has to be compiled with -openmp - ! LVECTOR = 5; 3 but for mixed open-mpi, code has to be compiled with -openmp - ! LVECTOR = 6; same as 2 but highly optmizied with respect to memory usage, of course it is must less efficient than 2 - ! remarks: if you are using this routines be aware that the memory amount that is used is approx. for LVECTOR 1-5 arround - ! 24 * MSC * MDC * MNP, so if you are trying this on 1 CPU you get a segmentation fault if your system has not enough memory or - ! if your system is not properly configured it may results into the fact that your computer starts blocking since it try's to swap to disk - ! The total amount of memoery used per CPU = 24 * MSC * MDC * MNP / No.CPU - LADVTEST = F ! for testing the advection schemes, testcase will be added soon - LCHKCONV = F ! needs to set to .true. for quasi-steady mode. in order to compute the QSCONVi criteria and check them - DTMIN_DYN = 1. ! min. time step (sec?) for dynamic integration, this controls in SMETHOD the smallest time step for the triads, DT = 1.s is found to work well. - NDYNITER = 100, ! max. iteration for dyn. scheme afterwards the limiter is applied in the last step, for SMETHOD .eq. this controls the integration of the triad interaction terms, which is done dynamically. - DTMIN_SIN = 1. ! min. time steps for the full fractional step method, where each source term is integrated with its own fractional step - DTMIN_SNL4 = 1. ! - DTMIN_SDS = 1. ! - DTMIN_SNL3 = 1. ! - DTMIN_SBR = 0.10 ! - DTMIN_SBF = 1.0 ! - NDYNITER_SIN = 10, ! max. iterations for each source term in the fractional step approach. - NDYNITER_SNL4 = 10, ! - NDYNITER_SDS = 10, ! - NDYNITER_SBR = 10, ! - NDYNITER_SNL3 = 10, ! - NDYNITER_SBF = 10, ! - ! 1: use PETSC - WAE_SOLVERTHR = 1.e-9, ! Threshold for the Block-Jacobi or Block-Gauss-Seider solver - MAXITER = 500, ! Max. number of iterations - PMIN = 1., ! Max. percentage of non-converged grid points - LNANINFCHK = F, ! Check for NaN and INF; usually turned off for efficiency - LZETA_SETUP = F, ! Compute wave setup (simple momentum eq.) - ZETA_METH = 0, ! Method for wave setup, Mathieu please explain! - LSOUBOUND = F - BLOCK_GAUSS_SEIDEL = T, ! Use the Gauss Seidel on each computer block. The result seems to be faster and use less memory But the # of iterations depends on the number of processors - LNONL = F ! Solve the nonlinear system using simpler algorithm (Patankar) - ASPAR_LOCAL_LEVEL = 0 ! Aspar locality level (0-10; check with your system) - L_SOLVER_NORM = F ! Compute solver norm ||A*x-b|| as termination - ! check of jacobi-Gauss-Seidel solver. Will increas cost if T - LACCEL = F -/ - - -! output of statistical variables over the whole domain at specified times. -&HISTORY - BEGTC = '20180830.000000' ! Start output time, yyyymmdd. hhmmss; - ! must fit the simulation time otherwise no output. - ! Default is same as PROC%BEGTC - DELTC = 1 ! Time step for output; if smaller than simulation time step, the latter is used (output every step for better 1D 2D spectra analysis) - UNITC = 'SEC' ! Unit - ENDTC = '20181030.000000' ! Stop time output, yyyymmdd. hhmmss - ! Default is same as PROC%ENDC - DEFINETC = 86400 ! Time scoop (sec) for history files - ! If unset or set to a negative value - ! then only one file is generated - ! otherwise, for example for 86400 - ! daily output files are created. - OUTSTYLE = 'NO' ! output option - use 'NO' for no output - ! 'NC' for netcdf output - ! 'XFN' for XFN output (default) - ! 'SHP' for DARKO SHP output - MULTIPLEOUT = 0 ! 0: output in a single netcdf file - ! MPI_reduce is used (default) - ! 1: output in separate netcdf files - ! each associated with one process - USE_SINGLE_OUT = T ! T: Use single precision in the - ! output of model variables (default) - PARAMWRITE = T ! T: Write the physical parametrization - ! and chosen numerical method - ! in the netcdf file (default T) - GRIDWRITE = T ! T/F: Write the grid in the netcdf history file (default T) - PRINTMMA = F ! T/F: Print minimum, maximum and average - ! value of statistics during runtime - ! (Default F) - ! (Requires a MPI_REDUCE) - FILEOUT = 'wwm_hist.dat' - ! Below is selection for all variables. Default is F for all variables. - HS = F ! significant wave height - TM01 = F ! mean period - TM02 = F ! zero-crossing mean period - KLM = F ! mean wave number - WLM = F ! mean wave length - ETOTC = F ! Variable ETOTC - ETOTS = F ! Variable ETOTS - DM = F ! mean wave direction - DSPR = F ! directional spreading - TPPD = F ! direaction of the peak ... check source code - TPP = F ! peak period - CPP = F ! peak phase vel. - WNPP = F ! peak wave number - CGPP = F ! peak group speed - KPP = F ! peak wave number - LPP = F ! peak wave length - PEAKD = F ! peak direction - PEAKDSPR = F ! peak directional spreading - DPEAK = F ! peak direction - UBOT = F ! bottom exc. vel. - ORBITAL = F ! bottom orbital vel. - BOTEXPER = F ! bottom exc. - TMBOT = F ! bottom period - URSELL = F ! Ursell number - UFRIC = F ! air friction velocity - Z0 = F ! air roughness length - ALPHA_CH = F ! Charnoch coefficient for air - WINDX = F ! Wind in X direction - WINDY = F ! Wind in Y direction - CD = F ! Drag coefficient - CURRTX = F ! current in X direction - CURRTY = F ! current in Y direction - WATLEV = F ! water level - WATLEVOLD = F ! water level at previous time step - DEPDT = F ! change of water level in time - DEP = F ! depth - TAUW = F ! surface stress from the wave - TAUHF = F ! high frequency surface stress - TAUTOT = F ! total surface stress - STOKESSURFX = F ! Surface Stokes drift in X direction - STOKESSURFY = F ! Surface Stokes drift in X direction - STOKESBAROX = F ! Barotropic Stokes drift in X direction - STOKESBAROY = F ! Barotropic Stokes drift in Y direction - RSXX = F ! RSXX potential of LH - RSXY = F ! RSXY potential of LH - RSYY = F ! RSYY potential of LH - CFL1 = F ! CFL number 1 - CFL2 = F ! CFL number 2 - CFL3 = F ! CFL number 3 -/ - -&STATION - BEGTC = '20180830.000000' ! Start simulation time, yyyymmdd. hhmmss; must fit the simulation time otherwise no output - ! Default is same as PROC%BEGTC - DELTC = 600 ! Time step for output; if smaller than simulation time step, the latter is used (output every step for better 1D 2D spectra analysis) - UNITC = 'SEC' ! Unit - ENDTC = '20181030.000000' ! Stop time simulation, yyyymmdd. hhmmss - ! Default is same as PROC%ENDC - DEFINETC = 86400 ! Time for definition of station files - ! If unset or set to a negative value - ! then only one file is generated - ! otherwise, for example for 86400 - ! daily output files are created. - OUTSTYLE = 'NO' ! output option - ! 'NO' no output - ! 'STE' classic station output (default) - ! 'NC' for netcdf output - MULTIPLEOUT = 0 ! 0: output in a single netcdf file - ! MPI_reduce is used (default) - ! 1: output in separate netcdf files - ! each associated with one process - USE_SINGLE_OUT = T ! T: Use single precision in the - ! output of model variables (default) - PARAMWRITE = T ! T: Write the physical parametrization - ! and chosen numerical method - ! in the netcdf file (default T) - FILEOUT = 'wwm_sta.dat' !not used - LOUTITER = F - IOUTS = 15, - NOUTS = P-1, P-2, P-3, P-4, P-5, P-6, P-7, P-8, P-9, P-10, P-11, P-12, P-13, P-14, P-15 - XOUTS = -76.0460000000000 , -76.7780000000000 , -75.8100000000000 , -75.7200000000000 , -74.8420000000000 , - -74.7030000000000 , -75.3300000000000 , -72.6310000000000 , -74.8350000000000 , -69.2480000000000 , - -72.6000000000000 - YOUTS = 39.152, 38.556, 38.033, 37.551, - 36.9740000000000 , 37.2040000000000 , 37.0230000000000 , 36.9150000000000 , 36.6110000000000 , - 38.4610000000000 , 35.7500000000000 , 34.5610000000000 , 31.8620000000000 , 40.5030000000000 , - 39.5840000000000 - CUTOFF = 15*0.44 ! cutoff freq (Hz) for each station - consistent with buoys - LSP1D = T ! 1D spectral station output - LSP2D = F ! 2D spectral station output - LSIGMAX = T ! Adjust the cut-freq. for the output (e.g. consistent with buoy cut-off freq.) - AC = F ! spectrum - WK = F ! variable WK - ACOUT_1D = F ! variable ACOUT_1D - ACOUT_2D = F ! variable ACOUT_2D - HS = F ! significant wave height - TM01 = F ! mean period - TM02 = F ! zero-crossing mean period - KLM = F ! mean wave number - WLM = F ! mean wave length - ETOTC = F ! Variable ETOTC - ETOTS = F ! Variable ETOTS - DM = F ! mean wave direction - DSPR = F ! directional spreading - TPPD = F ! Discrete Peak Period - TPP = F ! Peak Period - CPP = F - WNPP = F ! peak wave number - CGPP = F ! peak group speed - KPP = F ! peak wave number - LPP = F ! peak - PEAKD = F ! peak direction - PEAKDSPR = F ! peak directional spreading - DPEAK = F - UBOT = F - ORBITAL = F - BOTEXPER = F - TMBOT = F - URSELL = F ! Ursell number - UFRIC = F ! air friction velocity - Z0 = F ! air roughness length - ALPHA_CH = F ! Charnoch coefficient for air - WINDX = F ! Wind in X direction - WINDY = F ! Wind in Y direction - CD = F ! Drag coefficient - CURRTX = F ! current in X direction - CURRTY = F ! current in Y direction - WATLEV = F ! water level - WATLEVOLD = F ! water level at previous time step - DEPDT = F ! change of water level in time - DEP = F ! depth - TAUW = F ! surface stress from the wave - TAUHF = F ! high frequency surface stress - TAUTOT = F ! total surface stress - STOKESSURFX = F ! Surface Stokes drift in X direction - STOKESSURFY = F ! Surface Stokes drift in X direction - STOKESBAROX = F ! Barotropic Stokes drift in X direction - STOKESBAROY = F ! Barotropic Stokes drift in Y direction - RSXX = F ! RSXX potential of LH - RSXY = F ! RSXY potential of LH - RSYY = F ! RSYY potential of LH - CFL1 = F ! CFL number 1 - CFL2 = F ! CFL number 2 - CFL3 = F ! CFL number 3 -/ - -&HOTFILE - LHOTF = F ! Write hotfile - FILEHOT_OUT = 'wwm_hot_out' !'.nc' suffix will be added - BEGTC = '20030908.000000' !Starting time of hotfile writing. With ihot!=0 in SCHISM, - !this will be whatever the new hotstarted time is (even with ihot=2) - DELTC = 86400. ! time between hotfile writes - UNITC = 'SEC' ! unit used above - ENDTC = '20031008.000000' ! Ending time of hotfile writing (adjust with BEGTC) - LCYCLEHOT = T ! Applies only to netcdf - ! If T then hotfile contains 2 last records. - ! If F then hotfile contains N record if N outputs - ! have been done - ! For binary only one record. - HOTSTYLE_OUT = 2 ! 1: binary hotfile of data as output - ! 2: netcdf hotfile of data as output (default) - MULTIPLEOUT = 0 ! 0: hotfile in a single file (binary or netcdf) - ! MPI_REDUCE is then used and thus you'd avoid too freq. output - ! 1: hotfiles in separate files, each associated - ! with one process - FILEHOT_IN = 'wwm_hot_in.nc' ! (Full) hot file name for input - HOTSTYLE_IN = 2 ! 1: binary hotfile of data as input - ! 2: netcdf hotfile of data as input (default) - IHOTPOS_IN = 1 ! Position in hotfile (only for netcdf) - ! for reading - MULTIPLEIN = 0 ! 0: read hotfile from one single file - ! 1: read hotfile from multiple files (must use same # of CPU?) -/ - -&NESTING - L_NESTING = F, ! whether to produce nesting data or not - L_HOTFILE = F ! whether to produce an hotfile as output - L_BOUC_PARAM = F ! whether to produce a parametric boundary condition to be used by the nested grids - L_BOUC_SPEC = F ! whether to produce a spectral boundary condition to be used by the nested grids - NB_GRID_NEST = 0 ! number of nested grids. All lines below must contain NB_GRID_NEST entries. -! ListIGRIDTYPE = ! list of integers giving the type of nested grid -! ListFILEGRID = ! list of strings for the grid file names. -! ListFILEBOUND = ! list of boundary file names to be used -! ListBEGTC = ! list of beginning time of the runs (used for hotfile and boundary) -! ListDELTC = ! list of DELTC of the boundary output -! ListUNITC = ! list of UNITS of the boundary output -! ListENDTC = ! list of ENDTC of the boundary output -! ListPrefix = ! list of prefix used for the output variable -/ - -! only used with AMETHOD 4 or 5 -&PETScOptions - ! Summary of Sparse Linear Solvers Available from PETSc: http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html - KSPTYPE = 'LGMRES' - ! This parameter controls which solver is used. This is the same as petsc command line parameter -ksp_type. - ! KSPTYPE = 'GMRES' - ! Implements the Generalized Minimal Residual method. (Saad and Schultz, 1986) with restart - ! KSPTYPE = 'LGMRES' - ! Augments the standard GMRES approximation space with approximations to the error from previous restart cycles. - ! KSPTYPE = 'DGMRES' - ! In this implementation, the adaptive strategy allows to switch to the deflated GMRES when the stagnation occurs. - ! KSPTYPE = 'PGMRES' - ! Implements the Pipelined Generalized Minimal Residual method. Only PETSc 3.3 - ! KSPTYPE = 'KSPBCGSL' - ! Implements a slight variant of the Enhanced BiCGStab(L) algorithm - - RTOL = 1.E-20 ! the relative convergence tolerance (relative decrease in the residual norm) - ABSTOL = 1.E-20 ! the absolute convergence tolerance (absolute size of the residual norm) - DTOL = 10000. ! the divergence tolerance - MAXITS = 1000 ! maximum number of iterations to use - - INITIALGUESSNONZERO = F ! Tells the iterative solver that the initial guess is nonzero; otherwise KSP assumes the initial guess is to be zero - GMRESPREALLOCATE = T ! Causes GMRES and FGMRES to preallocate all its needed work vectors at initial setup rather than the default, which is to allocate them in chunks when needed. - - - PCTYPE = 'SOR' - ! This parameter controls which preconditioner is used. This is the same as petsc command line parameter -pc_type - ! PCTYPE = 'SOR' - ! (S)SOR (successive over relaxation, Gauss-Seidel) preconditioning - ! PCTYPE = 'ASM' - ! Use the (restricted) additive Schwarz method, each block is (approximately) solved with its own KSP object. - ! PCTYPE = 'HYPRE' - ! Allows you to use the matrix element based preconditioners in the LLNL package hypre - ! PCTYPE = 'SPAI' - ! Use the Sparse Approximate Inverse method of Grote and Barnard as a preconditioner - ! PCTYPE = 'NONE' - ! This is used when you wish to employ a nonpreconditioned Krylov method. -/ - - diff --git a/singularity/prep/prep.def b/singularity/prep/prep.def deleted file mode 100644 index 8757545..0000000 --- a/singularity/prep/prep.def +++ /dev/null @@ -1,50 +0,0 @@ -BootStrap: docker -#From: centos:centos7.8.2003 -From: continuumio/miniconda3:23.5.2-0-alpine - -%files - environment.yml - files/*.py /scripts/ - files/refs/* /refs/ - -%environment - export PYTHONPATH=/scripts - -%post - ENV_NAME=prep - - apk update && apk upgrade && apk add \ - git \ - libarchive - - conda install mamba -n base -c conda-forge - mamba update --name base --channel defaults conda - mamba env create -n $ENV_NAME --file /environment.yml - - conda run -n $ENV_NAME --no-capture-output \ - pip install "pyschism>=0.1.15" - conda run -n $ENV_NAME --no-capture-output \ - pip install "coupledmodeldriver>=1.6.6" - conda run -n $ENV_NAME --no-capture-output \ - pip install "ensembleperturbation>=1.1.2" - conda run -n $ENV_NAME --no-capture-output \ - pip uninstall -y pygeos geopandas # We use shapely 2 - - mamba install -y -n $ENV_NAME -cconda-forge \ - --force-reinstall geopandas geopandas-base - - git clone https://github.com/schism-dev/schism - cp -v schism/src/Utility/Pre-Processing/STOFS-3D-Atl-shadow-VIMS/Pre_processing/Source_sink/Relocate/relocate_source_feeder.py /scripts - cp -v schism/src/Utility/Pre-Processing/STOFS-3D-Atl-shadow-VIMS/Pre_processing/Source_sink/feeder_heads_bases_v2.1.xy /refs -# cp -v schism/src/Utility/Pre-Processing/STOFS-3D-Atl-shadow-VIMS/Pre_processing/Source_sink/relocate_florence.reg /refs - rm -rfv schism - - mamba clean --all --yes && apk del git - - -%runscript - conda run -n prep --no-capture-output python -m $* - - -%labels - Author "Soroosh Mani" diff --git a/singularity/scripts/build.sh b/singularity/scripts/build.sh deleted file mode 100755 index 43bebd7..0000000 --- a/singularity/scripts/build.sh +++ /dev/null @@ -1,9 +0,0 @@ -L_DEF_DIR=~/sandbox/ondemand-storm-workflow/singularity/ -L_IMG_DIR=/lustre/imgs - -mkdir -p $L_IMG_DIR -for i in prep; do - pushd $L_DEF_DIR/$i/ - sudo singularity build $L_IMG_DIR/$i.sif $i.def - popd -done diff --git a/singularity/scripts/combine_gr3.exp b/singularity/scripts/combine_gr3.exp deleted file mode 100755 index ac4b0b3..0000000 --- a/singularity/scripts/combine_gr3.exp +++ /dev/null @@ -1,52 +0,0 @@ -#!/bin/expect -f -# -# This Expect script was generated by autoexpect on Tue Dec 21 16:42:59 2021 -# Expect and autoexpect were both written by Don Libes, NIST. -# -# Note that autoexpect does not guarantee a working script. It -# necessarily has to guess about certain things. Two reasons a script -# might fail are: -# -# 1) timing - A surprising number of programs (rn, ksh, zsh, telnet, -# etc.) and devices discard or ignore keystrokes that arrive "too -# quickly" after prompts. If you find your new script hanging up at -# one spot, try adding a short sleep just before the previous send. -# Setting "force_conservative" to 1 (see below) makes Expect do this -# automatically - pausing briefly before sending each character. This -# pacifies every program I know of. The -c flag makes the script do -# this in the first place. The -C flag allows you to define a -# character to toggle this mode off and on. - -set force_conservative 0 ;# set to 1 to force conservative mode even if - ;# script wasn't run conservatively originally -if {$force_conservative} { - set send_slow {1 .1} - proc send {ignore arg} { - sleep .1 - exp_send -s -- $arg - } -} - -# -# 2) differing output - Some programs produce different output each time -# they run. The "date" command is an obvious example. Another is -# ftp, if it produces throughput statistics at the end of a file -# transfer. If this causes a problem, delete these patterns or replace -# them with wildcards. An alternative is to use the -p flag (for -# "prompt") which makes Expect only look for the last line of output -# (i.e., the prompt). The -P flag allows you to define a character to -# toggle this mode off and on. -# -# Read the man page for more info. -# -# -Don - - -set timeout -1 -spawn combine_gr3 -match_max 100000 -expect -exact " Input file name (e.g.: maxelev):\r" -send -- "[lindex $argv 0]\r" -expect -exact " Input # of scalar fields:\r" -send -- "[lindex $argv 1]\r" -expect eof diff --git a/singularity/scripts/input.conf b/singularity/scripts/input.conf deleted file mode 100644 index d4d7d33..0000000 --- a/singularity/scripts/input.conf +++ /dev/null @@ -1,33 +0,0 @@ -# Parameters -storm=$1 -year=$2 -subset_mesh=1 -# Other params -hr_prelandfall=-1 -past_forecast=1 -hydrology=1 -use_wwm=0 -pahm_model='symmetric' -num_perturb=2 -sample_rule='korobov' -spinup_exec='pschism_PAHM_TVD-VL' -hotstart_exec='pschism_PAHM_TVD-VL' - -# Paths as local variables -L_NWM_DATASET=/lustre/static_data/nwm/NWM_v2.0_channel_hydrofabric/nwm_v2_0_hydrofabric.gdb -L_TPXO_DATASET=/lustre/static_data/tpxo -L_LEADTIMES_DATASET=/lustre/static_data/leadtimes.json -L_DEM_HI=/lustre/static_data/dem/ncei19/*.tif -L_DEM_LO=/lustre/static_data/dem/gebco/*.tif -L_MESH_HI=/lustre/static_data/grid/stofs3d_atl_v2.1_eval.gr3 -L_MESH_LO=/lustre/static_data/grid/WNAT_1km.14 -L_SHP_DIR=/lustre/static_data/shape -L_IMG_DIR=/lustre/imgs -L_SCRIPT_DIR=~/sandbox/ondemand-storm-workflow/singularity/scripts - -# Environment -export SINGULARITY_BINDFLAGS="--bind /lustre" -export TMPDIR=/lustre/.tmp # redirect OCSMESH temp files - -# Modules -L_SOLVE_MODULES="openmpi/4.1.2" diff --git a/singularity/scripts/mesh.sbatch b/singularity/scripts/mesh.sbatch deleted file mode 100755 index 0ee6ee8..0000000 --- a/singularity/scripts/mesh.sbatch +++ /dev/null @@ -1,8 +0,0 @@ -#!/bin/bash -#SBATCH --parsable -#SBATCH --exclusive -#SBATCH --mem=0 - -set -ex - -singularity run ${SINGULARITY_BINDFLAGS} ${IMG} ${STORM} ${YEAR} ${MESH_KWDS} diff --git a/singularity/scripts/prep.sbatch b/singularity/scripts/prep.sbatch deleted file mode 100644 index f892c92..0000000 --- a/singularity/scripts/prep.sbatch +++ /dev/null @@ -1,8 +0,0 @@ -#!/bin/bash -#SBATCH --parsable -#SBATCH --exclusive -#SBATCH --mem=0 - -set -ex - -singularity run ${SINGULARITY_BINDFLAGS} ${IMG} ${PREP_KWDS} ${STORM} ${YEAR} diff --git a/singularity/scripts/schism.sbatch b/singularity/scripts/schism.sbatch deleted file mode 100755 index c8f09fb..0000000 --- a/singularity/scripts/schism.sbatch +++ /dev/null @@ -1,42 +0,0 @@ -#!/bin/bash -#SBATCH --parsable -#SBATCH --exclusive -#SBATCH --mem=0 -#SBATCH --nodes=3 -#SBATCH --ntasks-per-node=36 - -module load $MODULES - -export MV2_ENABLE_AFFINITY=0 -ulimit -s unlimited - -set -ex - -pushd ${SCHISM_DIR} -mkdir -p outputs -mpirun -np 36 singularity exec ${SINGULARITY_BINDFLAGS} ${IMG} \ - ${SCHISM_EXEC} 4 - -if [ $? -eq 0 ]; then - echo "Combining outputs..." - date - pushd outputs - if ls hotstart* >/dev/null 2>&1; then - times=$(ls hotstart_* | grep -o "hotstart[0-9_]\+" | awk 'BEGIN {FS = "_"}; {print $3}' | sort -h | uniq ) - for i in $times; do - singularity exec ${SINGULARITY_BINDFLAGS} ${IMG} \ - combine_hotstart7 --iteration $i - done - fi - popd - - singularity exec ${SINGULARITY_BINDFLAGS} ${IMG} \ - expect -f /scripts/combine_gr3.exp maxelev 1 - singularity exec ${SINGULARITY_BINDFLAGS} ${IMG} \ - expect -f /scripts/combine_gr3.exp maxdahv 3 - mv maxdahv.gr3 maxelev.gr3 -t outputs -fi - - -echo "Done" -date diff --git a/singularity/solve/files/combine_gr3.exp b/singularity/solve/files/combine_gr3.exp deleted file mode 100755 index ac4b0b3..0000000 --- a/singularity/solve/files/combine_gr3.exp +++ /dev/null @@ -1,52 +0,0 @@ -#!/bin/expect -f -# -# This Expect script was generated by autoexpect on Tue Dec 21 16:42:59 2021 -# Expect and autoexpect were both written by Don Libes, NIST. -# -# Note that autoexpect does not guarantee a working script. It -# necessarily has to guess about certain things. Two reasons a script -# might fail are: -# -# 1) timing - A surprising number of programs (rn, ksh, zsh, telnet, -# etc.) and devices discard or ignore keystrokes that arrive "too -# quickly" after prompts. If you find your new script hanging up at -# one spot, try adding a short sleep just before the previous send. -# Setting "force_conservative" to 1 (see below) makes Expect do this -# automatically - pausing briefly before sending each character. This -# pacifies every program I know of. The -c flag makes the script do -# this in the first place. The -C flag allows you to define a -# character to toggle this mode off and on. - -set force_conservative 0 ;# set to 1 to force conservative mode even if - ;# script wasn't run conservatively originally -if {$force_conservative} { - set send_slow {1 .1} - proc send {ignore arg} { - sleep .1 - exp_send -s -- $arg - } -} - -# -# 2) differing output - Some programs produce different output each time -# they run. The "date" command is an obvious example. Another is -# ftp, if it produces throughput statistics at the end of a file -# transfer. If this causes a problem, delete these patterns or replace -# them with wildcards. An alternative is to use the -p flag (for -# "prompt") which makes Expect only look for the last line of output -# (i.e., the prompt). The -P flag allows you to define a character to -# toggle this mode off and on. -# -# Read the man page for more info. -# -# -Don - - -set timeout -1 -spawn combine_gr3 -match_max 100000 -expect -exact " Input file name (e.g.: maxelev):\r" -send -- "[lindex $argv 0]\r" -expect -exact " Input # of scalar fields:\r" -send -- "[lindex $argv 1]\r" -expect eof diff --git a/singularity/solve/solve.def b/singularity/solve/solve.def deleted file mode 100644 index 6fe9773..0000000 --- a/singularity/solve/solve.def +++ /dev/null @@ -1,81 +0,0 @@ -BootStrap: docker -#From: centos:centos7.8.2003 -From: ubuntu:22.10 - -%files - files/entrypoint.sh /scripts/ - files/combine_gr3.exp /scripts/ - - -%post - apt-get update && apt-get upgrade -y && apt-get install -y \ - git \ - gcc \ - g++ \ - gfortran \ - make \ - cmake \ - openmpi-bin libopenmpi-dev \ - libhdf5-dev \ - libnetcdf-dev libnetcdf-mpi-dev libnetcdff-dev \ - python3 \ - python-is-python3 - - - # Install SCHISM - git clone https://github.com/SorooshMani-NOAA/schism.git - git -C schism checkout a0817a8 - mkdir -p schism/build - PREV_PWD=$PWD - cd schism/build - cmake ../src/ \ - -DCMAKE_Fortran_COMPILER=mpifort \ - -DCMAKE_C_COMPILER=mpicc \ - -DNetCDF_Fortran_LIBRARY=$(nc-config --libdir)/libnetcdff.so \ - -DNetCDF_C_LIBRARY=$(nc-config --libdir)/libnetcdf.so \ - -DNetCDF_INCLUDE_DIR=$(nc-config --includedir) \ - -DUSE_PAHM=TRUE \ - -DCMAKE_Fortran_FLAGS_RELEASE="-O2 -ffree-line-length-none -fallow-argument-mismatch" - make -j8 - mv bin/* -t /usr/bin/ - rm -rf * - cmake ../src/ \ - -DCMAKE_Fortran_COMPILER=mpifort \ - -DCMAKE_C_COMPILER=mpicc \ - -DNetCDF_Fortran_LIBRARY=$(nc-config --libdir)/libnetcdff.so \ - -DNetCDF_C_LIBRARY=$(nc-config --libdir)/libnetcdf.so \ - -DNetCDF_INCLUDE_DIR=$(nc-config --includedir) \ - -DUSE_PAHM=TRUE \ - -DUSE_WWM=TRUE \ - -DCMAKE_Fortran_FLAGS_RELEASE="-O2 -ffree-line-length-none -fallow-argument-mismatch" - make -j8 - mv bin/* -t /usr/bin/ - cd ${PREV_PWD} - rm -rf schism - - - apt-get remove -y git - apt-get remove -y gcc - apt-get remove -y g++ - apt-get remove -y gfortran - apt-get remove -y make - apt-get remove -y cmake - apt-get remove -y python3 - apt-get remove -y python-is-python3 - apt-get remove -y libopenmpi-dev - apt-get remove -y libhdf5-dev - apt-get remove -y libnetcdf-dev libnetcdf-mpi-dev libnetcdff-dev - - apt-get install -y libnetcdf-c++4-1 libnetcdf-c++4 libnetcdf-mpi-19 libnetcdf19 libnetcdff7 netcdf-bin - apt-get install -y libhdf5-103-1 libhdf5-cpp-103-1 libhdf5-openmpi-103-1 - apt-get install -y libopenmpi3 - DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC apt-get -y install tzdata - apt-get install -y expect - - apt-get clean autoclean - apt-get autoremove --yes -# rm -rf /var/lib/{apt,dpkg,cache,log}/ - - -%labels - Author "Soroosh Mani" diff --git a/docker/post/docker/__init__.py b/stormworkflow/__init__.py similarity index 100% rename from docker/post/docker/__init__.py rename to stormworkflow/__init__.py diff --git a/stormworkflow/main.py b/stormworkflow/main.py new file mode 100644 index 0000000..e649f6c --- /dev/null +++ b/stormworkflow/main.py @@ -0,0 +1,62 @@ +import subprocess +import logging +import os +import shlex +from importlib.resources import files +from argparse import ArgumentParser +from pathlib import Path + +import stormworkflow +import yaml +try: + from yaml import CLoader as Loader, CDumper as Dumper +except ImportError: + from yaml import Loader, Dumper + + +_logger = logging.getLogger(__file__) + +def main(): + + parser = ArgumentParser() + parser.add_argument('configuration', type=Path) + args = parser.parse_args() + + scripts = files('stormworkflow.scripts') + slurm = files('stormworkflow.slurm') + refs = files('stormworkflow.refs') + + infile = args.configuration + if infile is None: + _logger.warn('No input configuration provided, using reference file!') + infile = refs.joinpath('input.yaml') + + with open(infile, 'r') as yfile: + conf = yaml.load(yfile, Loader=Loader) + + wf = scripts.joinpath('workflow.sh') + + run_env = os.environ.copy() + run_env['L_SCRIPT_DIR'] = slurm.joinpath('.') + for k, v in conf.items(): + if isinstance(v, list): + v = shlex.join(v) + run_env[k] = str(v) + + ps = subprocess.run( + [wf, infile], + env=run_env, + shell=False, +# check=True, + capture_output=False, + ) + + if ps.returncode != 0: + _logger.error(ps.stderr) + + _logger.info(ps.stdout) + + +if __name__ == '__main__': + + main() diff --git a/stormworkflow/post/ROC_single_run.py b/stormworkflow/post/ROC_single_run.py new file mode 100644 index 0000000..26dbd2c --- /dev/null +++ b/stormworkflow/post/ROC_single_run.py @@ -0,0 +1,300 @@ +import argparse +import logging +import os +import warnings +import numpy as np +import pandas as pd +import xarray as xr +import scipy as sp +import matplotlib.pyplot as plt +from pathlib import Path +from cartopy.feature import NaturalEarthFeature + +os.environ['USE_PYGEOS'] = '0' +import geopandas as gpd + +pd.options.mode.copy_on_write = True + + +def stack_station_coordinates(x, y): + """ + Create numpy.column_stack based on + coordinates of observation points + """ + coord_combined = np.column_stack([x, y]) + return coord_combined + + +def create_search_tree(longitude, latitude): + """ + Create scipy.spatial.CKDTree based on Lat. and Long. + """ + long_lat = np.column_stack((longitude.T.ravel(), latitude.T.ravel())) + tree = sp.spatial.cKDTree(long_lat) + return tree + + +def find_nearby_prediction(ds, variable, indices): + """ + Reads netcdf file, target variable, and indices + Returns max value among corresponding indices for each point + """ + obs_count = indices.shape[0] # total number of search/observation points + max_prediction_index = len(ds.node.values) # total number of nodes + + prediction_prob = np.zeros(obs_count) # assuming all are dry (probability of zero) + + for obs_point in range(obs_count): + idx_arr = np.delete( + indices[obs_point], np.where(indices[obs_point] == max_prediction_index)[0] + ) # len is length of surrogate model array + val_arr = ds[variable].values[idx_arr] + val_arr = np.nan_to_num(val_arr) # replace nan with zero (dry node) + + # # Pick the nearest non-zero probability (option #1) + # for val in val_arr: + # if val > 0.0: + # prediction_prob[obs_point] = round(val,4) #round to 0.1 mm + # break + + # pick the largest value (option #2) + if val_arr.size > 0: + prediction_prob[obs_point] = val_arr.max() + return prediction_prob + + +def plot_probabilities(df, prob_column, gdf_countries, title, save_name): + """ + plot probabilities of exceeding given threshold at obs. points + """ + figure, axis = plt.subplots(1, 1) + figure.set_size_inches(10, 10 / 1.6) + + plt.scatter(x=df.Longitude, y=df.Latitude, vmin=0, vmax=1.0, c=df[prob_column]) + xlim = axis.get_xlim() + ylim = axis.get_ylim() + + gdf_countries.plot(color='lightgrey', ax=axis, zorder=-5) + + axis.set_xlim(xlim) + axis.set_ylim(ylim) + plt.colorbar(shrink=0.75) + plt.title(title) + plt.savefig(save_name) + plt.close() + + +def calculate_hit_miss(df, obs_column, prob_column, threshold, probability): + """ + Reads dataframe with two columns for obs_elev, and probabilities + returns hit/miss/... based on user-defined threshold & probability + """ + hit = len(df[(df[obs_column] >= threshold) & (df[prob_column] >= probability)]) + miss = len(df[(df[obs_column] >= threshold) & (df[prob_column] < probability)]) + false_alarm = len(df[(df[obs_column] < threshold) & (df[prob_column] >= probability)]) + correct_neg = len(df[(df[obs_column] < threshold) & (df[prob_column] < probability)]) + + return hit, miss, false_alarm, correct_neg + + +def calculate_POD_FAR(hit, miss, false_alarm, correct_neg): + """ + Reads hit, miss, false_alarm, and correct_neg + returns POD and FAR + default POD and FAR are np.nan + """ + POD = np.nan + FAR = np.nan + try: + POD = round(hit / (hit + miss), 4) # Probability of Detection + except ZeroDivisionError: + pass + try: + FAR = round(false_alarm / (false_alarm + correct_neg), 4) # False Alarm Rate + except ZeroDivisionError: + pass + return POD, FAR + + +def main(args): + storm_name = args.storm_name.capitalize() + storm_year = args.storm_year + leadtime = args.leadtime + prob_nc_path = Path(args.prob_nc_path) + obs_df_path = Path(args.obs_df_path) + save_dir = args.save_dir + + # *.nc file coordinates + thresholds_ft = [3, 6, 9] # in ft + thresholds_m = [round(i * 0.3048, 4) for i in thresholds_ft] # convert to meter + sources = ['model', 'surrogate'] + probabilities = [0.0, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0] + + # attributes of input files + prediction_variable = 'probabilities' + obs_attribute = 'Elev_m_xGEOID20b' + + # search criteria + max_distance = 1000 # [in meters] to set distance_upper_bound + max_neighbors = 10 # to set k + + blank_arr = np.empty((len(thresholds_ft), 1, 1, len(sources), len(probabilities))) + blank_arr[:] = np.nan + + hit_arr = blank_arr.copy() + miss_arr = blank_arr.copy() + false_alarm_arr = blank_arr.copy() + correct_neg_arr = blank_arr.copy() + POD_arr = blank_arr.copy() + FAR_arr = blank_arr.copy() + + # Load obs file, extract storm obs points and coordinates + df_obs = pd.read_csv(obs_df_path) + Event_name = f'{storm_name}_{storm_year}' + df_obs_storm = df_obs[df_obs.Event == Event_name] + obs_coordinates = stack_station_coordinates( + df_obs_storm.Longitude.values, df_obs_storm.Latitude.values + ) + + # Load probabilities.nc file + ds_prob = xr.open_dataset(prob_nc_path) + + gdf_countries = gpd.GeoSeries( + NaturalEarthFeature(category='physical', scale='10m', name='land',).geometries(), + crs=4326, + ) + + # Loop through thresholds and sources and find corresponding values from probabilities.nc + threshold_count = -1 + for threshold in thresholds_m: + threshold_count += 1 + source_count = -1 + for source in sources: + source_count += 1 + ds_temp = ds_prob.sel(level=threshold, source=source) + tree = create_search_tree(ds_temp.x.values, ds_temp.y.values) + dist, indices = tree.query( + obs_coordinates, k=max_neighbors, distance_upper_bound=max_distance * 1e-5 + ) # 0.01 is equivalent to 1000 m + prediction_prob = find_nearby_prediction( + ds=ds_temp, variable=prediction_variable, indices=indices + ) + df_obs_storm[f'{source}_prob'] = prediction_prob + + # Plot probabilities at obs. points + plot_probabilities( + df_obs_storm, + f'{source}_prob', + gdf_countries, + f'Probability of {source} exceeding {thresholds_ft[threshold_count]} ft \n {storm_name}, {storm_year}, {leadtime}-hr leadtime', + os.path.join( + save_dir, + f'prob_{source}_above_{thresholds_ft[threshold_count]}ft_{storm_name}_{storm_year}_{leadtime}-hr.png', + ), + ) + + # Loop through probabilities: calculate hit/miss/... & POD/FAR + prob_count = -1 + for prob in probabilities: + prob_count += 1 + hit, miss, false_alarm, correct_neg = calculate_hit_miss( + df_obs_storm, obs_attribute, f'{source}_prob', threshold, prob + ) + hit_arr[threshold_count, 0, 0, source_count, prob_count] = hit + miss_arr[threshold_count, 0, 0, source_count, prob_count] = miss + false_alarm_arr[threshold_count, 0, 0, source_count, prob_count] = false_alarm + correct_neg_arr[threshold_count, 0, 0, source_count, prob_count] = correct_neg + + pod, far = calculate_POD_FAR(hit, miss, false_alarm, correct_neg) + POD_arr[threshold_count, 0, 0, source_count, prob_count] = pod + FAR_arr[threshold_count, 0, 0, source_count, prob_count] = far + + ds_ROC = xr.Dataset( + coords=dict( + threshold=thresholds_ft, + storm=[storm_name], + leadtime=[leadtime], + source=sources, + prob=probabilities, + ), + data_vars=dict( + hit=(['threshold', 'storm', 'leadtime', 'source', 'prob'], hit_arr), + miss=(['threshold', 'storm', 'leadtime', 'source', 'prob'], miss_arr), + false_alarm=( + ['threshold', 'storm', 'leadtime', 'source', 'prob'], + false_alarm_arr, + ), + correct_neg=( + ['threshold', 'storm', 'leadtime', 'source', 'prob'], + correct_neg_arr, + ), + POD=(['threshold', 'storm', 'leadtime', 'source', 'prob'], POD_arr), + FAR=(['threshold', 'storm', 'leadtime', 'source', 'prob'], FAR_arr), + ), + ) + ds_ROC.to_netcdf( + os.path.join(save_dir, f'{storm_name}_{storm_year}_{leadtime}hr_leadtime_POD_FAR.nc') + ) + + # plot ROC curves + marker_list = ['s', 'x'] + linestyle_list = ['dashed', 'dotted'] + threshold_count = -1 + for threshold in thresholds_ft: + threshold_count += 1 + fig = plt.figure() + ax = fig.add_subplot(111) + plt.axline( + (0.0, 0.0), (1.0, 1.0), linestyle='--', color='grey', label='random prediction' + ) + source_count = -1 + for source in sources: + source_count += 1 + plt.plot( + FAR_arr[threshold_count, 0, 0, source_count, :], + POD_arr[threshold_count, 0, 0, source_count, :], + label=f'{source}', + marker=marker_list[source_count], + linestyle=linestyle_list[source_count], + markersize=5, + ) + plt.legend() + plt.xlabel('False Alarm Rate') + plt.ylabel('Probability of Detection') + + plt.title( + f'{storm_name}_{storm_year}, {leadtime}-hr leadtime, {threshold} ft threshold' + ) + plt.savefig( + os.path.join( + save_dir, f'ROC_{storm_name}_{leadtime}hr_leadtime_{threshold}_ft.png' + ) + ) + plt.close() + + +def cli(): + parser = argparse.ArgumentParser() + + parser.add_argument('--storm_name', help='name of the storm', type=str) + + parser.add_argument('--storm_year', help='year of the storm', type=int) + + parser.add_argument('--leadtime', help='OFCL track leadtime hr', type=int) + + parser.add_argument('--prob_nc_path', help='path to probabilities.nc', type=str) + + parser.add_argument('--obs_df_path', help='Path to observations dataframe', type=str) + + # optional + parser.add_argument( + '--save_dir', help='directory for saving analysis', default=os.getcwd(), type=str + ) + + main(parser.parse_args()) + + +if __name__ == '__main__': + warnings.filterwarnings('ignore') + # warnings.filterwarnings("ignore", category=DeprecationWarning) + cli() diff --git a/singularity/post/files/Tidal_validation.py b/stormworkflow/post/Tidal_validation.py similarity index 100% rename from singularity/post/files/Tidal_validation.py rename to stormworkflow/post/Tidal_validation.py diff --git a/prefect/workflow/__init__.py b/stormworkflow/post/__init__.py similarity index 100% rename from prefect/workflow/__init__.py rename to stormworkflow/post/__init__.py diff --git a/singularity/prep/files/analyze_ensemble.py b/stormworkflow/post/analyze_ensemble.py similarity index 86% rename from singularity/prep/files/analyze_ensemble.py rename to stormworkflow/post/analyze_ensemble.py index 3a3b696..44a78a3 100644 --- a/singularity/prep/files/analyze_ensemble.py +++ b/stormworkflow/post/analyze_ensemble.py @@ -22,6 +22,7 @@ plot_selected_validations, plot_sensitivities, plot_validations, + plot_selected_probability_fields, ) from ensembleperturbation.uncertainty_quantification.karhunen_loeve_expansion import ( karhunen_loeve_expansion, @@ -33,11 +34,14 @@ surrogate_from_karhunen_loeve, surrogate_from_training_set, validations_from_surrogate, + probability_field_from_surrogate, ) from ensembleperturbation.utilities import get_logger -LOGGER = get_logger('klpc_wetonly') +from dask_jobqueue import SLURMCluster +from dask.distributed import Client +LOGGER = get_logger('klpc_wetonly') def main(args): @@ -45,13 +49,12 @@ def main(args): tracks_dir = args.tracks_dir ensemble_dir = args.ensemble_dir - analyze(tracks_dir, ensemble_dir/'analyze') - + analyze(tracks_dir, ensemble_dir / 'analyze') def analyze(tracks_dir, analyze_dir): - mann_coefs = [0.025, 0.05, 0.1] + mann_coefs = [0.025] #[0.025, 0.05, 0.1] for mann_coef in mann_coefs: _analyze(tracks_dir, analyze_dir, mann_coef) @@ -102,6 +105,7 @@ def _analyze(tracks_dir, analyze_dir, mann_coef): make_sensitivities_plot = True make_validation_plot = True make_percentile_plot = True + make_probability_plot = True save_plots = True show_plots = False @@ -109,17 +113,12 @@ def _analyze(tracks_dir, analyze_dir, mann_coef): storm_name = None if log_space: - output_directory = ( - analyze_dir / f'log_k{k_neighbors}_p{idw_order}_n{mann_coef}' - ) + output_directory = analyze_dir / f'log_k{k_neighbors}_p{idw_order}_n{mann_coef}' else: - output_directory = ( - analyze_dir / f'linear_k{k_neighbors}_p{idw_order}_n{mann_coef}' - ) + output_directory = analyze_dir / f'linear_k{k_neighbors}_p{idw_order}_n{mann_coef}' if not output_directory.exists(): output_directory.mkdir(parents=True, exist_ok=True) - subset_filename = output_directory / 'subset.nc' kl_filename = output_directory / 'karhunen_loeve.pkl' kl_surrogate_filename = output_directory / 'kl_surrogate.npy' @@ -128,6 +127,7 @@ def _analyze(tracks_dir, analyze_dir, mann_coef): sensitivities_filename = output_directory / 'sensitivities.nc' validation_filename = output_directory / 'validation.nc' percentile_filename = output_directory / 'percentiles.nc' + probability_filename = output_directory / 'probabilities.nc' filenames = ['perturbations.nc', 'maxele.63.nc'] if storm_name is None: @@ -242,9 +242,7 @@ def _analyze(tracks_dir, analyze_dir, mann_coef): training_set_adjusted += training_set_adjusted['depth'] if log_space: - training_depth_adjust = numpy.fmax( - 0, min_depth - training_set_adjusted.min(axis=0) - ) + training_depth_adjust = numpy.fmax(0, min_depth - training_set_adjusted.min(axis=0)) training_set_adjusted += training_depth_adjust training_set_adjusted = numpy.log(training_set_adjusted) @@ -301,9 +299,7 @@ def _analyze(tracks_dir, analyze_dir, mann_coef): plot_kl_surrogate_fit( kl_fit=kl_fit, - output_filename=output_directory / 'kl_surrogate_fit.png' - if save_plots - else None, + output_filename=output_directory / 'kl_surrogate_fit.png' if save_plots else None, ) # convert the KL surrogate model to the overall surrogate at each node @@ -377,16 +373,54 @@ def _analyze(tracks_dir, analyze_dir, mann_coef): output_directory=output_directory if save_plots else None, ) + if make_probability_plot: + level_ft = numpy.arange(1, 21) + level_m = (level_ft * 0.3048).round(decimals=4) + + node_prob_field = probability_field_from_surrogate( + levels=level_m, + surrogate_model=surrogate_model, + distribution=distribution, + training_set=validation_set, + minimum_allowable_value=min_depth if use_depth else None, + convert_from_log_scale=log_space, + convert_from_depths=training_depth_adjust.values if log_space else use_depth, + element_table=elements if point_spacing is None else None, + filename=probability_filename, + ) + + plot_selected_probability_fields( + node_prob_field=node_prob_field, + level_list=level_m, + output_directory=output_directory if save_plots else None, + label_unit_convert_factor=1 / 0.3048, + label_unit_name='ft', + ) + if show_plots: LOGGER.info('showing plots') pyplot.show() -if __name__ == '__main__': - +def cli(): parser = ArgumentParser() parser.add_argument('-d', '--ensemble-dir', type=Path) parser.add_argument('-t', '--tracks-dir', type=Path) parser.add_argument('-s', '--sequential', action='store_true') main(parser.parse_args()) + + +if __name__ == '__main__': + cluster = SLURMCluster(cores=16, + processes=1, + memory="500GB", + account="compute", + walltime="08:00:00", + header_skip=['--mem'], + interface="eth0") + cluster.scale(6) + client = Client(cluster) + print(client) + + cli() diff --git a/singularity/prep/files/combine_ensemble.py b/stormworkflow/post/combine_ensemble.py similarity index 81% rename from singularity/prep/files/combine_ensemble.py rename to stormworkflow/post/combine_ensemble.py index 28a6d9a..107220f 100644 --- a/singularity/prep/files/combine_ensemble.py +++ b/stormworkflow/post/combine_ensemble.py @@ -7,7 +7,6 @@ LOGGER = get_logger('klpc_wetonly') - def main(args): tracks_dir = args.tracks_dir @@ -16,16 +15,21 @@ def main(args): output = combine_results( model='schism', adcirc_like=True, - output=ensemble_dir/'analyze', + filenames=['out2d_*.nc'], #only combine elevations. + output=ensemble_dir / 'analyze', directory=ensemble_dir, - parallel=not args.sequential + parallel=not args.sequential, ) -if __name__ == '__main__': +def cli(): parser = ArgumentParser() parser.add_argument('-d', '--ensemble-dir', type=Path) parser.add_argument('-t', '--tracks-dir', type=Path) parser.add_argument('-s', '--sequential', action='store_true') main(parser.parse_args()) + + +if __name__ == '__main__': + cli() diff --git a/docker/post/docker/defn.py b/stormworkflow/post/defn.py similarity index 100% rename from docker/post/docker/defn.py rename to stormworkflow/post/defn.py diff --git a/singularity/post/files/generate_viz.py b/stormworkflow/post/generate_viz.py similarity index 100% rename from singularity/post/files/generate_viz.py rename to stormworkflow/post/generate_viz.py diff --git a/docker/post/docker/hurricane_funcs.py b/stormworkflow/post/hurricane_funcs.py similarity index 100% rename from docker/post/docker/hurricane_funcs.py rename to stormworkflow/post/hurricane_funcs.py diff --git a/singularity/post/files/max_ele_vs_hwm.py b/stormworkflow/post/max_ele_vs_hwm.py similarity index 100% rename from singularity/post/files/max_ele_vs_hwm.py rename to stormworkflow/post/max_ele_vs_hwm.py diff --git a/singularity/post/files/maxelev_diff.py b/stormworkflow/post/maxelev_diff.py similarity index 100% rename from singularity/post/files/maxelev_diff.py rename to stormworkflow/post/maxelev_diff.py diff --git a/singularity/post/files/__init__.py b/stormworkflow/prep/__init__.py similarity index 100% rename from singularity/post/files/__init__.py rename to stormworkflow/prep/__init__.py diff --git a/singularity/prep/files/download_data.py b/stormworkflow/prep/download_data.py similarity index 76% rename from singularity/prep/files/download_data.py rename to stormworkflow/prep/download_data.py index ef01ae5..eab58a4 100644 --- a/singularity/prep/files/download_data.py +++ b/stormworkflow/prep/download_data.py @@ -27,10 +27,9 @@ def main(args): workdir.mkdir(exist_ok=True) dt_data = pd.read_csv(dt_rng_path, delimiter=',') - date_1, date_2, _ = pd.to_datetime(dt_data.date_time).dt.strftime( - "%Y%m%d%H").values - model_start_time = datetime.strptime(date_1, "%Y%m%d%H") - model_end_time = datetime.strptime(date_2, "%Y%m%d%H") + date_1, date_2, _ = pd.to_datetime(dt_data.date_time).dt.strftime('%Y%m%d%H').values + model_start_time = datetime.strptime(date_1, '%Y%m%d%H') + model_end_time = datetime.strptime(date_2, '%Y%m%d%H') spinup_time = timedelta(days=2) # Right now the only download is for NWM, in the future there @@ -48,10 +47,9 @@ def main(args): start_date=model_start_time - spinup_time, end_date=model_end_time - model_start_time + spinup_time, overwrite=True, - ) + ) nwm.pairings.save_json( - sources=workdir / 'source.json', - sinks=workdir / 'sink.json' + sources=workdir / 'source.json', sinks=workdir / 'sink.json' ) @@ -63,18 +61,16 @@ def parse_arguments(): required=True, type=Path, default=None, - help='path to store generated configuration files' + help='path to store generated configuration files', ) argument_parser.add_argument( - "--date-range-file", + '--date-range-file', required=True, type=Path, - help="path to the file containing simulation date range" + help='path to the file containing simulation date range', ) argument_parser.add_argument( - "--nwm-file", - type=Path, - help="path to the NWM hydrofabric dataset", + '--nwm-file', type=Path, help='path to the NWM hydrofabric dataset', ) argument_parser.add_argument( '--mesh-directory', @@ -82,14 +78,15 @@ def parse_arguments(): required=True, help='path to input mesh (`hgrid.gr3`, `manning.gr3` or `drag.gr3`)', ) - argument_parser.add_argument( - "--with-hydrology", action="store_true" - ) + argument_parser.add_argument('--with-hydrology', action='store_true') args = argument_parser.parse_args() return args - -if __name__ == "__main__": + +def cli(): main(parse_arguments()) + +if __name__ == '__main__': + cli() diff --git a/stormworkflow/prep/hurricane_data.py b/stormworkflow/prep/hurricane_data.py new file mode 100644 index 0000000..5e3382a --- /dev/null +++ b/stormworkflow/prep/hurricane_data.py @@ -0,0 +1,430 @@ +"""User script to get hurricane info relevant to the workflow +This script gether information about: + - Hurricane track + - Hurricane windswath + - Hurricane event dates + - Stations info for historical hurricane +""" + +import sys +import logging +import pathlib +import argparse +import tempfile +import numpy as np +from datetime import datetime, timedelta +from typing import Optional, List + +import pandas as pd +import geopandas as gpd +from searvey.coops import COOPS_TidalDatum +from searvey.coops import COOPS_TimeZone +from searvey.coops import COOPS_Units +from shapely.geometry import box, base +from stormevents import StormEvent +from stormevents.nhc import VortexTrack +from stormevents.nhc.track import ( + combine_tracks, + correct_ofcl_based_on_carq_n_hollandb, + separate_tracks, +) + + +logger = logging.getLogger(__name__) +logger.setLevel(logging.INFO) +logging.basicConfig( + stream=sys.stdout, + format='%(asctime)s,%(msecs)d %(levelname)-8s [%(filename)s:%(lineno)d] %(message)s', + datefmt='%Y-%m-%d:%H:%M:%S') + + + +def trackstart_from_file( + leadtime_file: Optional[pathlib.Path], + nhc_code: str, + leadtime: float, +) -> Optional[datetime]: + if leadtime_file is None or not leadtime_file.is_file(): + return None + + leadtime_dict = pd.read_json(leadtime_file, orient='index') + leadtime_table = leadtime_dict.drop(columns='leadtime').merge( + leadtime_dict.leadtime.apply( + lambda x: pd.Series({v: k for k, v in x.items()}) + ).apply(pd.to_datetime, format='%Y%m%d%H'), + left_index=True, + right_index=True + ).set_index('ALnumber') + + if nhc_code.lower() not in leadtime_table.index: + return None + + storm_all_times = leadtime_table.loc[[nhc_code.lower()]].dropna() + if len(storm_all_times) > 1: + storm_all_times = storm_all_times.iloc[0] + if leadtime not in storm_all_times: + return None + + return storm_all_times[leadtime].item() + + +def get_perturb_timestamp_in_track( + track: VortexTrack, + time_col: 'str', + hr_before_landfall: datetime, + prescribed: Optional[datetime], + land_shapes: List[base.BaseGeometry], +) -> Optional[datetime]: + ''' + For best track pick the best track time that is at least + leadtime before the time besttrack is on land. But for forecast + pick the track that has a fcst000 date which is + at least leadtime before the time that the track is on land. + + Note that for a single advisory forecast, there are still MULTIPLE + tracks each with a different START DATE; while for best track + there's a SINGLE track with a start date equal to the beginning. + ''' + + track_data = track.data + + assert len(set(track.advisories)) == 1 + + perturb_start = track_data.track_start_time.iloc[0] + if prescribed is not None: + times = track_data[time_col].unique() + leastdiff_idx = np.argmin(abs(times - prescribed)) + perturb_start = times[leastdiff_idx] + return perturb_start + + for shp in land_shapes: + tracks_onland = track_data[track_data.intersects(shp)] + if not tracks_onland.empty: + break + else: + # If track is never on input land polygons + return perturb_start + + + # Find tracks that started closest and prior to specified leadtime + # For each track start date, pick the FIRST time it's on land + candidates = tracks_onland.groupby('track_start_time').nth(0).reset_index() + dt = timedelta(hours=hr_before_landfall) + + # Pick LAST track that starts BEFORE the given leadtime among + # the candidates (start time and landfall time) + candidates['timediff'] = candidates.datetime - candidates.track_start_time + times_start_landfall = candidates[ + candidates['timediff'] >= dt + ][ + ['track_start_time', 'datetime'] + ].iloc[-1] + picked_track = track_data[ + track_data.track_start_time == times_start_landfall.track_start_time] + + # Get the chosen track's timestamp closest to specifid leadtime + perturb_start = picked_track.loc[ + times_start_landfall.datetime - picked_track.datetime >= dt + ].iloc[-1] + + return perturb_start[time_col] + + +def main(args): + + name_or_code = args.name_or_code + year = args.year + date_out = args.date_range_outpath + track_out = args.track_outpath + swath_out = args.swath_outpath + sta_dat_out = args.station_data_outpath + sta_loc_out = args.station_location_outpath + use_past_forecast = args.past_forecast + hr_before_landfall = args.hours_before_landfall + lead_times = args.lead_times + track_dir = args.preprocessed_tracks_dir + countries_shpfile = args.countries_polygon + + if hr_before_landfall < 0: + hr_before_landfall = 48 + + ne_low = gpd.read_file(countries_shpfile) + shp_US = ne_low[ne_low.NAME_EN.isin(['United States of America', 'Puerto Rico'])].unary_union + + logger.info("Fetching hurricane info...") + event = None + if year == 0: + event = StormEvent.from_nhc_code(name_or_code) + else: + event = StormEvent(name_or_code, year) + nhc_code = event.nhc_code + storm_name = event.name + + prescribed = trackstart_from_file( + lead_times, nhc_code, hr_before_landfall + ) + + # TODO: Get user input for whether it's forecast or now! + now = datetime.now() + is_current_storm = (now - event.start_date < timedelta(days=30)) + + df_dt = pd.DataFrame(columns=['date_time']) + + # All preprocessed tracks are treated as OFCL + local_track_file = pathlib.Path() + if track_dir is not None: + local_track_file = track_dir / f'a{nhc_code.lower()}.dat' + + if use_past_forecast or is_current_storm: + logger.info("Fetching a-deck track info...") + + advisory = 'OFCL' + if not local_track_file.is_file(): + # Find and pick a single advisory based on priority + temp_track = event.track(file_deck='a') + adv_avail = temp_track.unfiltered_data.advisory.unique() + adv_order = ['OFCL', 'HWRF', 'HMON', 'CARQ'] + advisory = adv_avail[0] + for adv in adv_order: + if adv in adv_avail: + advisory = adv + break + + # TODO: THIS IS NO LONGER RELEVANT IF WE FAKE RMWP AS OFCL! + if advisory == "OFCL" and "CARQ" not in adv_avail: + raise ValueError( + "OFCL advisory needs CARQ for fixing missing variables!" + ) + + track = VortexTrack(nhc_code, file_deck='a', advisories=[advisory]) + + else: # read from preprocessed file + advisory = 'OFCL' + + # If a file exists, use the local file + track_raw = pd.read_csv(local_track_file, header=None, dtype=str) + assert len(track_raw[4].unique()) == 1 + track_raw[4] = advisory + + with tempfile.NamedTemporaryFile() as tmp: + track_raw.to_csv(tmp.name, header=False, index=False) + + unfixed_track = VortexTrack( + tmp.name, file_deck='a', advisories=[advisory] + ) + carq_track = event.track(file_deck='a', advisories=['CARQ']) + unfix_dict = { + **separate_tracks(unfixed_track.data), + **separate_tracks(carq_track.data), + } + + fix_dict = correct_ofcl_based_on_carq_n_hollandb(unfix_dict) + fix_track = combine_tracks(fix_dict) + + track = VortexTrack( + fix_track[fix_track.advisory == advisory], + file_deck='a', + advisories=[advisory] + ) + + + forecast_start = None # TODO? + if is_current_storm: + # Get the latest track forecast + forecast_start = track.data.track_start_time.max() + coops_ssh = None + + else: #if use_past_forecast: + logger.info( + f"Creating {advisory} track for {hr_before_landfall}" + +" hours before landfall forecast..." + ) + forecast_start = get_perturb_timestamp_in_track( + track, + 'track_start_time', + hr_before_landfall, + prescribed, + [shp_US, ne_low.unary_union], + ) + + logger.info("Fetching water levels for COOPS stations...") + coops_ssh = event.coops_product_within_isotach( + product='water_level', wind_speed=34, + datum=COOPS_TidalDatum.NAVD, + units=COOPS_Units.METRIC, + time_zone=COOPS_TimeZone.GMT, + ) + + df_dt['date_time'] = ( + forecast_start - timedelta(days=2), track.end_date, forecast_start + ) + + gdf_track = track.data[track.data.track_start_time == forecast_start] + # Prepend track from previous 0hr forecasts: + gdf_track = pd.concat(( + track.data[ + (track.data.track_start_time < forecast_start) + & (track.data.forecast_hours.astype(int) == 0) + ], + gdf_track + )) + + # NOTE: Fake best track for PySCHISM AFTER perturbation + # Fill missing name column if any + gdf_track['name'] = storm_name + track = VortexTrack( + storm=gdf_track, file_deck='a', advisories=[advisory] + ) + + windswath_dict = track.wind_swaths(wind_speed=34) + windswaths = windswath_dict[advisory] + logger.info(f"Fetching {advisory} windswath...") + windswath_time = min(pd.to_datetime(list(windswaths.keys()))) + windswath = windswaths[ + windswath_time.strftime("%Y%m%dT%H%M%S") + ] + + else: # Best track + + logger.info("Fetching b-deck track info...") + + + logger.info("Fetching BEST windswath...") + track = event.track(file_deck='b') + + perturb_start = track.start_date + if hr_before_landfall: + perturb_start = get_perturb_timestamp_in_track( + track, + 'datetime', + hr_before_landfall, + prescribed, + [shp_US, ne_low.unary_union], + ) + + logger.info("Fetching water level measurements from COOPS stations...") + coops_ssh = event.coops_product_within_isotach( + product='water_level', wind_speed=34, + datum=COOPS_TidalDatum.NAVD, + units=COOPS_Units.METRIC, + time_zone=COOPS_TimeZone.GMT, + ) + + df_dt['date_time'] = ( + track.start_date, track.end_date, perturb_start + ) + + # Drop duplicate rows based on isotach and time without minutes + # (PaHM doesn't take minutes into account) + gdf_track = track.data + gdf_track.datetime = gdf_track.datetime.dt.floor('h') + gdf_track = gdf_track.drop_duplicates( + subset=['datetime', 'isotach_radius'], keep='last' + ) + track = VortexTrack( + storm=gdf_track, file_deck='b', advisories=['BEST'] + ) + + windswath_dict = track.wind_swaths(wind_speed=34) + windswaths = windswath_dict['BEST'] + latest_advistory_stamp = max(pd.to_datetime(list(windswaths.keys()))) + windswath = windswaths[ + latest_advistory_stamp.strftime("%Y%m%dT%H%M%S") + ] + + logger.info("Writing relevant data to files...") + df_dt.to_csv(date_out) + # Remove duplicate entries for similar isotach and time + # (e.g. Dorian19 and Ian22 best tracks) + track.to_file(track_out) + gs = gpd.GeoSeries(windswath) + gdf_windswath = gpd.GeoDataFrame( + geometry=gs, data={'RADII': len(gs) * [34]}, crs="EPSG:4326" + ) + gdf_windswath.to_file(swath_out) + if coops_ssh is not None and len(coops_ssh) > 0: + coops_ssh.to_netcdf(sta_dat_out, 'w') + coops_ssh[['x', 'y']].to_dataframe().drop(columns=['nws_id']).to_csv( + sta_loc_out, header=False, index=False) + +def cli(): + parser = argparse.ArgumentParser() + + parser.add_argument( + "name_or_code", help="name or NHC code of the storm", type=str) + parser.add_argument( + "year", help="year of the storm", type=int) + + parser.add_argument( + "--date-range-outpath", + help="output date range", + type=pathlib.Path, + required=True + ) + + parser.add_argument( + "--track-outpath", + help="output hurricane track", + type=pathlib.Path, + required=True + ) + + parser.add_argument( + "--swath-outpath", + help="output hurricane windswath", + type=pathlib.Path, + required=True + ) + + parser.add_argument( + "--station-data-outpath", + help="output station data", + type=pathlib.Path, + required=True + ) + + parser.add_argument( + "--station-location-outpath", + help="output station location", + type=pathlib.Path, + required=True + ) + + parser.add_argument( + "--past-forecast", + help="Get forecast data for a past storm", + action='store_true', + ) + + parser.add_argument( + "--hours-before-landfall", + help="Get forecast data for a past storm at this many hour before landfall", + type=int, + default=-1, + ) + + parser.add_argument( + "--lead-times", + type=pathlib.Path, + help="Helper file for prescribed lead times", + ) + + parser.add_argument( + "--preprocessed-tracks-dir", + type=pathlib.Path, + help="Existing adjusted track directory", + ) + + parser.add_argument( + "--countries-polygon", + type=pathlib.Path, + help="Shapefile containing country polygons", + ) + + args = parser.parse_args() + + main(args) + +if __name__ == '__main__': + cli() + diff --git a/singularity/ocsmesh/files/hurricane_mesh.py b/stormworkflow/prep/hurricane_mesh.py old mode 100755 new mode 100644 similarity index 99% rename from singularity/ocsmesh/files/hurricane_mesh.py rename to stormworkflow/prep/hurricane_mesh.py index 7af8cec..bf93ff3 --- a/singularity/ocsmesh/files/hurricane_mesh.py +++ b/stormworkflow/prep/hurricane_mesh.py @@ -526,8 +526,7 @@ def run(self, args): overwrite=True) - -if __name__ == '__main__': +def cli(): parser = argparse.ArgumentParser() parser.add_argument( @@ -546,3 +545,8 @@ def run(self, args): logger.info(f"Mesh arguments are {args}.") main(args, [hurrmesh_client, subset_client]) + + +if __name__ == '__main__': + cli() + diff --git a/singularity/prep/files/setup_ensemble.py b/stormworkflow/prep/setup_ensemble.py similarity index 69% rename from singularity/prep/files/setup_ensemble.py rename to stormworkflow/prep/setup_ensemble.py index d16b4a0..71d2218 100644 --- a/singularity/prep/files/setup_ensemble.py +++ b/stormworkflow/prep/setup_ensemble.py @@ -37,11 +37,13 @@ from stormevents import StormEvent from stormevents.nhc.track import VortexTrack -import wwm -from relocate_source_feeder import ( - relocate_sources, - v16_mandatory_sources_coor, -) +import stormworkflow.prep.wwm +# TODO: Later find a clean way to package this module from SCHISM from +# src/Utility/Pre-Processing/STOFS-3D-Atl-shadow-VIMS/Pre_processing/Source_sink/Relocate/ +#from relocate_source_feeder import ( +# relocate_sources, +# v16_mandatory_sources_coor, +#) REFS = Path('/refs') @@ -51,7 +53,6 @@ logger.setLevel(logging.INFO) - def _relocate_source_sink(schism_dir, region_shape): # Feeder info is generated during mesh generation @@ -66,19 +67,15 @@ def _relocate_source_sink(schism_dir, region_shape): original_ss = source_sink.from_files(source_dir=old_ss_dir) region = gpd.read_file(region_shape) - region_coords = [ - get_coordinates(p) for p in region.explode(index_parts=True).exterior - ] + region_coords = [get_coordinates(p) for p in region.explode(index_parts=True).exterior] # split source/sink into inside and outside region - _, outside_ss = original_ss.clip_by_polygons( - hgrid=hgrid, polygons_xy=region_coords, - ) + _, outside_ss = original_ss.clip_by_polygons(hgrid=hgrid, polygons_xy=region_coords,) # relocate sources relocated_ss = relocate_sources( old_ss_dir=old_ss_dir, # based on the without feeder hgrid - feeder_info_file=feeder_info_file, + feeder_info_file=feeder_info_file, hgrid_fname=hgrid_fname, # HGrid with feeder outdir=str(schism_dir / 'relocated_source_sink'), max_search_radius=2000, # search radius (in meters) @@ -96,7 +93,6 @@ def _relocate_source_sink(schism_dir, region_shape): combined_ss.writer(str(schism_dir)) - def _fix_nwm_issues(ensemble_dir, hires_shapefile): # Workaround for hydrology param bug #34 @@ -130,17 +126,16 @@ def main(args): workdir.mkdir(exist_ok=True) dt_data = pd.read_csv(dt_rng_path, delimiter=',') - date_1, date_2, date_3 = pd.to_datetime(dt_data.date_time).dt.strftime( - "%Y%m%d%H").values - model_start_time = datetime.strptime(date_1, "%Y%m%d%H") - model_end_time = datetime.strptime(date_2, "%Y%m%d%H") - perturb_start = datetime.strptime(date_3, "%Y%m%d%H") - spinup_time = timedelta(days=2) + date_1, date_2, date_3 = pd.to_datetime(dt_data.date_time).dt.strftime('%Y%m%d%H').values + model_start_time = datetime.strptime(date_1, '%Y%m%d%H') + model_end_time = datetime.strptime(date_2, '%Y%m%d%H') + perturb_start = datetime.strptime(date_3, '%Y%m%d%H') + spinup_time = timedelta(days=8) forcing_configurations = [] - forcing_configurations.append(TidalForcingJSON( - resource=tpxo_dir / 'h_tpxo9.v1.nc', - tidal_source=TidalSource.TPXO)) + forcing_configurations.append( + TidalForcingJSON(resource=tpxo_dir / 'h_tpxo9.v1.nc', tidal_source=TidalSource.TPXO) + ) if with_hydrology: forcing_configurations.append( NationalWaterModelFocringJSON( @@ -148,11 +143,10 @@ def main(args): cache=True, source_json=workdir / 'source.json', sink_json=workdir / 'sink.json', - pairing_hgrid=mesh_file + pairing_hgrid=mesh_file, ) ) - platform = Platform.LOCAL unperturbed = None @@ -161,38 +155,35 @@ def main(args): orig_track = VortexTrack.from_file(track_path) adv_uniq = orig_track.data.advisory.unique() if len(adv_uniq) != 1: - raise ValueError("Track file has multiple advisory types!") + raise ValueError('Track file has multiple advisory types!') advisory = adv_uniq.item() file_deck = 'a' if advisory != 'BEST' else 'b' - # NOTE: Perturbers use min("forecast_time") to filter multiple # tracks. But for OFCL forecast simulation, the track file we # get has unique forecast time for only the segment we want to # perturb, the preceeding entries are 0-hour forecasts from # previous forecast_times track_to_perturb = VortexTrack.from_file( - track_path, - start_date=perturb_start, - forecast_time=perturb_start if advisory != 'BEST' else None, - end_date=model_end_time, - file_deck=file_deck, - advisories=[advisory], - ) - track_to_perturb.to_file( - workdir/'track_to_perturb.dat', overwrite=True + track_path, + start_date=perturb_start, + forecast_time=perturb_start if advisory != 'BEST' else None, + end_date=model_end_time, + file_deck=file_deck, + advisories=[advisory], ) + track_to_perturb.to_file(workdir / 'track_to_perturb.dat', overwrite=True) perturbations = perturb_tracks( perturbations=args.num_perturbations, - directory=workdir/'track_files', - storm=workdir/'track_to_perturb.dat', + directory=workdir / 'track_files', + storm=workdir / 'track_to_perturb.dat', variables=[ 'cross_track', 'along_track', - 'radius_of_maximum_winds', + 'radius_of_maximum_winds', # TODO: add option for persistent 'max_sustained_wind_speed', - ], + ], sample_from_distribution=args.sample_from_distribution, sample_rule=args.sample_rule, quadrature=args.quadrature, @@ -204,9 +195,7 @@ def main(args): ) if perturb_start != model_start_time: - perturb_idx = orig_track.data[ - orig_track.data.datetime == perturb_start - ].index.min() + perturb_idx = orig_track.data[orig_track.data.datetime == perturb_start].index.min() if perturb_idx > 0: # If only part of the track needs to be updated @@ -216,26 +205,21 @@ def main(args): unperturbed = VortexTrack( unperturbed_data, file_deck='b', - advisories = ['BEST'], - end_date=orig_track.data.iloc[perturb_idx - 1].datetime + advisories=['BEST'], + end_date=orig_track.data.iloc[perturb_idx - 1].datetime, ) # Read generated tracks and append to unpertubed section - perturbed_tracks = glob.glob(str(workdir/'track_files'/'*.22')) + perturbed_tracks = glob.glob(str(workdir / 'track_files' / '*.22')) for pt in perturbed_tracks: # Fake BEST track here (in case it's not a real best)! perturbed_data = VortexTrack.from_file(pt).data perturbed_data.advisory = 'BEST' perturbed_data.forecast_hours = 0 - perturbed = VortexTrack( - perturbed_data, - file_deck='b', - advisories = ['BEST'], - ) + perturbed = VortexTrack(perturbed_data, file_deck='b', advisories=['BEST'],) full_track = pd.concat( - (unperturbed.fort_22(), perturbed.fort_22()), - ignore_index=True + (unperturbed.fort_22(), perturbed.fort_22()), ignore_index=True ) # Overwrites the perturbed-segment-only file full_track.to_csv(pt, index=False, header=False) @@ -244,11 +228,11 @@ def main(args): # spinup too instead of spinup trying to download! forcing_configurations.append( BestTrackForcingJSON( - nhc_code=f'{args.name}{args.year}', + nhc_code=orig_track.nhc_code, interval_seconds=3600, nws=20, - fort22_filename=workdir/'track_files'/'original.22', - attributes={'model': pahm_model} + fort22_filename=workdir / 'track_files' / 'original.22', + attributes={'model': pahm_model}, ) ) @@ -261,12 +245,10 @@ def main(args): 'forcings': forcing_configurations, 'perturbations': perturbations, 'platform': platform, -# 'schism_executable': 'pschism_PAHM_TVD-VL' + # 'schism_executable': 'pschism_PAHM_TVD-VL' } - run_configuration = SCHISMRunConfiguration( - **run_config_kwargs, - ) + run_configuration = SCHISMRunConfiguration(**run_config_kwargs,) run_configuration['schism']['hgrid_path'] = mesh_file run_configuration['schism']['attributes']['ncor'] = 1 @@ -275,13 +257,15 @@ def main(args): ) # Now generate the setup - generate_schism_configuration(**{ - 'configuration_directory': workdir, - 'output_directory': workdir, - 'relative_paths': True, - 'overwrite': True, - 'parallel': True - }) + generate_schism_configuration( + **{ + 'configuration_directory': workdir, + 'output_directory': workdir, + 'relative_paths': True, + 'overwrite': True, + 'parallel': True, + } + ) if with_hydrology: _fix_nwm_issues(workdir, hires_reg) @@ -293,10 +277,10 @@ def parse_arguments(): argument_parser = ArgumentParser() argument_parser.add_argument( - "--track-file", - help="path to the storm track file for parametric wind setup", + '--track-file', + help='path to the storm track file for parametric wind setup', type=Path, - required=True + required=True, ) argument_parser.add_argument( @@ -304,30 +288,26 @@ def parse_arguments(): required=True, type=Path, default=None, - help='path to store generated configuration files' + help='path to store generated configuration files', ) argument_parser.add_argument( - "--date-range-file", + '--date-range-file', required=True, type=Path, - help="path to the file containing simulation date range" + help='path to the file containing simulation date range', ) argument_parser.add_argument( - '-n', '--num-perturbations', + '-n', + '--num-perturbations', type=int, required=True, help='path to input mesh (`hgrid.gr3`, `manning.gr3` or `drag.gr3`)', ) argument_parser.add_argument( - "--tpxo-dir", - required=True, - type=Path, - help="path to the TPXO dataset directory", + '--tpxo-dir', required=True, type=Path, help='path to the TPXO dataset directory', ) argument_parser.add_argument( - "--nwm-file", - type=Path, - help="path to the NWM hydrofabric dataset", + '--nwm-file', type=Path, help='path to the NWM hydrofabric dataset', ) argument_parser.add_argument( '--mesh-directory', @@ -339,38 +319,26 @@ def parse_arguments(): '--hires-region', type=Path, required=True, - help='path to high resolution polygon shapefile' - ) - argument_parser.add_argument( - "--sample-from-distribution", action="store_true" - ) - argument_parser.add_argument( - "--sample-rule", type=str, default='random' - ) - argument_parser.add_argument( - "--quadrature", action="store_true" - ) - argument_parser.add_argument( - "--use-wwm", action="store_true" - ) - argument_parser.add_argument( - "--with-hydrology", action="store_true" - ) - argument_parser.add_argument( - "--pahm-model", choices=['gahm', 'symmetric'], default='gahm' + help='path to high resolution polygon shapefile', ) + argument_parser.add_argument('--sample-from-distribution', action='store_true') + argument_parser.add_argument('--sample-rule', type=str, default='random') + argument_parser.add_argument('--quadrature', action='store_true') + argument_parser.add_argument('--use-wwm', action='store_true') + argument_parser.add_argument('--with-hydrology', action='store_true') + argument_parser.add_argument('--pahm-model', choices=['gahm', 'symmetric'], default='gahm') - argument_parser.add_argument( - "name", help="name of the storm", type=str) - - argument_parser.add_argument( - "year", help="year of the storm", type=int) + argument_parser.add_argument('name', help='name of the storm', type=str) + argument_parser.add_argument('year', help='year of the storm', type=int) args = argument_parser.parse_args() return args -if __name__ == "__main__": +def cli(): main(parse_arguments()) + +if __name__ == '__main__': + cli() diff --git a/singularity/prep/files/setup_model.py b/stormworkflow/prep/setup_model.py old mode 100755 new mode 100644 similarity index 62% rename from singularity/prep/files/setup_model.py rename to stormworkflow/prep/setup_model.py index 6e2061f..48dbb78 --- a/singularity/prep/files/setup_model.py +++ b/stormworkflow/prep/setup_model.py @@ -20,7 +20,9 @@ from pyschism import dates from pyschism.enums import NWSType from pyschism.driver import ModelConfig -from pyschism.forcing.bctides import iettype, ifltype +from pyschism.forcing.bctides.tides import Tides, TidalDatabase +from pyschism.forcing.bctides.tpxo import TPXO_ELEVATION +from pyschism.forcing.bctides.tpxo import TPXO_VELOCITY from pyschism.forcing.nws import GFS, HRRR, ERA5, BestTrackForcing from pyschism.forcing.nws.nws2 import hrrr3 from pyschism.forcing.source_sink import NWM @@ -33,14 +35,14 @@ logger = logging.getLogger(__name__) logger.setLevel(logging.INFO) -CDSAPI_URL = "https://cds.climate.copernicus.eu/api/v2" +CDSAPI_URL = 'https://cds.climate.copernicus.eu/api/v2' TPXO_LINK_PATH = pathlib.Path('~').expanduser() / '.local/share/tpxo' NWM_LINK_PATH = pathlib.Path('~').expanduser() / '.local/share/pyschism/nwm' @contextmanager def pushd(directory): - '''Temporarily modify current directory + """Temporarily modify current directory Parameters ---------- @@ -50,7 +52,7 @@ def pushd(directory): Returns ------- None - ''' + """ origin = os.getcwd() try: @@ -65,14 +67,15 @@ def get_main_cache_path(cache_dir, storm, year): return cache_dir / f'{storm.lower()}_{year}' + def get_meteo_cache_path(source, main_cache_path, bbox, start_date, end_date): m = hashlib.md5() m.update(np.round(bbox.corners(), decimals=2).tobytes()) - m.update(start_date.strftime("%Y-%m-%d:%H:%M:%S").encode('utf8')) - m.update(end_date.strftime("%Y-%m-%d:%H:%M:%S").encode('utf8')) + m.update(start_date.strftime('%Y-%m-%d:%H:%M:%S').encode('utf8')) + m.update(end_date.strftime('%Y-%m-%d:%H:%M:%S').encode('utf8')) - meteo_cache_path = main_cache_path / f"{source}_{m.hexdigest()}" + meteo_cache_path = main_cache_path / f'{source}_{m.hexdigest()}' return meteo_cache_path @@ -82,7 +85,7 @@ def cache_lock(cache_path): if not cache_path.exists(): cache_path.mkdir(parents=True, exist_ok=True) - with open(cache_path / ".cache.lock", "w") as fp: + with open(cache_path / '.cache.lock', 'w') as fp: try: fcntl.flock(fp.fileno(), fcntl.LOCK_EX) yield @@ -90,6 +93,7 @@ def cache_lock(cache_path): finally: fcntl.flock(fp.fileno(), fcntl.LOCK_UN) + def from_meteo_cache(meteo_cache_path, sflux_dir): # TODO: Generalize @@ -98,10 +102,10 @@ def from_meteo_cache(meteo_cache_path, sflux_dir): return False contents = list(meteo_cache_path.iterdir()) - if not any(p.match("sflux_inputs.txt") for p in contents): + if not any(p.match('sflux_inputs.txt') for p in contents): return False - logger.info("Creating sflux from cache...") + logger.info('Creating sflux from cache...') # Copy files from cache dir to sflux dir for p in contents: @@ -111,7 +115,7 @@ def from_meteo_cache(meteo_cache_path, sflux_dir): else: shutil.copy(p, dest) - logger.info("Done copying cached sflux.") + logger.info('Done copying cached sflux.') return True @@ -119,12 +123,12 @@ def from_meteo_cache(meteo_cache_path, sflux_dir): def copy_meteo_cache(sflux_dir, meteo_cache_path): # TODO: Generalize - logger.info("Copying cache files to main cache location...") + logger.info('Copying cache files to main cache location...') # Copy files from sflux dir to cache dir # Clean meteo_cache_path if already populated? contents_dst = list(meteo_cache_path.iterdir()) - contents_dst = [p for p in contents_dst if p.suffix != ".lock"] + contents_dst = [p for p in contents_dst if p.suffix != '.lock'] for p in contents_dst: if p.is_dir(): shutil.rmtree(p) @@ -140,7 +144,8 @@ def copy_meteo_cache(sflux_dir, meteo_cache_path): else: shutil.copy(p, dest) - logger.info("Done copying cache files to main cache location.") + logger.info('Done copying cache files to main cache location.') + def setup_schism_model( mesh_path, @@ -153,29 +158,31 @@ def setup_schism_model( nhc_track_file=None, storm_id=None, use_wwm=False, + tpxo_dir=None, + with_hydrology=False, ): domain_box = gpd.read_file(domain_bbox_path) - atm_bbox = Bbox(domain_box.to_crs('EPSG:4326').total_bounds.reshape(2,2)) + atm_bbox = Bbox(domain_box.to_crs('EPSG:4326').total_bounds.reshape(2, 2)) schism_dir = out_dir schism_dir.mkdir(exist_ok=True, parents=True) - logger.info("Calculating times and dates") - dt = timedelta(seconds=150.) + logger.info('Calculating times and dates') + dt = timedelta(seconds=150.0) # Use an integer for number of steps or a timedelta to approximate # number of steps internally based on timestep - nspool = timedelta(minutes=20.) - + nspool = timedelta(minutes=20.0) # measurement days +7 days of simulation: 3 ramp, 2 prior # & 2 after the measurement dates dt_data = pd.read_csv(date_range_path, delimiter=',') - date_1, date_2 = pd.to_datetime(dt_data.date_time).dt.strftime( + date_1, date_2, date_3 = pd.to_datetime(dt_data.date_time).dt.strftime( "%Y%m%d%H").values date_1 = datetime.strptime(date_1, "%Y%m%d%H") date_2 = datetime.strptime(date_2, "%Y%m%d%H") +# date_3 = datetime.strptime(date_3, "%Y%m%d%H") # If there are no observation data, it's hindcast mode @@ -183,12 +190,12 @@ def setup_schism_model( if hindcast_mode: # If in hindcast mode run for 4 days: 2 days prior to now to # 2 days after. - logger.info("Setup hindcast mode") + logger.info('Setup hindcast mode') start_date = date_1 - timedelta(days=2) end_date = date_2 + timedelta(days=2) else: - logger.info("Setup forecast mode") + logger.info('Setup forecast mode') # If in forecast mode then date_1 == date_2, and simulation # will run for about 3 days: abou 1 day prior to now to 2 days @@ -208,22 +215,20 @@ def setup_schism_model( rnday = end_date - start_date - dramp = timedelta(days=1.) + dramp = timedelta(days=1.0) - hgrid = Hgrid.open(mesh_path, crs="epsg:4326") + hgrid = Hgrid.open(mesh_path, crs='epsg:4326') fgrid = ManningsN.linear_with_depth( - hgrid, - min_value=0.02, max_value=0.05, - min_depth=-1.0, max_depth=-3.0) + hgrid, min_value=0.02, max_value=0.05, min_depth=-1.0, max_depth=-3.0 + ) coops_stations = None stations_file = station_info_path if stations_file.is_file(): st_data = np.genfromtxt(stations_file, delimiter=',') coops_stations = Stations( - nspool_sta=nspool, - crs="EPSG:4326", - elev=True, u=True, v=True) + nspool_sta=nspool, crs='EPSG:4326', elev=True, u=True, v=True + ) for coord in st_data: coops_stations.add_station(coord[0], coord[1]) @@ -235,7 +240,7 @@ def setup_schism_model( elif storm_id is not None: atmospheric = BestTrackForcing(storm=storm_id) else: - ValueError("Storm track information is not provided!") + ValueError('Storm track information is not provided!') else: # For hindcast ERA5 is used and for forecast # GFS and hrrr3.HRRR. Neither ERA5 nor the GFS and @@ -243,20 +248,30 @@ def setup_schism_model( pass + tidal_flags = [3, 3, 0, 0] logger.info("Creating model configuration ...") + src_sink = None + if with_hydrology: + src_sink = NWM() config = ModelConfig( hgrid=hgrid, fgrid=fgrid, - iettype=iettype.Iettype3(database="tpxo"), - ifltype=ifltype.Ifltype3(database="tpxo"), + flags=[tidal_flags for _ in hgrid.boundaries.open.itertuples()], + constituents=[], # we're overwriting Tides obj + database='tpxo', # we're overwriting Tides obj nws=atmospheric, - source_sink=NWM(), + source_sink=src_sink, + ) + tide_db = TidalDatabase.TPXO.value( + h_file=tpxo_dir / TPXO_ELEVATION, u_file=tpxo_dir / TPXO_VELOCITY, ) + tides = Tides(tidal_database=tide_db, constituents='all') + config.bctides.tides = tides if config.forcings.nws and getattr(config.forcings.nws, 'sflux_2', None): config.forcings.nws.sflux_2.inventory.file_interval = timedelta(hours=6) - logger.info("Creating cold start ...") + logger.info('Creating cold start ...') # create reference dates coldstart = config.coldstart( stations=coops_stations, @@ -271,19 +286,19 @@ def setup_schism_model( dahv=True, ) - logger.info("Writing to disk ...") + logger.info('Writing to disk ...') if not parametric_wind: # In hindcast mode ERA5 is used manually: temporary solution - sflux_dir = (schism_dir / "sflux") + sflux_dir = schism_dir / 'sflux' sflux_dir.mkdir(exist_ok=True, parents=True) # Workaround for ERA5 not being compatible with NWS2 object meteo_cache_kwargs = { - 'bbox': atm_bbox, - 'start_date': start_date, - 'end_date': start_date + rnday + 'bbox': atm_bbox, + 'start_date': start_date, + 'end_date': start_date + rnday, } if hindcast_mode: @@ -300,16 +315,18 @@ def setup_schism_model( if hindcast_mode: era5 = ERA5() era5.write( - outdir=schism_dir / "sflux", - start_date=start_date, - rnday=rnday.total_seconds() / timedelta(days=1).total_seconds(), - air=True, rad=True, prc=True, - bbox=atm_bbox, - overwrite=True) + outdir=schism_dir / 'sflux', + start_date=start_date, + rnday=rnday.total_seconds() / timedelta(days=1).total_seconds(), + air=True, + rad=True, + prc=True, + bbox=atm_bbox, + overwrite=True, + ) else: - with ExitStack() as stack: # Just to make sure there are not permission @@ -320,14 +337,16 @@ def setup_schism_model( gfs = GFS() gfs.write( - outdir=schism_dir / "sflux", - level=1, - start_date=start_date, - rnday=rnday.total_seconds() / timedelta(days=1).total_seconds(), - air=True, rad=True, prc=True, - bbox=atm_bbox, - overwrite=True - ) + outdir=schism_dir / 'sflux', + level=1, + start_date=start_date, + rnday=rnday.total_seconds() / timedelta(days=1).total_seconds(), + air=True, + rad=True, + prc=True, + bbox=atm_bbox, + overwrite=True, + ) # If we should limit forecast to 2 days, then # why not use old HRRR implementation? Because @@ -336,39 +355,40 @@ def setup_schism_model( # 2day forecast! hrrr = HRRR() hrrr.write( - outdir=schism_dir / "sflux", - level=2, - start_date=start_date, - rnday=rnday.total_seconds() / timedelta(days=1).total_seconds(), - air=True, rad=True, prc=True, - bbox=atm_bbox, - overwrite=True + outdir=schism_dir / 'sflux', + level=2, + start_date=start_date, + rnday=rnday.total_seconds() / timedelta(days=1).total_seconds(), + air=True, + rad=True, + prc=True, + bbox=atm_bbox, + overwrite=True, ) -# hrrr3.HRRR( -# start_date=start_date, -# rnday=rnday.total_seconds() / timedelta(days=1).total_seconds(), -# record=2, -# bbox=atm_bbox -# ) -# for i, nc_file in enumerate(sorted(pathlib.Path().glob('*/*.nc'))): -# dst_air = schism_dir / "sflux" / f"sflux_air_2.{i:04d}.nc" -# shutil.move(nc_file, dst_air) -# pathlib.Path(schism_dir / "sflux" / f"sflux_prc_2.{i:04d}.nc").symlink_to( -# dst_air -# ) -# pathlib.Path(schism_dir / "sflux" / f"sflux_rad_2.{i:04d}.nc").symlink_to( -# dst_air -# ) - - - with open(schism_dir / "sflux" / "sflux_inputs.txt", "w") as f: - f.write("&sflux_inputs\n/\n") + # hrrr3.HRRR( + # start_date=start_date, + # rnday=rnday.total_seconds() / timedelta(days=1).total_seconds(), + # record=2, + # bbox=atm_bbox + # ) + # for i, nc_file in enumerate(sorted(pathlib.Path().glob('*/*.nc'))): + # dst_air = schism_dir / "sflux" / f"sflux_air_2.{i:04d}.nc" + # shutil.move(nc_file, dst_air) + # pathlib.Path(schism_dir / "sflux" / f"sflux_prc_2.{i:04d}.nc").symlink_to( + # dst_air + # ) + # pathlib.Path(schism_dir / "sflux" / f"sflux_rad_2.{i:04d}.nc").symlink_to( + # dst_air + # ) + + with open(schism_dir / 'sflux' / 'sflux_inputs.txt', 'w') as f: + f.write('&sflux_inputs\n/\n') copy_meteo_cache(sflux_dir, meteo_cache_path) windrot = gridgr3.Windrot.default(hgrid) - windrot.write(schism_dir / "windrot_geo2proj.gr3", overwrite=True) + windrot.write(schism_dir / 'windrot_geo2proj.gr3', overwrite=True) ## end of workaround # Workaround for bug #30 @@ -376,12 +396,11 @@ def setup_schism_model( coldstart.param.opt.nws = NWSType.CLIMATE_AND_FORECAST.value ## end of workaround - - # Workaround for station bug #32 if coops_stations is not None: coldstart.param.schout.nspool_sta = int( - round(nspool.total_seconds() / coldstart.param.core.dt)) + round(nspool.total_seconds() / coldstart.param.core.dt) + ) ## end of workaround with ExitStack() as stack: @@ -396,13 +415,14 @@ def setup_schism_model( # Workardoun for hydrology param bug #34 nm_list = f90nml.read(schism_dir / 'param.nml') - nm_list['opt']['if_source'] = 1 + if with_hydrology: + nm_list['opt']['if_source'] = 1 nm_list.write(schism_dir / 'param.nml', force=True) ## end of workaround ## Workaround to make sure outputs directory is copied from/to S3 try: - os.mknod(schism_dir / "outputs" / "_") + os.mknod(schism_dir / 'outputs' / '_') except FileExistsError: pass ## end of workaround @@ -410,7 +430,8 @@ def setup_schism_model( if use_wwm: wwm.setup_wwm(mesh_path, schism_dir, ensemble=False) - logger.info("Setup done") + logger.info('Setup done') + def main(args): @@ -424,20 +445,19 @@ def main(args): st_loc_path = args.station_location_file out_dir = args.out nhc_track = None if args.track_file is None else args.track_file - cache_path = get_main_cache_path( - args.cache_dir, storm_name, storm_year - ) + cache_path = get_main_cache_path(args.cache_dir, storm_name, storm_year) tpxo_dir = args.tpxo_dir nwm_dir = args.nwm_dir use_wwm = args.use_wwm - if TPXO_LINK_PATH.is_dir(): - shutil.rmtree(TPXO_LINK_PATH) - if NWM_LINK_PATH.is_dir(): - shutil.rmtree(NWM_LINK_PATH) - os.symlink(tpxo_dir, TPXO_LINK_PATH, target_is_directory=True) - os.symlink(nwm_dir, NWM_LINK_PATH, target_is_directory=True) + with_hydrology = args.with_hydrology +# if TPXO_LINK_PATH.is_dir(): +# shutil.rmtree(TPXO_LINK_PATH) +# if NWM_LINK_PATH.is_dir(): +# shutil.rmtree(NWM_LINK_PATH) +# os.symlink(tpxo_dir, TPXO_LINK_PATH, target_is_directory=True) +# os.symlink(nwm_dir, NWM_LINK_PATH, target_is_directory=True) setup_schism_model( mesh_path, @@ -449,7 +469,9 @@ def main(args): parametric_wind=param_wind, nhc_track_file=nhc_track, storm_id=f'{storm_name}{storm_year}', - use_wwm=use_wwm + use_wwm=use_wwm, + tpxo_dir=tpxo_dir, + with_hydrology=with_hydrology, ) @@ -457,67 +479,61 @@ def main(args): parser = argparse.ArgumentParser() - parser.add_argument( - "--parametric-wind", "-w", - help="flag to switch to parametric wind setup", action="store_true") + '--parametric-wind', + '-w', + help='flag to switch to parametric wind setup', + action='store_true', + ) parser.add_argument( - "--mesh-file", - help="path to the file containing computational grid", - type=pathlib.Path + '--mesh-file', help='path to the file containing computational grid', type=pathlib.Path ) parser.add_argument( - "--domain-bbox-file", - help="path to the file containing domain bounding box", - type=pathlib.Path + '--domain-bbox-file', + help='path to the file containing domain bounding box', + type=pathlib.Path, ) parser.add_argument( - "--date-range-file", - help="path to the file containing simulation date range", - type=pathlib.Path + '--date-range-file', + help='path to the file containing simulation date range', + type=pathlib.Path, ) parser.add_argument( - "--station-location-file", - help="path to the file containing station locations", - type=pathlib.Path + '--station-location-file', + help='path to the file containing station locations', + type=pathlib.Path, ) + parser.add_argument('--cache-dir', help='path to the cache directory', type=pathlib.Path) + parser.add_argument( - "--cache-dir", - help="path to the cache directory", - type=pathlib.Path + '--track-file', + help='path to the storm track file for parametric wind setup', + type=pathlib.Path, ) parser.add_argument( - "--track-file", - help="path to the storm track file for parametric wind setup", - type=pathlib.Path + '--tpxo-dir', help='path to the TPXO database directory', type=pathlib.Path ) parser.add_argument( - "--tpxo-dir", - help="path to the TPXO database directory", - type=pathlib.Path + '--nwm-dir', help='path to the NWM stream vector database directory', type=pathlib.Path ) parser.add_argument( - "--nwm-dir", - help="path to the NWM stream vector database directory", - type=pathlib.Path + '--out', help='path to the setup output (solver input) directory', type=pathlib.Path ) parser.add_argument( - "--out", - help="path to the setup output (solver input) directory", - type=pathlib.Path + "--use-wwm", action="store_true" ) parser.add_argument( - "--use-wwm", action="store_true" + "--with-hydrology", action="store_true" ) parser.add_argument( diff --git a/singularity/prep/files/wwm.py b/stormworkflow/prep/wwm.py similarity index 67% rename from singularity/prep/files/wwm.py rename to stormworkflow/prep/wwm.py index 56a060a..0448687 100644 --- a/singularity/prep/files/wwm.py +++ b/stormworkflow/prep/wwm.py @@ -13,18 +13,18 @@ REFS = Path('/refs') + def setup_wwm(mesh_file: Path, setup_dir: Path, ensemble: bool): - '''Output is + """Output is - hgrid_WWM.gr3 - param.nml - wwmbnd.gr3 - wwminput.nml - ''' + """ - runs_dir = [setup_dir] if ensemble: - spinup_dir = setup_dir/'spinup' + spinup_dir = setup_dir / 'spinup' runs_dir = setup_dir.glob('runs/*') schism_grid = Gr3.open(mesh_file, crs=4326) @@ -44,7 +44,6 @@ def setup_wwm(mesh_file: Path, setup_dir: Path, ensemble: bool): wwm_nml = get_wwm_params(run_name=run.name, schism_nml=schism_nml) wwm_nml.write(run / 'wwminput.nml') - def break_quads(pyschism_mesh: Gr3) -> Gr3 | Gr3Field: @@ -54,27 +53,25 @@ def break_quads(pyschism_mesh: Gr3) -> Gr3 | Gr3Field: new_mesh = deepcopy(pyschism_mesh) else: - tmp = quads[:,2:] + tmp = quads[:, 2:] tmp = np.insert(tmp, 0, quads[:, 0], axis=1) broken = np.vstack((quads[:, :3], tmp)) trias = pyschism_mesh.triangles final_trias = np.vstack((trias, broken)) # NOTE: Node IDs and indexs are the same as before elements = { - idx+1: list(map(pyschism_mesh.nodes.get_id_by_index, tri)) + idx + 1: list(map(pyschism_mesh.nodes.get_id_by_index, tri)) for idx, tri in enumerate(final_trias) } new_mesh = deepcopy(pyschism_mesh) new_mesh.elements = Elements(pyschism_mesh.nodes, elements) - return new_mesh - def get_wwm_params(run_name, schism_nml) -> f90nml.Namelist: - + # Get relevant values from SCHISM setup begin_time = datetime( year=schism_nml['opt']['start_year'], @@ -93,7 +90,7 @@ def get_wwm_params(run_name, schism_nml) -> f90nml.Namelist: wwm_delta_t = nstep_wwm * delta_t # For now just read the example file update relevant names and write - wwm_params = f90nml.read(REFS/'wwminput.nml') + wwm_params = f90nml.read(REFS / 'wwminput.nml') wwm_params.uppercase = True proc_nml = wwm_params['PROC'] @@ -120,21 +117,21 @@ def get_wwm_params(run_name, schism_nml) -> f90nml.Namelist: grid_nml['IGRIDTYPE'] = 3 bouc_nml = wwm_params['BOUC'] - # Begin time of the wave boundary file (FILEWAVE) + # Begin time of the wave boundary file (FILEWAVE) bouc_nml['BEGTC'] = begin_time.strftime(time_fmt) - # Time step in FILEWAVE + # Time step in FILEWAVE bouc_nml['DELTC'] = 1 - # Unit can be HR, MIN, SEC + # Unit can be HR, MIN, SEC bouc_nml['UNITC'] = 'HR' # End time bouc_nml['ENDTC'] = end_time.strftime(time_fmt) # Boundary file defining boundary conditions and Neumann nodes. bouc_nml['FILEBOUND'] = 'wwmbnd.gr3' - bouc_nml['BEGTC_OUT'] = 20030908.000000 + bouc_nml['BEGTC_OUT'] = 20030908.000000 bouc_nml['DELTC_OUT'] = 600.000000000000 bouc_nml['UNITC_OUT'] = 'SEC' - bouc_nml['ENDTC_OUT'] = 20031008.000000 - + bouc_nml['ENDTC_OUT'] = 20031008.000000 + hist_nml = wwm_params['HISTORY'] # Start output time, yyyymmdd. hhmmss; # must fit the simulation time otherwise no output. @@ -169,42 +166,42 @@ def get_wwm_params(run_name, schism_nml) -> f90nml.Namelist: hot_nml = wwm_params['HOTFILE'] # Write hotfile hot_nml['LHOTF'] = False - #'.nc' suffix will be added -# hot_nml['FILEHOT_OUT'] = 'wwm_hot_out' -# #Starting time of hotfile writing. With ihot!=0 in SCHISM, -# # this will be whatever the new hotstarted time is (even with ihot=2) -# hot_nml['BEGTC'] = '20030908.000000' -# # time between hotfile writes -# hot_nml['DELTC'] = 86400. -# # unit used above -# hot_nml['UNITC'] = 'SEC' -# # Ending time of hotfile writing (adjust with BEGTC) -# hot_nml['ENDTC'] = '20031008.000000' -# # Applies only to netcdf -# # If T then hotfile contains 2 last records. -# # If F then hotfile contains N record if N outputs -# # have been done. -# # For binary only one record. -# hot_nml['LCYCLEHOT'] = True -# # 1: binary hotfile of data as output -# # 2: netcdf hotfile of data as output (default) -# hot_nml['HOTSTYLE_OUT'] = 2 -# # 0: hotfile in a single file (binary or netcdf) -# # MPI_REDUCE is then used and thus youd avoid too freq. output -# # 1: hotfiles in separate files, each associated -# # with one process -# hot_nml['MULTIPLEOUT'] = 0 -# # (Full) hot file name for input -# hot_nml['FILEHOT_IN'] = 'wwm_hot_in.nc' -# # 1: binary hotfile of data as input -# # 2: netcdf hotfile of data as input (default) -# hot_nml['HOTSTYLE_IN'] = 2 -# # Position in hotfile (only for netcdf) -# # for reading -# hot_nml['IHOTPOS_IN'] = 1 -# # 0: read hotfile from one single file -# # 1: read hotfile from multiple files (must use same # of CPU?) -# hot_nml['MULTIPLEIN'] = 0 + #'.nc' suffix will be added + # hot_nml['FILEHOT_OUT'] = 'wwm_hot_out' + # #Starting time of hotfile writing. With ihot!=0 in SCHISM, + # # this will be whatever the new hotstarted time is (even with ihot=2) + # hot_nml['BEGTC'] = '20030908.000000' + # # time between hotfile writes + # hot_nml['DELTC'] = 86400. + # # unit used above + # hot_nml['UNITC'] = 'SEC' + # # Ending time of hotfile writing (adjust with BEGTC) + # hot_nml['ENDTC'] = '20031008.000000' + # # Applies only to netcdf + # # If T then hotfile contains 2 last records. + # # If F then hotfile contains N record if N outputs + # # have been done. + # # For binary only one record. + # hot_nml['LCYCLEHOT'] = True + # # 1: binary hotfile of data as output + # # 2: netcdf hotfile of data as output (default) + # hot_nml['HOTSTYLE_OUT'] = 2 + # # 0: hotfile in a single file (binary or netcdf) + # # MPI_REDUCE is then used and thus youd avoid too freq. output + # # 1: hotfiles in separate files, each associated + # # with one process + # hot_nml['MULTIPLEOUT'] = 0 + # # (Full) hot file name for input + # hot_nml['FILEHOT_IN'] = 'wwm_hot_in.nc' + # # 1: binary hotfile of data as input + # # 2: netcdf hotfile of data as input (default) + # hot_nml['HOTSTYLE_IN'] = 2 + # # Position in hotfile (only for netcdf) + # # for reading + # hot_nml['IHOTPOS_IN'] = 1 + # # 0: read hotfile from one single file + # # 1: read hotfile from multiple files (must use same # of CPU?) + # hot_nml['MULTIPLEIN'] = 0 return wwm_params @@ -221,16 +218,15 @@ def update_schism_params(path: Path) -> f90nml.Namelist: opt_nml['icou_elfe_wwm'] = 1 opt_nml['nstep_wwm'] = 4 opt_nml['iwbl'] = 0 - opt_nml['hmin_radstress'] = 1. + opt_nml['hmin_radstress'] = 1.0 # TODO: Revisit for spinup support # NOTE: Issue 7#issuecomment-1482848205 oceanmodeling fork -# opt_nml['nrampwafo'] = 0 - opt_nml['drampwafo'] = 0. + # opt_nml['nrampwafo'] = 0 + opt_nml['drampwafo'] = 0.0 opt_nml['turbinj'] = 0.15 opt_nml['turbinjds'] = 1.0 opt_nml['alphaw'] = 0.5 - # NOTE: Python index is different from the NML index schout_nml = schism_nml['schout'] @@ -239,39 +235,39 @@ def update_schism_params(path: Path) -> f90nml.Namelist: schout_nml.start_index.update(iof_hydro=[14], iof_wwm=[1]) - #sig. height (m) {sigWaveHeight} 2D + # sig. height (m) {sigWaveHeight} 2D schout_nml['iof_wwm'][0] = 1 - #Mean average period (sec) - TM01 {meanWavePeriod} 2D + # Mean average period (sec) - TM01 {meanWavePeriod} 2D schout_nml['iof_wwm'][1] = 0 - #Zero down crossing period for comparison with buoy (s) - TM02 {zeroDowncrossPeriod} 2D + # Zero down crossing period for comparison with buoy (s) - TM02 {zeroDowncrossPeriod} 2D schout_nml['iof_wwm'][2] = 0 - #Average period of wave runup/overtopping - TM10 {TM10} 2D + # Average period of wave runup/overtopping - TM10 {TM10} 2D schout_nml['iof_wwm'][3] = 0 - #Mean wave number (1/m) {meanWaveNumber} 2D + # Mean wave number (1/m) {meanWaveNumber} 2D schout_nml['iof_wwm'][4] = 0 - #Mean wave length (m) {meanWaveLength} 2D + # Mean wave length (m) {meanWaveLength} 2D schout_nml['iof_wwm'][5] = 0 - #Mean average energy transport direction (degr) - MWD in NDBC? {meanWaveDirection} 2D + # Mean average energy transport direction (degr) - MWD in NDBC? {meanWaveDirection} 2D schout_nml['iof_wwm'][6] = 0 - #Mean directional spreading (degr) {meanDirSpreading} 2D + # Mean directional spreading (degr) {meanDirSpreading} 2D schout_nml['iof_wwm'][7] = 0 - #Discrete peak period (sec) - Tp {peakPeriod} 2D + # Discrete peak period (sec) - Tp {peakPeriod} 2D schout_nml['iof_wwm'][8] = 1 - #Continuous peak period based on higher order moments (sec) {continuousPeakPeriod} 2D + # Continuous peak period based on higher order moments (sec) {continuousPeakPeriod} 2D schout_nml['iof_wwm'][9] = 0 - #Peak phase vel. (m/s) {peakPhaseVel} 2D + # Peak phase vel. (m/s) {peakPhaseVel} 2D schout_nml['iof_wwm'][10] = 0 - #Peak n-factor {peakNFactor} 2D + # Peak n-factor {peakNFactor} 2D schout_nml['iof_wwm'][11] = 0 - #Peak group vel. (m/s) {peakGroupVel} 2D + # Peak group vel. (m/s) {peakGroupVel} 2D schout_nml['iof_wwm'][12] = 0 - #Peak wave number {peakWaveNumber} 2D + # Peak wave number {peakWaveNumber} 2D schout_nml['iof_wwm'][13] = 0 - #Peak wave length {peakWaveLength} 2D + # Peak wave length {peakWaveLength} 2D schout_nml['iof_wwm'][14] = 0 - #Peak (dominant) direction (degr) {dominantDirection} 2D + # Peak (dominant) direction (degr) {dominantDirection} 2D schout_nml['iof_wwm'][15] = 1 - #Peak directional spreading {peakSpreading} 2D + # Peak directional spreading {peakSpreading} 2D schout_nml['iof_wwm'][16] = 0 return schism_nml diff --git a/stormworkflow/refs/input.yaml b/stormworkflow/refs/input.yaml new file mode 100644 index 0000000..1e8141e --- /dev/null +++ b/stormworkflow/refs/input.yaml @@ -0,0 +1,40 @@ +--- +input_version: 0.0.1 + +storm: "florence" +year: 2018 +suffix: "" +subset_mesh: 1 +hr_prelandfall: -1 +past_forecast: 1 +hydrology: 0 +use_wwm: 0 +pahm_model: "gahm" +num_perturb: 2 +sample_rule: "korobov" +spinup_exec: "pschism_PAHM_TVD-VL" +hotstart_exec: "pschism_PAHM_TVD-VL" + +hpc_solver_nnodes: 3 +hpc_solver_ntasks: 108 +hpc_account: "" +hpc_partition: "" + +RUN_OUT: "" +L_NWM_DATASET: "" +L_TPXO_DATASET: "" +L_LEADTIMES_DATASET: "" +L_TRACK_DIR: "" +L_DEM_HI: "" +L_DEM_LO: "" +L_MESH_HI: "" +L_MESH_LO: "" +L_SHP_DIR: "" + +TMPDIR: "/tmp" +PATH_APPEND: "" + +L_SOLVE_MODULES: + - "intel/2022.1.2" + - "impi/2022.1.2" + - "netcdf" diff --git a/docker/pyschism/docker/refs/param.nml b/stormworkflow/refs/param.nml old mode 100755 new mode 100644 similarity index 100% rename from docker/pyschism/docker/refs/param.nml rename to stormworkflow/refs/param.nml diff --git a/docker/pyschism/docker/refs/wwminput.nml b/stormworkflow/refs/wwminput.nml old mode 100755 new mode 100644 similarity index 100% rename from docker/pyschism/docker/refs/wwminput.nml rename to stormworkflow/refs/wwminput.nml diff --git a/docker/schism/docker/combine_gr3.exp b/stormworkflow/scripts/combine_gr3.exp old mode 100755 new mode 100644 similarity index 100% rename from docker/schism/docker/combine_gr3.exp rename to stormworkflow/scripts/combine_gr3.exp diff --git a/singularity/solve/files/entrypoint.sh b/stormworkflow/scripts/entrypoint.sh old mode 100755 new mode 100644 similarity index 100% rename from singularity/solve/files/entrypoint.sh rename to stormworkflow/scripts/entrypoint.sh diff --git a/singularity/scripts/workflow.sh b/stormworkflow/scripts/workflow.sh similarity index 56% rename from singularity/scripts/workflow.sh rename to stormworkflow/scripts/workflow.sh index da699a2..e9fab6c 100755 --- a/singularity/scripts/workflow.sh +++ b/stormworkflow/scripts/workflow.sh @@ -1,44 +1,81 @@ #!/bin/bash set -e -# User inputs... -THIS_SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd ) -source $THIS_SCRIPT_DIR/input.conf - -if [ $use_wwm == 1 ]; then hotstart_exec='pschism_WWM_PAHM_TVD-VL'; fi - -# PATH -export PATH=$L_SCRIPT_DIR:$PATH +export PATH=$PATH:$PATH_APPEND +export TMPDIR # Processing... mkdir -p $TMPDIR +# CHECK VER +### pip install --quiet --report - --dry-run --no-deps -r requirements.txt | jq -r '.install' + +# CHECK BIN +# combine_hotstart7 +# pschism ... +input_file=$1 + +function version { + logfile=$1 + pip list | grep $2 >> $logfile +} + +function add_sbatch_header { + fnm=${2##*\/} + awk '!found && /^#SBATCH/ { print "#SBATCH '$1'"; found=1 } 1' $2 > /tmp/$fnm + mv /tmp/$fnm $2 +} + function init { - local run_dir=/lustre/hurricanes/$1 + local run_dir=$RUN_OUT/$1 mkdir $run_dir -# mkdir $run_dir/downloads + mkdir $run_dir/slurm + mkdir $run_dir/output mkdir $run_dir/mesh mkdir $run_dir/setup mkdir $run_dir/nhc_track mkdir $run_dir/coops_ssh + + for i in $L_SCRIPT_DIR/*.sbatch; do + d=$run_dir/slurm/${i##*\/} + cp $i $d + if [ ! -z $hpc_partition ]; then + add_sbatch_header "--partition=$hpc_partition" $d + fi + if [ ! -z $hpc_account ]; then + add_sbatch_header "--account=$hpc_account" $d + fi + done + + logfile=$run_dir/versions.info + version $logfile stormevents + version $logfile ensembleperturbation + version $logfile ocsmesh + echo "SCHISM: see solver.version each outputs dir" >> $logfile + + cp $input_file $run_dir/input.yaml + echo $run_dir } uuid=$(uuidgen) tag=${storm}_${year}_${uuid} +if [ ! -z $suffix ]; then tag=${tag}_${suffix}; fi run_dir=$(init $tag) echo $run_dir -singularity run $SINGULARITY_BINDFLAGS $L_IMG_DIR/info.sif \ +hurricane_data \ --date-range-outpath $run_dir/setup/dates.csv \ --track-outpath $run_dir/nhc_track/hurricane-track.dat \ --swath-outpath $run_dir/windswath \ --station-data-outpath $run_dir/coops_ssh/stations.nc \ --station-location-outpath $run_dir/setup/stations.csv \ $(if [ $past_forecast == 1 ]; then echo "--past-forecast"; fi) \ - --hours-before-landfall $hr_prelandfall \ - --lead-times $L_LEADTIMES_DATASET \ - $storm $year + --hours-before-landfall "$hr_prelandfall" \ + --lead-times "$L_LEADTIMES_DATASET" \ + --preprocessed-tracks-dir "$L_TRACK_DIR" \ + --countries-polygon "$L_SHP_DIR/ne_110m_cultural/ne_110m_admin_0_countries.shp" \ + $storm $year 2>&1 | tee "${run_dir}/output/head_hurricane_data.out" MESH_KWDS="" @@ -64,23 +101,27 @@ else fi MESH_KWDS+=" --out ${run_dir}/mesh" export MESH_KWDS -sbatch --wait --export=ALL,MESH_KWDS,STORM=$storm,YEAR=$year,IMG=$L_IMG_DIR/ocsmesh.sif $L_SCRIPT_DIR/mesh.sbatch +sbatch \ + --output "${run_dir}/output/slurm-%j.mesh.out" \ + --wait \ + --job-name=mesh_$tag \ + --export=ALL,MESH_KWDS,STORM=$storm,YEAR=$year \ + $run_dir/slurm/mesh.sbatch echo "Download necessary data..." # TODO: Separate pairing NWM-elem from downloading! DOWNLOAD_KWDS="" if [ $hydrology == 1 ]; then DOWNLOAD_KWDS+=" --with-hydrology"; fi -singularity run $SINGULARITY_BINDFLAGS $L_IMG_DIR/prep.sif download_data \ +download_data \ --output-directory $run_dir/setup/ensemble.dir/ \ --mesh-directory $run_dir/mesh/ \ --date-range-file $run_dir/setup/dates.csv \ --nwm-file $L_NWM_DATASET \ - $DOWNLOAD_KWDS + $DOWNLOAD_KWDS 2>&1 | tee "${run_dir}/output/head_download_nwm.out" echo "Setting up the model..." -PREP_KWDS="setup_ensemble" PREP_KWDS+=" --track-file $run_dir/nhc_track/hurricane-track.dat" PREP_KWDS+=" --output-directory $run_dir/setup/ensemble.dir/" PREP_KWDS+=" --num-perturbations $num_perturb" @@ -97,10 +138,12 @@ PREP_KWDS+=" --pahm-model $pahm_model" export PREP_KWDS # NOTE: We need to wait because run jobs depend on perturbation dirs! setup_id=$(sbatch \ + --output "${run_dir}/output/slurm-%j.setup.out" \ --wait \ + --job-name=prep_$tag \ --parsable \ --export=ALL,PREP_KWDS,STORM=$storm,YEAR=$year,IMG="$L_IMG_DIR/prep.sif" \ - $L_SCRIPT_DIR/prep.sbatch \ + $run_dir/slurm/prep.sbatch \ ) @@ -110,27 +153,33 @@ SCHISM_SHARED_ENV+="ALL" SCHISM_SHARED_ENV+=",IMG=$L_IMG_DIR/solve.sif" SCHISM_SHARED_ENV+=",MODULES=$L_SOLVE_MODULES" spinup_id=$(sbatch \ + --nodes $hpc_solver_nnodes --ntasks $hpc_solver_ntasks \ --parsable \ + --output "${run_dir}/output/slurm-%j.spinup.out" \ + --job-name=spinup_$tag \ -d afterok:$setup_id \ - --export=$SCHISM_SHARED_ENV,SCHISM_DIR="$run_dir/setup/ensemble.dir/spinup",SCHISM_EXEC="$spinup_exec" \ - $L_SCRIPT_DIR/schism.sbatch + --export="$SCHISM_SHARED_ENV",SCHISM_EXEC="$spinup_exec" \ + $run_dir/slurm/schism.sbatch "$run_dir/setup/ensemble.dir/spinup" ) joblist="" for i in $run_dir/setup/ensemble.dir/runs/*; do jobid=$( sbatch --parsable -d afterok:$spinup_id \ - --export=$SCHISM_SHARED_ENV,SCHISM_DIR="$i",SCHISM_EXEC="$hotstart_exec" \ - $L_SCRIPT_DIR/schism.sbatch + --nodes $hpc_solver_nnodes --ntasks $hpc_solver_ntasks \ + --output "${run_dir}/output/slurm-%j.run-$(basename $i).out" \ + --job-name="run_$(basename $i)_$tag" \ + --export="$SCHISM_SHARED_ENV",SCHISM_EXEC="$hotstart_exec" \ + $run_dir/slurm/schism.sbatch "$i" ) joblist+=":$jobid" done -#echo "Wait for ${joblist}" -#srun -d afterok${joblist} --pty sleep 1 # Post processing sbatch \ --parsable \ + --output "${run_dir}/output/slurm-%j.post.out" \ + --job-name=post_$tag \ -d afterok${joblist} \ --export=ALL,IMG="$L_IMG_DIR/prep.sif",ENSEMBLE_DIR="$run_dir/setup/ensemble.dir/" \ - $L_SCRIPT_DIR/post.sbatch + $run_dir/slurm/post.sbatch diff --git a/stormworkflow/slurm/mesh.sbatch b/stormworkflow/slurm/mesh.sbatch new file mode 100644 index 0000000..757b9cd --- /dev/null +++ b/stormworkflow/slurm/mesh.sbatch @@ -0,0 +1,9 @@ +#!/bin/bash +#SBATCH --parsable +#SBATCH --exclusive +#SBATCH --time=00:30:00 +#SBATCH --nodes=1 + +set -ex + +hurricane_mesh ${STORM} ${YEAR} ${MESH_KWDS} diff --git a/singularity/scripts/post.sbatch b/stormworkflow/slurm/post.sbatch similarity index 54% rename from singularity/scripts/post.sbatch rename to stormworkflow/slurm/post.sbatch index 8376b20..4ef59f8 100644 --- a/singularity/scripts/post.sbatch +++ b/stormworkflow/slurm/post.sbatch @@ -1,17 +1,14 @@ #!/bin/bash #SBATCH --parsable -#SBATCH --exclusive -#SBATCH --mem=0 +#SBATCH --time=05:00:00 #SBATCH --nodes=1 set -ex -singularity run ${SINGULARITY_BINDFLAGS} ${IMG} \ - combine_ensemble \ +combine_ensemble \ --ensemble-dir $ENSEMBLE_DIR \ --tracks-dir $ENSEMBLE_DIR/track_files -singularity run ${SINGULARITY_BINDFLAGS} ${IMG} \ - analyze_ensemble \ +analyze_ensemble \ --ensemble-dir $ENSEMBLE_DIR \ --tracks-dir $ENSEMBLE_DIR/track_files diff --git a/stormworkflow/slurm/prep.sbatch b/stormworkflow/slurm/prep.sbatch new file mode 100644 index 0000000..8beb298 --- /dev/null +++ b/stormworkflow/slurm/prep.sbatch @@ -0,0 +1,9 @@ +#!/bin/bash +#SBATCH --parsable +#SBATCH --exclusive +#SBATCH --time=00:30:00 +#SBATCH --nodes=1 + +set -ex + +setup_ensemble ${PREP_KWDS} ${STORM} ${YEAR} diff --git a/stormworkflow/slurm/schism.sbatch b/stormworkflow/slurm/schism.sbatch new file mode 100644 index 0000000..fef7e43 --- /dev/null +++ b/stormworkflow/slurm/schism.sbatch @@ -0,0 +1,41 @@ +#!/bin/bash +#SBATCH --parsable +#SBATCH --exclusive +#SBATCH --time=03:00:00 + +set -ex + +SCHISM_DIR=$1 +pushd $SCHISM_DIR +mkdir -p outputs + + +if [ ! -z "$MODULES" ]; then + module purge + module load $MODULES + module list +fi + +export MV2_ENABLE_AFFINITY=0 +ulimit -s unlimited + +date +${SCHISM_EXEC} -v > outputs/solver.version +mpirun -np $SLURM_NTASKS ${SCHISM_EXEC} 4 + +if [ $? -eq 0 ]; then + echo "Combining outputs..." + date + pushd outputs + if ls hotstart* >/dev/null 2>&1; then + times=$(ls hotstart_* | grep -o "hotstart[0-9_]\+" | awk 'BEGIN {FS = "_"}; {print $3}' | sort -h | uniq ) + for i in $times; do + combine_hotstart7 --iteration $i + done + fi + popd +fi + + +echo "Done" +date diff --git a/terraform/backend/backend.tf b/terraform/backend/backend.tf deleted file mode 100644 index e0d880e..0000000 --- a/terraform/backend/backend.tf +++ /dev/null @@ -1,87 +0,0 @@ -provider "aws" { - region = "us-east-1" -} - - -################### -resource "aws_s3_bucket" "odssm-s3-backend" { - bucket = "tacc-nos-icogs-backend" - - tags = { - Name = "On-Demand Storm Surge Modeling" - Phase = "Development" - POCName = "saeed.moghimi@noaa.gov" - Project = "NOAA ICOGS-C" - LineOffice = "NOS" - DivisionBranch = "CSDL-CMMB" - Reason = "terraform" - } -} - - -################### -resource "aws_s3_bucket_acl" "odssm-s3-backend-acl" { - bucket = aws_s3_bucket.odssm-s3-backend.id - acl = "private" -} - - -################### -resource "aws_s3_bucket_public_access_block" "odssm-s3-backend-accessblock" { - bucket = aws_s3_bucket.odssm-s3-backend.id - - block_public_acls = true - block_public_policy = true - ignore_public_acls = true - restrict_public_buckets = true -} - - -################### -resource "aws_kms_key" "odssm-kms-s3-backend" { - description = "This key is used to encrypt bucket objects" - deletion_window_in_days = 10 -} - - -################### -resource "aws_s3_bucket_server_side_encryption_configuration" "odssm-s3-backend-encrypt" { - bucket = aws_s3_bucket.odssm-s3-backend.bucket - - rule { - apply_server_side_encryption_by_default { - kms_master_key_id = aws_kms_key.odssm-kms-s3-backend.arn - sse_algorithm = "aws:kms" - } - } -} - - -################### -resource "aws_s3_bucket_versioning" "odssm-s3-backend-versioning" { - bucket = aws_s3_bucket.odssm-s3-backend.id - versioning_configuration { - status = "Enabled" - } -} - - -################### -resource "aws_s3_bucket_lifecycle_configuration" "odssm-s3-backend-lifecycle" { - # Must have bucket versioning enabled first - depends_on = [aws_s3_bucket_versioning.odssm-s3-backend-versioning] - - bucket = aws_s3_bucket.odssm-s3-backend.id - - rule { - id = "statefile" - - filter {} - - noncurrent_version_expiration { - noncurrent_days = 60 - } - - status = "Enabled" - } -} diff --git a/terraform/main.tf b/terraform/main.tf deleted file mode 100644 index ad8d550..0000000 --- a/terraform/main.tf +++ /dev/null @@ -1,1100 +0,0 @@ -terraform { - backend "s3" { - bucket = "tacc-nos-icogs-backend" - key = "terraform/state" - region = "us-east-1" - } -} - -locals { - common_tags = { - Name = "On-Demand Storm Surge Modeling" - Phase = "Development" - POCName = "saeed.moghimi@noaa.gov" - Project = "NOAA ICOGS-C" - LineOffice = "NOS" - DivisionBranch = "CSDL-CMMB" - } - docker_user = "ondemand-user" - task_role_arn = "arn:aws:iam::${var.account_id}:role/${var.role_prefix}_ECS_Role" - execution_role_arn = "arn:aws:iam::${var.account_id}:role/${var.role_prefix}_ECS_Role" - ec2_profile_name = "${var.role_prefix}_ECS_Role2" - ecs_profile_name = "${var.role_prefix}_ECS_Role" - subnet_idx = 3 - ec2_ami = "ami-0d5eff06f840b45e9" - ecs_ami = "ami-03fe4d5b1d229063a" - # TODO: Make these to be terraform variables - ansible_var_path = "../ansible/inventory/group_vars/vars_from_terraform" - prefect_var_path = "../prefect/vars_from_terraform" - pvt_key_path = "~/.ssh/tacc_aws" - pub_key_path = "~/.ssh/tacc_aws.pub" - dev = "soroosh" -} - -################### -provider "aws" { - region = "us-east-1" -} - - -################ -data "aws_region" "current" {} - -################ -data "aws_caller_identity" "current" {} - -################ -data "aws_availability_zones" "available" { - state = "available" -} - -################ -data "aws_s3_object" "odssm-prep-ud" { - bucket = aws_s3_bucket.odssm-s3["statics"].bucket - key = "userdata/userdata-ocsmesh.txt" -} - -################ -data "aws_s3_object" "odssm-solve-ud" { - bucket = aws_s3_bucket.odssm-s3["statics"].bucket - key = "userdata/userdata-schism.txt" -} - -################ -data "aws_s3_object" "odssm-post-ud" { - bucket = aws_s3_bucket.odssm-s3["statics"].bucket - key = "userdata/userdata-viz.txt" -} - -################ -data "aws_s3_object" "odssm-wf-ud" { - bucket = aws_s3_bucket.odssm-s3["statics"].bucket - key = "userdata/userdata-wf.txt" -} - - -################### -resource "local_file" "odssm-ansible-vars" { - content = yamlencode({ - ansible_ssh_private_key_file = abspath(pathexpand(local.pvt_key_path)) - aws_account_id = data.aws_caller_identity.current.account_id - aws_default_region: data.aws_region.current.name - ec2_public_ip = aws_instance.odssm-local-agent-ec2.public_ip - ecs_task_role = local.task_role_arn - ecs_exec_role = local.execution_role_arn - efs_id = aws_efs_file_system.odssm-efs.id - prefect_image = aws_ecr_repository.odssm-repo["odssm-workflow"].repository_url - }) - filename = local.ansible_var_path -} -################### -resource "local_file" "odssm-prefect-vars" { - content = yamlencode({ - S3_BUCKET = aws_s3_bucket.odssm-s3["prefect"].bucket - OCSMESH_CLUSTER = aws_ecs_cluster.odssm-cluster-prep.name - SCHISM_CLUSTER = aws_ecs_cluster.odssm-cluster-solve.name - VIZ_CLUSTER = aws_ecs_cluster.odssm-cluster-post.name - WF_CLUSTER = aws_ecs_cluster.odssm-cluster-workflow.name - OCSMESH_TEMPLATE_1_ID = aws_launch_template.odssm-prep-instance-1-template.id - OCSMESH_TEMPLATE_2_ID = aws_launch_template.odssm-prep-instance-2-template.id - SCHISM_TEMPLATE_ID = aws_launch_template.odssm-solve-instance-template.id - VIZ_TEMPLATE_ID = aws_launch_template.odssm-post-instance-template.id - WF_TEMPLATE_ID = aws_launch_template.odssm-workflow-instance-template.id - ECS_TASK_ROLE = local.task_role_arn - ECS_EXEC_ROLE = local.execution_role_arn - ECS_SUBNET_ID = aws_subnet.odssm-subnet.id - ECS_EC2_SG = [ - aws_security_group.odssm-sg-efs.id, - aws_security_group.odssm-sg-ecsout.id, - aws_security_group.odssm-sg-ssh.id, - ] - WF_IMG = "${aws_ecr_repository.odssm-repo["odssm-workflow"].repository_url}:v0.4" - WF_ECS_TASK_ARN = aws_ecs_task_definition.odssm-flowrun-task.arn - }) - filename = local.prefect_var_path -} - -################### -resource "aws_key_pair" "odssm-ssh-key" { - key_name = "noaa-ondemand-${local.dev}-tacc-prefect-ssh-key" - public_key = file("${local.pub_key_path}") - tags = local.common_tags -} - - -################### -resource "aws_s3_bucket" "odssm-s3" { - for_each = { - statics = "tacc-nos-icogs-static" - prefect = "tacc-nos-icogs-prefect" - results = "tacc-icogs-results" - } - bucket = "${each.value}" - - tags = merge( - local.common_tags, - { - Reason = "${each.key}" - } - ) -} - - -################### -resource "aws_s3_bucket_acl" "odssm-s3-acl" { - for_each = aws_s3_bucket.odssm-s3 - bucket = each.value.id - acl = "private" -} - - -################### -resource "aws_s3_bucket_public_access_block" "odssm-s3-accessblock" { - for_each = aws_s3_bucket.odssm-s3 - bucket = each.value.id - - block_public_acls = true - block_public_policy = true - ignore_public_acls = true - restrict_public_buckets = true -} - - -################### -resource "aws_s3_bucket" "odssm-s3-website" { - bucket = "tacc-icogs-results-website" - - tags = merge( - local.common_tags, - { - Reason = "website" - } - ) -} - - -################### -resource "aws_s3_bucket_acl" "odssm-s3-website-acl" { - bucket = aws_s3_bucket.odssm-s3-website.id - acl = "public-read" -} - -################### -resource "aws_s3_bucket_website_configuration" "odssm-s3-website-config" { - bucket = aws_s3_bucket.odssm-s3-website.bucket - - index_document { - suffix = "index.html" - } - -} - -################### -resource "aws_efs_file_system" "odssm-efs" { - tags = local.common_tags -} - - -################### -resource "aws_efs_mount_target" "odssm-efs-mount" { - file_system_id = aws_efs_file_system.odssm-efs.id - subnet_id = aws_subnet.odssm-subnet.id - security_groups = [ - aws_security_group.odssm-sg-efs.id - ] -} - - -################### -resource "aws_vpc" "odssm-vpc" { - assign_generated_ipv6_cidr_block = false - cidr_block = "172.31.0.0/16" - enable_dns_hostnames = true - enable_dns_support = true - tags = local.common_tags -} - - -################### -resource "aws_subnet" "odssm-subnet" { - vpc_id = aws_vpc.odssm-vpc.id - - assign_ipv6_address_on_creation = false - - availability_zone = data.aws_availability_zones.available.names[local.subnet_idx] - - cidr_block = "172.31.0.0/20" - - tags = local.common_tags -} - - -################### -resource "aws_security_group" "odssm-sg-default" { - name = "default" - description = "default VPC security group" - vpc_id = aws_vpc.odssm-vpc.id - - egress = [ - { - cidr_blocks = [ - "0.0.0.0/0" - ] - description = "" - from_port = 0 - ipv6_cidr_blocks = [] - prefix_list_ids = [] - protocol = "-1" - security_groups = [] - self = false - to_port = 0 - } - ] - ingress = [ - { - cidr_blocks = [] - description = "" - from_port = 0 - ipv6_cidr_blocks = [] - prefix_list_ids = [] - protocol = "-1" - security_groups = [] - self = true - to_port = 0 - } - ] - - tags = local.common_tags -} - - -################### -resource "aws_security_group" "odssm-sg-ecsout" { - name = "ecs" - description = "Allow ecs to access the internet" - vpc_id = aws_vpc.odssm-vpc.id - - egress = [ - { - cidr_blocks = [ - "0.0.0.0/0" - ] - description = "" - from_port = 0 - ipv6_cidr_blocks = [] - prefix_list_ids = [] - protocol = "-1" - security_groups = [] - self = false - to_port = 0 - } - ] - ingress = [] - - tags = local.common_tags -} - - -################### -resource "aws_security_group" "odssm-sg-efs" { - name = "efs" - description = "Allow EFS/NFS mounts" - vpc_id = aws_vpc.odssm-vpc.id - - egress = [ - { - cidr_blocks = [ - "0.0.0.0/0" - ] - description = "" - from_port = 2049 - ipv6_cidr_blocks = [] - prefix_list_ids = [] - protocol = "tcp" - security_groups = [] - self = false - to_port = 2049 - } - ] - ingress = [ - { - cidr_blocks = [ - "0.0.0.0/0" - ] - description = "" - from_port = 2049 - ipv6_cidr_blocks = [] - prefix_list_ids = [] - protocol = "tcp" - security_groups = [] - self = false - to_port = 2049 - } - ] - - tags = local.common_tags -} - -################### -resource "aws_security_group" "odssm-sg-ssh" { - name = "ssh-access" - description = "Allow SSH" - vpc_id = aws_vpc.odssm-vpc.id - - egress = [ - { - cidr_blocks = [ - "0.0.0.0/0" - ] - description = "" - from_port = 22 - ipv6_cidr_blocks = [] - prefix_list_ids = [] - protocol = "tcp" - security_groups = [] - self = false - to_port = 22 - } - ] - ingress = [ - { - cidr_blocks = [ - "0.0.0.0/0" - ] - description = "" - from_port = 22 - ipv6_cidr_blocks = [] - prefix_list_ids = [] - protocol = "tcp" - security_groups = [] - self = false - to_port = 22 - } - ] - - tags = local.common_tags -} - - -################### -resource "aws_ecr_repository" "odssm-repo" { - for_each = { - odssm-info = "Fetch hurricane information" - odssm-mesh = "Mesh the domain" - odssm-prep = "Setup SCHISM model" - odssm-solve = "Run SCHISM model" - odssm-post = "Generate visualizations" - odssm-workflow = "Run SCHISM model" - } - name = "${each.key}" - - image_tag_mutability = "IMMUTABLE" - - image_scanning_configuration { - scan_on_push = true - } - - - tags = merge( - local.common_tags, - { - Description = "${each.value}" - } - ) -} - -################### -resource "aws_ecs_cluster" "odssm-cluster-prep" { - - name = "odssm-ocsmesh" - - setting { - name = "containerInsights" - value = "disabled" - } - - tags = merge( - local.common_tags, - { - Description = "Cluster used for model preparation" - } - ) -} - - -################### -resource "aws_ecs_cluster" "odssm-cluster-solve" { - name = "odssm-schism" - - setting { - name = "containerInsights" - value = "disabled" - } - - tags = merge( - local.common_tags, - { - Description = "Cluster used for solving the model" - } - ) -} - -################### -resource "aws_ecs_cluster" "odssm-cluster-post" { - name = "odssm-viz" - - setting { - name = "containerInsights" - value = "disabled" - } - - tags = merge( - local.common_tags, - { - Description = "Cluster used for generating visualizations" - } - ) -} - - -################### -resource "aws_ecs_cluster" "odssm-cluster-workflow" { - name = "odssm-wf" - - setting { - name = "containerInsights" - value = "disabled" - } - - tags = merge( - local.common_tags, - { - Description = "Cluster used for running Prefect Flows on ECS" - } - ) -} - - -################### -resource "aws_ecs_task_definition" "odssm-info-task" { - - family = "odssm-info" - network_mode = "bridge" - requires_compatibilities = [ "EC2" ] - task_role_arn = local.task_role_arn - execution_role_arn = local.execution_role_arn - - container_definitions = jsonencode([ - { - name = "info" - image = "${aws_ecr_repository.odssm-repo["odssm-info"].repository_url}:v0.11" - - essential = true - - memoryReservation = 2000 # MB - mountPoints = [ - { - containerPath = "/home/${local.docker_user}/app/io/output" - sourceVolume = "efs_vol" - } - ] - logConfiguration = { - logDriver = "awslogs", - options = { - awslogs-group = aws_cloudwatch_log_group.odssm-cw-log-grp.name, - awslogs-create-group = "true", - awslogs-region = data.aws_region.current.name, - awslogs-stream-prefix = "odssm-info" - } - } - }]) - - volume { - name = "efs_vol" - efs_volume_configuration { - file_system_id = aws_efs_file_system.odssm-efs.id - root_directory = "/hurricanes" - } - } - - tags = local.common_tags -} - - -################### -resource "aws_ecs_task_definition" "odssm-mesh-task" { - - family = "odssm-mesh" - network_mode = "bridge" - requires_compatibilities = [ "EC2" ] - task_role_arn = local.task_role_arn - execution_role_arn = local.execution_role_arn - - container_definitions = jsonencode([ - { - name = "mesh" - image = "${aws_ecr_repository.odssm-repo["odssm-mesh"].repository_url}:v0.11" - - essential = true - - memoryReservation = 123000 # MB - mountPoints = [ - { - containerPath = "/home/${local.docker_user}/app/io" - sourceVolume = "efs_vol" - }, - ] - logConfiguration = { - logDriver = "awslogs", - options = { - awslogs-group = aws_cloudwatch_log_group.odssm-cw-log-grp.name, - awslogs-create-group = "true", - awslogs-region = data.aws_region.current.name, - awslogs-stream-prefix = "odssm-mesh" - } - } - }]) - - volume { - name = "efs_vol" - efs_volume_configuration { - file_system_id = aws_efs_file_system.odssm-efs.id - root_directory = "/" - } - } - - tags = local.common_tags -} - - -################### -resource "aws_ecs_task_definition" "odssm-prep-task" { - - family = "odssm-prep" - network_mode = "bridge" - requires_compatibilities = [ "EC2" ] - task_role_arn = local.task_role_arn - execution_role_arn = local.execution_role_arn - - container_definitions = jsonencode([ - { - name = "prep" - image = "${aws_ecr_repository.odssm-repo["odssm-prep"].repository_url}:v0.17" - - essential = true - - memoryReservation = 2000 # MB - mountPoints = [ - { - containerPath = "/home/${local.docker_user}/app/io/" - sourceVolume = "efs_vol" - } - ] - logConfiguration = { - logDriver = "awslogs", - options = { - awslogs-group = aws_cloudwatch_log_group.odssm-cw-log-grp.name, - awslogs-create-group = "true", - awslogs-region = data.aws_region.current.name, - awslogs-stream-prefix = "odssm-prep" - } - } - }]) - - volume { - name = "efs_vol" - efs_volume_configuration { - file_system_id = aws_efs_file_system.odssm-efs.id - root_directory = "/" - } - } - - tags = local.common_tags -} - - -################### -resource "aws_ecs_task_definition" "odssm-solve-task" { - - family = "odssm-solve" - network_mode = "bridge" - requires_compatibilities = [ "EC2" ] - task_role_arn = local.task_role_arn - execution_role_arn = local.execution_role_arn - - container_definitions = jsonencode([ - { - name = "solve" - image = "${aws_ecr_repository.odssm-repo["odssm-solve"].repository_url}:v0.10" - - essential = true - - environment = [ - { - name = "SCHISM_NPROCS" - value = "48" - } - ] - - linuxParameters = { - capabilities = { - add = ["SYS_PTRACE"] - } - } - - memoryReservation = 50000 # MB - mountPoints = [ - { - containerPath = "/home/${local.docker_user}/app/io/hurricanes" - sourceVolume = "hurr_vol" - } - ] - logConfiguration = { - logDriver = "awslogs", - options = { - awslogs-group = aws_cloudwatch_log_group.odssm-cw-log-grp.name, - awslogs-create-group = "true", - awslogs-region = data.aws_region.current.name, - awslogs-stream-prefix = "odssm-solve" - } - } - }]) - - volume { - name = "hurr_vol" - efs_volume_configuration { - file_system_id = aws_efs_file_system.odssm-efs.id - root_directory = "/hurricanes" - } - } - - tags = local.common_tags -} - - -################### -resource "aws_ecs_task_definition" "odssm-post-task" { - - family = "odssm-post" - network_mode = "bridge" - requires_compatibilities = [ "EC2" ] - task_role_arn = local.task_role_arn - execution_role_arn = local.execution_role_arn - - container_definitions = jsonencode([ - { - name = "post" - image = "${aws_ecr_repository.odssm-repo["odssm-post"].repository_url}:v0.7" - - essential = true - - memoryReservation = 6000 # MB - mountPoints = [ - { - containerPath = "/home/${local.docker_user}/app/io/hurricanes" - sourceVolume = "hurr_vol" - } - ] - logConfiguration = { - logDriver = "awslogs", - options = { - awslogs-group = aws_cloudwatch_log_group.odssm-cw-log-grp.name, - awslogs-create-group = "true", - awslogs-region = data.aws_region.current.name, - awslogs-stream-prefix = "odssm-post" - } - } - }]) - - volume { - name = "hurr_vol" - efs_volume_configuration { - file_system_id = aws_efs_file_system.odssm-efs.id - root_directory = "/hurricanes" - } - } - - tags = local.common_tags -} - - -################### -resource "aws_ecs_task_definition" "odssm-flowrun-task" { - - family = "odssm-prefect-flowrun" - network_mode = "bridge" - requires_compatibilities = [ "EC2" ] - # Use the instance profile the task si running on instead -# task_role_arn = local.task_role_arn -# execution_role_arn = local.execution_role_arn - - container_definitions = jsonencode([ - { - name = "flow" - image = "${aws_ecr_repository.odssm-repo["odssm-workflow"].repository_url}:v0.4" - - essential = true - - memoryReservation = 500 # MB - mountPoints = [ - { - containerPath = "/efs" - sourceVolume = "efs_vol" - } - ] -# logConfiguration = { -# logDriver = "awslogs", -# options = { -# awslogs-group = aws_cloudwatch_log_group.odssm-cw-log-grp.name, -# awslogs-create-group = "true", -# awslogs-region = data.aws_region.current.name, -# awslogs-stream-prefix = "odssm-prefect-flowrun" -# } -# } - }]) - - volume { - name = "efs_vol" - efs_volume_configuration { - file_system_id = aws_efs_file_system.odssm-efs.id - root_directory = "/" - } - } - - tags = local.common_tags -} - -################### -resource "aws_instance" "odssm-local-agent-ec2" { - - ami = local.ec2_ami - - associate_public_ip_address = true - - availability_zone = data.aws_availability_zones.available.names[local.subnet_idx] - - iam_instance_profile = local.ec2_profile_name - - instance_type = "t3.small" # micro cannot handle multiple workflow runs, t2 network is slow - - key_name = aws_key_pair.odssm-ssh-key.id - - subnet_id = aws_subnet.odssm-subnet.id - - vpc_security_group_ids = [ -# aws_security_group.odssm-sg-default.id, - aws_security_group.odssm-sg-efs.id, - aws_security_group.odssm-sg-ecsout.id, - aws_security_group.odssm-sg-ssh.id, - ] - - tags = merge( - local.common_tags, - { - Role = "Workflow management agent" - } - ) - -} - - -################### -resource "aws_launch_template" "odssm-prep-instance-1-template" { - - name = "odssm-ocsmesh-awsall" - description = "Instance with sufficient memory for meshing process" - update_default_version = true - - image_id = local.ecs_ami - - key_name = aws_key_pair.odssm-ssh-key.key_name - - instance_type = "m5.8xlarge" - - iam_instance_profile { - name = local.ecs_profile_name - } - - network_interfaces { - associate_public_ip_address = true - subnet_id = aws_subnet.odssm-subnet.id - security_groups = [ -# aws_security_group.odssm-sg-default.id, - aws_security_group.odssm-sg-efs.id, - aws_security_group.odssm-sg-ecsout.id, - aws_security_group.odssm-sg-ssh.id, - ] - } - - - block_device_mappings { - device_name = "/dev/xvda" - - ebs { - volume_size = 300 - } - } - - ebs_optimized = true - - placement { - availability_zone = data.aws_availability_zones.available.names[local.subnet_idx] - } - - tag_specifications { - resource_type = "instance" - - tags = merge( - local.common_tags, - { - Role = "Run model preparation tasks" - } - ) - } - - user_data = base64encode(data.aws_s3_object.odssm-prep-ud.body) -} - - -################### -resource "aws_launch_template" "odssm-prep-instance-2-template" { - - name = "odssm-ocsmesh-hybrid" - description = "Instance for pre processing in when meshing is done on HPC" - update_default_version = true - - image_id = local.ecs_ami - - key_name = aws_key_pair.odssm-ssh-key.key_name - - instance_type = "m5.xlarge" - - iam_instance_profile { - name = local.ecs_profile_name - } - - network_interfaces { - associate_public_ip_address = true - subnet_id = aws_subnet.odssm-subnet.id - security_groups = [ -# aws_security_group.odssm-sg-default.id, - aws_security_group.odssm-sg-efs.id, - aws_security_group.odssm-sg-ecsout.id, - aws_security_group.odssm-sg-ssh.id, - ] - } - - - block_device_mappings { - device_name = "/dev/xvda" - - ebs { - volume_size = 300 - } - } - - ebs_optimized = true - - placement { - availability_zone = data.aws_availability_zones.available.names[local.subnet_idx] - } - - tag_specifications { - resource_type = "instance" - - tags = merge( - local.common_tags, - { - Role = "Run model preparation tasks" - } - ) - } - - user_data = base64encode(data.aws_s3_object.odssm-prep-ud.body) -} - - -################### -resource "aws_launch_template" "odssm-solve-instance-template" { - - name = "odssm-schism" - description = "Instance with sufficient compute power for SCHISM" - update_default_version = true - - image_id = local.ecs_ami - - key_name = aws_key_pair.odssm-ssh-key.key_name - - instance_type = "c5.metal" - - iam_instance_profile { - name = local.ecs_profile_name - } - - network_interfaces { - associate_public_ip_address = true - subnet_id = aws_subnet.odssm-subnet.id - security_groups = [ -# aws_security_group.odssm-sg-default.id, - aws_security_group.odssm-sg-efs.id, - aws_security_group.odssm-sg-ecsout.id, - aws_security_group.odssm-sg-ssh.id, - ] - } - - block_device_mappings { - device_name = "/dev/xvda" - - ebs { - volume_size = 30 - } - } - - ebs_optimized = true - - # For ensemble runs where we need many instances, we need to spread - # over multiple az -# placement { -# availability_zone = data.aws_availability_zones.available.names[local.subnet_idx] -# } - - tag_specifications { - resource_type = "instance" - - tags = merge( - local.common_tags, - { - Role = "Run model preparation tasks" - } - ) - } - - user_data = base64encode(data.aws_s3_object.odssm-solve-ud.body) -} - - -################### -resource "aws_launch_template" "odssm-post-instance-template" { - - name = "odssm-viz" - description = "Instance for generating visualization" - update_default_version = true - - image_id = local.ecs_ami - - key_name = aws_key_pair.odssm-ssh-key.key_name - - instance_type = "c5.xlarge" - - iam_instance_profile { - name = local.ecs_profile_name - } - - network_interfaces { - associate_public_ip_address = true - subnet_id = aws_subnet.odssm-subnet.id - security_groups = [ -# aws_security_group.odssm-sg-default.id, - aws_security_group.odssm-sg-efs.id, - aws_security_group.odssm-sg-ecsout.id, - aws_security_group.odssm-sg-ssh.id, - ] - } - - block_device_mappings { - device_name = "/dev/xvda" - - ebs { - volume_size = 30 - } - } - - ebs_optimized = true - - placement { - availability_zone = data.aws_availability_zones.available.names[local.subnet_idx] - } - - tag_specifications { - resource_type = "instance" - - tags = merge( - local.common_tags, - { - Role = "Run visualization generation tasks" - } - ) - } - - user_data = base64encode(data.aws_s3_object.odssm-post-ud.body) -} - -################### -resource "aws_launch_template" "odssm-workflow-instance-template" { - - name = "odssm-wf" - description = "Instance for running Prefect Flows as ECSRun" - update_default_version = true - - image_id = local.ecs_ami - - key_name = aws_key_pair.odssm-ssh-key.key_name - - instance_type = "c5.4xlarge" - - # The workflow ECSRun instance needs to be able to create its own instances - iam_instance_profile { - name = local.ec2_profile_name - } - - network_interfaces { - associate_public_ip_address = true - subnet_id = aws_subnet.odssm-subnet.id - security_groups = [ -# aws_security_group.odssm-sg-default.id, - aws_security_group.odssm-sg-efs.id, - aws_security_group.odssm-sg-ecsout.id, - aws_security_group.odssm-sg-ssh.id, - ] - } - - block_device_mappings { - device_name = "/dev/xvda" - - ebs { - volume_size = 30 - } - } - - ebs_optimized = true - - placement { - availability_zone = data.aws_availability_zones.available.names[local.subnet_idx] - } - - tag_specifications { - resource_type = "instance" - - tags = merge( - local.common_tags, - { - Role = "Run ECSRun tasks" - } - ) - } - - user_data = base64encode(data.aws_s3_object.odssm-wf-ud.body) -} - -resource "aws_cloudwatch_log_group" "odssm-cw-log-grp" { - name = "odssm_ecs_task_docker_logs" - -# kms_key_id = - - tags = merge( - local.common_tags, - { - Role = "Watch ECS logs" - } - ) -} diff --git a/terraform/outputs.tf b/terraform/outputs.tf deleted file mode 100644 index 5c5789a..0000000 --- a/terraform/outputs.tf +++ /dev/null @@ -1,30 +0,0 @@ -output "ec2_ip" { - description = "IP of the EC2 instance" - value = aws_instance.odssm-local-agent-ec2.public_ip -} - -output "efs_id" { - description = "ID of the EFS instance" - value = aws_efs_file_system.odssm-efs.id -} - -output "ecr_url" { - description = "URL of the ECR Repositories" - value = toset([ - for repo in aws_ecr_repository.odssm-repo: repo.repository_url - ]) -} - -output "ansible_var_path" { - description = "Path of the Ansible variable file written to local disk" - value = local_file.odssm-ansible-vars.filename -} - -output "prefect_var_path" { - description = "Path of the Prefect variable file written to local disk" - value = local_file.odssm-prefect-vars.filename -} - -output "account_id" { - value = data.aws_caller_identity.current.account_id -} diff --git a/terraform/ud/userdata-ocsmesh.txt b/terraform/ud/userdata-ocsmesh.txt deleted file mode 100644 index 5b98656..0000000 --- a/terraform/ud/userdata-ocsmesh.txt +++ /dev/null @@ -1,3 +0,0 @@ -#!/bin/bash - -echo ECS_CLUSTER=odssm-ocsmesh >> /etc/ecs/ecs.config diff --git a/terraform/ud/userdata-schism.txt b/terraform/ud/userdata-schism.txt deleted file mode 100644 index 0dca752..0000000 --- a/terraform/ud/userdata-schism.txt +++ /dev/null @@ -1,3 +0,0 @@ -#!/bin/bash - -echo ECS_CLUSTER=odssm-schism >> /etc/ecs/ecs.config diff --git a/terraform/ud/userdata-viz.txt b/terraform/ud/userdata-viz.txt deleted file mode 100644 index c27ec09..0000000 --- a/terraform/ud/userdata-viz.txt +++ /dev/null @@ -1,3 +0,0 @@ -#!/bin/bash - -echo ECS_CLUSTER=odssm-viz >> /etc/ecs/ecs.config diff --git a/terraform/ud/userdata-wf.txt b/terraform/ud/userdata-wf.txt deleted file mode 100644 index a51b9a2..0000000 --- a/terraform/ud/userdata-wf.txt +++ /dev/null @@ -1,3 +0,0 @@ -#!/bin/bash - -echo ECS_CLUSTER=odssm-wf >> /etc/ecs/ecs.config diff --git a/terraform/variables.tf b/terraform/variables.tf deleted file mode 100644 index ec9a0ed..0000000 --- a/terraform/variables.tf +++ /dev/null @@ -1,9 +0,0 @@ -variable account_id { - type = string - description = "AWS account ID number used for the system" -} - -variable role_prefix { - type = string - description = "Prefix used for the name of the roles created for this system" -} diff --git a/tests/test.sh b/tests/test.sh new file mode 100755 index 0000000..4d67ed6 --- /dev/null +++ b/tests/test.sh @@ -0,0 +1,102 @@ +#!/bin/bash + +THIS_SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd ) +source $THIS_SCRIPT_DIR/input.conf + +TEST_OUT=/nhc/Soroosh.Mani/runs/ +SINGULARITY_ROOT=$L_SCRIPT_DIR/../ + +init () { + uuid=$(uuidgen) + tag=test_${2}_${3}_${uuid} + + local run_dir=$1/$tag + mkdir $run_dir + echo $run_dir +} + +test_hurricane_info () { + storm=florence + year=2018 + hr_prelandfall=48 + + run_dir=$(init $TEST_OUT $storm $year) + + this_test_out=$run_dir/info_w_leadjson_preptrack_$hr_prelandfall + mkdir $this_test_out + python $SINGULARITY_ROOT/info/files/hurricane_data.py \ + --date-range-outpath $this_test_out/dates.csv \ + --track-outpath $this_test_out/hurricane-track.dat \ + --swath-outpath $this_test_out/windswath \ + --station-data-outpath $this_test_out/stations.nc \ + --station-location-outpath $this_test_out/stations.csv \ + --past-forecast \ + --hours-before-landfall "$hr_prelandfall" \ + --lead-times "$L_LEADTIMES_DATASET" \ + --preprocessed-tracks-dir "$L_TRACK_DIR" \ + $storm $year + + this_test_out=$run_dir/info_w_leadjson_$hr_prelandfall + mkdir $this_test_out + python $SINGULARITY_ROOT/info/files/hurricane_data.py \ + --date-range-outpath $this_test_out/dates.csv \ + --track-outpath $this_test_out/hurricane-track.dat \ + --swath-outpath $this_test_out/windswath \ + --station-data-outpath $this_test_out/stations.nc \ + --station-location-outpath $this_test_out/stations.csv \ + --past-forecast \ + --hours-before-landfall "$hr_prelandfall" \ + --lead-times "$L_LEADTIMES_DATASET" \ + $storm $year + + this_test_out=$run_dir/info_w_leadjson_24 + mkdir $this_test_out + python $SINGULARITY_ROOT/info/files/hurricane_data.py \ + --date-range-outpath $this_test_out/dates.csv \ + --track-outpath $this_test_out/hurricane-track.dat \ + --swath-outpath $this_test_out/windswath \ + --station-data-outpath $this_test_out/stations.nc \ + --station-location-outpath $this_test_out/stations.csv \ + --past-forecast \ + --hours-before-landfall 24 \ + --lead-times "$L_LEADTIMES_DATASET" \ + $storm $year + + this_test_out=$run_dir/info_w_preptrack_$hr_prelandfall + mkdir $this_test_out + python $SINGULARITY_ROOT/info/files/hurricane_data.py \ + --date-range-outpath $this_test_out/dates.csv \ + --track-outpath $this_test_out/hurricane-track.dat \ + --swath-outpath $this_test_out/windswath \ + --station-data-outpath $this_test_out/stations.nc \ + --station-location-outpath $this_test_out/stations.csv \ + --past-forecast \ + --hours-before-landfall "$hr_prelandfall" \ + --preprocessed-tracks-dir "$L_TRACK_DIR" \ + $storm $year + + this_test_out=$run_dir/info_w_leadjson_besttrack_$hr_prelandfall + mkdir $this_test_out + python $SINGULARITY_ROOT/info/files/hurricane_data.py \ + --date-range-outpath $this_test_out/dates.csv \ + --track-outpath $this_test_out/hurricane-track.dat \ + --swath-outpath $this_test_out/windswath \ + --station-data-outpath $this_test_out/stations.nc \ + --station-location-outpath $this_test_out/stations.csv \ + --hours-before-landfall "$hr_prelandfall" \ + --lead-times "$L_LEADTIMES_DATASET" \ + $storm $year + + this_test_out=$run_dir/info_w_besttrack_$hr_prelandfall + mkdir $this_test_out + python $SINGULARITY_ROOT/info/files/hurricane_data.py \ + --date-range-outpath $this_test_out/dates.csv \ + --track-outpath $this_test_out/hurricane-track.dat \ + --swath-outpath $this_test_out/windswath \ + --station-data-outpath $this_test_out/stations.nc \ + --station-location-outpath $this_test_out/stations.csv \ + --hours-before-landfall "$hr_prelandfall" \ + $storm $year +} + +$1