Compute API¶
As previously mentioned, one of the beautiful things about the declarative approach to workflows is that we can execute workflows on massive machines just as easily as executing workflows on a local laptop. Concretely, merely changing --run_local
to --run_compute
, we can execute the exact same workflow on the NCATS HPC cluster! That’s it! Absolutely no modifications to the workflow itself are necessary!
Note that if your workflow inputs file references files on your local machine, then
those files will need to be manually transferred to the new machine (this could perhaps be automated, but maybe you have ~1TB of data…)
if you are using absolute file paths, the paths will need to be updated w.r.t. the new machine. (So use relative paths ;) )
Authentication Access Token¶
When using --run_compute
you will also need to use --compute_access_token $ACCESS_TOKEN
. Unfortunately, there is currently no programmatic way of obtaining the access token via an API from the command line. You will need to manually perform the following steps:
Go to https://compute.scb-ncats.io/explorer/
Click the green Authorize button. You will be taken to the NIH login page.
Enter your NIH username and password, then
Authenticate (using Microsoft Authenticator)
You will be returned to https://compute.scb-ncats.io/explorer/
Click Close (NOT Logout!)
Scroll down to HealthController
Click Try It Out and then click Execute.
You should see a massive hash string after “authorization: bearer”.
Copy the massive hash string. Be careful not to copy any other characters (like blank spaces).
In a bash terminal, create the environment variable
export ACCESS_TOKEN=...
(where … means paste in the hash string)
Unfortunately, the access token currently expires after about an hour, so you will need to repeat these steps periodically.
Workflow Status¶
After submitting a workflow, users can check on its status either using https://compute.scb-ncats.io/explorer/ or by directly logging into dali.ncats.nih.gov and using the squeue
command. Currently, the workflows are executed under the placeholder svc-polus user.
ssh <username>@dali.ncats.nih.gov
NOTE: The above server is behind the NIH VPN; you must enable the VPN to access it.
watch -n5 squeue -u svc-polus
After a ~2-3 minute initial delay you should see nodes starting, running, and finishing, corresponding to the individual steps in the workflow. The output files and logs are currently stored under /project/labshare-compute/