Developer Guide¶
See algorithms for a description of the compilation algorithms and some high-level implementation considerations. I hope you like recursion! ;)
Coding Standards¶
See coding standards
Git Etiquette¶
See git etiquette
Known Issues¶
Globbing Unexpected File Order¶
If a workflow step generates a file with some association between a filename and numerical value, if the numerical values are being extracted in order of each row/column from the file then there is no guarantee that using glob to retrieve files will produce the same order consistent with the extracted numerical value array. Another example is illustrated below without scattering, where using glob will result in inconsistency between input order and output order of files. With scattering it is possible in some cases to induce the correct output ordering consistent with input order, however it is best practice to adopt reading filename indices from an output file rather than using glob to ensure consistent order. This way developer does not have to think about which cases glob might or might now work in.
input: [3, 2, 1] # Here is the order of the input array.
#!/usr/bin/env cwl-runner
cwlVersion: v1.2
class: CommandLineTool
baseCommand: ["python", "example.py"]
requirements:
- class: InitialWorkDirRequirement
listing:
# See https://www.commonwl.org/user_guide/topics/creating-files-at-runtime.html
- entryname: example.py
entry: |
import sys
from pathlib import Path
for arg in sys.argv[1:]:
Path(f'{arg}.txt').touch()
inputs:
input:
type: Any[]
inputBinding:
position: 0
outputs:
output:
type: File[]
outputBinding:
glob: "*.txt"
cwltool touch_array.cwl touch_array_inputs.yml
INFO [job test.cwl] /tmp/0x0q86bg$ python \
example.py \
3 \
2 \
1
INFO [job test.cwl] completed success
{
"output": [
{
"location": "file:///home/walkerbd/workflow-inference-compiler/cwl_adapters/1.txt",
"basename": "1.txt",
"class": "File",
"checksum": "sha1$da39a3ee5e6b4b0d3255bfef95601890afd80709",
"size": 0,
"path": "/home/walkerbd/workflow-inference-compiler/cwl_adapters/1.txt"
},
{
"location": "file:///home/walkerbd/workflow-inference-compiler/cwl_adapters/2.txt",
"basename": "2.txt",
"class": "File",
"checksum": "sha1$da39a3ee5e6b4b0d3255bfef95601890afd80709",
"size": 0,
"path": "/home/walkerbd/workflow-inference-compiler/cwl_adapters/2.txt"
},
{
"location": "file:///home/walkerbd/workflow-inference-compiler/cwl_adapters/3.txt",
"basename": "3.txt",
"class": "File",
"checksum": "sha1$da39a3ee5e6b4b0d3255bfef95601890afd80709",
"size": 0,
"path": "/home/walkerbd/workflow-inference-compiler/cwl_adapters/3.txt"
}
]
As can be seen from the output json blob the order returned is not the same as input order.
Partial Failures¶
When the partial failures feature is enabled although the subprocess for the workflow step itself will pass, the post-processing javascript can potentially crash as seen below. The Sophios compiler only semantically understands Sophios/CWL. It is theoretically impossible to correct mistakes in the embedded JS of any arbitrary workflow. The corresponding cwl snippet is also shown.
outputs:
topology_changed:
type: boolean
outputBinding:
glob: valid.txt
loadContents: true
outputEval: |
${
// Read the contents of the file
const lines = self[0].contents.split("\n");
// Read boolean value from the first line
const valid = lines[0].trim() === "True";
return valid;
}
stdout was: ''
stderr was: 'evalmachine.<anonymous>:45
const lines = self[0].contents.split("\n");
^
TypeError: Cannot read properties of undefined (reading 'contents')
To fix this the developer needs to add a javascript snippet to check if the self object being globbed exists, shown below.
outputs:
topology_changed:
type: boolean
outputBinding:
glob: valid.txt
loadContents: true
outputEval: |
${
// check if self[0] exists
if (!self[0]) {
return null;
}
// Read the contents of the file
const lines = self[0].contents.split("\n");
// Read boolean value from the first line
const valid = lines[0].trim() === "True";
return valid;
}
Workflow Development¶
When adding new .cwl or .wic files its best to remove the .wic folder containing paths to .cwl and .yml files
rm -r ~/wic
Singularity¶
When building images with Singularity its best to clean the cache to avoid potential errors with cwltool or cwl-docker-extract.
singularity cache clean
Toil¶
When working with toil be sure to clean the working state as well as the configuration file, otherwise if you change input flags the configuration file will not be updated.
toil clean
rm -r ~/.toil