Validation Guide¶
Validation answers one question: does this artifact match the shape we expect?
Sophios uses validation at several layers. Each layer catches a different kind of problem.
Validation Layers¶
Python API Validation¶
The Python API validates workflow structure before compilation. For example, workflow outputs must have sources, scatter settings must be valid, and typed explicit links must be compatible.
This catches many authoring mistakes close to the code that created them.
Required step inputs may be left unbound so the compiler can apply the same edge
inference used for .wic workflows.
CWL Validation¶
When you build a CommandLineTool in Python, you can validate the generated CWL:
tool.validate()
tool.save("tool.cwl", validate=True)
This checks that the generated document is valid CWL, not merely a dictionary that looks like CWL.
.wic Schema Validation¶
For advanced YAML users, Sophios can generate a JSON schema for discovered tools and workflows:
sophios --generate_schemas
The generated schema powers editor validation for .wic files.
Because the schema is based on discovered tools and workflows, it can become stale. Regenerate it when you add, remove, or rename tools.
Compute Payload Validation¶
ComputeWorkflowPayload validates compute submission requests against the
checked-in payload schema before submission:
compute_json = payload.get_compute_payload()
That makes the submission boundary explicit: build the payload, validate it, then submit.
Strictness¶
Schemas can be loose or strict.
The .wic schema is intentionally strict: it is generated from the tools and
workflows Sophios can discover. That means it can reject unknown steps before a
workflow reaches compilation.
Strict validation is helpful, but only when the schema is current. If validation errors look wrong, regenerate:
rm -rf autogenerated/schemas
sophios --generate_schemas
Recommended Validation Habits¶
For Python workflows:
Build or load tools.
Validate generated
CommandLineToolobjects when authoring new tools.Compile the workflow before running it.
Inspect generated artifacts when behavior matters.
Validate compute payloads before submission.
For YAML workflows:
Regenerate schemas after tool or workflow changes.
Use editor validation for quick feedback.
Compile before running.
Use Graphviz and generated CWL for review.
Replace ambiguous inference with explicit anchors.
Property-Based Testing¶
Sophios can use generated schemas for property-based integration testing. The idea is to generate many synthetic workflows and check that the language and compiler behave consistently.
This is most useful for developers and maintainers. Users do not need to understand property-based testing to benefit from schema validation in their editor or from Python API validation in their scripts.