Validation Guide

Validation answers one question: does this artifact match the shape we expect?

Sophios uses validation at several layers. Each layer catches a different kind of problem.

Validation Layers

Python API Validation

The Python API validates workflow structure before compilation. For example, workflow outputs must have sources, scatter settings must be valid, and typed explicit links must be compatible.

This catches many authoring mistakes close to the code that created them. Required step inputs may be left unbound so the compiler can apply the same edge inference used for .wic workflows.

CWL Validation

When you build a CommandLineTool in Python, you can validate the generated CWL:

tool.validate()
tool.save("tool.cwl", validate=True)

This checks that the generated document is valid CWL, not merely a dictionary that looks like CWL.

.wic Schema Validation

For advanced YAML users, Sophios can generate a JSON schema for discovered tools and workflows:

sophios --generate_schemas

The generated schema powers editor validation for .wic files.

Because the schema is based on discovered tools and workflows, it can become stale. Regenerate it when you add, remove, or rename tools.

Compute Payload Validation

ComputeWorkflowPayload validates compute submission requests against the checked-in payload schema before submission:

compute_json = payload.get_compute_payload()

That makes the submission boundary explicit: build the payload, validate it, then submit.

Strictness

Schemas can be loose or strict.

The .wic schema is intentionally strict: it is generated from the tools and workflows Sophios can discover. That means it can reject unknown steps before a workflow reaches compilation.

Strict validation is helpful, but only when the schema is current. If validation errors look wrong, regenerate:

rm -rf autogenerated/schemas
sophios --generate_schemas

Property-Based Testing

Sophios can use generated schemas for property-based integration testing. The idea is to generate many synthetic workflows and check that the language and compiler behave consistently.

This is most useful for developers and maintainers. Users do not need to understand property-based testing to benefit from schema validation in their editor or from Python API validation in their scripts.