Using Tool Builder and the Workflow Python API Together

Sophios has two related Python surfaces:

  • sophios.apis.python.tool_builder for authoring a single CWL CommandLineTool

  • sophios.apis.python.workflow for wiring tools into a workflow with Step and Workflow

Those APIs are intentionally separate, but they can be combined cleanly.

This guide shows the intended end-to-end pattern:

  1. define a new tool in Python,

  2. validate that tool as a real CWL CommandLineTool,

  3. convert it into an in-memory Step,

  4. compose it with the normal Sophios workflow API.

The important part is that the handoff stays in memory. You do not need to write a temporary .cwl file just to use a freshly built tool inside a workflow.

A runnable version of this pattern lives in examples/scripts/tool_builder_workflow.py.

When to use this pattern

This hybrid style is useful when:

  • a tool does not exist yet as a checked-in .cwl file,

  • you want to generate a family of similar tools from Python,

  • you want to validate the generated CLT before putting it into a workflow,

  • or you want a workflow to mix generated tools with ordinary file-backed Step(...) objects.

If you only need to build a single standalone CLT, start with tool_builder_sam3.

If you already have checked-in .cwl tools and only need to compose them, start with the Python Workflow API.

If your next step is compute submission rather than local execution, continue with ichnaea_compact_compute for the larger end-to-end example or compute_payload_workflow for the lower-level compute payload API.

Mental model

The cleanest way to think about the boundary is:

  • CommandLineTool(...) defines a tool contract

  • tool.validate() checks that contract as real CWL

  • Step(tool, step_name=...) turns that contract into a workflow node

  • Workflow(...) composes that node with other steps

That separation is deliberate.

The builder does not need to know about workflows. The workflow API does not need to know how the tool was authored. The handoff is direct: Step(tool, step_name=...) lets client code use a built tool with the same workflow API used for file-backed steps.

What we will build

We will build a compact example tool called emit_text:

  • it accepts one string input named message,

  • it runs echo,

  • it captures stdout into a file,

  • and it exposes that file as a normal CWL File output.

Then we will:

  • convert that built tool into a Sophios Step,

  • feed its file output into the existing checked-in cat.cwl,

  • and expose a workflow output called result.

So the final workflow shape is:

literal message
    -> emit_text (built in Python)
    -> cat.cwl (file-backed step)
    -> workflow output "result"

Full example

The snippet below assumes you are running from the repository root, so the checked-in adapter path cwl_adapters/cat.cwl is valid as written.

from pathlib import Path

from sophios.apis.python.tool_builder import (
    CommandLineTool,
    Input,
    Inputs,
    Output,
    Outputs,
    cwl,
)
from sophios.apis.python.workflow import (
    Step,
    Workflow,
)


def build_emit_text_tool() -> CommandLineTool:
    inputs = Inputs(
        message=Input(cwl.string, position=1)
        .label("Message")
        .doc("Text to print to stdout"),
    )

    outputs = Outputs(
        file=Output(cwl.file, glob="stdout")
        .label("Captured stdout")
        .doc("The file produced by redirecting stdout"),
    )

    return (
        CommandLineTool("emit_text", inputs, outputs)
        .describe(
            "Emit a message",
            "Example CLT built in Python and consumed by the workflow API.",
        )
        .base_command("echo")
        .stdout("stdout")
    )


def build_workflow() -> Workflow:
    emit_tool = build_emit_text_tool()

    # Validate the generated CLT before composing it into the workflow.
    emit_tool.validate()

    # No temporary file is needed here. The CLT is handed to Step in memory.
    emit_step = Step(emit_tool, step_name="emit_text")

    # This is an ordinary checked-in CWL adapter.
    cat_step = Step(Path("cwl_adapters") / "cat.cwl")

    workflow = Workflow([emit_step, cat_step], "builder_and_pyapi_demo")

    # Recommended explicit binding style: values go into inputs.
    emit_step.inputs.message = "hello from Sophios"
    cat_step.inputs.file = emit_step.outputs.file

    # Expose a workflow output.
    workflow.outputs.result = cat_step.outputs.output
    return workflow


workflow = build_workflow()
compiler_info = workflow.write_artifacts()

Why this example is structured this way

There are a few details worth calling out.

1. The CLT is complete before it becomes a step

The emit_text tool is a real CommandLineTool first:

inputs = Inputs(
    message=Input(cwl.string, position=1),
)

outputs = Outputs(
    file=Output(cwl.file, glob="stdout"),
)

tool = (
    CommandLineTool("emit_text", inputs, outputs)
    .base_command("echo")
    .stdout("stdout")
)

That matters because the builder API is responsible for answering tool-level questions:

  • what are the inputs,

  • what are the outputs,

  • what command runs,

  • how are stdout/stderr/files represented.

The workflow API should not need to rebuild that information later.

2. tool.validate() happens at the tool boundary

Validation belongs naturally on the builder side:

emit_tool.validate()

That checks the generated tool as a valid CWL CommandLineTool before it participates in a larger workflow.

For self-authored tools, that is usually the best debugging boundary:

  • first make the tool valid,

  • then compose it into the workflow.

This is more than a syntax check. It verifies the CWL shape, the declared inputs and outputs, and the tool contract that the workflow will consume.

3. Step(tool) is the handoff

This is the key handoff:

emit_step = Step(emit_tool, step_name="emit_text")

That call:

  • uses the built tool directly,

  • avoids a temporary .cwl file,

  • and gives you a normal Step.

The equivalent convenience form also works:

emit_step = emit_tool.to_step(step_name="emit_text")

After that, you work with the object exactly like any other Step:

emit_step.inputs.message = "hello from Sophios"
cat_step.inputs.file = emit_step.outputs.file

That is the main design goal of the handoff: once a built tool becomes a step, users work with the same inputs and outputs API they use for file-backed steps.

4. Workflow bindings should stay explicit

This guide uses the explicit form:

emit_step.inputs.message = "hello from Sophios"
cat_step.inputs.file = emit_step.outputs.file
workflow.outputs.result = cat_step.outputs.output

That is easier to read than the legacy shorthand and makes directionality obvious:

  • inputs.* are places you can bind values,

  • outputs.* are places you can read values from.

The old shorthand still exists for compatibility, but explicit namespaces are the preferred documentation style.

What gets written to disk

Only the compiled workflow artifacts are written when you call:

workflow.write_artifacts()

The generated emit_text CLT does not need to be written as a standalone .cwl file first.

That means this pattern is suitable for:

  • generated tools,

  • parameterized tools,

  • short-lived tools used only inside a larger workflow,

  • and tests that want to build tools programmatically.

Validation And Inspection Points

There are two separate checks here, and they answer different user questions.

1. Tool validation

emit_tool.validate() checks the generated CLT as a real CWL document.

That tells you:

  • the tool structure is valid,

  • the CWL fields are in the right shape,

  • and the generated CLT is ready to be composed into a workflow.

2. Workflow compilation

workflow.compile(...) checks that the generated step can participate in the normal Sophios compilation path.

That tells you:

  • the workflow API can consume the built tool,

  • the step ports are wired correctly,

  • and the result compiles into the same pipeline machinery as any other Sophios workflow.

Those are different checks, and both are useful before a generated tool becomes part of a larger workflow.

Summary

The combined Python story is now:

  • use tool_builder to define a proper CWL tool,

  • validate it while it is still a tool,

  • turn it into a Step in memory,

  • compose it with ordinary Sophios workflow steps.

That gives you the best of both worlds:

  • the rigor of a real CWL CommandLineTool,

  • and the composability of the Sophios workflow Python API.

Run the example script

From the repository root:

python examples/scripts/tool_builder_workflow.py

The script validates the generated CLTs and compiles the workflow by default. To run the workflow locally or write the generated CLTs for inspection, edit the configuration constants near the top of the script.