Skip to content

Expose split-output behavior as a configurable CLI option #81

Description

@araikes

Description

ModelArrayIO supports split output internally through the split_outputs parameter in the NIfTI, CIFTI, and MIF conversion functions. However, this behavior is not independently configurable from the CLI.

Currently, to_modelarray() sets:

split_outputs = bool(scalar_columns)

Consequently:

  • Wide cohorts using --scalar-columns always produce one output per scalar.
  • Long-format cohorts always produce a single combined output.
  • --split-files and --split_files are not recognized CLI arguments.
  • Users cannot combine scalars from a wide cohort or split scalars from a long cohort.

Proposed behavior

Add explicit CLI options:

--split-files
--no-split-files

For consistency with existing options, --split_files and --no_split_files could also be accepted as aliases.

When no option is supplied, retain the existing behavior for backward compatibility:

split_outputs = bool(scalar_columns)

This would allow:

# Wide CSV, combined output
modelarrayio to-modelarray \
  --cohort-file cohort.csv \
  --scalar-columns FA MD \
  --no-split-files \
  --output modelarray.h5
# Long CSV, separate outputs
modelarrayio to-modelarray \
  --cohort-file cohort.csv \
  --split-files \
  --output modelarray.h5

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions