First CLI Run¶

This tutorial walks through a first successful psma run from a CSV file using the command-line interface.

Goal¶

By the end of the tutorial you will:

run the PSMA workflow from the CLI
write the standard artifact set to disk
know which arguments are required for a SMILES-based run

Prerequisites¶

the package is installed with its runtime dependencies
your input file is a CSV
the CSV contains:
- an endpoint column
- a SMILES column

Example command¶

pixi run psma run docs/_data/solubility_NCATS-sol.csv \
  --output-dir .tmp/ncats_sol_cli \
  --y-col low_solubility \
  --label-threshold 0.5 \
  --label-direction ge \
  --similarity-method rdkit_morgan_tanimoto \
  --smiles-col canonical_smiles \
  --split-method random

What the command does¶

The CLI reads the CSV, validates the required columns for the selected similarity backend, computes the PSMA surface, and writes the standard artifacts to --output-dir.

Expected outputs¶

After a successful run, the output directory should contain:

train_coords.csv
test_coords.csv
predictions_test.csv
surface_grid.npz
params.json
metrics.json

Next steps¶

To understand the output files, see Output Artifacts.
To compare split strategies, see Choose Random vs Butina.