# Run With SMILES

Use this guide when your dataset contains SMILES strings and you want to
run the RDKit Morgan Tanimoto workflow.

## Required inputs

Your CSV should contain:

- an endpoint column
- a SMILES column

If no molecule identifier column is supplied, `psma` will generate one.

## CLI example

```bash
pixi run psma run input.csv \
  --output-dir outputs/run1 \
  --y-col low_solubility \
  --label-threshold 0.5 \
  --label-direction ge \
  --similarity-method rdkit_morgan_tanimoto \
  --smiles-col canonical_smiles
```

## Notes

- invalid SMILES strings will fail validation during similarity
  construction
- RDKit must be available for this backend
- you can compare split behavior by changing `--split-method`