PSMA Workflow

The package implements a probabilistic molecular activity surface workflow in a sequence of small, testable steps.

High-level flow

  1. Validate input data and backend-specific columns.

  2. Build a similarity matrix.

  3. Split compounds into train and test sets.

  4. Transform similarities into distances.

  5. Embed the training compounds into a 2D reference space.

  6. Threshold the endpoint into binary labels.

  7. Fit class-conditional KDE models.

  8. Evaluate density surfaces on a regular grid.

  9. Convert densities plus priors into a posterior surface.

  10. Project test compounds into the learned space.

  11. Score test points and compute metrics.

Why the workflow is structured this way

The package mirrors the paper’s workflow while keeping each stage independently testable. That makes it easier to:

  • verify parity with the original method

  • harden failure handling

  • expose a clear public API

Key constraints

  • the package assumes a 2D reference space

  • the workflow needs both classes present in the training set for KDE

  • the projection step uses a pseudoinverse-based mapping

These constraints are important to understand when interpreting results.