Similarity Backends¶

psma currently supports three ways to build the similarity matrix.

RDKit Morgan Tanimoto¶

This backend uses SMILES strings or fingerprints to build Morgan fingerprints and then computes pairwise Tanimoto similarity.

Use it when:

you have SMILES
you want a standard cheminformatics baseline

Embedding cosine¶

This backend uses dense vector embeddings and computes cosine similarity.

Use it when:

you already have learned molecular embeddings
you want to compare representation-learning workflows

Imported triples¶

This backend reconstructs the full similarity matrix from a sparse triples table containing:

first molecule id
second molecule id
similarity score

Use it when:

the similarities were computed elsewhere
you want the package to consume a precomputed similarity definition

Choosing between them¶

Start with the backend that matches the representation you already have. The downstream PSMA surface workflow is the same once the similarity matrix exists.