Publication

Orbion Team

Astra on Transporters: A 6,203-Protein Benchmark

Membrane transporters move ions, nutrients, neurotransmitters, and drugs across the lipid bilayer — and they are among the most important and most under-exploited drug-target classes in biology. The solute-carrier (SLC) superfamily alone spans roughly 450 human genes and is the second-largest family of membrane proteins in the genome; the ABC transporters govern multidrug resistance in oncology; and individual carriers like the serotonin transporter (SERT) are the direct target of entire drug classes — the SSRIs.


They are also exceptionally hard to characterize computationally: transporters span a wide range of folds and transmembrane-helix counts, they cycle through multiple conformational states by alternating access, and a substantial subset are themselves enzymes. So the practical question for a transporter team is the same one every program faces: can AI say something about my target I can trust enough to spend reagents on?


To answer it, we ran the entire Astra AI Suite across 6,203 reviewed transport-associated proteins from UniProt Swiss-Prot and scored every output against the strongest publicly available experimental reference. This post is the short version; the full per-model breakdown is the Transporters volume of the Orbion Model Performance Series, linked at the end.

Key Takeaways

  • 6,203 transport-associated proteins, every model output scored against the strongest public experimental reference for its task.

  • Family recognition is near-perfect: SLC carriers 97%, ABC transporters 100%, aquaporins 100% — and the model flags the ~15% of the cohort that are enzymatic (P-type ATPases, ABC ATPase domains) rather than mislabeling them.

  • Topology holds across the diversity: transporters span 1–14 transmembrane helices, and per-residue topology agrees with UniProt annotation at AUROC 0.95 (F1 0.86, n=3,524), uniformly across folds.

  • Binding-pocket prediction is strongest where the ligand is a nucleotide: 56% recall on co-crystal targets, up to residue F1 0.81 on trafficking GTPases.

  • Thermostability on SERT (197 mutations, independent experimental data): MAE 2.13 °C, Spearman ρ 0.59.

The Benchmark

  • Cohort. 6,203 reviewed (Swiss-Prot) transport-associated proteins.

  • References. Swiss-Prot annotation for sequence-level features; PDB co-crystal contacts at 4 Å for ligand binding; and 197 mutations on the serotonin transporter with measured thermal-shift data for stability.

  • Reporting. Each capability is shown with its headline metric and its failure modes.


The classification, topology, and PTM metrics describe how the deployed models behave on real transporter targets in production, including proteins drawn from their training corpora. Thermostability is the exception — it's scored against independent experimental data.

Every output is checked against the strongest public experimental reference for that task

Family and Topology: 97–100% and AUROC 0.95

The strongest, most reliable calls are recognition and topology.

  • Family recognition. SLC carriers are called at 97%, ABC transporters at 100%, aquaporins at 100%. Because the broad Swiss-Prot transport keyword also sweeps in ATP-driven primary active transporters, the model flags the ~15% enzymatic members (P-type ATPases, ABC ATPase domains) via the EC head rather than forcing them into a carrier label.

  • Topology. Transporters range from 1 to 14 transmembrane helices. The model resolves that diversity rather than assuming a fixed architecture, with per-residue agreement at AUROC 0.95, AUPRC 0.89, F1 0.86 (n=3,524) — uniform across folds.

Topology agrees with UniProt annotation at AUROC 0.95, uniformly across a class that ranges from 1 to 14 transmembrane helices

PTM Sites and Pockets

  • PTM sites: on the strongest classes, F1 0.88 for disulfide bonds, 0.83 for N-linked glycosylation, 0.79 for myristoylation — across all 39 modification classes, each at two operating points (high-precision and high-recall).

  • Binding pockets: 56% recall on PDB-observed ligand identities across the 833 transporters with co-crystal data. The model is strongest where the ligand is a nucleotide — on the trafficking GTPases (RAN, Rab, SAR1B), residue F1 reaches 0.81 — reflecting a genuine nucleotide-pocket strength rather than a uniform claim.

Pocket prediction is carried by chemically well-defined nucleotide sites

Thermostability: The Serotonin Transporter

Stability prediction was validated against independent experimental data: 197 mutations of the serotonin transporter, scored against measured thermal-shift values. The result — MAE 2.13 °C, Spearman ρ 0.59 — is a usable pre-screen for ranking stabilizing mutations on a notoriously difficult membrane target, while being honest about the noise floor of the underlying assay.

Thermostability on 197 SERT mutations: MAE 2.13 °C, Spearman ρ 0.59

Where It's Uneven

The transporter class is defined by its heterogeneity, and the results reflect that:

  • Recognition and topology are excellent where a family is structurally well-defined — SLC carriers, ABC transporters, aquaporins.

  • The broad transport keyword also captures enzymatic ATPases and trafficking GTPases whose dominant function the model assigns elsewhere; we report that directly rather than averaging it in.

  • Binding-pocket prediction is strongest for nucleotide ligands and weaker for the diverse small-molecule substrates of the SLC carriers.

One Transporter, End-to-End: The Serotonin Transporter

The whitepaper walks the whole suite through a single target — the serotonin transporter (SLC6A4, UniProt P31645), the target of the SSRIs — to show what a program team does with the integrated output:

  1. Resolve the transmembrane topology and conformational-state context.

  2. Flag the modification sites that shape trafficking and regulation.

  3. Triage the central substrate and allosteric pockets.

  4. Pre-screen stabilizing mutations before committing to thermal-shift assays.


The value isn't replacing the experiment — it's compressing the read on a hard membrane target to a prioritized starting point, so reagents and assay time go where they're most likely to pay off.

One transporter, end to end — what a program team does with the integrated output

Why This Matters for Transporter Programs

The SLC superfamily — ~450 genes, the second-largest membrane-protein family in the genome — remains one of the most under-exploited target spaces in drug discovery, precisely because these proteins are hard to express, stabilize, and characterize. A correct read of a transporter's fold, modification sites, pockets, and stabilizing mutations is often the difference between an intractable target and a tractable construct.

Read the Full Benchmark

The Transporters volume reports every model's headline performance, its failure modes, and a per-family breakdown across the SLC, ABC, and aquaporin classes.


→ Read the full Transporters benchmark: https://www.orbion.life/research/transporter-performance


Part of the Orbion Model Performance Series — alongside GPCRs, Enzymes, and Ion Channels.

References & Sources

  • César-Razquin, A. et al. A Call for Systematic Research on Solute Carriers. Cell 162(3):478–487 (2015). doi:10.1016/j.cell.2015.07.022 — SLC superfamily scope.

  • Lin, L., Yee, S. W., Kim, R. B. & Giacomini, K. M. SLC transporters as therapeutic targets: emerging opportunities. Nature Reviews Drug Discovery 14(8):543–560 (2015). doi:10.1038/nrd4626 — transporters as drug targets.

  • Kulandaisamy, A., Sakthivel, R. & Gromiha, M. M. MPTherm: database for membrane protein thermodynamics. Briefings in Bioinformatics 22(2):2119–2125 (2021). doi:10.1093/bib/bbaa064 — membrane-protein thermal-stability reference.

  • Alexandrov, A. I. et al. Microscale fluorescent thermal stability assay for membrane proteins. Structure 16(3):351–359 (2008). doi:10.1016/j.str.2008.02.004 — the CPM thermal-shift assay.

  • The UniProt Consortium. UniProt: the Universal Protein Knowledgebase in 2023. Nucleic Acids Research 51(D1):D523–D531 (2023). doi:10.1093/nar/gkac1052 — Swiss-Prot reference annotations.

  • Burley, S. K. et al. RCSB Protein Data Bank. Nucleic Acids Research 47(D1):D464–D474 (2019). doi:10.1093/nar/gky1004 — co-crystal ligand-contact references.

  • Full per-model methodology and metrics: Orbion, Astra AI on Transporters — Model Performance Series (2026) — https://www.orbion.life/research/transporter-performance

Ready to try it on your target?

Book a 20-Minute Demo

Sign up free for unlimited Overview runs — summary, sequence-based analysis, homology search. For the full Characterization — PTMs, binding sites, stability variants, construct design — book a demo and we'll run your target live.

Try Orbion on your own protein
Summary, sequence-based analysis, homology search — free, unlimited.
Try Orbion →