Target discovery

How do you find synthetic lethal targets without screening everything?

There are billions of possible gene combinations and no lab can test them all, can an AI agent actually prioritise the right ones, and then prove them out?

Single-target drugs fail to resistance, so teams turn to combinations, but two genes make ~200M pairs and three make ~1.3 trillion. We tested whether an LLM can prioritise synthetic-lethal pairs (73.8% accuracy, beating SLant and MAGICAL), shortlisted ~400,000 pairs to 11, then used Hydra to validate those 11 against real omic data.

By PharosBioUpdated

Who this is for: Target-discovery and computational-biology teams facing a combinatorial search space too large for wet-lab screening.

possible three-gene combinations
1.3T

possible three-gene combinations

best LLM accuracy on synthetic lethality (untrained)
73.8%

best LLM accuracy on synthetic lethality (untrained)

accuracy held on a post-cutoff, unseen screen
~60%

accuracy held on a post-cutoff, unseen screen

pairs shortlisted, then Hydra-validated on omics
400k → 11

pairs shortlisted, then Hydra-validated on omics

The problem with combination target discovery

Most cancers are heterogeneous, so single-target drugs frequently fail, tumours become resistant or the agent is too toxic. Combinations are the established answer, and synthetic lethality (losing two genes together kills a cell that tolerates losing either alone) is the cleanest version of it: it's how PARP inhibitors work in BRCA-mutant cancers, and newer targets like ATR, WEE1 and WRN follow the same logic.

But the combinatorics explode. One gene is ~20,000 candidates, two genes ~200 million pairs, and three genes over 1.3 trillion. CRISPR and RNAi screens are powerful but can't cover that space, so the real question teams google is how to prioritise, to spend scarce screening capacity only on the combinations most likely to be real, novel, and feasible.

What teams in this space search for

  • How do I find synthetic lethal targets without screening everything?
  • Can AI predict which gene combinations are synthetically lethal?
  • How do I prioritise combination targets to beat drug resistance?
The solution

How we solved it with Hydra

The prompt we gave HydraModule: Synthetic-Lethality Scoring module

Score these ~10,000 gene pairs for synthetic lethality and tell me how an LLM compares against a known CRISPR screen. Then widen to ~400,000 pairs of clinically relevant genes, add novelty and feasibility scores, and shortlist the strongest candidates. For the survivors, validate them against real omic data, tumour-versus-normal expression, dependencies, and which patient populations would benefit.

What Hydra ran

Initial screening: the Synthetic-Lethality Scoring module scored 10,000 gene pairs with open-weight LLMs on a −1 to +1 scale (lethal to rescue), benchmarked against a published CRISPR ground-truth screen and against purpose-built tools. To rule out memorisation, it re-ran on a screen released after the model's training cutoff.

Scaling: it widened to ~400,000 pairs of clinically relevant genes and layered two more scores, a novelty score (an agent with literature access checks whether a similar pair was tested before) and a feasibility score (pan-essential genes, for example, are harder to validate), then filtered to the top candidates on all three axes.

Validation: the 11 surviving pairs were handed to Hydra, which planned and ran the bioinformatics to test each one against real omic data: DepMap dependencies, TCGA tumour-versus-normal expression, and cBioPortal alterations, to see which were worth a wet-lab.

What it found

The best untrained model scored synthetic lethality at 73.8% accuracy, outperforming purpose-built tools (SLant, MAGICAL). On the unseen, post-cutoff screen it held ~60%, a modest drop that points to real generalisation, not data leakage.

Scaling and validation collapsed ~400,000 pairs to 11 testable hypotheses. Hydra's omic validation is what makes them actionable: each surviving pair comes with its tumour-versus-normal expression, dependency profile, and the patient populations most likely to benefit, not just a score.

What we learned

An LLM score is a powerful prior, not a verdict: specificity is low, so survivors must be grounded in real data. That's exactly the division of labour, the module ranks the space, Hydra validates the shortlist on omics, and only then does anything reach a wet lab.

Above-random performance on a screen the model never saw mirrors what happened in coding, where the ability to spot concurrency bugs emerged without being taught. The same filter-score-validate loop generalises to any problem where combinatorics outrun lab capacity.

What you get

  • An untrained LLM scored synthetic lethality at 73.8%, outperforming SLant and MAGICAL
  • ≈60% accuracy held on a post-training-cutoff screen, controlling for data leakage
  • ~400,000 candidate pairs collapsed to 11, each validated by Hydra on real omic data
  • A reusable filter-score-validate loop for any combinatorial discovery problem

Data sources used

  • Published CRISPR synthetic-lethality screens (ground truth)
  • DepMap (gene dependencies & essentiality)
  • TCGA / cBioPortal (tumour expression & alterations)
  • Primary literature (novelty assessment)

Figures reflect analyses PharosBio ran on public datasets and public benchmarks. Named competitors, collaborators, and logos are withheld at this stage; the methods and results shown are real and repointable to your own target.

Sources & methods

  • Ground-truth CRISPR synthetic-lethality screen (Olivieri et al., 2020)
  • Compared tools: SLant; MAGICAL
  • Omic validation: DepMap; TCGA / cBioPortal

Frequently asked questions

What is synthetic lethality, and why use it as the test case?

Synthetic lethality is when two genes can each be lost individually with little effect, but losing both together kills the cell, the principle behind PARP inhibitors in BRCA1/2-mutant cancers. It's an ideal benchmark because it's a two-gene problem with a known, experimentally-validated answer.

How did Hydra validate the 11 surviving pairs?

Hydra planned and ran the bioinformatics for each pair against real omic data: DepMap dependencies, TCGA tumour-versus-normal expression, and cBioPortal alterations, to check which were biologically plausible and which patient populations would benefit, turning a ranked list into wet-lab-ready hypotheses.

Isn't 73.8% just data leakage from training?

We re-tested on a CRISPR screen released after the model's training cutoff. Accuracy held at ≈60% on genuinely unseen data, a modest drop that indicates real generalisation rather than memorisation.

Run this analysis on your question

Hydra plans, executes, and validates, so you reach a defensible answer in hours, not weeks.

Related case studies