NAIAD

Machine Learning for Proteomics

Discovering the
Hidden Proteome
of Cilia

NAIAD identifies novel ciliary proteins in Chlamydomonas reinhardtii using protein language models, achieving 95% accuracy in predicting subcellular localization.

Exhibit A

Chlamydomonas reinhardtii

A single-celled green algae with two flagella, serving as a model organism for studying ciliary biology. Its genome encodes ~17,000 proteins, but only 187 are confirmed ciliary proteins.

NAIAD combines ESM-2 protein embeddings (650M parameters) with Gene Ontology annotations to predict which of the remaining proteins localize to cilia.

The Challenge
Total Genes17,741
Genes Analyzed13,140
Known Ciliary187
Novel Candidates Found15

Model Performance

0.950
ROC-AUC
0.777
PR-AUC
94%
Top-100 Precision
+14%
vs CilioGenics

Trained on 187 known ciliary proteins using 5-fold cross-validation. The model outperforms the CilioGenics database baseline (0.81 AUC) by 14 percentage points, demonstrating that protein language models capture meaningful structural features predictive of subcellular localization.

Novel Candidates

Top Discoveries

Query your own →
Loading candidates...

Gene Query

Look Up Your Genes

Enter gene IDs to retrieve ciliary localization probabilities from our model.

View Full Database →
Try: