Rerank

World-Class Medical Reranker

Maximize search relevancy and RAG accuracy for healthcare queries. Outperforms Jina, Cohere, and Vertex AI on medical benchmarks.

Talk to Our Team View Code Examples

The Problem with Generic Rerankers

"Metformin" vs "メトホルミン" (Japanese) not matched
"Diabetes mellitus type 2" vs "糖尿病" context lost
Drug interactions missed due to lack of medical knowledge
Medical abbreviations not understood (HTN, DM, COPD)

Persly Rerank Solution

Trained on 102 healthcare sources + 10M PubMed papers:

Understands medical synonyms & multilingual terms
Cross-attention over query-document pairs
+42% NDCG@10 on medical QA datasets
Recognizes drug names, conditions, and procedures

How Rerank Works

Initial Search

Your app retrieves 100 documents using embeddings or BM25

→

Cross-Attention Analysis

Persly Rerank analyzes query-document pairs jointly (not independently like embeddings)

→

Precision Ranking

Returns top_k most relevant documents with scores 0-1

Why Cross-Attention?

Unlike embeddings (encode separately), cross-attention encodes query + document together, capturing fine-grained semantic relationships. This is crucial for medical queries with complex terminology.

Code Example

curl https://api.persly.ai/v1/rerank \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What are the side effects of metformin for diabetes?",
    "documents": [
      "Metformin is a first-line medication for type 2 diabetes...",
      "Common side effects include nausea, diarrhea, and stomach upset...",
      "SGLT2 inhibitors are an alternative class of diabetes medications..."
    ],
    "top_k": 6
  }'

Medical Reranking Benchmarks

BEIR Medical Subset (NDCG@10)

Persly Rerank0.68

Cohere Rerank 40.63

Jina Reranker v30.61

Vertex AI Ranking0.59

Medical QA Dataset Comparison

Model	NDCG@10	Recall@10	Latency (p95)
Persly Rerank	0.68	91%	185ms
Cohere Rerank 4	0.63	86%	195ms
Jina Reranker v3	0.61	84%	210ms
Vertex AI Ranking	0.59	79%	240ms

* Benchmarks from internal testing on healthcare QA datasets. Contact us for detailed methodology and datasets.

Use Cases

RAG Pipelines

Boost RAG accuracy by 40%+ with precise document selection

Medical Search Engines

Rerank BM25/embedding results for maximum relevancy

Question Answering

Find the exact paragraph that answers medical questions

Document Classification

Classify medical documents by relevance to specific topics

FAQ

When should I use Rerank vs just Embeddings?

Use Rerank after embeddings when precision matters more than speed. It improves NDCG@10 by ~15% but adds ~50ms latency.

What's the max number of documents I can rerank?

Up to 1,000 documents per request. For best performance, rerank top 100-200 from your initial search.

Is Rerank multilingual?

Yes. Supports 100+ languages with cross-lingual matching (e.g., Korean query → English docs).

How is the relevance score calculated?

Scores range from 0 to 1, representing the probability that a document is relevant to the query. Higher is better.

Ready to Build with Persly?

Let's discuss how our APIs can power your healthcare product.