← Back to APIs

Embed

Healthcare-Optimized Embeddings

Semantic embeddings that understand medical terminology. +15% similarity accuracy vs OpenAI text-embedding-3 on healthcare queries.

Generic Embeddings Fail on Medical Queries

Medical AbbreviationMI = Motivational Interviewing? or Myocardial Infarction?
Query:"MI treatment guidelines"
Generic"Motivational Interviewing techniques"
Persly"Myocardial Infarction emergency protocol"
Brand → Generic NameBrand names require clinical context translation
Query:"Tylenol overdose what to do"
Generic"Pain relief medication options"
Persly"Acetaminophen toxicity: N-acetylcysteine protocol"
Emergency Symptom CombinationSymptom pairs can signal life-threatening conditions
Query:"chest pain with difficulty breathing"
Generic"General chest discomfort remedies"
Persly"Acute coronary syndrome vs pulmonary embolism differential"

Model Specifications

Dimensions1536 (default), adjustable 256-2048
Context Length32,000 tokens
Languages Supported100+ languages
Training Data102 healthcare domains + 10M PubMed papers
Latency (p95)120ms
Batch SizeUp to 1,000 texts per request

Code Example

curl https://api.persly.ai/v1/embed \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "texts": [
      "What are the side effects of metformin?",
      "Metformin common adverse reactions include nausea..."
    ],
    "input_type": "document",
    "dimensions": 1536
  }'

Embedding Similarity Benchmarks

Medical QA Similarity Dataset

Persly Embed0.85
Voyage Large 20.78
OpenAI text-embedding-3-large0.74
Jina Embeddings v30.71

1,000 medical questions with 5,000 candidate answers. Measure: Cosine similarity to ground truth answers.

ModelAvg. Cosine SimilarityRecall@10Cost (per 1M tokens)
Persly Embed0.8592%$0.50
Voyage Large 20.7885%$0.80
OpenAI text-embedding-3-large0.7481%$1.30
Jina Embeddings v30.7178%$0.40

* Internal testing on healthcare QA datasets. Pricing estimates as of 2026.

Use Cases

Vector Search

Build semantic search over medical documents, FAQs, knowledge bases

RAG Context Retrieval

Find relevant context for LLM prompts in RAG pipelines

Semantic Deduplication

Find duplicate medical records with different phrasing

Recommendation Systems

Recommend similar health articles, treatments, or resources

FAQ

Can I adjust the embedding dimensions?

Yes. Use the "dimensions" parameter (256-2048). Lower dimensions = faster search, higher = more accuracy.

What vector databases are supported?

Works with any vector DB: Pinecone, LambdaDB, Weaviate, Qdrant, etc. Just store the embeddings and query as usual.

What's the difference between query and document types?

Use "query" for search queries and "document" for content to be searched. Different optimizations are applied.

Is there a batch limit?

Up to 1,000 texts per request. For larger batches, split into chunks.

Ready to Build with Persly?

Let's discuss how our APIs can power your healthcare product