Healthcare-Optimized Embeddings
Semantic embeddings that understand medical terminology. +15% similarity accuracy vs OpenAI text-embedding-3 on healthcare queries.
| Dimensions | 1536 (default), adjustable 256-2048 |
| Context Length | 32,000 tokens |
| Languages Supported | 100+ languages |
| Training Data | 102 healthcare domains + 10M PubMed papers |
| Latency (p95) | 120ms |
| Batch Size | Up to 1,000 texts per request |
curl https://api.persly.ai/v1/embed \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"texts": [
"What are the side effects of metformin?",
"Metformin common adverse reactions include nausea..."
],
"input_type": "document",
"dimensions": 1536
}'1,000 medical questions with 5,000 candidate answers. Measure: Cosine similarity to ground truth answers.
| Model | Avg. Cosine Similarity | Recall@10 | Cost (per 1M tokens) |
|---|---|---|---|
| Persly Embed | 0.85 | 92% | $0.50 |
| Voyage Large 2 | 0.78 | 85% | $0.80 |
| OpenAI text-embedding-3-large | 0.74 | 81% | $1.30 |
| Jina Embeddings v3 | 0.71 | 78% | $0.40 |
* Internal testing on healthcare QA datasets. Pricing estimates as of 2026.
Build semantic search over medical documents, FAQs, knowledge bases
Find relevant context for LLM prompts in RAG pipelines
Find duplicate medical records with different phrasing
Recommend similar health articles, treatments, or resources
Yes. Use the "dimensions" parameter (256-2048). Lower dimensions = faster search, higher = more accuracy.
Works with any vector DB: Pinecone, LambdaDB, Weaviate, Qdrant, etc. Just store the embeddings and query as usual.
Use "query" for search queries and "document" for content to be searched. Different optimizations are applied.
Up to 1,000 texts per request. For larger batches, split into chunks.
Let's discuss how our APIs can power your healthcare product