Vector embeddings are the foundation of modern semantic search and RAG systems. They convert unstructured text into high-dimensional vectors where similar concepts are close together in space.
From Text to Numbers
An embedding is a numerical representation of meaning. When you embed the sentence "The cat sat on the mat," you get a vector of 1536 numbers that capture its semantic essence.
The key insight: similar sentences have similar embeddings. "The dog lay on the rug" will be close to "The cat sat on the mat" in vector space.
How Embeddings Work
Transformer Models: Modern embeddings use transformers (like BERT or OpenAI's text-embedding-3) that understand context.
Dimensionality: Typical embeddings are 384 to 1536 dimensions. More dimensions = more nuance but slower search.
Cosine Similarity: The most common metric to measure how similar two embeddings are (0 = completely different, 1 = identical).
Practical Applications
- Semantic Search: Find documents by meaning, not keywords
- Recommendation Systems: Find similar products/content
- Duplicate Detection: Find similar documents automatically
- Classification: Classify text without labeled training data
Choosing an Embedding Model
OpenAI text-embedding-3: Best quality, easy integration, costs money
Sentence-Transformers: Free, open-source, good quality, can run locally
Cohere Embeddings: Good for specific languages, competitive pricing
Pro Tip: Start with a free embedding model. Only upgrade if quality is insufficient. The difference is often smaller than you'd expect.