← Back

Glossary

Technical Glossary

Key terms and concepts related to Retrieval-Augmented Generation and vector search.

Try the Interactive Demo

Jump to Letter

C

Chunking

The process of splitting large documents into smaller, semantically meaningful pieces before indexing them in a vector database. Proper chunking preserves context within each chunk while keeping chunks small enough to fit in language model context windows. Chunk size (typically 100-1000 tokens) significantly impacts retrieval quality and must be tuned for your specific domain.

Cosine Similarity

A mathematical measure of similarity between two vectors, calculated as the cosine of the angle between them in multi-dimensional space. Cosine similarity ranges from -1 to 1 (or 0 to 1 for embeddings), with values closer to 1 indicating very similar vectors. It's the standard similarity metric for semantic search because it focuses on direction (meaning) rather than magnitude (length).

E

Embeddings

Numerical vector representations of text that capture semantic meaning in a high-dimensional space (typically 384 to 3072 dimensions). Embeddings enable semantic search by placing conceptually similar text nearby in the vector space. Unlike keyword matching, embeddings recognize that "vehicle" and "car" are semantically related even though they use different words. Embedding quality directly impacts RAG system performance.

H

Hallucination

When a language model generates plausible-sounding but false, unsupported, or outdated information with confidence. Hallucinations can involve fabricating facts, misrepresenting sources, or combining information incorrectly. RAG systems reduce but don't eliminate hallucinations by grounding responses in retrieved documents. Users should always verify critical claims against the cited sources.

P

Prompt Injection

A security attack where malicious users craft prompts designed to manipulate language model behavior, bypass safety guidelines, or extract sensitive information. Examples include embedding hidden instructions in documents that the model retrieves and follows, or asking the model to ignore previous instructions. Defending against prompt injection requires input validation, instruction hierarchies, and careful system prompt design.

R

RAG (Retrieval-Augmented Generation)

An AI architecture that enhances language models by retrieving relevant information from a knowledge base and using it as context when generating responses. Instead of relying solely on training data, RAG systems fetch specific documents matching user queries and ground their answers in those sources. This dramatically improves accuracy, factuality, and currency while enabling the model to answer questions about information it was never trained on.

S

V

Vector Database

A specialized database system optimized for storing and querying high-dimensional vectors (embeddings). Vector databases use algorithms like IVFFLAT, HNSW, or LSH to perform efficient similarity search on millions or billions of vectors. Examples include Pinecone, Weaviate, Chroma, and pgvector (PostgreSQL extension). Vector databases are essential infrastructure for RAG systems, enabling fast retrieval of relevant documents.

Ready to see these concepts in action?