AI System Design Language
2 exercises — use precise technical vocabulary to describe AI architectures: RAG pipelines, vector databases, and inference chains.
0 / 2 completed
RAG pipeline vocabulary
- Embedding model — converts text to a vector (e.g. text-embedding-ada-002)
- Vector store — database for storing and searching embedding vectors (Pinecone, pgvector)
- Chunk — a piece of a document; chunk size affects retrieval quality
- Top-k retrieval — fetch the k most semantically similar chunks
- Grounding context — retrieved chunks injected into the LLM prompt
- Reranker — second-pass model that re-scores retrieved results by relevance
- Inference pipeline — the full query → response processing chain
1 / 2
You are describing a RAG-based AI system to a software architect who is new to AI. Which description is most precise and complete?
Option C is the professional architecture description. It uses precise technical vocabulary:
• Embedding vector — the numeric representation of the query for similarity search
• Semantic similarity search — finds documents by meaning, not keyword match
• Vector store / pgvector — names the specific technology (important for concrete conversations)
• Top-k — a standard parameter meaning "return the k most similar results"
• Context window injection — explains how retrieval output is fed to the LLM
• Grounding context — the industry term for retrieved documents used to anchor the response
• Pre-training knowledge — contrasts RAG with purely parametric (in-weights) knowledge
Standard RAG vocabulary for system design discussions:
— Chunking strategy (fixed-size, semantic, recursive) — how documents are split
— Embedding model (e.g. text-embedding-ada-002) — converts text to vectors
— Reranker — a second pass model that re-orders retrieved chunks by relevance
— Hybrid search — combining vector similarity with keyword search (BM25)
• Embedding vector — the numeric representation of the query for similarity search
• Semantic similarity search — finds documents by meaning, not keyword match
• Vector store / pgvector — names the specific technology (important for concrete conversations)
• Top-k — a standard parameter meaning "return the k most similar results"
• Context window injection — explains how retrieval output is fed to the LLM
• Grounding context — the industry term for retrieved documents used to anchor the response
• Pre-training knowledge — contrasts RAG with purely parametric (in-weights) knowledge
Standard RAG vocabulary for system design discussions:
— Chunking strategy (fixed-size, semantic, recursive) — how documents are split
— Embedding model (e.g. text-embedding-ada-002) — converts text to vectors
— Reranker — a second pass model that re-orders retrieved chunks by relevance
— Hybrid search — combining vector similarity with keyword search (BM25)
Explore more: Tech-to-Business Communication →