Performance: index · query plan · seq scan · explain analyze · covering index · composite index
Scale: sharding · replication · partitioning · read replica · connection pool · vertical vs. horizontal scaling
0 / 5 completed
1 / 5
In database terminology, what is an index?
A database index is a separate data structure (typically a B-tree) that maps column values to row locations, allowing the database to find matching rows in O(log n) instead of scanning all rows O(n). Analogy: a book index vs. reading every page. Trade-off: indexes speed up reads but slow down writes (INSERT/UPDATE/DELETE must also update the index). Types: primary index (unique, usually the primary key), composite index (multiple columns), covering index (includes all columns the query needs — avoids fetching the row at all), full-text index (for text search).
2 / 5
What does ACID stand for in the context of database transactions?
ACID stands for: Atomicity (a transaction either fully completes or fully rolls back — no partial commits), Consistency (the database moves from one valid state to another — all constraints remain satisfied), Isolation (concurrent transactions behave as if they ran sequentially — no dirty reads), Durability (once committed, data survives crashes — written to disk). ACID is guaranteed by most relational databases (PostgreSQL, MySQL, SQL Server). NoSQL databases often relax ACID in favour of availability and scalability, following the BASE model (Basically Available, Soft state, Eventually consistent).
3 / 5
What is database normalization?
Normalization organizes a relational database to minimize redundancy and dependency by decomposing tables into smaller, related ones. Normal forms: 1NF — atomic values, no repeating groups; 2NF — no partial dependencies on a composite key; 3NF — no transitive dependencies. Example: storing a customer's city name in every order row is denormalized — normalization moves city to a separate customers table. But normalization isn't always the goal: denormalization (intentional redundancy) is used for read performance in analytics and data warehouses.
4 / 5
Complete with the correct database term: "The query was taking 8 seconds because it was doing a full table _____ on 50 million rows — adding an index on the user_id column dropped it to 12ms."
A full table scan (or sequential scan) means the database reads every row in the table to find matching results — O(n). This is the worst case for query performance on large tables. An index scan uses the index to jump directly to matching rows — O(log n). EXPLAIN ANALYZE (PostgreSQL) or EXPLAIN (MySQL) show the query plan and tell you whether an index is being used. Key vocabulary for query performance discussions: query plan, cost estimate, seq scan (sequential scan), index scan, bitmap index scan, nested loop join.
5 / 5
What is database sharding?
Sharding is horizontal partitioning: splitting data across multiple database servers (shards), each responsible for a range of records. Example: users with IDs 1–1M on shard 1, 1M–2M on shard 2. Each shard handles both reads and writes for its partition. Compare: replication (all shards have a copy of the same data — for read scaling and failover) vs. sharding (each shard has a unique subset — for write scaling). Sharding adds significant complexity: cross-shard queries, re-sharding when load grows, shard key selection. Famous examples: Cassandra, MongoDB, Vitess (MySQL sharding), CockroachDB.