Sharding (Partitioning)

Sharding, also called partitioning, splits a dataset across multiple machines so that each “shard” holds only a subset of the data. Where replication keeps full copies of the data for availability, sharding divides the data for capacity: it lets a system store and serve more than a single machine could hold or handle alone. MongoDB describes this as horizontal scaling, “dividing the system dataset and load over multiple servers, adding more servers to increase capacity as required,” in contrast to vertical scaling, which buys a bigger single machine and eventually hits a hard limit.

The hard part is choosing how to assign each record to a shard. MongoDB does this with a shard key: “MongoDB uses the shard key to distribute the collection’s documents across shards.” The choice of key determines whether load spreads evenly or piles onto one machine.

There are two main partitioning strategies, with opposite trade-offs. Hashed sharding hashes the key, so that “hashed values are unlikely to share the same chunk, providing more even data distribution,” but means “range-based queries on the shard key are less likely to target a single shard,” forcing broadcasts across the cluster. Ranged sharding keeps nearby keys together so that “a mongos can route the operations to only the shards that contain the required data,” at the cost of possible hot spots when keys grow monotonically.

The other hard part is rebalancing: when a machine is added or removed, data must move between shards. Naive hashing reshuffles almost everything, so distributed databases use consistent hashing instead. Cassandra maps both nodes and keys onto a hash ring so that “when the number of nodes to hash into changes, consistent hashing only has to move a small fraction of the keys.”