Relational Algebra

Relational algebra is the set of formal operations that act on relations and produce new relations as results. Because every operation takes one or more tables as input and returns a table, the operations can be combined and chained, which lets complex queries be built up from a few simple, well-defined building blocks.

The core operations include selection, which keeps only the rows that satisfy a condition; projection, which keeps only certain columns; and the set operations union, intersection, and difference. The join combines rows from two relations that share matching values, which is how separate tables are brought together in a query. Codd introduced these operations as part of the relational model in his 1970 paper, giving database queries a precise mathematical basis.

Because relational algebra is grounded in mathematics, expressions written in it have well-defined meanings and obey algebraic laws. Two different expressions can be proven to produce the same result, which means a database system can rewrite a query into an equivalent but cheaper form. This is the foundation of query optimization, the part of a database that decides how to execute a request efficiently.

In his 1981 Turing Award lecture, Codd emphasized that this firm theoretical footing was not just elegant but practical: it let high-level, non-procedural query languages like SQL be defined, understood, and executed reliably. Relational algebra is what connects the simple table-based view that users see to the efficient machinery that runs underneath.