Distributed Transaction

A distributed transaction is a single logical unit of work that touches more than one database, service, or resource manager, and must finish with all participants in agreement: either every participant commits its part, or every participant aborts. The goal is to preserve atomicity, the all-or-nothing property of a transaction, even though the work is spread across machines that can fail or become unreachable independently.

The classical way to coordinate this is the two-phase commit protocol, in which a coordinator first asks every participant whether it is prepared to commit, and only if all of them vote yes does it tell them to commit in a second phase. This guarantees agreement, but it is expensive: participants must hold locks and stay prepared while waiting for the coordinator’s decision, and a coordinator failure at the wrong moment can leave them blocked indefinitely.

Those blocking and coordination costs are why distributed transactions are often avoided in modern architectures. When a process must span many independently owned services, each with its own database, holding a global lock across all of them harms availability and scalability. The saga pattern was developed precisely to handle such long-lived, cross-service work without distributed locking, by replacing global atomicity with a chain of local commits plus compensating actions.

In service-oriented designs, the common alternatives to a distributed transaction are sagas and eventual consistency. Rather than insisting that every service commit at the same instant, a saga lets each service commit locally and triggers the next step through events, accepting that the overall system is only consistent once the whole sequence completes, and relying on compensation if a step fails partway through.