Byzantine Fault Tolerance Explained


Since the inception of Bitcoin in 2008, as a peer-to-peer electronic cash system, many other cryptocurrencies were created, each one with a particular mechanism. But one thing that nearly all cryptocurrencies have in common is the blockchain, as the core element of their architecture.

With few exceptions, blockchains are intentionally designed to be decentralized, working as a digital ledger that is maintained by a distributed network of computer nodes. For this reason, blockchain technology allowed the creation of trustless economic systems, where transparent and reliable financial transactions could be executed without the need for intermediaries. Cryptocurrencies are being adopted as a viable alternative to traditional banking and payment systems, which are heavily dependent on trust.

Just as most distributed computing systems, the participants of a cryptocurrency network need to regularly agree on the current state of the blockchain, and that is what we call consensus achievement. However, reaching consensus on distributed networks, in a safe and efficient way, is far from being an easy task.

So, how can a distributed network of computer nodes agree on a decision, if some of the nodes are likely to fail or to act dishonestly? This is the fundamental question of the so-called Byzantine Generals’ problem, which gave birth to the concept of Byzantine fault tolerance.

What is the Byzantine Generals’ Problem?

In a few words, the Byzantine Generals’ Problem was conceived in 1982 as a logical dilemma that illustrates how a group of Byzantine generals may have communication problems when trying to agree on their next move.

The dilemma assumes that each general has its own army and that each group is situated in different locations around the city they intend to attack. The generals need to agree on either attacking or retreating. It does not matter whether they attack or retreat, as long as all generals reach consensus, i.e., agree on a common decision in order to execute it in coordination.

Therefore, we may consider the following requirements:

  • Each general has to decide: attack or retreat (yes or no);
  • After the decision is made, it cannot be changed;
  • All generals have to agree on the same decision and execute it in a synchronized manner.

The aforementioned communication problems are related to the fact that one general is only able to communicate with another through messages, which are forwarded by a courier. Consequently, the central challenge of the Byzantine Generals’ Problem is that the messages can get somehow delayed, destroyed or lost.

In addition, even if a message is successfully delivered, one or more generals may choose (for whatever reason) to act maliciously and send a fraudulent message to confuse the other generals, leading to a total failure.

If we apply the dilemma to the context of blockchains, each general represents a network node, and the nodes need to reach consensus on the current state of the system. Putting in another way, the majority of participants within a distributed network have to agree and execute the same action in order to avoid complete failure.

Therefore, the only way to achieve consensus in these types of distributed system is by having at least ⅔ or more reliable and honest network nodes. This means that if the majority of the network decides to act maliciously, the system is susceptible to failures and attacks (such as the 51% attack).

Byzantine Fault Tolerance (BFT)

In a few words, Byzantine fault tolerance (BFT) is the property of a system that is able to resist the class of failures derived from the Byzantine Generals’ Problem. This means that a BFT system is able to continue operating even if some of the nodes fail or act maliciously. 

There is more than one possible solution to the Byzantine Generals’ Problem and, therefore, multiple ways of building a BFT system. Likewise, there are different approaches for a blockchain to achieve Byzantine fault tolerance and this leads us to the so-called consensus algorithms.

Blockchain consensus algorithms

We can define a consensus algorithm as the mechanism through which a blockchain network reach consensus. The most common implementations are Proof of Work (PoW) and Proof of Stake (PoS). But let’s take the Bitcoin case as an example.

While the Bitcoin protocol prescribes the primary rules of the system, the PoW consensus algorithm is what defines how these rules will be followed in order to reach consensus (for instance, during the verification and validation of transactions).

Although the concept of Proof of Work is older than cryptocurrencies, Satoshi Nakamoto developed a modified version of it as an algorithm that enabled the creation of Bitcoin as a BFT system.

Note that the PoW algorithm is not 100% tolerant to the Byzantine faults, but due to the cost-intensive mining process and the underlying cryptographic techniques, PoW has proven to be one of the most secure and reliable implementations for blockchain networks. In that sense, the Proof of Work consensus algorithm, designed by Satoshi Nakamoto, is considered by many as one of the most genius solutions to the Byzantine faults.

Final thoughts

The Byzantine Generals’ Problem is an intriguing dilemma that eventually gave rise to the BFT systems, which are being extensively applied in various scenarios. Beyond the blockchain industry, a few use cases of BFT systems include the aviation, space, and nuclear power industries.

Within the cryptocurrency context, having an efficient network communication along with a good consensus mechanism is vital to any blockchain ecosystem. Securing these systems is an ongoing effort, and the existing consensus algorithms are yet to overcome a few limitations (such as scalability). Nonetheless, PoW and PoS are very interesting approaches as BFT systems, and the potential applications are certainly inspiring widespread innovation.