There are many blockchains out there, but only a handful of independent implementations exist. Tezos is one of them, and as an early architect of the chain, I had the chance of being involved in its creation and development from the beginning. An early, and fortuitous decision was to follow the functional programming paradigm to build the chain, using the OCaml programming language. Throughout this experience, I found that functional programming and blockchains were a great fit for each other. Let’s try and see why!
From the start, it was clear that security should be at the center of technical design choices. Blockchains and cryptocurrencies present an almost worst case environment for bugs:
- Critical bugs can’t be discussed openly because they affect live systems, but they need to be deployed simultaneously across many participants without the use of a trusted third party. This leaves very few options to address them, besides covert bug fixes.
- There are large and direct financial incentives for criminal hackers to discover bugs in those systems since they secure real financial value.
While security is critical, there is unfortunately no surefire way to ensure it. Even the most rigorous approaches like formal verification remain expensive and are subject to bugs in the specifications themselves. Some technical choices however can help.
A major reason for selecting OCaml as a programming language was that it could help eliminate large classes of bugs. As a memory managed language, there is no need to worry about buffer overflows for instance, but this only scratches the surface. Tezos leverages OCaml’s very strong static type system to enforce isolation and permissions. The code managing a transaction cannot access the underlying storage of the ledger; it cannot even construct the types it would need to write to the storage. Instead, the type system constrains it to write to a higher abstraction that can check and sanitize every action. Encapsulation is not unique to functional programming of course, but the module signature mechanism of OCaml makes it very straightforward to review and refine permissions.
The Tezos protocol embeds an interpreter for Michelson, the virtual machine behind Tezos smart contracts, which is itself statically typed and functional. That interpreter leverages OCaml’s GADT system to ensure that mistyped Michelson contracts cannot even be constructed. This is another nice security property that we inherit from the language itself.
An old adage claims that if a program is written in a functional programming language, then it works. The statement is of course flippant, though I do remember that the very first version of Tezos that compiled, after months of development, did run on the first try and was able to process transactions.
None of these properties can guarantee security, but they take care of more obvious flaws, freeing up programmers and security researchers to focus on higher-level matters.
If the gold standard is formal verification, then OCaml is extremely well positioned. Coq, a leading interactive theorem prover and proof checker is written in OCaml and can naturally output OCaml. In addition, Coq-of-OCaml can do the reverse and prepare Coq code from existing OCaml code.
Blockchains look like a functional programming problem
As Tezos began taking shape, I realized that many of the problems that need to be solved when implementing a blockchain are similar to the type of problems functional programmers are very familiar with. At its heart, a blockchain is a way to represent a mutable state using an append-only data structure. The state is what you get when you hold over the blocks with an accumulator. This is typical of how we handle data and their immutability in the functional world.
One problem functional programming is very suitable for is handling chain reorganizations, when blocks that have been applied to the state need to be rolled back because a different branch ends up being chosen by the consensus. When the data is stored as a functional tree, network participants can undo the effect of these blocks on the state efficiently. Then, as the chain progresses, you need to clean it up and free the memory with a garbage collector, which is again something that is very familiar in the world of functional programming.
Additionally, if you’re building smart contracts, then you need a smart contract language, which means you’ll need a compiler—compilers tend to be handled very well by functional programming in general and OCaml in particular. There are a lot of steps when compiling from a source language to a target language: lexically parsing the text to create individual tokens, assembling those into an abstract syntax tree, and transforming various parts of that tree until we get to the target language, sometimes going through a number of intermediate representations, where the type system constrains the transformations. The code for all these steps can be very elegant and efficient when written in OCaml.
The compiled smart contract, too, benefits from being written in a functional style. Each contract has its own immutable data associated with it, so you cannot have it be a pure function. What we can do however is load that storage and the contract into an isolated virtual machine to execute. It’s the next best thing to a pure function, deterministic and unaffected by external values.
OCaml is not an obvious choice. As a programming language, it remains somewhat niche. Nonetheless, it is a mature language that offers the security of a strongly-typed functional programming language, without compromising on performance. Its roots are in French academia, and it is used by companies like Facebook, Jane Street Capital, and Docker in security-sensitive projects. It is also a popular language for writing compilers. You can write very readable, reliable, and efficient code in OCaml, and while it doesn’t prevent outright programming errors, the strong type-system and the absence of side effects brought by functional programming help give you high confidence in the correctness of your code.
Haskell, a more popular functional programming language offers a very pure paradigm based on lazy evaluation, but it is harder to write Haskell code that is both performant and idiomatic.
A common objection is that using an uncommon programming language like OCaml makes it harder to hire programmers. That argument could hold some weight for companies trying to recruit many thousands of developers, but it was apparent early on that the most effective size for a core protocol development team is much smaller than that. In addition, I found that developers with a knack for building these types of systems had no difficulty picking up the language in a matter of months.
I was inspired early on by WhatsApp’s ability to scale to hundreds of millions of users with a small, focused team of Erlang developers, and I would say that this inspiration stood the test of time.
In conclusion, there is a very natural fit between blockchains and functional programming and it would be a shame to not use the right tool for the right job! There are numerous problems yet to be solved and opportunities for developers everywhere to apply their skills to build better tools, applications and infrastructure for this nascent (but booming) category. If you’re interested to learn more about blockchain and Tezos, head over to tezos.com and join the growing community of builders in the ecosystem.
The Stack Overflow blog is committed to publishing interesting articles by developers, for developers. From time to time that means working with companies that are also clients of Stack Overflow’s through our advertising, talent, or teams business. When we publish work from clients, we’ll identify it as Partner Content with tags and by including this disclaimer at the bottom.