Web3 is the next iteration of the internet, built upon decentralized technology and spanning three fundamental pillars: consensus, storage, and computation. Blockchain technology has sparked a decentralized revolution and introduced the concept of Web3, which represents not only decentralized consensus but also the use of this technology to decentralize the rest of the internet.
Just as Web2 is a complex combination of various technologies that together form the Web3 ecosystem, we can break down the ecosystem into three key infrastructure pillars that need to be developed to fully decentralize the internet: consensus, storage, and computation. Since the launch of Bitcoin in 2009, consensus has matured rapidly, with dozens of other successful decentralized consensus models implemented since then. Over time, attempts at decentralized storage and computation have emerged to complement these in building the next pillar of a truly decentralized internet.
In this article, we will explore decentralized storage, which describes peer-to-peer networks where members combine disk space to create what is essentially a global hard drive. It is trustless, immutable, and in some cases, permanent and censorship-resistant.
From a blockchain perspective, the need for decentralized storage can be examined from two main angles: economically and technically. Economically, storing data on-chain is very expensive, and data that does not need to be stored on the blockchain should not be. Technically, storing data on-chain is highly inefficient, and block sizes are limited. To prevent blocks from being filled with useless data, we need to offload this data elsewhere.
If we were to store the image file of the Bored Ape Yacht Club #3368 NFT on the Bitcoin network, we would need at least 1700 OP_RETURN transactions (a conservative estimate) to save the entire file, assuming standard consensus rules and node settings (80 bytes of arbitrary data per OP_RETURN, with a maximum of one OP_RETURN per transaction). The transaction fee would be 12 sats/vB, or 0.028 BTC for a single image of a 10,000-piece collection.
Storing the same image data in permanent storage on the Ethereum network would cost about 7.9 ETH, with a Gas fee of approximately 95 gwei, requiring nearly 23 million Gas units in a single smart contract deployment. Such storage costs are unfeasible for most applications.
When we further compare these costs with the cost of storing the same data on a decentralized storage network, it quickly becomes apparent that dedicated storage networks are more cost-effective in storing files while also ensuring permanence, immutability, and censorship resistance, which will be discussed in more detail later.
Technical Perspective: Why Should We Avoid Storing Data Directly on Public Blockchains?
As the name implies, a blockchain is made up of interconnected blocks that form a chronological sequence. Each block points to the previous one, ensuring that the data in past blocks cannot be altered. The data contained in the blocks is either transactions or state descriptors. Thousands of nodes worldwide ensure that no one can deceive the system and maintain consensus among the nodes.
For each block, a set of transactions is added that changes the value state in the network. Since the size of a block is capped, each block can only process a certain number of transactions. This gives the block an implicit time value, which is reflected in the fees that network participants are willing to pay to have their transactions confirmed and included in the block.
When a block is filled, transactions remain in the node’s memory pool until the block is confirmed and the transactions are added to the next block. If a transaction is not confirmed for a long time, it may be affected by slippage or front-running. Storing arbitrary data on the blockchain takes up block space and pushes transactions into subsequent blocks, amplifying this issue.
The limited supply of block space, coupled with the demand for transactions to be included in the block, therefore, drives up transaction fees across the entire network, which can prevent users from interacting with the network.
Arbitrary data on the blockchain can be reduced through decentralized storage networks, which provide similar characteristics to public blockchains while unloading this data load.
But Why Not Store Files on a Centralized Network?
The previous points explained why we should not store data on the blockchain, but the next question becomes: why store data on a decentralized network? Data can be easily stored on centralized Web2 servers. The answer is simple: to ensure immutability and trustlessness, and to achieve data permanence and censorship resistance.
Let’s look at NFTs (Non-Fungible Tokens): NFTs represent unique (i.e., non-fungible) ownership tokens stored on the blockchain and controlled by smart contracts. The blockchain records who owns the unique token and points to something called metadata, which describes what the token represents. Metadata includes details about the NFT and links to other data such as media files – this is what gives the NFT context and meaning.
The metadata can be stored anywhere. As long as the data can be accessed through a pointer embedded in the NFT smart contract, the content will