Does DA not store historical data? Data availability is not equal to permanent availability.
Data Availability (DA) does not mean permanent storage. Blockchain does not have an obligation to permanently store data. This article aims to clarify misunderstandings about the role of data availability and its importance in enhancing security rather than storing historical data.
Table of Contents
Toggle
Saving historical data does not guarantee security?
What is data availability?
What data availability is not
DA: Data availability only ensures complete data publication
DS: Data storage and historical indexing
Why is data availability important?
Fraud-proof mechanism
Validium escape hatch requires the latest state
Data availability is a crucial pillar of network security
While there is debate in the market about whether data availability should be built in external projects such as Celestia or retained in Ethereum, it is not about where historical data should be stored but rather about network security.
Some readers may question, “Isn’t historical data storage the same as security for Layer2?” In fact, historical data is not the most important consideration for Layer2 security. What Vitalik insists on is not historical data storage.
If there are such questions, it means a misunderstanding of the purpose and definition of the data availability layer.
Data availability does not guarantee that all historical data will be permanently available for users or node retrieval. Data availability projects such as Celestia or EigenDA provide temporary storage, which fundamentally differs from the decentralized storage facility provided by Arweave for permanent data storage, although they both provide hard drives.
Since existing mainstream Rollups treat Ethereum as DA and compress complete transaction information before being uploaded, it creates the misconception that data availability represents permanent storage. However, including the recent Cancun upgrade (Dencun) brought by EIP-4844, it will start deleting outdated Rollups transaction data because the purpose of uploading is not for permanent storage.
Data availability only guarantees the availability of data before a block is confirmed finally, providing Ethereum with a basis for arbitration of disputed blocks. Therefore, some people believe that the name should be changed to Data Publication (DP).
For example, if a node on Arbitrum discovers errors in blocks transmitted by other nodes and issues fraud proofs, there needs to be accurate data for Ethereum to compute and arbitrate. Without DA ensuring data availability, the fraud-proof mechanism cannot proceed.
If a transaction has been finally confirmed, such as a block confirmed by more than 2/3 of nodes in the Ethereum network (approximately having more than 60 new blocks becoming the longest chain), it will be finally confirmed and cannot be changed.
After final confirmation, there will be no disputes, only consensus on the network, and the relevant transaction’s DA is no longer required. This is why EIP-4844 decides to periodically delete this data because permanent storage is irrelevant to the purpose.
Some readers may find it strange that if data availability means ensuring complete data publication on the network, which will be deleted or not guaranteed access after a period of time, then what should be done if the complete transaction history of Rollups needs to be accessed for special reasons? This is where data storage comes into play.
However, blockchain historical data storage is not a very important issue.
As long as any party in all nodes, due to interests or other reasons, actively preserves complete transaction data, it can be guaranteed that anyone can obtain the correct historical data from somewhere, as long as there is a large enough network scale (N nodes). The assumption of data storage and historical indexing is close to 1/N.
In conclusion, the data storage and historical indexing functions have an approximate 1/N assumption. As long as the network is large enough (N nodes), as long as one node is willing to provide complete data, it can almost guarantee that anyone can obtain the correct historical data from somewhere.
Next, let’s explain why DA is crucial for Rollups or Validium, to the extent that L2BEAT considers DA as one of the five risk models.
The fraud-proof mechanism relies on complete transaction information to function.
Even in extreme cases where all nodes collude and stop sending information to an honest node, without ensuring data availability, the honest node cannot distinguish whether the network connection is unstable or under attack, nor can it counterattack. The 1/N trust assumption of the fraud-proof mechanism cannot be established.
Most Layer2 networks have anti-censorship withdrawal mechanisms, such as an escape hatch, which is activated when a user’s withdrawal request is consistently ignored or maliciously refused by the sequencer, and the forced withdrawal function is also ignored by the nodes without a response for several days. When the escape hatch is activated, the network is temporarily suspended, and all transactions on the network cannot be conducted. However, users can withdraw based on the state root to achieve an anti-censorship withdrawal mechanism.
However, to obtain the latest state tree in Validium, at least one node needs to provide it. To enhance user asset security, having a reliable DA can ensure that users can obtain the state tree and withdraw.
Therefore, based on the above two scenarios, it can be understood that data availability plays a crucial role in the Layer2 ecosystem. Even though it is not responsible for permanently storing data, it still serves as a vital security component.
The function of the data availability layer is not to provide complete transaction history information but to ensure the availability of the state before transactions are finally confirmed, ensuring smooth network operation and user asset security, becoming a critical component of Layer2 security models.
Without DA, in extreme cases, no matter how well-designed the transaction proof mechanisms (fraud proofs, zero-knowledge proofs) are, they will be essentially useless. This is why data availability is so important, and it is not surprising that many Ethereum developers do not agree with external DAs.
While projects like zkEVM, fraud proofs, and ecosystem development are heavily promoted, it is important to always remember whether these fundamental infrastructures can ensure security.
DA
Data Availability
Data Publication
Data Storage