Original Author: Siddarth, IOSG Ventures
What is FVM and why is it needed?
Protocol Labs mentioned their three-stage plan for a decentralized internet:
Build the world's largest decentralized storage network.
Ingest and protect human data.
Bring retrieval and computation to data to build scalable applications.
Filecoin has achieved its first step, becoming the largest decentralized data custodian with over 661.54 PiB of data storage. Filecoin's storage market continues to grow in Q4 2022, with active transactions up 117% and a year-over-year increase of 1,798%.
The next step for Filecoin is to help introduce and protect human data, which will be an ongoing process. To reach the third step, new infrastructure needs to be established to help retrieve and calculate data, otherwise Filecoin will only become a hard disk accumulation of archived data around the world.
This led to the birth of the Filecoin Virtual Machine (FVM). By integrating virtual machines into the Filecoin protocol, developers can create decentralized applications (dApps) based on the Filecoin storage network, execute smart contracts in a safe and secure manner, and create additional value layers on top of existing Filecoin data .
FVM is designed to execute smart contracts on the Filecoin network. When a smart contract is deployed, it is compiled into WASM bytecode. This allows developers to create useful applications that behave according to immutable rules, unlocking various use cases such as bringing all of Filecoin's markets on-chain, permanent storage, DataDAOs, and more. While all of this is great, we have seen many chains fail without a good developer experience and lack of a developer marketplace.
Filecoin decided to launch and support FEVM. When an EVM is deployed to FEVM, it is compiled to WASM and an actor instance is created in FEVM to run the EVM bytecode. User-defined FEVM actors are then able to interact with the Filecoin network through the built-in Market and Miner APIs.
This is a very critical step for Filecoin, because EVM-based development has proven to have a good developer experience, and it also has a large existing developer community. By providing a safe and secure environment for executing smart contracts, the Filecoin VM helps unlock the full potential of the Filecoin protocol, bringing new and innovative use cases for decentralized storage and computing.
Use Cases for FVM Unlocking
While many different use cases can be built using FVM, Protocol Labs and the FVM team have come up with a list of "Requests for Startups" that they believe are critical to the prosperity of the Filecoin ecosystem. Here are a few highlights from that list:
DataDAOs🌟
ML model storage and augmentation [via DataDAOs] 🌟🌟
Storage entry [via DataDAOs] 🌟🌟
Pay Per View [via DataDAOs]
Games [via DataDAOs] 🌟🌟🌟
Social [via DataDAOs] 🌟🌟🌟
Decentralized Science [via DataDAOs] 🌟🌟🌟
Track and reduce Filecoin's carbon footprint 🌟
Hardware Collateralized Lending
Permanent storage 🌟🌟
Storage automation (replication and repair) 🌟🌟
Trustless FIL + Notary 🌟🌟
KYC and Proof of Claim
Decentralized data aggregator 🌟🌟
Access control
According to Protocol Labs, the asterisk indicates the importance of the project's presence on the Filecoin network. Readers may have noticed the repeated reference to DataDAOs, so what are they and why are they important?
DataDAOs
The saying "data is the new oil" has been circulating in recent years. The largest Internet companies in the world have long considered data to be their most valuable resource, whether for internal use (marketing/user insights), or external sales data. Why is data valuable? Data is only valuable if it can generate insights, and to generate insights, calculations and interpretations need to be performed on the data itself.
In the current state of Filecoin, it is impossible to realize data on the chain, because the storage protocol is point-to-point, the transaction is carried out off the chain, and the settlement is carried out on the chain. Monetizing data requires infrastructure to build access control systems, subscription payment systems, data augmentation, packaging, etc.
This is what FVM can unlock, with a decentralized community-focused model. DataDAO is a type of DAO whose mission revolves around the protection, generation, enhancement, and promotion of datasets deemed valuable by stakeholders.
Every stakeholder in DataDAO can now be incentivized in a variety of different ways, SPs are responsible for storage and preservation, replication workers ensure local and fast availability, data providers are now compensated for the sale of any of their data, Data experts can participate in data packaging, and ML engineers can be compensated for providing models that can run on encrypted data, thereby increasing the value of data.
We can think of data as a family of values:
raw data (least valuable)
packed data
data calculation
Verifiable insights (most valuable)
In the pre-FVM state, raw and packaged data could be uploaded to Filecoin, but access control could not be done on-chain. It is now possible to use FVM for data computations (e.g. AI/ML models, NFT creation, etc.). Although in the current state of Filecoin, complex and heavy calculations cannot be run on-chain because Filecoin nodes currently do not have enough computing power. This suggests that there is a clear scope for an off-chain Dcompute platform that interoperates with Filecoin.
Solutions like Bacalhau focus on data-based computing and have tight interoperability with Filecoin. Bacalhau is a fairly new project that could make the Filecoin network more valuable by allowing data stored in Filecoin itself to become more valuable.
DataDAO does not need to cover the complete data value chain in its initial state. While this may be the ultimate goal of DataDAO, there are many DataDAOs that are already starting to work in specific verticals or use cases. Some notable examples include:
Lagrange DAO: A DAO for Data Value Realization and Decentralized Science (DeSci). It provides a data sharing and analysis space for DeSci.
GlacierDAO: A DAO that exposes replicas of Git repositories containing code of public interest. Additionally, it allows users to pool funds together to fund the replication of these repositories on the Filecoin network.
SPN DAO: A DataDAO that enables consumers to turn credit card transaction data into assets, giving them direct control over the usage and monetization of their data.
While DataDAO is currently focused on the collection and management of data, to move up the value chain, computing data and DCompute infrastructure will be required. Projects already working on this and focusing on Filecoin include:
Bacalhau: Bacalhau is a platform that enables fast, efficient, and secure computing by running jobs where data is generated and stored. By running arbitrary Docker containers and WebAssembly images as tasks, users can streamline existing workflows without a large scale rewrite.
Shale: Shale is working to bring cloud computing to Filecoin, enabling storage providers to leverage existing storage capacity for computing and directly compete with other cloud storage providers such as AWS and Google Cloud. Public users will have the opportunity to rent compute instances from storage providers and access Filecoin+ storage transactions over a local network. This is a solution similar to Akash Network and StackOS, but it focuses more on computing on Filecoin data.
DataDAO is a unique use case that can only be unlocked end-to-end in a decentralized manner through the Filecoin network.
solve existing problems
Filecoin was only a storage protocol in the early stage, and there must be many problems to be solved in a developing protocol, such as:
Even if my SP was penalized for not storing data, how can I repair or retrieve my data?
I have high quality content that needs to be delivered using a caching layer for my distributed website, and all my transactions with the SP are done offline.
If I have enterprise-grade data, how can I access control my data on Filecoin? There are many other problems.
Filecoin is working hard to solve these problems, like FVM and Retrieval Markets. FVM unlocks the possibility to create and incentivize data replication on Filecoin, so even if one SP fails, data can be retrieved or repaired from other nodes.
Retrieval Markets will help create a decentralized CDN network, which is very beneficial for Web3 social and gaming projects. Or FVM can be used to build an access control platform with end-to-end data encryption, projects like Medusa are working in this direction, while also taking into account an additional layer of interoperability with other chains.
Strengthening the need for DCompute infrastructure
Computing this is somewhat difficult in a distributed data storage system like Filecoin, since the data may reside in many different nodes far away. Aggregating data and then performing computations on it completely defeats the purpose of a decentralized storage system.
As the world moves towards training AI models on large datasets, there is enormous economic value in storing large datasets on Filecoin (since the cost is low), but building training models on these systems is currently very difficult.
ChatGPT (GPT-3) was trained on a computer with 1000 V100 GPUs (each GPU operating at about 147 TFLOPs), at an average cost of about $10 million.
Stable Diffusion was trained on 256 A100 GPUs that cost close to $600,000.
This requires enormous computing power, and the current minimum requirements for a Filecoin node are an 8-core GPU and 128 GB of RAM. Filecoin alone cannot cope with the computing-intensive internet age we live in.
There are two ways to solve this problem:
Making Filecoin a computational powerhouse by making the minimum computational requirements imposed by storage very high. This is not a transformation that can be accomplished overnight. This must be a gradual process.
Outsource calculations on data on Filecoin to third-party Dcompute platforms such as Bacalhau and Shale (with Filecoin as the center) or other solutions such as Akash Network, StackOS, etc.
Another problem is that many training models have already been built. If data needs to be ported to Filecoin, the computing network must also be able to support existing models that have been built using TensorFlow, Ray, etc.
In the short term, the author believes that decentralized cloud computing networks such as Shale and Akash will win the market because of easy deployment and less developer overhead. In the long run, Filecoin must also strive to become a computing power, by upgrading existing resources Or find a more efficient way of decentralized computing.
Original link
