BTC
ETH
HTX
SOL
BNB
View Market
简中
繁中
English
日本語
한국어
ภาษาไทย
Tiếng Việt

How does multi-dimensional analysis of DePIN help artificial intelligence?

星球君的朋友们
Odaily资深作者
2023-06-16 09:20
This article is about 4613 words, reading the full article takes about 7 minutes
Why is AI inseparable from the blockchain?
AI Summary
Expand
Why is AI inseparable from the blockchain?

Original Author: Filecoin Insights Contributing Writer,Portal VenturesOriginal source:Catrina

Original source:Filecoin Network

In the past, start-ups, with their speed, agility, and entrepreneurial culture, were free from the shackles of organizational inertia and led technological innovation for a long time.However, all this has been rewritten by the era of artificial intelligence.So far, the creators of breakthrough AI products have been traditional technology giants such as Microsoft's OpenAI, Nvidia, Google, and even Meta.

what happened?Why did the giant win over the start-up this time? Startups can write great code, but they face several obstacles compared to the tech giants:

  • Computing costs remain high

  • AI development has a reverse lobe: Concern and uncertainty around AI's societal impact hinder innovation due to lack of necessary guidelines

  • AI Black Box Problems

  • The "data moat" built by the big technology companies forms a barrier to entry


So, why is blockchain technology needed? Where does it intersect with artificial intelligence? Although not all problems can be solved at once, theDistributed Physical Infrastructure Network (DePIN)Created conditions for solving the above problems. The following will explain how the technology behind DePIN can help artificial intelligence, mainly from four dimensions:

  • reduceinfrastructure cost

  • verifyCreator and Personality

  • to fillAI Democracy and Transparency

  • set upData Contribution Reward Mechanism


Below:

  • “web3”Refers to the next generation of Internet, blockchain technology and other existing technologies are its organic components.

  • "Blockchain"Refers to decentralization and distributed ledger technology.

  • "encryption"first level title

1. Reduce infrastructure costs (computing and storage)

Every wave of technological innovation starts with something expensive becoming cheap enough to waste.

——Society's Technical Debt and Software's Gutenberg Moment, from SK Ventures

How important is infrastructure affordability(The infrastructure of artificial intelligence refers to the hardware cost of computing, transmitting and storing data),Carlota Perez's Theory of Technological Revolutionimage description

Source: Carlota Perez's Theory of Technological Revolutions

  • installation phaseWith a large amount of venture capital,andand"Push" go-to-market (GTM) strategycharacteristic because customers do not understand the value proposition of the new technology.

  • Deployment phase with massive increase in infrastructure provisioningas a feature, lower the threshold for attracting newcomers, and adopt"Pull"go-to-market (GTM) strategy,It shows that the product has a high degree of market matching, and customers expect more products that have not yet been formed.

Now that attempts such as ChatGPT have demonstrated market fit and customer demand, one might feel that AI has entered the deployment phase.question

question

The current physical infrastructure field is mainly monopolized by vertically integrated oligopoly, including AWS, GCP, Azure, Nvidia, Cloudflare, Akamai, etc. The industry has a high profit margin, and it is estimated thatAWS' gross margin on commodity computing hardware is 61%. Therefore, new entrants in the AI ​​field, especially the LLM field, have to face extremely high computational costs.

  • The cost of one training session of ChatGPT is estimated at 4 million US dollars, and the operating cost of hardware inference is about 700,000 US dollars per day.

  • Version 2 of Bloom could cost $10 million to train and retrain.

  • If ChatGPT enters Google Search, Google's revenue will be reduced by $36 billion,image description

solution

solution

DePIN networks such as Filecoin (originating from the DePIN pioneer in 2014, focusing on gathering Internet-level hardware to serve distributed data storage),BacalhauGensyn.aiRender Network、ExaBits(Coordination layer for matching CPU/GPU supply and demand) can save 75% to 90%+ in infrastructure costs by:

1. Push the supply curve and stimulate market competition

DePIN provides equal opportunity for hardware suppliers to become service providers. It creates a market where anyone can join as a "miner" and exchange CPU/GPU or storage power for financial compensation, thereby creating competition for existing providers.

While a company like AWS certainly enjoys a 17-year head start on user interface, operations, and vertical integration, theDePIN attracts a new customer base that cannot accept pricing from centralized suppliers.Just like Ebay doesn't directly compete with Bloomingdale's, butOffer a more economical alternativeTo meet similar needs, distributed storage networks do not replace centralized suppliers, but are designed to serve price-sensitive user groups.

2. Promote market economic balance through encrypted economic design

The subsidy mechanism created by DePIN canGuide hardware suppliers to participate in the network, thereby reducing the cost to the end user. In principle, we can look at the costs and revenues of AWS and Filecoin storage providers in Web2 and Web3.

Customer gets a price reduction:The DePIN network creates a competitive market, introducingBertrand competition, thereby reducing customer payment costs. In comparison, AWS EC 2 requires ~55% margins and 31% overall margins to stay afloat. Provided by the DePIN networkToken Incentives/Block RewardsToonew source of income. In the context of Filecoin, storage providers earn block rewards (tokens) the more real data they host.Therefore, storage providers have an incentive to attract more customers to close deals and increase revenue.The token structures of several emerging computing DePIN networks remain undisclosed, but likely follow a similar pattern. Similar networks include:

  • Bacalhau: Brings computation to a coordination layer where data is stored, avoiding moving large amounts of data.

  • exaBITS: A distributed computing network serving AI and compute-intensive applications.

  • Gensyn.ai: A Computing Protocol for Deep Learning Models.

3. Reduce overhead costs:Advantages of DePIN networks such as Bacalhau, exaBITS, and IPFS/content-addressed storage include:

  • Unleash the availability of latent data:Large volumes of data, such as the massive event data generated by sports stadiums, are currently untapped due to the high bandwidth cost of transmitting large data sets. The DePIN project can process data on-site and transmit only meaningful output, unearthing potential data availability.

  • Reduce operating costs:Reduce data entry, transfer, and import/export costs by acquiring data locally.

  • Minimize manual work in sensitive data sharing:If hospitals A and B need to combine sensitive data of their respective patients for analysis, they can use Bacalhau to coordinate GPU computing power to process sensitive data directly locally without having to exchange personally identifiable information (PII) with each other through cumbersome administrative processes.

  • No need to recompute the underlying dataset:This articleThis article

AI generated summary:first level title

question

question

A recent study showed that,50% of AI scholars believe that the possibility of AI causing devastating harm to humans exceeds 10%.

People need to be alert that AI has caused social chaos, and there is still a lack of regulation or technical specifications. This situation is called "reverse lobe".

For example, inthis twitter videoimage description

Source: Bloomberg

It’s worth noting that AI’s social impact extends far beyond the problems posed by fake blogs, conversations, and images:

  • During the 2024 US election, for the first time, AI-generated deepfake campaign content has achieved the effect of real fake.

  • A video of Senator Elizabeth Warren was edited to make her 'say'"Republicans shouldn't be allowed to vote"Such words (rumours have been refuted).

  • Speech-synthesized Biden's voice criticizes trans women.

  • A group of artists has filed a class action lawsuit against Midjourney and Stability AI, alleging unauthorized use of artists' work to train AI, copyright infringement and threats to artists' livelihoods.

  • The AI-generated song "Heart on My Sleeve" featuring The Weeknd and Drake went viral on the streaming platform, but was later pulled. When new technology enters the mainstream without regulation, it creates many problems,Copyright infringement is a "reverse lobe" problem.

solution

solution

Use the proof of origin on the encrypted chain for proof of personality and proof of creator

Make blockchain technology truly work - as a distributed ledger containing an immutable on-chain history, the authenticity of digital content can be verified through content cryptographic proofs.

Digital signature as proof of creator and proof of personality

To identify a deepfake, a cryptographic proof can be generated using a digital signature unique to the creator of the original content, which can be created using a private key known only to the creator and verifiable by a public key that is available to all. Having a signature proves that the content was created by the original creator, whether human or AI, and verifies authorized or unauthorized changes to the content.

Proof of Authenticity with IPFS and Merkle Trees

IPFS is a distributed protocol for referencing large datasets using content addressing and Merkle trees. In order to prove that the content of the file was received and changed, a Merkle proof is generated, which is a string of hashes showing the position of a specific data block in the Merkle tree. With each change, a hash is added to the Merkle tree, providing proof of the file modification.

The pain point of the encryption scheme is the incentive mechanism,After all, identifying deepfake creators despitecan reduce negative social impacts,But it will not bring the same economic benefits. This responsibility is likely to fall on mainstream media distribution platforms such as Twitter, Meta, and Google, and it is indeed the case.So why do we need blockchain?

The answer is blockchain's cryptographic signatures and proof of authenticityMore valid, verifiable and certain.Currently, the process of detecting deepfakes is mainly through machine learning algorithms (such as Meta's "Deepfake Detection Challenge", Google's "Asymmetric Numeral Systems" (ANS), and c 2 pa: https://c 2 pa.org/) to identify visual regularities and anomalies in the content,But it is often not accurate enough and lags behind the development speed of deepfake.Manual review is generally required to determine authenticity, which is inefficient and expensive.

If one day every piece of content has an encrypted signature,Everyone can verifiably prove the source of creation,Mark tampering or forgery, then we will usher in a beautiful world.

AI generated summary:first level title

question

question

AI today is a black box made of proprietary data and proprietary algorithms. Closed nature of big tech LLMs kills what I see"AI Democracy", that is, every developer and even user can contribute to the LLM modelAlgorithms and Datarelated articlesrelated articles)。

AI Democracy = Visibility(can see the data and algorithm entered into the model)+ Contributesolution

solution

The purpose of AI democracy is to make generative AI models publicly available, relevant, and owned by the public. The table below compares the current state of AI with the future that can be achieved through Web3 blockchain technology.

at present--

forclient:

  • Receive LLM output unidirectionally

  • No control over how personal data is used

forDeveloper:

  • low composability

  • ETL data processing is not traceable and difficult to reproduce

  • Sources of data contributions are limited to data owning institutions

  • Closed-source models can only be accessed via API for a fee

  • Shared data output lacks verifiability, and data scientists spend 80% of their time on low-end data cleaning

After combining the blockchain——

forclient:

Users can provide feedback (such as bias, content moderation, granular feedback on output) as a basis for fine-tuning

Users can choose to contribute data in exchange for the profit after the model is profitable

forDeveloper:

  • Distributed data management layer:Crowdsource repetitive and time-consuming data preparation tasks such as data labeling

  • Visibility& the ability to combine & fine-tune algorithms, with verifiable sources (a tamper-proof history of all changes can be seen)

  • data sovereignty(enabled by content addressing/IPFS) and algorithmic sovereignty (e.g. Urbit enables peer-to-peer composition and portability of data and algorithms)

  • Accelerate LLM innovation,Accelerate LLM innovation from variations on the underlying open source model.

  • Reproducible training data output,Implemented through blockchain's immutable record of past ETL operations and queries (like Kamu).

Some people say that the open source platform of Web2 also provides a compromise solution, but the effect is not ideal, see the related discussionexaBITSblog postblog post

first level title

question

question

Today, the most valuable consumer data is the exclusive asset of large technology companies, forming a core business barrier. Tech giants have no incentive to share this data with outside parties.

So why can't we get data directly from its creators or users? Why can't we make data a public resource, contribute data and open source it for data scientists to use?

simply becauseLack of incentives and coordination mechanisms. Maintaining data and performing ETL (extract, transform, and load) is a significant overhead cost. In fact, data storage alone will be a $777 billion industry in 2030, not including computing costs. Nobody undertakes the work and costs of data processing for free.

Let's take a look at OpenAI. It was originally set to be open source and non-profit, but it is difficult to realize the cost and cannot cover the cost. In 2019, OpenAI had to accept funding from Microsoft, and the algorithm was no longer open to the public. It is estimated that by 2024,solution

solution

Web3 introducesA new mechanism called "dataDAO",Facilitates the redistribution of income between AI model owners and data contributors, creating an incentive layer for crowdsourced data contributions. Due to space limitations, it will not be expanded here. If you want to know more, you can read the following two articles:

  • How DataDAO works/How DataDAO works, by HQ Han of Protocol Labs

  • How data contribution and monetization works in web3/web3 How data contribution and monetization work, I discussed in depth the mechanism, shortcomings and opportunities of dataDAO in this article


In general, DePIN takes a new approach and provides new hardware energy for promoting Web3 and AI innovation. While tech giants dominate the AI ​​industry, emerging players can leverage blockchain technology to join the fray: DePIN Network lowers barriers to entry in ways that lower computing costs; blockchain’s verifiable and distributed nature enables truly open AI It becomes possible; innovative mechanisms such as dataDAO incentivize data contribution; the immutability and tamper-resistant features of the blockchain provide the identity proof of the creator, dispelling people's concerns about the negative social impact of AI.

AI
Filecoin
Welcome to Join Odaily Official Community