Mapping the Blockchain x AI Landscape
Author: Oliver Jaros
Teams building at the intersection of crypto and AI have generated quite a buzz since the launch of ChatGPT. The prevailing standard for AI models involves closed-sourced mega models powered by massive compute clusters hosted in AWS or another centralized compute provider. An increasingly decentralized ecosystem with open networks and token incentives can uniquely enable new innovations and add value to existing AI projects by incentivizing collaboration and crowdsourcing valuable data and compute. This is why we are interested in companies building networks facilitating the easy exchange of compute, data, and even models themselves — as well as companies building ZK circuits and libraries to prove inferences and training jobs of these models and data.
Additionally, we believe that the AI agent ecosystem will be coordinated on a blockchain substrate. In the future when there are millions or even billions of agents deployed, it’s natural that they will exist on blockchains and transact with crypto. This is because it’s far easier for an agent to spin up a wallet and transact with their wallet than it is for them to go through the KYC process to create a bank account.
For these reasons we are excited about companies building AI agent frameworks and networks to coordinate these agents, as well as proof-of-personhood protocols to distinguish between human users and agent users of this future internet meta.
It can be challenging to understand all the ways in which crypto and AI intersect. We have put together the map below to help visualize the different sub-sectors within the AI stack.
We’re currently most excited about decentralized data collection, dataset and content provenance, AI agent frameworks, proof of personhood, and zero knowledge machine learning (zkML). We believe that these categories will uniquely enable new innovations in AI via blockchains and crypto. This piece dives further into how these categories will likely play out and why they will be important.
Decentralized Data Collection and Transformation
This sector focuses on the ways blockchains can enhance data collection and transformation. Blockchains uniquely enable coordination mechanisms whereby participants can be rewarded for contributing their data or time and attention towards labeling or otherwise transforming data to optimize it for machine learning training datasets. Token incentives can ensure high-quality data and high-quality annotations of this data. Hivemapper’s model exemplifies this approach. Participants install a dashcam in their vehicles to capture and upload street-level imagery to the Hivemapper Network. A high-quality map is created, and customers wishing to consume the map data can purchase access to Hivemapper’s API via their native token, HONEY. Network participants can also contribute to data labelling in a gamified way. AI trainers play games where they do things like classify objects — for example, confirming that an image is a 35mph speed limit sign. Each review contributes to a labelled dataset that helps the Hivemapper AI do a better job of identifying objects. This network empowers anyone equipped with a dash-cam to contribute to a dynamic, global dataset rather than relying on a few major companies to map the world. It also enables ordinary users to contribute to the annotation and labelling of data sets to train AI models.
Dataset and Content Provenance and Tracing
The proliferation of deepfakes and AI generated content has amplified misinformation online. Cryptography and blockchains offer solutions by authenticating created content such as photos or videos via cryptographic signatures. While few companies currently offer this functionality on a blockchain substrate, several web2 companies recognize the value of cryptographically signing content to prove provenance of data. Adobe’s Content Credentials allow creators to sign their content, attaching a signature to the exported file’s metadata as proof of origin. Leica is a camera company that has incorporated Adobe’s Content Authenticity standards into their chips to cryptographically sign each photo taken. This doesn’t fully prevent unauthorized copying of content and sharing without crediting the original creator, but adopting a standard for posting content credentials to a blockchain upon creation is certainly a start for creators looking to build their reputation. Solana Mobile is poised to pioneer these standards via their Saga phone, which could be updated such that every photo is cryptographically signed using the Solana Mobile Stack (SMS) seed vault SDK. The provenance of photos can be proven by writing metadata such as time and location to Solana, and identifying the photo was taken on the device and was not AI generated. Generative AI software providers such as Stable Diffusion and ChatGPT could also integrate public key cryptography to enable creators to demonstrate that these tools were used. This would work well with a token incentive where creators are rewarded for cryptographically signing and generating proofs for their content. As this area evolves, we are keeping an eye out for first movers utilizing blockchain for content verification.
Cryptographic signing could also enable the provenance and tracing of datasets, an issue brought to light by the recent OpenAI lawsuit regarding training on copyrighted data. Employing ZK proofs could theoretically validate that a dataset is free of copyrighted material without exposing its actual content. However, the challenge lies in implementing a cost-effective system for on-chain verification. Verifying and recording a hash for each dataset update on a blockchain can be costly, posing a hurdle in practical application.
AI Agent Frameworks and Coordination Mechanisms
An AI agent is a computer program designed to perceive its environment and take actions aimed at achieving specific goals. Agents utilize existing models like GPT, but they are distinguished by their ability to take actions within their environment. For example, while a Large Language Model (LLM) like GPT is a static application that you consult for vacation planning advice, an AI agent goes a step further. It not only provides information but can also autonomously execute tasks such as booking flights and hotels on your behalf.
We believe blockchain rails are the natural substrate in which AI agents will interact with each other and exchange value on. If we want AI agents to perform complex actions they’ll need access to capital. Cryptocurrencies are the most logical payment medium for AI as they are 24/7, digital, and wallets can be spun up programmatically. It will not be as straightforward for an AI agent to interact with a bank’s interface and KYC themselves. Additionally, blockchains are the natural home for agents because their actions need to be provable. This means that AI agent actions and decisions can occur on-chain and users have visibility into these actions, or an agent requiring more computational resources could execute these off-chain and use zkML provers to ensure their off-chain data inputs are verified.
Promising use cases for on-chain agents include automated trading and portfolio management, fraud detection, DAO governance, and smart contract auditing. Olas is one such example of a protocol related to the development and coordination of AI agents. Olas has a framework that allows developers to easily create agent services, which are off-chain autonomous services that run as a multi-agent-system (MAS) and offer enhanced functionalities on-chain. Agent services expand the range of operations that traditional smart contracts offer, making it possible to execute arbitrarily complex operations and achieve some of the aforementioned use cases.
Proof of Personhood
Proof of Personhood (PoP) is a protocol used to identify online users as humans. The advent of generative AI and the inevitable proliferation of agents highlight the need for PoP. PoP can be used to counter malicious attacks on online platforms, such as Sybil attacks utilizing many fake virtual identities. PoP can also prevent the spread of misinformation and spam in a future internet era where AI agents are driving the bulk of online interactions. The current challenges in adopting proof of personhood include reluctancy to hand over the data required to authenticate identity — typically biometrics or sensitive personal information. Projects like Worldcoin have adopted token incentives, paying users who authenticate themselves 25 WLD. Proof of Personhood is crucial for advancing secure and ethical AI interactions online and ensuring a reliable digital environment.
Zero Knowledge Machine Learning (“zkML”)
zkML is the technology that enables smart contracts to inference AI/ML models. This is done by inferencing a model off-chain and ZK proving on-chain that the computation was done as intended. The computation of AI inference on-chain is unreasonable as validator CPUs can’t perform the thousands of matrix multiplications required, and it’s not scalable as every validator must compute this inference. zkML and opML middleware solve this by computing ml jobs off chain, and submitting an on-chain proof that the job was completed. zkML is a scalable solution because the execution of the inference and the proof generation can be done on one beefy machine, and the proof can be cheaply verified across light nodes in the network.
If you are building in any of the above sub-sectors or are interested in exploring any of these ideas further, please reach out.