Special thanks to Brian Retford, Sun Yi, Jason Morton, Shumo, Feng Boyuan, Daniel, Aaron Greenblatt, Nick Matthew, Baz, Marcin, and Brent for incredibly valuable insights, feedback, and review of this post.
For us fellow degens living under a crypto rock, AI has been on fire for a while. Fun story, no one wants to see an AI go rogue. Blockchain was invented to prevent the US dollar from going rogue, so we might just give it a shot. Also, we now have a new technology called ZK, which is used to ensure stuff doesn't go wrong. I just assumed that the average degen knows a slight bit about what blockchain and ZK are. However, to tame the AI beast, we must understand how AI works.
AI has gone through a few names from “expert systems“ to “neural nets“, then “graphical models“, and eventually “machine learning“. All of these are subsets of “AI“ which people give different names and we learn more about AI. Let's take a small dive into ML and unveil the mysteries of machine learning.
Note: most machine learning models today are neutral networks for their excellent performance for many tasks. We mainly refer to machine learning as neural networks machine learning.
First, let's join me on a quick journey through the inner workings of machine learning：
Input data Preprocessing:
The input data needs to be processed into a format that can be used as input to the model. This often involves preprocessing and feature engineering to extract useful information and transform the data into a suitable form, such as an input matrix or tensor (high dimensional matrix). This is the expert systems approach. As deep learning comes across, layers appear to take care of the pre-processing automatically.
Set up initial model parameters:
Initial model parameters include several layers, activation functions, initial weights, biases, learning rate, etc. Some can be adjusted during training to improve the model's accuracy using optimization algorithms.
Train the data：
The input is fed into the neural network, typically starting with one or more layers of feature extraction and relationship modeling, such as convolutional layers（CNN）, recurrent layers（RNN）, or self-attention layers. These layers learn to extract relevant features from the input data and model the relationships between these features.
The output from these layers is then passed through one or more additional layers, which perform different computations and transformations on the input data. These layers typically mainly involve matrix multiplication with learnable weight matrices and the application of non-linear activation functions, but they could also include other operations, such as convolutions and pooling in convolutional neural networks or iterations in recurrent neural networks. The output of these layers serves as input to the next layer in the model or as the final output for prediction.
Get the output of the model:
The output of neural network computation is usually a vector or matrix that represents the probability of image classification, sentiment analysis score, or other results, depending on the application of the network. There is usually another error evaluation & parameter update module which allows automatic updates to parameters depending on the purpose of the model.
If the above explanation seems too obscure, you can look at the following example of using a CNN model to recognize images of apples.
An image is loaded into the model as a matrix of pixel values. The matrix can be represented as a 3D tensor with dimensions (height, width, channels).
The initial parameters of the CNN model are set.
The input image is fed through multiple hidden layers in the CNN, where each layer applies convolutional filters to extract increasingly complex features from the image. The output of each layer is passed through a non-linear activation function and then pooled to reduce the dimensionality of the feature maps. The final layer is typically a fully connected layer that produces the output prediction based on the extracted features.
The final output of the CNN is the class with the highest probability. This is the predicted label for the input image.
We can then summarize the above into an ML trust framework, which includes four essential layers of ML that need to be trustworthy for the entire ML process to be reliable:
Input: Raw data needs to be preprocessed, and sometimes it needs to be private.
Integrity: input data is not tampered with, contaminated with adversarial input, and is correctly pre-processed.
Privacy: input data is not leaked if desired.
Output: Needs to be accurately generated and transmitted
Integrity: output is correctly generated.
Privacy: output is not leaked if desired.
Model Type/Algo: models should be correctly computed
Integrity: model is executed correctly.
Privacy: the model itself or the computation is not leaked if desired.
Different neural network models have different algos and layers, catering to different use cases and inputs.
CNNs are commonly used for tasks involving grid-like data, such as images, where local patterns and features can be captured by applying convolutional operations to small input regions.
RNNs, on the other hand, are well-suited for sequential data such as time series or natural language, where the hidden states can capture information from previous time steps and model temporal dependencies.
Self-attention layers are useful for capturing relationships between elements in the input sequence, making them effective for tasks such as machine translation or summarization, where long-range dependencies are essential.
Several other model types exist, including Multilayer Perceptron (MLP), etc.
Model parameters: parameters should be transparent or democratically generated in some cases but not easy to tamper with in all cases.
Integrity: parameters are generated, maintained, and governed in the correct manner.
Privacy: Model owners often keep machine learning model parameters confidential to protect the intellectual property and competitive advantage of the organization that developed the model. This is only prevalent until transformer models become crazy expensive to train but nonetheless a major issue for the industry.
With the explosive growth of Machine Learning (ML) applications (with CAGR exceeding 20%) and their increasing integration into daily life, such as the recent popularity of ChatGPT, issues of trust in ML are becoming increasingly critical and cannot be ignored. Therefore, it is crucial to discover and address these trust issues to ensure the responsible use of AI and prevent its potential misuse. However, what exactly are these issues? Let's dive in.
Trust issues have long plagued machine learning for two main reasons:
Privacy nature: As mentioned above, the model parameters are usually private, and in some cases, the model input also needs to be private, which naturally brings some trust problems between the model owner and the model users.
Blackbox of the algorithm: Machine learning models are sometimes referred to as 'black boxes' because they involve many automated steps in the computational process that can be difficult to understand or interpret. These steps involve complex algorithms and large amounts of data which bring indeterministic and sometimes random output, putting the algorithm in a place to be blamed for biases and even discrimination.
Before going more profound, the bigger assumption in this article is that the model is already "ready to use, " meaning it's well-trained and fit for purpose. The model might not fit all cases, and models improve at a staggering rate that the normal shelf life of an ML model is anywhere from 2 to 18 months, depending on application scenarios.
There are trust issues that underlie the model training process, and Gensyn is currently working on generating effective proof to contribute to this process. However, this article will mainly focus on the model inference process. Now let's use the four building blocks of ML to discover underlying trust issues:
The data source is tamper-proof
Private input data is not stolen by the model operators (privacy problem)
The model itself is accurate as advertised.
The computation process is done correctly.
The parameters of the model are not changed or are as advertised.
Model parameters which are valuable assets to the model owner are not leaked during the process (privacy problem)
Some of the above trust issues can be solved simply by going on-chain; uploading the inputs and ML parameters on-chain and calculating the model on-chain can ensure the correctness of input, parameters and model computations. But this method might sacrifice scalability and privacy. Giza is doing this on Starknet, but due to the cost problem, it only supports simple machine learning models like regression and does not support neural networks. ZK technology can solve the above trust issues more efficiently. Currently, ZKML's ZK usually refers to zkSNARK. First, let's quickly go over some basics of zkSNARKs:
A zkSNARK proof is proof that I know some secret inputs w such that the result of this computation f is OUT is true without telling you what the w is. The proof generation process can be summed up into several steps:
Formulating a statement that needs to be proved: f(x,w)=true
"I have correctly classified this image x using an ML model f with private parameters w."
Converting the statement into a circuit（Arithmetization）: different circuit construction approaches include the R1CS, QAP, Plonkish, etc.
ZKML needs an additional step called quantization compared to other use cases. Neural network inference is typically done in floating-point arithmetic, which is extremely expensive to emulate in the prime field of arithmetic circuits. Different quantization methods are a trade-off between accuracy and equipment requirements.
Some circuit construction methods like R1CS are not efficient for neural networks. This part can be adjusted to improve performance.
Generating a proving key and a verification key
Creating a witness: when w=w*,f(x,w)=true
Creating a hash commitment: The witness w* is committed to using a cryptographic hash function to generate a hash value. This hash value can then be made public.
It helps ensure the private inputs or model parameters have not been tampered with or modified during the computation process. This step is essential as even minor modifications can have significant impacts on the model's behavior and output.
Generating proof: Different proof systems use different proof generation algorithms.
Special zero-knowledge rules need to be designed for machine learning operations, such as matrix multiplication and convolutional layers, allowing for efficient protocols with sub-linear time for these computations.
The general zkSNARK systems like groth16 might not be able to process neural networks efficiently as the calculation workload is too much.
Since 2020, many new ZK proof systems have emerged to optimize the ZK proof for the model inference process, including vCNN, ZEN, ZKCNN, and pvCNN. However, most of them are optimized for CNN models. They can only be applied to some primary datasets, such as MNIST or CIFAR-10.
In 2022, Daniel Kang Tatsunori Hashimoto, Ion Stoica, and Yi Sun(founder of Axiom) proposed a new proof scheme based on Halo2, which first achieved ZK proof generation for the ImageNet dataset. Their optimization mainly falls on the arithmetization part, with novel lookup arguments for non-linearities and the reuse of sub-circuits across layers.
Modulus Labs is benchmarking different proof systems for on-chain inference and finds that in terms of proof time, ZKCNN and plonky2 perform the best; in terms of peak prover memory usage, ZKCNN and halo2 perform well; while plonky, although performing well, sacrifices memory consumption and ZKCNN is only usable for CNN models. It is also developing a new zkSNARK system, especially for ZKML with a new virtual machine.
Verifying the proof: The verifier uses the verification key to verify without requiring the knowledge of the witness.
We can therefore demonstrate that applying zero-knowledge techniques to a machine-learning model can solve a lot of its trust issues. Similar techniques that use interactive verification can achieve similar results but will require more resources on the verifier side and potentially face more privacy issues. It is worth noting that depending on the exact model, generating proofs for them can be time and resource-consuming, so there will be compromises in various aspects when this technique is eventually implemented in real-world use cases.
Next, what's on the table? Bear in mind that there are many reasons why model providers may not want to generate a ZKML proof. For those who are brave enough to try ZKML and when the solutions make sense to be implemented, they can choose from a few different solutions according to where their models and inputs lie:
If the input data is on-chain, Axiom could be considered as a solution:
Axiom is building a ZK co-processor for Ethereum to improve user access to blockchain data and provide more sophisticated views of on-chain data. It is feasible to do reliable machine learning computation on on-chain data:
First, Axiom imports on-chain data by storing Merkle roots of Ethereum block hashes in its smart contract AxiomV0, which are trustlessly verified through a ZK-SNARK verification process. The AxiomV0StoragePf contract then allows for batch verification of arbitrary historic Ethereum storage proofs against the root of trust given by block hashes cached in AxiomV0.
Next, the ML input data can be pulled from imported historical data.
Then Axiom can apply verified machine learning operations on top; The validity of each piece of computing is verified using optimized halo2 as the backend.
Finally, Axiom accompanies the result of each query with the zk proof, and the Axiom smart contract would verify the zk proof. Any related parties who want the proof can access it from the smart contract.
If the model is put on-chain, RISC Zero could be considered as a solution:
The RISC Zero ZKVM is a RISC-V virtual machine that produces zero-knowledge proofs of the code it executes. Using the ZKVM, a cryptographic receipt is produced, which anyone can verify was produced by the ZKVM's guest code. No additional information about code execution (such as, for example, the inputs provided) is revealed by publishing the receipt.
By running a machine learning model in RISC Zero's ZKVM, it can be proven that the exact computation involved in the model is carried out. The computation and the verification process can be done either off-chain in a user's preferred environment, or in the Bonsai Network, which is a universal roll-up.
First, the model's source code would need to be compiled into a RISC-V binary file. When this binary file is executed inside the ZKVM, the output is paired with a computational receipt that contains a cryptographic seal. This seal serves as a zero-knowledge argument of computational integrity and links the cryptographic imageID (which identifies the RISC-V binary file that was executed) to the asserted code output in a way that third parties can quickly verify.
When the model executes inside the ZKVM, the computation about state change is done entirely within the VM. It does not leak any information about the model's internal state to external parties.
Once the model has finished executing, the resultant seal serves as a zero-knowledge proof of computational integrity.
The exact process of generating the ZK proof involves an interactive protocol with a random oracle as the verifier. The seal on a RISC Zero receipt is essentially the transcript of this interactive protocol.
If you want to directly import a model from commonly-used ML software like Tensorflow or Pytorch, ezkl could be considered as a solution:
Ezkl is a library and command-line tool for making inferences for deep learning models and other computational graphs in a zkSNARK.
First, export the final model as a .onnx file and some sample inputs to a .json file.
Then, point ezkl to the .onnx and .json files to generate a ZK-SNARK circuit with which you can prove ZKML statements.
Looks simple, right? It is ezkl's goal to offer an abstraction layer that allows for higher-level operations to be called and laid out in a Halo 2 circuit. Ezkl abstracts away a lot of complexity while staying incredibly flexible. Their quantization model has a scale factor for automated quantization. They support a flexible change to other proving systems as new solutions emerge. They also support several types of virtual machines, including EVM and WASM.
Regarding the proving system, ezkl customs halo2 circuits through aggregation proofs (transform hard-to-verify into easy-to-verify through an intermediary) and recursion(can solve the memory problem, but hard to fit in halo2). Ezkl also optimizes the whole process using fusion and abstraction (can reduce overhead through high-level proof)
It is also worth noticing that, compared to other general-purpose zkml projects, Accessor Labs focuses on providing zkml tooling specially designed for fully onchain gaming, and might involve AI NPC, automatic updates of gameplay, game interface involving natural language, etc.
Solving ML's trust issues with ZK technology means it can now be applied to many more "high-stakes" and "highly deterministic" use cases rather than just keeping up with people's conversations or telling the pictures of cats apart from pictures of dogs. Web3 is already exploring a lot of these use cases. This is no coincidence as most Web3 applications run or intend to run on a blockchain because of the specific nature of a blockchain that can run securely, is difficult to tamper with, and has deterministic computations. A verifiably well-behaving AI should be an AI capable of conducting activities in a trustless and decentralized environment, right?
Many Web3 applications have been sacrificing user experiences for the sake of security & decentralization since that's obviously their priority, and limitations of infrastructures also exist. AI/ML has the potential to enrich user experiences, which will certainly help but previously seemed impossible without compromise. Now, thanks to ZK, we can comfortably see the marriage of AI/ML with Web3 applications without too much sacrifice on security & decentralization.
Essentially, it would be a Web3 application (which may or may not exist when writing this article) implementing ML/AI in a trustless way. By a trustless manner, we mean whether it's operated in a trustless environment/platform or its operation is provably verifiable. Note that not all ML/AI use cases (even in Web3) are needed or preferred to run in a trustless way. We will analyze each part of the ML functionality used in various Web3 fields. Then, we will identify the parts that require ZKML, usually the high-value parts that people are willing to pay extra money for proof.
Most use cases/applications mentioned below are still in the experimental research phase. Hence, they are still far from actual adoption. We will discuss why later.
Defi is one of the few demonstrated product market fit for blockchain protocols and Web3 applications. Being able to create, store and manage wealth & capital in a permissionless way is unprecedented in human history. We have identified many use cases where an AI/ML model would need to operate permissionless to ensure security & decentralization.
Risk Assessment: Modern finance requires AI/ML models for all kinds of risk assessment, from preventing fraud and money laundering to giving out uncollateralized loans. Ensuring this AI/ML model functions in a verifiable way means we can prevent them from being manipulated into censorship, which hinders the permissionless nature of using Defi products.
Asset Management: automated trading strategy has not been new for Tradfi and Defi. Efforts to apply AI/ML-generated trading strategies have been made, but only a few decentralized ones have been successful. Current typical applications of the defi sector include the rocky bot experimented with by Modulus Labs.
The Rocky Bot： Modulus Labs created a trading bot on StarkNet using AI for decision-making.
An L1 contract that holds funds and exchanges WEth / USDC on Uniswap.
An L2 contract implements a simple (but flexible) 3-layer neural network to predict future WEth prices. The contract uses historical WETH price information as input.
A simple frontend for visualization and PyTorch code for training both regressors and classifiers.
Automated MM and liquidity provision: this is in essence a combination of similar efforts conducted in risk assessment and asset management, just done in a different manner when it comes to volume, timeline, and asset types. There have been numerous papers on how ML can be used in market making in the equities market. It might only be a matter of time before some of them are applicable to Defi products.
Warp.cc team has developed a tutorial project on how to deploy a smart contract that runs a trained neural network to predict Bitcoin's price. This falls into the "input" and "model" parts of our framework as the input is fed with the RedStone Oracles feed, and the model is executed as a Warp smart contract on Arweave.
It’s first iteration and ZK is involved so it falls into our honorable mention, but in the future Warp team considers implementing a ZK piece.
Gaming intersects with machine learning a lot:
The grey area in the figure represents our initial assessment of whether the ML functionality in the gaming section needs to be paired with corresponding ZKML proofs. Leela Chess Zero is a very interesting example of applying ZKML to gaming:
Leela Chess Zero (LC0): a fully on-chain AI chess player built by Modulus Labs, playing against a collective of human players from the community.
LC0 and the human collective take turns playing (as it should be in chess).
LC0's move is calculated using a simplified, circuit-friendly LC0 model.
LC0's move has a Halo2 snark proof generated to ensure no human mastermind intervention. Only the simplified LC0 model is there to make the decision.
This fits into the "model" part. The model's execution has a ZK-proof to verify that the computation is not tampered with.
Data Analysis & Prediction: this has been a common use of AI/ML in the Web2 gaming world. However, we find there to be very few reasons to implement ZK into this ML process. For the sake of not having too much value directly involved in the process, it might not be worth the effort. However, if some analytics & predictions are used to determine rewards for users, ZK might be implemented to ensure the results are correct.
AI Arena is an Ethereum native game where players all over the world can design, train and battle NFT characters powered by artificial neural networks. Talented researchers from around the world compete to create the best Machine Learning (ML) model to battle in the game. AI Arena focuses on feedforward neural networks. In general, they have lower computational overhead than convolutional neural networks (CNNs) or recurrent neural networks (RNNs). Though, for now, models are only uploaded to the platform after it's trained, hence it's honorably mentioned.
GiroGiro.AI is building an AI toolkit that enables the masses to create artificial intelligence for personal or commercial use. Users can create various sorts of AI systems based on an intuitive and automated AI workflow platform. Only with a few inputs of data and a choice of algorithms (or models for refinement), users will generate and harness the AI model in mind. Though the project is in a very early stage, we are excitedly anticipating to see what GiroGiro can bring onto the table due to its focus on the gamefi and metaverse-focused products, hence its presence in honorable mentions.
DID and Social
In the realm of DID&social, web3 and ml’s intersection currently mainly lies in the proof of humanity and proof of credentials field; other parts might evolve but will take longer.
Proof of humanity
Worldcoin uses a device called the Orb to determine if someone is an actual living person that is not attempting to defraud the verification. It does this with a variety of camera sensors and machine-learning models that analyze facial and iris features. Once that determination is made, the Orb takes a set of pictures of the person’s irises and uses several machine learning models and other computer vision techniques to create an iris code, which is a numerical representation of the most important features of an individual’s iris pattern. Specific steps to sign up are as follows:
The user generates a Semaphore keypair on her phone and presents the hashed public key (via QR code) to the Orb.
The Orb scans the user’s irises and locally computes the user’s IrisHash. It then sends a signed message containing the hashed public key and the IrisHash to the sign-up sequencer node.
The sequencer node verifies the Orb’s signature, then checks if the IrisHash does not match any already in the database. If the uniqueness checks passes, the IrisHash and the public key are saved.
Worldcoin uses the open-source Semaphore zero-knowledge proof system to transfer the uniqueness of IrisHashes to the uniqueness of user accounts, without ever linking them. This ensures a newly signed-up user can successfully claim his/her WorldCoins. The steps are as follows:
The user’s app locally generates a wallet address.
The app uses Semaphore to prove that it owns the private counterpart to one public key registered previously. Because it’s zero-knowledge proof, it does not reveal which public key.
The proof is sent again to the sequencer, which verifies it and initiates the deposit of tokens to the provided wallet address. A so-called nullifier is sent along with the proof and ensures the user cannot claim the reward twice.
Proof of action
Personalized recommendations and content filtering
Twitter recently open-sourced their algorithm for the "For You" timeline, but users cannot verify if the algorithm is running correctly as the weights for the ML models used to rank tweets are kept private. This has resulted in concerns about bias and censorship.
However, Daniel Kang, Edward Gan, Ion Stoica, and Yi Sun offer a solution using ezkl to help balance privacy and transparency by producing proof that the Twitter algorithm ran honestly without revealing the model weights. By using the ZKML framework, Twitter can commit to a specific version of its ranking model and publish proof that it produces the specific final output ranking for a given user and tweet. This solution enables users to verify that the computations were performed correctly without requiring trust in the system. While there is still work to be done to make ZKML more practical, this is a positive step toward improving transparency in social media. Thus this falls into our ML trust framework's "model" part.
It can be seen that the potential use cases for ZKML in web3 are still in their infancy but can not be ignored; in the future, as the use of ZKML continues to expand, there may be a demand for ZKML providers, forming the closed loop in the following figure:
ZKML service providers largely focus on the "model" and "parameter" part of the ML trust framework. Though most we see nowadays are more "model" related than "parameters". Note that the "input" and "output" parts are more addressed by blockchain-based solutions, either used as data sources or data destinations. ZK or blockchain alone might not achieve full trustworthiness, but probably united they do.
Finally, we can look at the current feasibility status of ZKML and how far we are from a large-scale application of ZKML.
Modulus Labs's paper has given us some data and insights into the feasibility of ZKML applications by testing Worldcoin (with strict precision and memory requirements) and AI Arena (with cost-effectiveness and time requirements):
If Worldcon uses ZKML, the prover's memory consumption will overwhelm any commercially available mobile hardware. If AI Arena's tournament uses ZKML, using ZKCNNs would increase the time and cost to 100x (0.6s vs original 0.008s). So sadly, both are not feasible for directly applying ZKML technology to prove time and prover memory usage.
What about the proof size and verification time? We can refer to Daniel Kang, Tatsunori Hashimoto, Ion Stoica, and Yi Sun's paper. As shown below, their DNN inference solution can achieve up to 79% accuracy on ImageNet（ model type: DCNN,16 layers,3.4 million parameters ） while simultaneously taking as few as 10s and 5952 bytes to verify. Furthermore, the zkSNARKs can be scaled down to take as few as 0.7s to verify at 59% accuracy. These results show the feasibility of zkSNARKing ImageNet-scale models in terms of proof size and verification time.
The main technical bottleneck now lies in proof of time and memory consumption. It is still not technically feasible to apply ZKML in web3 cases. Can ZKML potentially catch the development of AI? We can compare several empirical data:
The development speed of ML models: the GPT-1 model released in 2019 had 150 million parameters, whereas the most recent GPT-3 model released in 2020 has 175 billion parameters, representing a 1,166x increase in the number of parameters in just two years.
The optimization speed of the ZK system: The performance growth of ZK systems basically follows "Moore's Law"-like paces. New ZK systems come out almost yearly, and we expect prover performance's rocket growth to continue for a while.
The outlook is not very optimistic when comparing the improvement rate of cutting-edge ML to ZK. However, with the continuous improvement of rollup performance, ZK hardware, and the tailor-made ZK proving systems based on highly structured neural network operations, hopefully, the development of ZKML can satisfy web3 needs and start by providing some old-fashioned machine learning functions first.
Though we might have a hard time using blockchain + ZK to verify that the information ChatGPT feeds me is trustworthy, we might be able to fit some smaller and older ML models into ZK circuits.
"Power tends to corrupt, and absolute power corrupts absolutely". With the incredible power of AI & ML, there is no surefire way to put it under governance for now. The governments have repeatedly proven to provide either late interference for aftermaths or early outright ban. Blockchain + ZK delivers one of the few solutions that can tame the beast in a provable and verifiable way.
We look forward to seeing more product innovation in the field of ZKML, where ZK and blockchain provide a secure and trustworthy environment for AI/ML to run. We also expect brand new business models to arise from those product innovations as we are not constrained by the go-to SaaS commercialization model here in the permissionless world of crypto. We look forward to supporting more builders to come and build out their exciting ideas in this fascinating overlap of the "wild west anarchies" and "ivory tower elites".
We are still early, but we might save the world on the way.