Introduction
Heurist is a Zero-Knowledge (ZK) Layer 2 network specifically crafted for decentralized hosting and inference of AI models. Unlike traditional platforms, it operates on a distributed network of compute providers, enabling seamless, serverless access to open-source AI models. This approach mirrors the functionality of Hugging Face, a renowned open-source AI model marketplace. However, Heurist distinguishes itself through its unique ownership structure: the network is collectively owned by its users, promoting a more democratic and decentralized model of operation.
The primary ambition of the network is to democratize access to AI technologies by achieving several key objectives: facilitating easy and affordable access to AI models, enhancing transparency across the board, significantly reducing biases inherent in AI, and fostering the democratization of AI models. This strategy is designed to lower barriers to entry for AI utilization, ensuring a more equitable distribution of AI benefits across various sectors and communities.
Closed-source AI
The internal ‘settings’ of a model, known as parameters, play a critical role in neural networks. 'Weights,' which are coefficients applied to the input data, determine the strength of connections between units across various layers of the model. These weights are adjusted during training to reduce errors in predictions. 'Biases' are constants added just before the activation function, ensuring the model can still make precise predictions even when input values are zero. They aid in pattern recognition by allowing adjustments to the application of the activation function.
Proprietary, closed-source models, such as the GPT series developed by OpenAI, keep their training data and architectural details confidential, making the precise configurations of their parameters proprietary. This exclusivity grants the model's owner full authority over its utilization, development, and deployment. Such control can introduce several centralizing influences during the creation of a model:
Censorship - Model owners have the autonomy to dictate the nature of content produced or analyzed by the model, incorporating filters to exclude specific topics, keywords, or concepts. This functionality serves multiple purposes, such as sidestepping contentious issues, adhering to legal mandates, or ensuring alignment with a company’s ethical standards and commercial objectives. Since ChatGPT's introduction, there has been a noticeable trend towards more restricted outputs, diminishing its utility in certain contexts. A notable illustration of this trend is observed in China, where interactions with a Robot based on OpenAI's core model on WeChat are heavily censored. For instance, it refrains from addressing inquiries about Taiwan or posing questions related to Xi Jinping. A Wall Street Journal journalist successfully employed adversarial techniques to reveal that the Robot was intentionally programmed to evade topics deemed politically sensitive by the Chinese government or the Communist Party of China.
Bias - In neural networks, weights and biases play a crucial role but can also unintentionally lead to bias, especially if the training data is not diverse. Weights adjust connection strengths between neurons and might emphasize or overlook specific features. This can result in a bias of omission, where essential details or patterns in less represented data are missed. Biases, intended to improve learning efficiency, could cause the model to prefer certain types of data over others if they are not adequately adjusted for a wide range of inputs. The proprietary nature of these models further exacerbates this issue, as it may lead to the exclusion of important patterns related to certain groups or situations, thereby skewing predictions and reinforcing biases in the outputs. This means certain perspectives, voices, or information might be underrepresented or inaccurately portrayed. An illustrative case of model owner-induced bias and censorship is seen with Google's advanced language model, Gemini.
Verifiability - In a closed-source framework, it's challenging for users to verify if the model version claimed to be in use, such as ChatGPT 4 compared to ChatGPT 3, truly matches the service provided. The lack of transparency regarding the model's architecture, parameters, and training data prevents external validation. This opacity obscures whether users are benefiting from the latest technological enhancements or if outdated technologies are being misrepresented as new, potentially compromising the AI service's effectiveness and quality. For instance, in scenarios like assessing an applicant's eligibility for a loan through AI models, questions arise about the consistency of the model applied across different applicants. Moreover, there are concerns about whether the model strictly adheres to the designated inputs without deviation, highlighting the importance of transparency in ensuring fairness and reliability in AI applications.
Dependency, lock-in and stagnation - Entities that depend on proprietary AI platforms or models are often at the mercy of the corporations controlling these technologies, resulting in a monopolistic aggregation of power that hinders open innovation. This dependence emerges as these corporations can limit access or modify the model at any time, significantly affecting those who rely on these tools for development. Historically, several examples illustrate this trend: Facebook, which initially promoted open development through its public APIs to encourage innovation, later restricted access to emerging competitors like Vine. Voxer, a messaging app that became popular in 2012 for its integration with Facebook to find friends, was cut off from Facebook's 'Find Friends' feature. This phenomenon isn't confined to Facebook; numerous platforms that start with an open-source or open innovation philosophy often shift towards prioritizing shareholder value, to the detriment of their users. For instance, Apple's App Store enforces a 30% commission on app-generated revenues. Similarly, Twitter, which once championed openness and interoperability with the RSS protocol, moved to prioritize its centralized database in 2013, severing ties with RSS. This shift resulted in the loss of data ownership and access to one's social network. Amazon has faced accusations of exploiting its internal data to create and favor its products over third-party sellers. These instances highlight a consistent pattern where platforms transition from open ecosystems to more controlled, centralized frameworks, adversely affecting innovation and the wider digital community.
Privacy - The owners of these centralized models, large corporations such as OpenAI, retain all rights to use the prompt and user data to better train their models. This greatly inhibits user privacy. For example, Samsung employees inadvertently exposed highly confidential information by utilizing ChatGPT for assistance with their projects. The organization permitted its semiconductor division engineers to use this AI tool for debugging source code. However, this led to the accidental disclosure of proprietary information, including the source code of an upcoming software, internal discussion notes, and details about their hardware. Given that ChatGPT collects and uses the data inputted into it for its learning processes, Samsung's trade secrets have unintentionally been shared with OpenAI.
The Rise of Open-Source AI
Open-source AI is defined by its transparency and accessibility, featuring openly available model parameters and clear disclosures about the data (and the raw data) used in pre-training. This approach allows developers, researchers, and users to inspect, modify, and enhance AI models, fostering a collaborative environment that accelerates innovation and improvement. Open-source AI projects disclose the architecture of their models, enabling a deeper understanding of how they operate and the basis of their decision-making processes. Additionally, by revealing the datasets used for pre-training, these projects ensure users can assess the diversity, breadth, and potential biases within the training data, contributing to more ethical and unbiased AI systems. This openness not only democratizes access to cutting-edge technology but also encourages a global community of contributors to identify flaws, suggest improvements, and adapt the technology for varied applications, thereby ensuring the AI's continuous evolution and relevance.
Despite the approximately seven-year lead that closed-source AI development had over its open-source counterparts, the landscape is rapidly changing. Currently, many open-source language models not only match but occasionally surpass the performance of GPT-3.5. Moreover, in specific areas, these open-source models achieve performance levels comparable to GPT-4. This development signals a significant shift in the AI field, where open-source initiatives are closing the gap with well-funded, proprietary systems. The success of these open-source models can be attributed to a combination of factors, including the collaborative nature of open-source projects, which accelerates innovation and improvement; the availability of extensive datasets and advanced computing resources; and a growing community of developers dedicated to enhancing AI accessibility and capabilities.
In the realm of image generation, the Stable Diffusion models developed by Stability AI have emerged as the leading open-source text-to-image model family. They have demonstrated a level of power and efficiency on par with, and in terms of cost, superior to, their closed-source rival, OpenAI's DALL-E 2. A distinctive advantage of the Stable Diffusion models is the public accessibility of their weights, which empowers artists and developers to tailor the model for specific visual styles. This level of customization and adaptability is notably absent in the DALL-E models from OpenAI, highlighting a significant flexibility offered by Stable Diffusion in creative and development processes.
HuggingFace has become the epicenter of a Cambrian explosion in open-source AI innovation, dramatically democratizing the ability for individuals and organizations to host and leverage open-source models for a wide range of inference tasks. The platform's growth in hosting models, datasets, and applications has been nothing short of meteoric. From having fewer than 5,000 models in 2020, HuggingFace has witnessed an exponential increase, boasting a staggering 574,737 models as of the latest count. This surge reflects not just the escalating interest and investment in AI but also underscores HuggingFace's pivotal role in facilitating unprecedented access to cutting-edge AI tools and fostering a vibrant, collaborative ecosystem for AI research and development.
Seeing this exponential growth of open-source AI, Heurist is looking to build the HuggingFace of Web3.
Protocol Overview
The Heurist protocol connects various participants, each playing a role in maintaining a decentralized model inference protocol through the coordination of distributed compute. The network participants are:
Consumers: Consumers engage with the Heurist protocol to perform inference tasks, such as generating text or images, utilizing a selection of AI models hosted on the platform. They benefit from a pay-as-you-go model for the computational resources utilized.
Miners/Model Hosts: Individuals possessing GPU resources can earn Heurist Tokens by hosting AI models on the protocol. They process model operations on their hardware and receive compensation through payments from users and Heurist Token distributions for completing inference tasks. To assure a commitment to high-quality service, miners are mandated to stake a predetermined quantity of tokens.
Model Creators: The dynamism of the Heurist ecosystem is propelled by AI model creators. By uploading their AI models to the protocol's model registry, they gain a share of the transactions made by users. This arrangement motivates creators to innovate and produce more sophisticated models to meet the escalating demands of users.
Application Integrators: These participants develop user-facing interfaces that leverage Heurist’s AI models, including but not limited to chatbots, AI agents, and image generation applications hosted online, as well as SDKs for web service integration. Application integrators receive a portion of the fees in Heurist Tokens when consumers transact through their applications.
Validators: Validators are essential for upholding the Heurist network’s integrity and reliability. They conduct regular verifications of the data output by miners to ensure its validity. Miners found to be delivering inaccurate or fraudulent data face a penalty, with a part of their staked tokens being confiscated and awarded to the validator who detected the misconduct.
Protocol Mechanics
The Heurist protocol facilitates open access to inference, whilst preserving user privacy via the following mechanism:
Each miner (model host) generates a unique public-private key pair.
Miners publish their public keys, making them accessible to users.
When accessing inference, a user generates a symmetrical encryption key to securely encrypt the input data (their prompt) destined for inference.
The symmetric key is then encrypted using the public key of each miner the user intends to interact with. Should a user choose N miners, this encryption process is repeated N times, individually for each miner's public key.
The user compiles the encrypted data (their prompt), the collection of encrypted symmetric keys, and their public key into a request. This request is then disseminated across the network.
Upon receiving a user's request, a miner utilizes its private key to decrypt the symmetric key.
The miner subsequently uses the symmetric key to decrypt the user's original input data.
Once the user's input data is decrypted, the miner proceeds to execute the model inference task by running the user's unencrypted prompt through the correct model.
After finishing the task, the miner encrypts the model output data with the user's public key.
The encrypted output is then published to the network, where only the original user can decrypt it with their private key.
Token Economics
The HUE token (which is not currently live) has a dynamic supply, influenced by both emissions and burning mechanisms inherent in the protocol, with the maximum supply capped at 1 billion tokens. The exact distribution and launch plan is currently TBD, but so far we know the following:
5% of total supply will be rewarded to testnet miners.
2% of total supply will be governed by Heurist Imagineries NFT holders
There will be a DeFi-inspired staking mechanism to align the interests of token holders with miners
The distribution is currently TBD, but so far we know the following:
Mining
Mining Process: Users have the opportunity to mine HUE tokens by utilizing their GPUs to host AI models.
Staking Criteria: Activating a mining node necessitates staking a minimum of 10,000 HUE or esHUE tokens. Falling below this threshold renders the node inactive, unable to generate rewards.
Mining Rewards: The process of mining awards esHUE tokens, which are automatically added to the miner node's stake. The reward magnitude is influenced by several factors, including the efficiency of the GPU, its availability (uptime), the specific AI model in operation, and the cumulative stake within a mining node.
Enhanced POW Mining: For those staking between 10,000 and 100,000 HUE tokens, the efficiency of mining operations improves in direct proportion to the staked amount.
Staking
Staking Mechanism: Users are afforded the capability to stake either HUE or esHUE tokens within mining nodes.
Staking Rewards: The rewards from staking are dispensed in either HUE or esHUE, based on the token variant staked. Opting to stake esHUE tokens results in higher yield compared to staking HUE.
Withdrawal Lock Period: There is a 30-day withdrawal lock period for unstaking HUE tokens. In contrast, unstaking esHUE tokens is not subject to any lock period.
Vesting Scheme: Rewards earned in esHUE can undergo a vesting process to convert into HUE over a span of one year, adhering to a linear vesting schedule.
Stake Transferability: Stakeholders have the liberty to transfer their stake in HUE or esHUE between mining nodes instantly, fostering a dynamic environment that encourages competition and flexibility among miners.
Incentivised Testnet Information
Starting on the 1st of April 2024, Heurist’s incentivised testnet goes live. Heurist have earmarked 5% of the total HUE token supply to serve as rewards for mining activities. Participants will earn these rewards in the form of points, which can be converted into fully liquid HUE tokens at the mainnet's TGE. This conversion opportunity will become available immediately following the conclusion of the incentivized testnet phase, providing a direct pathway for participants to claim their earnings.
Rewards are segmented into two distinct categories, reflecting the type of AI model being provided by the miner:
Llama Points, allocated to miners of LLMs. Each Llama Point is awarded for the processing of 1000 input/output tokens by the Mixtral 8x7b model.
Waifu Points, allocated to miners utilizing Stable Diffusion models. Each Waifu Point is awarded for generating one 512x512 image via the Stable Diffusion 1.5 model, achieved through 20 iteration steps.
Compute providers hosting various LLM models will have to run hardware with the following minimum GPU RAM and will earn the following number of Llama points per 1000 tokens:
The HUE allocation ratio of Llama Points to Waifu Points will be finalized as Heurist approaches TGE, based on an analysis of demand and usage patterns for the two model types in the upcoming months. This approach ensures that the distribution of rewards accurately reflects the value and contribution of each mining activity within our ecosystem.
Compute Provision Specifications
In order to participate as a compute provider in the incentivised testnet, the following GPUs are recommended to host either LLMs or Stable diffusion models:
Nirmaan Partnership
The cornerstone of crypto AI networks lies in computing power. This includes the computing necessary for the inference of computationally demanding models, or the computing required to execute a model and generate a cryptographic proof verifying the correct execution of the model. For these tasks, high-performance GPUs are essential to the operation of such networks. However, not everyone has access to this high cost hardware or has the technical know-how to run the hardware with high performance and uptime.
We at Nirmaan are democratizing access to compute and are excited to enter a strategic partnership with Heurist, providing our miner-as-a service middleware product to Heurist users who wish to provide compute to earn rewards.
Nirmaan aggregates the most cost effective compute from web2 & web3 providers, utilizing our partnerships with the largest compute providers in India to secure cheap and effective compute so that we can provision it to Heurist.
We will initially offer and manage NVIDIA RTX A6000 GPUs to run LLM models on the Heurist network.