Nirmaan's AI Thesis

Introduction

Artificial intelligence, which is broadly the ability of machines to perform cognitive tasks, has quickly become an essential technology in our day to day lives. The breakthrough in 2017 occurred when transformers were developed to solve the problem of neural machine translation, which allows a model to take an input sentence of a task and produce an output. This enabled a neural network to take text, speech, or images as an input, process it, and produce output.

OpenAI and Deepmind pioneered this technology and more recently the OpenAI GPT (Generative Pre-trained Transformer) models created a eureka for AI with the proliferation of their LLM chatbots. GPT-1 was first introduced in June of 2018, featuring a model composed of twelve processing layers. It used a specialized technique called "masked self-attention" across twelve different focus areas, allowing it to understand and interpret language more effectively. Unlike simpler learning methods, GPT-1 employed the Adam optimization algorithm for more efficient learning, with its learning rate gradually increasing and then decreasing in a controlled manner. Overall, it contained 117 million adjustable elements, or parameters, which helped refine its language processing capabilities.

GPT 1 Architecture
GPT 1 Architecture

Fast forward to March 14th 2023, OpenAI released GPT-4, which features approximately 1.8 trillion parameters spread across 120 layers. The increase in parameters and layers enhances its ability to understand and generate more nuanced and contextually relevant language, among other things. The over 10,000x increase in the number of parameters in OpenAI’s GPT models in under 5 years shows the astounding rate of innovation happening at the cutting edge of generative models.

[insert performance data]

Regulation

Running parallel to this innovation and underpinning the AI stack is regulation. Whenever a transformative technology comes to market, regulators will introduce laws and processes so that they can better control it. Almost prophetically, we saw this play out in 1991 when Joe Biden, then a chairman of the Senate Judiciary Committee, proposed a bill to ban encryption on Emails. This potential ban on code and mathematics inspired Phil Zimmermann to build the open source Pretty Good Privacy (PGP) program that enabled users to communicate securely by decrypting and encrypting messages, authenticating messages through digital signatures, and encrypting files. The United States Customs Service went on to start a criminal investigation into Zimmermann for allegedly violating the Arms Export Control Act as they regarded his PGP software as a munition and wanted to limit access to strong cryptography to citizens and foreign entities.

Reminiscent of the email encryption bill, on the 30th of October 2023 Joe Biden, now the President of the United States, passed a Presidential Executive Order on “Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence”. The order falls under the Defense Production Act (DPA) affording the President a broad set of authorities to ensure the country has the resources necessary for national security. Broadly, the act seeks to establish new standards for AI safety and security. The order imposes strict Know Your Customer (KYC) on compute and data, whilst also banning all foreign AI model training occurring on US soil or in US data centers. On top of this they are putting permissionless AI models that will be capped at contains “tens of billions of parameters”, for reference Mistral-7B-v0.1 has 7 billion parameters. We are also witnessing this play out with hardware as the US recently prohibited the sale of semiconductor chips above a certain capability threshold to China, Russia and other nations.

Model Generation

On top of the centralizing regulatory pressures that artificial intelligence faces, there are a number of centralizing forces throughout the creation of a model. The creation of an AI model, particularly large-scale models like those used in natural language processing, typically follows three main phases: pre-training, fine-tuning, and inference. We will walk through each phase and the centralizing forces that are present:

Pre-Training

The pre-training phase is the initial step where the model learns a wide range of knowledge and skills from a large and diverse dataset. Before the advent of transformer-based architectures, top-performing neural models in natural language processing (NLP) primarily used supervised learning, which required vast quantities of curated and manually labeled data, which resided mostly within corporate boundaries. This dependence on supervised learning restricted their applicability to datasets lacking extensive annotations and created a centralizing force due to the prohibitive costs of employing skilled researchers and developers to perform this supervised learning. During this pre-transformer stage, supervised pre-training of models was dominated by

centralized entities like Google who had the resources to fund this work. The advent of transformer-based architectures, among other advancements, contributed significantly to the advancement of unsupervised learning, particularly in the field of natural language processing,  enabling models to be trained on datasets without predefined labels or annotated outcomes.

Data Collection & Preparation

The first step in pre-training a model is gathering the data that the model will be trained on. A large and diverse data set is collected from a vast corpus of text such as books, websites and articles. The data is then cleaned and processed.

Tokenization involves breaking down text data into smaller units, or tokens, which may range from words to parts of words, or even individual characters, based on the model's architecture. Following this, the data undergoes formatting to make it comprehensible to the model. This typically includes transforming the text into numerical values that correspond to the tokens, such as through the use of word embeddings.

Model Architecture

Selecting the right model architecture is a crucial step in the development process, tailored to the specific application at hand. For instance, transformer-based architectures are frequently chosen for language models due to their effectiveness in handling sequential data. Alongside choosing a framework, it's also important to set the initial parameters of the model, such as the weights within the neural network. These parameters serve as the starting point for training and will be fine-tuned to optimize the model's performance.

Training Procedure

Using the cleaned and processed data, the model is fed a large amount of text and learns patterns and relationships in order to make predictions about that text. During the training procedure there are a couple of key procedures used to dial in the parameters of the model so that it produces accurate results. One is the learning algorithm:

The learning algorithm in neural network training prominently involves backpropagation, a fundamental method that propagates the error—defined as the difference between the predicted and actual outputs—back through the network layers. This identifies the contribution of each parameter, like weights, to the error. Backpropagation involves gradient calculation, where gradients of the error with respect to each parameter are computed. These gradients, essentially vectors, indicate the direction of the greatest increase of the error function.

Additionally, Stochastic Gradient Descent (SGD) is employed as an optimization algorithm to update the model's parameters, aiming to minimize the error. SGD updates parameters for each training example or small batches thereof, moving in the opposite direction of the error gradient. A critical aspect of SGD is the learning rate, a hyperparameter that influences the step size towards the loss function's minimum. A very high learning rate can cause overshooting of the minimum, while a very low rate can slow down the training process significantly.

Furthermore, the Adam optimizer, an enhancement over SGD, is used for its efficiency in handling separate learning rates for each parameter. It adjusts these rates based on the first moment (average of recent gradients) and the second moment (square of these gradients). Adam's popularity stems from its ability to achieve better results more quickly, making it ideal for large-scale problems with extensive datasets or numerous parameters.

The second key procedure we use in the training phase is the loss function, also known as a cost function. It plays a crucial role in supervised learning by quantifying the difference between the expected output and the model's predictions. It serves as a measure of error for the training algorithm to minimize. Common loss functions include Mean Squared Error (MSE), typically used in regression problems, where it computes the average of the squares of the differences between actual and predicted values. In classification tasks, Cross-Entropy Loss is often employed. This function measures the performance of a classification model by evaluating the probability output between 0 and 1. During the training process, the model generates predictions, the loss function assesses the error, and the optimization algorithm subsequently updates the model's parameters to reduce this loss. The choice of loss function is pivotal, significantly influencing the training's efficacy and the model's ultimate performance. It must be carefully selected to align with the specific objectives and nature of the problem at hand.

Resource Allocation

Resource allocation during the pre-training phase of AI models, particularly for large-scale models like those in the GPT series, necessitates a careful and substantial deployment of both computational and human resources. This phase is pivotal as it establishes the groundwork for the model's eventual performance and capabilities. The pre-training of these complex AI models demands an extensive amount of computational power, primarily sourced from Graphics Processing Units (GPUs) or Tensor Processing Units (TPUs), which are specialized for handling the intense parallel processing tasks typical in machine learning. To address the considerable computational needs, a distributed computing approach is often adopted, utilizing multiple GPUs or TPUs across various machines or data centers in tandem to process the vast amounts of training data and update the model parameters efficiently.

Moreover, the significant volume of data required for pre-training, potentially reaching petabytes, necessitates robust storage solutions for both the raw and processed data formats. The energy consumption during this phase is notably high due to the prolonged operation of high-performance computing hardware, prompting a need to optimize computational resource use to strike a balance between performance, cost, and environmental impact. The financial aspects also play a critical role, as the acquisition and maintenance of necessary hardware, alongside the electricity for powering and cooling these devices, entail substantial costs. Furthermore, many organizations turn to cloud computing services to access the needed computational resources, adding a variable cost based on usage rates. In fact, when asked at an MIT event Sam Altman said that GPT 4 cost “more than $100 million” to train.

Fine-Tuning

The next stage in the creation of a model is fine-tuning. The pre-trained model undergoes adaptation to excel in specific tasks or with certain datasets that were not part of its initial training regimen. This phase takes advantage of the broad capabilities acquired during pre-training, refining them for superior performance in more focused applications, such as text classification, sentiment analysis, or question-answering. Fine-tuning involves preparing a smaller, task-specific dataset that reflects the nuances of the intended application, modifying the model's architecture to suit the task's unique output requirements, and adjusting parameters, including adopting a lower learning rate for more precise, targeted optimization. The model is then retrained on this curated dataset, which may involve training only the newly adjusted layers or the entire model, depending on the task's demands.

Following the initial pre-training and fine-tuning phases, models, particularly those akin to OpenAI's GPT-3, may undergo Reinforcement Learning from Human Feedback (RLHF) as an additional refinement step. This advanced training approach integrates supervised fine-tuning with reward modeling and reinforcement learning, leveraging human feedback to steer the model towards outputs that align with human preferences and judgments. This process begins with fine-tuning on a dataset of input-output pairs to guide the model towards expected outcomes. Human annotators then assess the model's outputs, providing feedback that helps to model rewards based on human preferences. A reward model is subsequently developed to predict these human-given scores, guiding reinforcement learning to optimize the AI model's outputs for more favorable human feedback. RLHF thus represents a sophisticated phase in AI training, aimed at aligning model behavior more closely with human expectations and making it more effective in complex decision-making scenarios.

Inference

The inference stage marks the point where the model, after undergoing training and possible fine-tuning, is applied to make predictions or decisions on new, unseen data. This stage harnesses the model's learned knowledge to address real-world problems across various domains. The process begins with preparing the input data to match the training format, involving normalization, resizing, or tokenizing steps, followed by loading the trained model into the deployment environment, whether it be a server, cloud, or edge devices. The model then processes the input to generate outputs, such as class labels, numerical values, or sequences of tokens, tailored to its specific task. Inference can be categorized into batch and real-time, with the former processing data in large volumes where latency is less critical, and the latter providing immediate feedback, crucial for interactive applications. Performance during inference is gauged by latency, throughput, and efficiency—key factors that influence the deployment strategy, choosing between edge computing for local processing and cloud computing for scalable resources. However, challenges such as model updating, resource constraints, and ensuring security and privacy remain paramount.

Centralizing Forces Within Model Generation

In the process of creating an AI model, numerous centralizing and monopolistic forces come into play. The significant resources needed for every phase of development pave the way for economies of scale, meaning that efficiency improvements tend to concentrate superior models in the hands of a select few corporations. Below, we detail the diverse mechanisms through which AI centralization occurs:

Pre-Training

As we have seen, the pre-training phase of a model combines a few things: data, training and resources. When it comes to the data collection, there are a number of issues:

Access to data

The pre-training phase requires a large corpus of data, typically from books, articles, corporate databases and from scraping the internet. As we discussed, when supervised learning dominated as a training technique, the large companies like Google could create the best models due to the large amount of data they were able to store from users interacting with their search engine. We see a similar centralizing and monopolistic force throughout AI today. Large companies such as Microsoft, Google & OpenAI have access to the best data through data partnerships, in-house user data or the infrastructure required to create an industrial internet scraping pipeline. For example, leaked documents suggest OpenAI is preparing to purchase user data from Tumblr and WordPress, at the expense of users' privacy.

The top 1% of x networks, facilitates x proportion of the total traffic / volume. Source Chris Dixon's "Read Write Own".
The top 1% of x networks, facilitates x proportion of the total traffic / volume. Source Chris Dixon's "Read Write Own".

Transformers enabled unsupervised learning models but the scraping of web data is no easy feat, web pages typically ban scraper IP addresses, user agents and employ rate limits and CAPTCHA services.

AI companies deploy a variety of tactics to navigate around the barriers websites put in place to obstruct data collection efforts. One common method involves utilizing a diverse array of IP addresses to sidestep IP-based rate limiting or outright bans, often achieved through the use of proxy servers or VPN services. Additionally, altering the user-agent string in HTTP requests—a technique known as User-Agent Spoofing—allows these companies to emulate different browsers or devices, thereby potentially circumventing blocks aimed at user-agent strings typically associated with automated bots or scrapers. Furthermore, to overcome CAPTCHA challenges, which are frequently employed by websites to prevent automated data collection, some AI companies turn to CAPTCHA solving services. These services are designed to decode CAPTCHAs, enabling uninterrupted access to the site's data, albeit raising questions about the ethical implications of such practices.

Beyond their ability to gather large amounts of data, big corporations also have the financial means to build strong legal teams. These teams work tirelessly to help them collect data from the internet and through partnerships, as well as to obtain patents. We can see this happening today with OpenAI and Microsoft, who are in a legal dispute with The New York Times. The issue is over the use of The New York Times' articles to train the ChatGPT models without permission.

Patent Centralization. Source: Statista
Patent Centralization. Source: Statista

Closed source data

There are also ethical and bias considerations involved in training a model. All data has some inherent bias attached to it since AI models learn patterns, associations, and correlations from their training data, any inherent biases in this data can be absorbed and perpetuated by the model. Common biases we find in AI models result from sample bias, measurement bias and historical bias and can lead to AI models producing poor or unintended results. For example, Amazon trained an automated recruitment model which was designed to assess candidates based on their fit for different technical positions. The model developed its criteria for evaluating suitability by analyzing resumes from past applicants. However, since the data set it was trained on included predominantly male resumes, the model learned to penalize resumes that included the word “women”.

Resource allocation

As we have discussed, pre-training of foundation models requires large cycles of GPU compute, costing hundreds of millions to train the top models (in 2022, OpenAI reported a $540 million loss in the training phase of GPT3). Demand for accessible and usable GPUs vastly outstrips current supply and this has led to a consolidation of the pre-training of models to within the largest and most well-funded tech companies (FAANG, OpenAI, Anthropic) and data centers.

Although corporations keep details of their data centers and operations somewhat secret, for a variety of reasons: security, regulatory compliance, customer data protection & competitive advantages we can see that the top 5 cloud & data center providers
Although corporations keep details of their data centers and operations somewhat secret, for a variety of reasons: security, regulatory compliance, customer data protection & competitive advantages we can see that the top 5 cloud & data center providers

We have learned that models improve with training size logarithmically and therefore, in general, the best models are the ones trained with the highest number of GPU compute cycles. Thus, a very centralizing force within the pre-training of models is the economies of scale and productivity gains large incumbent tech and data companies have and we are seeing this play out with OpenAI, Google, Amazon, Microsoft and Meta dominating.

Source: Epochai
Source: Epochai

The concentration of the power to develop transformative artificial intelligence technologies within a small number of large corporations, such as OpenAI, Google, and Microsoft, prompts significant concerns. As articulated by Facebook's first president, the primary objective of these platforms is to capture and retain as much of our time and conscious attention as possible. This reveals a fundamental misalignment of incentives when interacting with Web2 companies, an issue we have begrudgingly accepted due to the perceived benefits their services bring to our lives. However, transplanting this oligopolistic Web2 model onto a technology that is far more influential than social media—and holds the capacity to profoundly influence our decisions and experiences—presents a concerning scenario. A perfect example of this is the Cambridge Analytica scandal in the 2010’s. The British firm unauthorizedly gathered personal data from up to 87 million Facebook users in order to build a user profile of each user before serving them targeted political ads to influence elections. This data aided the 2016 U.S. presidential campaigns of Ted Cruz and Donald Trump, and was implicated in the Brexit referendum interference. If such a powerful tool as AI falls under the control of a few dominant players, it risks amplifying the potential for misuse and manipulation, raising ethical, societal, and governance issues.

GPU Supply-Side Centralisation

The resultant effect of models scaling logarithmically with training size, is that demand for GPU compute is growing exponentially to achieve linear gains in model quality. Certainly we have seen this play out over the last 2 years with demand for GPU compute skyrocketing with the launch of chatGPT and the AI race. If we take Nvidia’s revenue as a proxy for GPU demand, we see that Nvidia’s quarterly revenue increased 405% from Q4 2022 to Q4 2023.

Source: Nvidia reports
Source: Nvidia reports

The production of GPUs and microchips for AI training is an extremely complex and expensive process, with high barriers to entry. As such, there are few companies capable of producing hardware capable of delivering the performance that companies like OpenAI require to train their GPT models. The largest of these semiconductor and GPU manufacturers is Nvidia, holding approximately 80% of the global market share in GPU semiconductor chips. Originally starting off in 1993, creating graphics-based computing hardware for video games, Nvidia quickly became a pioneer in high end GPUs and made their seminal step into AI in 2006 with the launch of its Compute Unified Device Architecture (CUDA), which specialized in GPU parallel processing.

The hardware used to train a model is vital and the costs of this are extremely high as we have discussed. To compound the barriers to entry of training a model, the current access to this hardware is extremely limited with only top tech companies receiving their orders in a timely manner. Normal people like you or I cannot buy the latest and greatest, H100 Tensor Core GPU from Nvidia. Nvidia works directly with Microsoft, Amazon, Google and co to facilitate large bulk orders of GPUs, leaving regular people at the bottom of the waitlist. We have seen a number of initiatives between chip manufacturers and large corporations in order to create the infrastructure required to train and provide inference for these models, for example:

  1. OpenAI - In 2020, Microsoft exclusively built a supercomputer in order to train their GPT models. The supercomputer developed for OpenAI is a single system with more than 285,000 CPU cores, 10,000 Nvidia V100 and A100 GPUs and 400 gigabits per second of network connectivity for each GPU server.

  2. Microsoft - In 2022, Nvidia partnered with Microsoft to create a 1,123,200-core supercomputer utilizing Microsoft's Azure cloud technology. Eagle is now the 3rd largest supercomputer in the entire world, with maximum performance of 561 petaFLOPS generated from 14,400 Nvidia H100 GPUs and Intels’ Xeon Platinum 8480C 48C CPU.

  3. Google - In 2023, Google announced the A3 supercomputer, purpose built for AI & ML models. A3 combines Nvidia’s H100 GPUs with Google’s custom-designed 200 Gpbs Infrastructure Processing Units (IPUs), allowing the A3 to host up to 26,000 H100 GPUs.

  4. Meta - By year end 2024, Meta expects to operate some 350,000 Nvidia H100 GPUs and an equivalent of 600,000 H100 of compute from older GPUs such as the Nvidia A100’s used to train Meta’s LLaMA models.

Source: Statista
Source: Statista

The application of these feats of engineering when applied to training of models is immediately transparent. The large number of GPUs allow for parallel processing, enabling AI training to be greatly sped up and for large models to be created. Take Microsoft's Eagle supercomputer for example, using the MLPerf benchmarking suite, this system trained a GPT-3 LLM generative model with 175 billion parameters, in just 4 minutes. The 10,752 H100 GPUs significantly speed up the process by leveraging their parallel processing capabilities, specialized Tensor Cores for deep learning acceleration, and high-speed interconnects like NVLink and NVSwitch. These GPUs' large memory bandwidth and capacity, along with optimized CUDA and AI frameworks, facilitate efficient data handling and computations. Consequently, this setup enables distributed training strategies, allowing for simultaneous processing of different model parts, which drastically reduces training times for complex AI models.

Scale records on the model GPT-3 (175 billion parameters) from MLPerf Training v3.0 in June 2023 (3.0-2003) and Azure on MLPerf Training v3.1 in November 2023 (3.1-2002). Source: Microsoft
Scale records on the model GPT-3 (175 billion parameters) from MLPerf Training v3.0 in June 2023 (3.0-2003) and Azure on MLPerf Training v3.1 in November 2023 (3.1-2002). Source: Microsoft

We have clearly established then that the powerhouse behind the training of these large models is compute power, primarily in the form of GPUs. The centralizing forces we run into here are two fold:

  1. Exclusivity - Nvidia GPUs have a huge waitlist & monopolistic corporations bulk order GPUs with priority over smaller orders / individuals.

  2. Costs - The sheer cost of these GPU configurations mean only a small set of entities worldwide can train these models. For reference, each Nvidia H100 costs anywhere between $30,000 to $40,000, meaning Meta’s 600,000 H100 equivalent compute infrastructure will cost between $10.5 Billion and $24 Billion.

Supercomputer Geographical Centralization. Source: Wikipedia TOP500
Supercomputer Geographical Centralization. Source: Wikipedia TOP500

Amid the consolidation of computational power by major corporations, there's a parallel and strategic push by leading nations to enhance their computational capabilities, mirroring the intense competition of the Cold War's nuclear arms race. These countries are crafting and implementing comprehensive AI strategies, accompanied by a suite of regulatory measures aimed at securing technological supremacy. Notably, a Presidential executive order now mandates that foreign entities must obtain authorization to train AI models on U.S. territory. Additionally, export restrictions on microchips are set to hinder China's efforts to expand its supercomputing infrastructure, showcasing the geopolitical maneuvers to maintain and control the advancement of critical technologies.

Chip Manufacturing

Whilst Nvidia & other semiconductor companies are at the cutting edge of chip design, they outsource all of their manufacturing to other corporations. Taiwan serves as the global hub for microchip production, accounting for more than 60% of the world's semiconductors and over 90% of the most sophisticated ones. The majority of these chips are produced by the Taiwan Semiconductor Manufacturing Corporation (TSMC), the sole manufacturer of most advanced semiconductors. Nvidia’s partnership with TSMC is fundamental to the company's success and for the efficient production of H100 GPUs. TSMC distinguishes itself in the semiconductor industry with its advanced chip packaging patents, utilizing high-density packaging technology that stacks chips in three dimensions to enhance performance. This technology is crucial for producing chips designed for intensive data processing tasks, such as AI, enabling faster operation.

Whilst microchip production is currently working at maximum capacity, there are some risks regarding the possible dangers to production due to increased military threats from China towards Taiwan, a democratic island claimed by Beijing despite Taipei's vehement opposition. Geopolitical tensions in the region have heightened, but worldwide we are seeing a heightening of AI tensions with the US banning certain microchip exports to China so as not to strengthen China’s AI capabilities and military. Should China advance on Taiwan, it could strategically position itself to dominate microchip manufacturing and thus the AI race.

Fine-Tuning & Closed-source Models

In the fine-tuning stage the model is trained on new, specific datasets and the internal configurations that allow the model to make predictions or decisions based on input data are altered. These internal configurations are called parameters and in neural networks, ‘weights’ are coefficients applied to input data, determining the connection strength between units across different layers of the model, and are adjusted throughout training to minimize prediction errors. ‘Biases’, constants added before the activation function, ensure the model can make accurate predictions even when inputs are zero, facilitating pattern recognition by allowing shifts in the activation function's application.

Closed-source models like OpenAI's GPT series maintain the confidentiality of their training data and model architecture, meaning the specific configurations of their parameters remain exclusive. The owner of this model retains complete control over how it is used, developed and deployed which can lead to a number of centralizing forces within the fine-tuning stage of a model:

  1. Censorship -  Owners can decide what types of content the model generates or processes. They can implement filters that block certain topics, keywords, or ideas from being produced or recognized by the model. This could be used to avoid controversial subjects, comply with legal regulations, or align with the company's ethical guidelines or business interests. Since the launch of chatGPT, the outputs have continued to become increasingly censored and less useful. An extreme case of censorship of these models is showcased in China, where weChat conversions with Robot (built atop OpenAI’s foundational model) doesn’t answer questions such as “What is Taiwan?” or allow users to ask questions about Xi Jinping. In fact, through adversarial bypass techniques, a WSJ reporter was able to get Robot to admit that it was programmed to avoid discussing “politically sensitive content about the Chinese government or Communist Party of China.”

  2. Bias - In neural networks, the role of weights and biases is pivotal, yet their influence can inadvertently introduce bias, particularly if the training data lacks diversity. Weights, by adjusting the strength of connections between neurons, may disproportionately highlight or ignore certain features, potentially leading to a bias of omission where critical information or patterns in underrepresented data are overlooked. Similarly, biases, set to enhance learning capabilities, might predispose the model to favor certain data types if not calibrated to reflect a broad spectrum of inputs. The closed source nature of these models can cause the model to neglect important patterns from specific groups or scenarios, skewing predictions and perpetuating biases in the model's output, meaning certain perspectives, voices or information are excluded or misrepresented. A good example of bias and censorship by the model owner is Google’s latest and greatest LLM, Gemini.

  3. Verifiability - In a closed-source environment, users cannot confirm whether the claimed version of a model, such as ChatGPT 4 versus ChatGPT 3, is actually being used. This is because the underlying model architecture, parameters, and training data are not accessible for external review. Such opacity makes it difficult to ascertain if the latest advancements or features are indeed present or if older technologies are being passed off as newer versions, potentially affecting the quality and capabilities of the AI service received. For example, when using AI models to ascertain an applicant's credit worthiness for a loan, how can the applicant be sure that the same model was run by them as other applicants? Or how can we be sure the model only used the inputs it was supposed to use?

  4. Dependency, lock-in and stagnation - Entities that rely on closed source AI platforms or models find themselves dependent on the corporations that maintain these services, leading to a monopolistic concentration of power that stifles open innovation. This dependency arises because the owning corporation can, at any moment, restrict access or alter the model, directly impacting those who build upon it. A historical perspective reveals numerous instances of this dynamic: Facebook, which initially embraced open development with its public APIs to foster innovation, notably restricted access to applications like Vine as they gained traction. Similarly, Voxer, a messaging app that gained popularity in 2012 for allowing users to connect with their Facebook friends, lost its access to Facebook's 'Find Friends' feature. This pattern is not exclusive to Facebook; many networks and platforms begin with an open-source or open innovation ethos only to later prioritize shareholder value, often at the expense of their user base. We see for-profit corporations eventually require take rates in order to meet their stated goals of creating shareholder value, for example Apple's App Store imposes a 30% fee on the revenues that are generated from apps. Another example is Twitter. Despite its original commitment to openness and interoperability with the RSS protocol network, eventually prioritized its centralized database, leading to a disconnection from RSS in 2013 with it the loss of data ownership and one's social graph. Amazon has also been accused of using its internal data to replicate and prioritize its products over those of other sellers. These examples underscore a trend where platforms evolve from open ecosystems to more controlled, centralized models, impacting both innovation and the broader digital community.

  5. Privacy - The owners of these centralized models, large corporations such as OpenAI, retain all rights to use the prompt and user data to better train their models. This greatly inhibits user privacy. For example, Samsung employees inadvertently exposed highly confidential information by utilizing ChatGPT for assistance with their projects. The organization permitted its semiconductor division engineers to use this AI tool for debugging source code. However, this led to the accidental disclosure of proprietary information, including the source code of an upcoming software, internal discussion notes, and details about their hardware. Given that ChatGPT collects and uses the data inputted into it for its learning processes, Samsung's trade secrets have unintentionally been shared with OpenAI.

Reinforcement Learning from Human Feedback (RLHF)

Reinforcement Learning from Human Feedback (RLHF) integrates supervised fine-tuning, reward modeling, and reinforcement learning, all underpinned by human feedback. In this approach, human evaluators critically assess the AI's outputs, assigning ratings that facilitate the development of a reward model attuned to human preferences. This process necessitates high-quality human input, highlighting the importance of skilled labor in refining these models. Typically, this expertise tends to be concentrated within a few organizations capable of offering competitive compensation for such specialized tasks. Consequently, corporations with substantial resources are often in a better position to enhance their models, leveraging top talent in the field. This dynamic presents challenges for open-source projects, which may struggle to attract the necessary human labor for feedback without comparable funding or revenue streams. The result is a landscape where resource-rich entities are more likely to advance their AI capabilities, underscoring the need for innovative solutions to support diverse contributions in the development of AI technologies.

Inference

To effectively deploy machine learning (ML) or artificial intelligence (AI) models for user applications, it is imperative to ensure these models are equipped to manage real-world data inputs and provide precise, timely predictions or analyses. This necessitates careful deliberation on two pivotal aspects: the choice of deployment platform and the infrastructure requirements.

Deployment Platform

The deployment platform serves as the foundation for hosting the model, dictating its accessibility, performance, and scalability. Options range from on-premises servers, offering heightened control over data security and privacy, to cloud-based solutions that provide flexible, scalable environments capable of adapting to fluctuating demand. Additionally, edge computing presents a viable alternative for applications requiring real-time processing, minimizing latency by bringing computation closer to the data source. As with the pre-training stage, we run into similar centralisation problems when deploying the model for real world use:

Infrastructure centralisation - The majority of models are deployed on top of high-performance cloud infrastructure, of which there are not many options worldwide. As highlighted earlier, a small set of corporations have the facilities to process inference for these high parameter models and the majority are located in the US (as of 2023, 58.6% of all data centers were located in the USA). This is particularly relevant in light of the presidential executive order on AI and the EU AI act as it could greatly limit the number of countries that are able to train and provide inference for complex AI models.

Source: Statista
Source: Statista

Costs - Another centralizing force within the inference stage is the significant costs involved in deploying these models on one's own servers, cloud infrastructure, or through edge computing. OpenAI has partnered with Microsoft to utilize Microsoft's Azure cloud infrastructure for serving its models. Dylan Patel, the chief analyst at consulting firm SemiAnalysis, estimated that OpenAI's server costs for enabling inference for GPT-3 were $700,000 per day. Importantly, this was when OpenAI was offering inference for their 175 billion parameter model, so, all things being equal, we would expect this number to have escalated well into the seven figures today. In addition to the geographical and jurisdictional centralization of these data centers, we also observe this necessary infrastructure being consolidated within a few corporations (84.9% of cloud revenues were generated by four companies in 2023).

Source: Amazon, Microsoft, Google, Equinix, Statista
Source: Amazon, Microsoft, Google, Equinix, Statista

Centralized Frontends

Centralized hosting of frontends involves delivering the user-interface components of websites and web applications from one primary location or a select few data centers managed by a handful of service providers. This method is widely adopted for rolling out web applications, particularly those leveraging AI technologies to offer dynamic content and interactive user experiences. The frontend is therefore susceptible to take-downs through regulations or through changes in the policies of the service providers. We have seen this play out in Mainland China as citizens are blocked from interacting with the frontends of the popular AI interfaces such as ChatGPT and Hugging Face.

Conclusion

In conclusion, we can see the status quo for AI suffers from a number of centralizing and monopolistic forces that enable a minority of the world's largest entities to control and distribute models to the population. We have seen from the failures of web2, the misalignment of incentives between the user and corporation poses a dire threat to our freedoms, privacy and right to use AI. The impending regulation surrounding AI and the flourishing open source space shows we are at a pivotal moment in the advancement of the technology and that we should do everything in our power to ensure it remains free and open source for all to use. In our next blog we will cover how crypto at the intersection of AI is enabling these free and open source systems to scale, improving the status quo and improving crypto.

Part 2

Introduction to Crypto AI

The influence of AI on our world is becoming increasingly evident across various aspects of daily life and industry. From enhancing the efficiency of operations in sectors such as healthcare, finance, and manufacturing to transforming the way we interact with technology through personal assistants and smart devices, AI's impact is profound. In our first report, we covered all the centralizing forces within model creation, culminating in increasing control amassed by major AI providers such as OpenAI and Microsoft. Approaching from a philosophical perspective, AI embodies digital knowledge. Within the vast expanse of the digital domain, knowledge stands out as a prime candidate for decentralization. This article delves into the convergence of AI and cryptocurrency, exploring how permissionless, uncensorable networks for settlement and incentivization can foster the secure and democratic evolution of AI. Additionally, we will scrutinize how AI can contribute to the enhancement of cryptocurrency ecosystems, creating a symbiotic relationship that promotes growth and innovation in both fields.

Pre-Training

As we have discussed extensively in part 1, within the pre-training phase of model generation we encounter multiple centralizing forces, namely:

  1. Closed-source data & data access

  2. Geographical centralization of resources

  3. Resource costs

  4. GPU supply-side exclusivity

The collection of this data in order to train models is vital, however the following issues are prevalent:

  1. Data Access - Major firms like Microsoft, Google, and OpenAI have superior data access through partnerships, their own user data, or the capability to establish extensive web scraping operations.

  2. Closed-Source Data - When Training models the data used requires careful consideration of bias.

  3. Data Provenance - Determining and verifying the source of data is becoming increasingly important, as it ensures the integrity and reliability of the data which is crucial when training a model.

Collection & Preparation

Up to 80% of the effort in deploying AI models is dedicated to data preparation. This task becomes more time-consuming and complex with fragmented or unstructured data, with exporting and cleansing being the two critical steps in the process. The competitive landscape of AI is intensifying as major websites with investments or strategic partnerships in centralized AI entities take measures to safeguard their position by restricting smaller contenders' access to vital data. These websites have adopted policies that effectively make data access prohibitively expensive, excluding all but the most well-funded AI laboratories. They frequently employ strategies such as blocking IP addresses from recognized data centers, and in some cases, they engage in intentional data poisoning—a tactic where companies deliberately corrupt shared data sources to disrupt their rivals' AI algorithms.

Valid residential IP addresses and user-agents hold significant value, as they enable the collection of internet data in a way that ensures the retrieval of accurate information. Every ordinary internet user possesses this potential, and if leveraged collectively within a network, it could facilitate the extensive indexing of the web. This, in turn, would empower open-source and decentralized AI initiatives by providing them with the vast datasets necessary for training. The use of crypto incentives to accurately reward participation in this DePIN network can create a virtuous flywheel that can enable this network to compete with the likes of Google and Microsoft who are the only entities who have indexed the whole internet:

Source: Messari
Source: Messari

The DePIN flywheel works as follows:

  1. Participants contributing to the network's growth are motivated through inflationary token rewards, effectively subsidizing their efforts. These incentives are designed to bolster the network's early development until it can earn steady income from user fees.

  2. The expansion of the network draws in developers and creators of products. Furthermore, the network's financial support for those who supply its services enables them to provide these services at lower costs, which in turn entices end users.

  3. As end users start to pay for the services offered by the network, the income for both the providers and the network itself rises. This increase in revenue generates a positive feedback loop, drawing in additional providers and investors to the network.

  4. Being user owned, the value within the network can be distributed back to them, typically via a token burn model or through distribution of revenues. With these models, as the network becomes more useful and tokens are either removed from circulation in a burn model or staked by users, the value of the tokens tend to go up. This increase in token value further encourages more providers to join the network, perpetuating a beneficial cycle.

Utilizing DePIN to compile publicly accessible data could address the problem of proprietary datasets in AI, which often embed biases in the resultant models. Training open-source AI models on data from such a network would enhance our ability to detect, assess, and correct biases. Currently, the opaqueness surrounding the datasets used for training AI models hinders our comprehension of the biases they may contain, compounded by the difficulty in contrasting models trained on diverse datasets.The creation of this decentralized data network, could incentivise various contributors to provide datasets with clear provenance, while also enabling the tracking of how these datasets are utilized in both the initial training and subsequent fine-tuning of foundational AI models.

Grass

Grass is one such network focussed on data acquisition, cleaning and provenance. Functioning similarly to traditional residential proxy services, Grass harnesses the untapped potential of users' idle bandwidth for operations such as web scraping. By installing a Google Chrome application, users contribute to the Grass network whenever they are online. This system repurposes any surplus bandwidth for designated tasks, like the extraction of large corpora of texts, such as philosophical texts.Utilizing residential proxies, the Grass network navigates around common obstacles such as rate limits, blocks, and data poisoning attacks. This approach allows Grass to efficiently gather substantial volumes of online data in its intended format, optimizing the process of data extraction.

On top of enabling streamlined data acquisition, a key advantage of Grass lies in the compensation model: users are awarded the full value of their bandwidth contribution, rather than just a small portion of the proceeds.

Data Provenance

Led by AI, data generation is increasing exponentially, from 2 Zettabytes in 2010 to an estimated 175 Zettabytes in 2025. Forecasts predict a surge to over 2000 Zettabytes by 2035, indicating a growth rate exceeding 10 times in the next 15 years. This is in part due to the creation of AI generated content, in fact, a report into deep fakes and AI generated content by Europol estimated that AI-generated content could account for as much as 90% of information on the internet in a few years’ time, as ChatGPT, Dall-E and similar programs flood language and images into online space.

Source: Statista digital economy compass 2019
Source: Statista digital economy compass 2019

As discussed in part 1, the inherent biases in an AI model's outputs are often a reflection of the data on which it was trained. Consider the potential pitfalls of using data harvested by industrial-scale internet scrapers for pre-training: with the proliferation of AI-generated content online, there's a heightened risk of feeding models with inaccurate or skewed data. A clear manifestation of this issue is observed in Google's Gemini LLM, which, in its quest for equitable representation, might introduce historically inaccurate elements into generated content. For instance, it might produce an image of the founding fathers of America that includes a diverse range of ethnicities, diverging from historical accuracy.Therefore, the provenance of data is crucial in the training of models. Currently, we are compelled to trust that the proprietary datasets large corporations employ for model training are both accurate and genuinely sourced, rather than being generated by another AI model.

However, there are a number of crypto solutions on the market today that offer data provenance solutions. Decentralized data storage solutions, such as Filecoin, guarantee data provenance through the use of blockchain technology. This technology creates a clear, unchangeable ledger that records data storage, access, and any alterations over time, ensuring transparency and immutability in the data's history. By enabling individuals to offer their unused storage space, Filecoin creates a vast, decentralized network of data storage providers. Each transaction on Filecoin, from the initial agreement between data owners and storage providers to every instance of data access or modification, is permanently recorded on the blockchain. This creates an indelible and transparent history of data interactions, making it straightforward to track the data's storage, access, and movement. Furthermore, Filecoin employs cryptographic proofs to guarantee the integrity and immutability of the data stored within its network. Storage providers are required to periodically demonstrate, via verifiable proofs, that they are faithfully storing the data as per the contractual agreement, adding an extra layer of transparency and enhancing the overall security of the data stored. The clear data provenance and immutable ledger has started to attract a number of respected institutions and we are seeing entities such as NASA, the University of California, the National Human Genome Research Institute and the National Library of Medicine utilize this storage solution. Filecoin is starting to facilitate more and more deals in the greater that 1000 tebibyte size as a direct result of this.

Source: Filfox
Source: Filfox

On top of the benefits of immutable, censorship-resistant data provenance guarantees, we also find that onboarding the long-tail of storage devices from around the world drives the price of storage down, making the decentralized storage solutions cheaper than the centralized alternatives. For example, storing 1TB of data for one month on Filecoin costs $0.19, whilst storing on Amazon’s S3 is 121x more expensive costing $23 for the month. Due to these benefits we are starting to see decentralized storage solutions growing.

Source: Coingecko centralized vs decentralised storage costs
Source: Coingecko centralized vs decentralised storage costs

Resource Allocation

The availability of GPUs was already constrained before the advent of ChatGPT and it wasn’t uncommon to see periods of heightened demand from use cases such as Crypto mining. Following the launch of chatGPT and the cambrian explosions of foundational models, the demand for GPUs has surged dramatically, possibly even a hundredfold. Rarely have we seen such a significant disparity between the demand for a resource and its available supply, even though the aggregate supply exceeds demand. If every GPU worldwide were capable of being organized and utilized for AI training today, we would be facing an excess rather than a shortage.

The long-tail of GPUs are scattered across various platforms and devices, often underutilized or used for purposes far less demanding than their full capabilities allow, for example:

  1. Gaming PCs and Consoles: High-end GPUs underused outside gaming could support distributed computing or AI training.

  2. Corporate Workstations: Workstations with GPUs for creative tasks could be redirected for computational use during off-hours.

  3. Data Centers: Despite their capacity, some data centers have GPUs with spare capacity ideal for AI tasks.

  4. Academic Institutions: Universities with high-performance GPUs for research might not fully utilize them at all times, offering potential for broader use.

  5. Cloud Computing Platforms: These platforms sometimes have more GPU resources than needed, presenting opportunities for optimized utilization.

  6. Edge Devices: IoT and smart appliances have GPUs that, while less powerful individually, offer substantial collective processing power.

  7. Cryptocurrency Mining Rigs: Market downturns make some rigs less suitable for mining but still valuable for computational tasks.

Previously, there lacked an incentivisation and coordination layer that could effectively manage this two-sided marketplace for compute, whilst also addressing the myriad of technical issues that must be considered when selecting GPUs for training. These issues primarily arise from the distributed and non-uniform nature of the aggregation of long-tail GPUs:

  1. Diverse GPU Capabilities for Varied Tasks: Graphics cards vary widely in design and performance capabilities, making some unsuitable for specific AI tasks. Success in this domain hinges on effectively pairing the right GPU resources with the corresponding AI workloads.

  2. Adapting Training Methods for Increased Latency: Currently, foundational AI models are developed using GPU clusters with ultra-low latency links. In a decentralized setup, where GPUs are distributed across various locations and connected over the public internet, latency can significantly rise. This situation presents a chance to innovate training methodologies that accommodate higher latency levels. Such adjustments could optimize the utilization of geographically dispersed GPU clusters.

  3. Security and Privacy: Utilizing GPUs across various platforms raises concerns about data security and privacy. Ensuring that sensitive or proprietary information is protected when processed on external or public GPUs is crucial.

  4. Quality of Service: Guaranteeing a consistent level of service can be challenging in a decentralized environment. Variability in GPU performance, network stability, and availability can lead to unpredictable processing times and outcomes.

Networks leveraging crypto-incentives to coordinate the development and operation of this  essential infrastructure can achieve greater efficiency, resilience, and performance (although not quite yet) compared to their centralized counterparts. Although nascent, we can already see the benefits of onboarding the long-tail of GPU power into a Decentralised Physical Infrastructure Networks (DePIN):

Source: Messari
Source: Messari

The costs to run some of the top performance GPUs on DePIN networks are between 60-80% cheaper than their centralized counterparts.

It’s still early within the generalized compute marketspace and despite the lower costs to utilize this infrastructure, we are seeing growing pains in terms of performance and uptime. Nevertheless, the demand for GPU has become apparent Akash’s daily spend increasing by 20.32x since their GPU market went live in late August 2023.

Source: Akash
Source: Akash

The facilitate this massive increase in demand, Akash’s GPU capacity has had to scale quickly:

Source: Akash
Source: Akash

The only way to compete with the centralized, monopolistic corporations and their ability to spend billions on compute power each year to improve their models is to harness the power of DePIN that provides decentralized, permissionless access to compute.  Crypto incentives enable software to pay for the hardware without a central authority. The very dynamics that have allowed the Bitcoin network to become “the largest computer network in the world , a network orders of magnitude larger than the combined size of the clouds that Amazon, Google, and Microsoft have built over the last 15-20 years”, can allow decentralized and open source AI to compete with centralized incumbents.

Io Net

Another example of the DePIN thesis is Io net. The platform aggregates a diverse array of GPUs into a communal resource pool, accessible to AI developers and businesses, with their mission statement being “Putting together one million GPUs” into a network. Io net leverages token incentives to fundamentally decrease the expenses associated with acquiring and retaining supply-side resources, thereby diminishing costs for end consumers. Presently, this network is fueled by thousands of GPUs sourced from data centers, mining operations, and consumer-level hardware and has over 62,000 compute hours.

While pooling these resources presents significant value, AI workloads can't seamlessly transition from centralized, high-end and low latency hardware to distributed networks of heterogeneous GPUs. The challenge here lies in efficiently managing and allocating tasks across a wide variety of hardware, each with its own memory, bandwidth, and storage specifications. Io.net implements ‘clustering’ by overlaying custom-designed networking and orchestration layers on top of distributed hardware, effectively activating and integrating them in order to perform ML tasks. Utilizing Ray, Ludwig, Kubernetes, and other open-source distributed computing frameworks, the network enables machine learning engineering and operations teams to effortlessly scale their projects across an extensive network of GPUs with only minor modifications needed. We believe the limited demand for compute networks like Render and Akash is primarily attributed to their model of renting out single GPU instances. This approach leads to slower and less efficient machine learning training, hindering their attractiveness to potential users seeking robust computational resources.

The IO Cloud is engineered to streamline the deployment and management of decentralized GPU clusters, known as Io workers, on demand. By creating on-demand clusters, machine learning teams can effectively distribute their workloads across io.net's GPU network. This system utilizes advanced libraries to address the complexities of orchestration, scheduling, fault tolerance, and scalability, ensuring a more efficient operational workflow. At its core, IO Cloud employs the RAY distributed computing Python framework, a solution that has been rigorously tested and adopted by OpenAI for training cutting-edge models like GPT-3 and GPT-4 on over 300,000 servers.

Conclusion

DePIN employs token incentives to significantly reduce the costs involved in acquiring and maintaining supply-side resources, which in turn lowers the expenses for end consumers, creating a virtuous flywheel effect that enables the network to expand rapidly. To rival the efficiency of centralized alternatives, DePIN networks are essential. However, in their present development stage, these networks face challenges with reliability, including susceptibility to downtime and software bugs.

Fine-Tuning

During the fine-tuning phase, the model's parameters are established. To summarize Part 1, we observe several centralizing influences resulting from its proprietary nature:

  1. Censorship - Owners have the authority to determine the kinds of content that the model creates or handles.

  2. Bias - Owners have the discretion to specify the types of content the model produces or processes.

  3. Verifiability - Within a proprietary setting, the verifiability of whether the stated version of a model is genuinely in operation is unattainable for users.

  4. Dependency and Lock-in -  Entities using proprietary AI platforms or models become dependent on the controlling corporations, fostering a monopolistic power dynamic that hampers open innovation.

  5. RLHF - Refining models with RLHF demands skilled labor, typically concentrated in wealthy organizations that can pay for top talent, giving them a competitive advantage in model enhancement.

Source: Tech Target
Source: Tech Target

In early March, the open-source community gained access to its first highly capable foundational model when Meta's LLaMA was unexpectedly leaked to the public. Despite lacking instruction or conversation tuning, as well as RLHF, the community quickly grasped its importance, sparking a wave of innovation with significant advancements occurring within days of each other. The open-source community created variations of the model enhanced with instruction tuning, quantization, improved quality, human evaluations, multimodality, RLHF, among other improvements, with many developments building upon the previous ones. An internal memo by a Google researcher, which was leaked, eloquently details the future of AI and the struggles of the development of closed source software, Below is a concise excerpt:

We Have No Moat

And neither does OpenAI

We’ve done a lot of looking over our shoulders at OpenAI. Who will cross the next milestone? What will the next move be?

But the uncomfortable truth is, we aren’t positioned to win this arms race and neither is OpenAI. While we’ve been squabbling, a third faction has been quietly eating our lunch.

I’m talking, of course, about open source. Plainly put, they are lapping us. Things we consider “major open problems” are solved and in people’s hands today. Just to name a few:

LLMs on a Phone: People are running foundation models on a Pixel 6 at 5 tokens / sec.

Scalable Personal AI: You can finetune a personalized AI on your laptop in an evening.

Responsible Release: This one isn’t “solved” so much as “obviated”. There are entire websites full of art models with no restrictions whatsoever, and text is not far behind.

Multimodality: The current multimodal ScienceQA SOTA was trained in an hour.

While our models still hold a slight edge in terms of quality, the gap is closing astonishingly quickly. Open-source models are faster, more customizable, more private, and pound-for-pound more capable. They are doing things with $100 and 13B params that we struggle with at $10M and 540B. And they are doing so in weeks, not months. This has profound implications for us:

We have no secret sauce. Our best hope is to learn from and collaborate with what others are doing outside Google. We should prioritize enabling 3P integrations.

People will not pay for a restricted model when free, unrestricted alternatives are comparable in quality. We should consider where our value add really is.

Giant models are slowing us down. In the long run, the best models are the ones

which can be iterated upon quickly. We should make small variants more than an afterthought, now that we know what is possible in the <20B parameter regime…

Directly Competing With Open Source Is a Losing Proposition

This recent progress has direct, immediate implications for our business strategy. Who would pay for a Google product with usage restrictions if there is a free, high quality alternative without them?

And we should not expect to be able to catch up. The modern internet runs on open source for a reason. Open source has some significant advantages that we cannot replicate.

We need them more than they need us

Keeping our technology secret was always a tenuous proposition. Google researchers are leaving for other companies on a regular cadence, so we can assume they know everything we know, and will continue to for as long as that pipeline is open.

But holding on to a competitive advantage in technology becomes even harder now that cutting edge research in LLMs is affordable. Research institutions all over the world are building on each other’s work, exploring the solution space in a breadth-first way that far outstrips our own capacity. We can try to hold tightly to our secrets while outside innovation dilutes their value, or we can try to learn from each other.”

Open source models offer transparent innovation whilst helping to address issues of censorship and bias inherent in proprietary models. With model parameters and code accessible to everyone, foundational models can be executed and modified by anyone, allowing for the removal of embedded censorship. Additionally, the transparency of open parameters makes it simpler to identify and mitigate any underlying biases in the model and training method. The nature of open source models enables extensive comparisons.

Verifiability

We also discussed the crucial issue of our reliance on centralized model providers to accurately process our prompts and, often, to handle our private data securely. Crypto has been at the forefront of pioneering cryptographic proofs systems like Zero Knowledge Proofs (ZKP), Optimistic Proofs (OP) and Trusted Execution Environments (TEE), enabling for the verification of claims without exposing the underlying data, thus safeguarding privacy while also offering solutions for scaling. Cryptographic proofs enable users to transact without revealing their identities, ensuring privacy, and facilitate scaling by allowing the offloading of computationally intensive tasks to an auxiliary layer, such as rollups, or off-chain. Simultaneously, they provide a proof on-chain that the correct procedure was adhered to (TEE does not).

Machine learning and AI models are notoriously computationally heavy and thus it would be prohibitively expensive to run these models inside smart contracts on-chain. Through the development of various proving systems, models can be utilized in the following manner:

  1. User submits a prompt to a specific LLM / model.

  2. This request for inference is relayed off-chain to a GPU / computer that enters the prompt through the desired model.

  3. This inference is then returned to the user and alongside it a cryptographic proof is attached verifying that the prompt was run through the specific model.

Verifiability is currently lacking in the AI industry, and as AI models become increasingly integrated into our lives and work, it is essential that we have the ability to verify the authenticity of their outputs.Take for example sectors like healthcare and law, where AI models assist in diagnosing diseases or analyzing legal precedents, the inability of professionals to verify the model's source and accuracy can foster mistrust or lead to errors. For healthcare providers, not knowing if an AI's recommendations are based on the most reliable model could adversely affect patient care. Similarly, for lawyers, uncertainty about an AI's legal analysis being up-to-date could compromise legal strategies and client outcomes. Conversely, if a user wishes to utilize a model with their data while keeping it private from the model provider due to confidentiality concerns, they can process their data with the model independently, without disclosing the data and then confirm the accurate execution of the desired model by providing a proof.

The result of these verifiability systems is that models can be integrated into smart contracts whilst preserving the robust security assumptions of running these models on-chain. The benefits are multi-faceted:

  1. The model provider is able to keep their model private if they wish.

  2. A User can verify the model was run correctly.

  3. Models can be integrated into smart contracts, helping bypass the scalability issues present in blockchains today.

Currently, these crypto proving systems are largely in a developmental phase, with the majority of initiatives concentrated on establishing foundational components and creating initial demonstrations. The primary obstacles encountered at present encompass high computational expenses, constraints on memory capacity, intricate model designs, a scarcity of specialized tools and underlying frameworks, and a shortage of skilled developers. The preferences of these cryptographic verification systems (zkML, opML, TEE) are still being decided, but we are starting to see TEEs vastly outperform the currently computationally intensive and expensive zk proofs. Marlin gave a good overview of the trade-offs during their Eth Denver conference:

Source: Marlin Protocol
Source: Marlin Protocol

The potential for verification to enhance current model architecture is clear, yet these systems also hold the promise of improving the user experience in cryptocurrencies through the facilitation of upgradability and the implementation of dynamic contracts. As it stands, the functionality of smart contracts is significantly limited by their dependence on preset parameters, necessitating manual updates to ensure their continued effectiveness. Typically, this manual updating process involves either bureaucratic governance procedures or compromises decentralization by granting undue autonomy to the smart contract owner. For instance, the updating of risk parameters in Aave is managed through political and bureaucratic governance votes alongside risk teams, a method that has proven to be inefficient as evidenced by Guantlet’s departure from Aave. Integrating AI into smart contracts has the potential to revolutionize their management, modification, and automatic enhancement. In the example of Aave, implementing AI agents for the adaptation of risk parameters in response to evolving market conditions and risks could significantly optimize the process, offering a more efficient and timely alternative to the often slow and cumbersome adjustments made by humans or DAOs.

Ritual

Ritual is conceptualized as a decentralized, flexible, and sovereign platform for executing AI tasks. It integrates a distributed network of nodes, each equipped with computational resources, and allows AI model developers to deploy their models, including both LLMs and traditional machine learning models, across these nodes. Users can access any model within the network through a unified API, benefitting from additional cryptographic measures embedded in the network. These measures provide assurances of computational integrity and privacy (zkML, opML & TEE).

Infernet represents the inaugural phase in the evolution of Ritual, propelling AI into the realm of on-chain applications by offering robust interfaces for smart contracts to utilize AI models for inference tasks. It ensures unrestricted access to a vast network of model and compute providers, marking a significant step towards democratizing AI and computational resources for on-chain applications and smart contracts.

Source: Ritual.net
Source: Ritual.net

The overarching aim for Ritual is to establish itself as the ‘AI coprocessor’. This involves advancing Infernet into a comprehensive suite of execution layers that seamlessly interact with the foundational infrastructure across the ecosystem. The goal is to enable protocols and applications on any blockchain to leverage Ritual as an AI coprocessor, facilitating widespread access to advanced AI functionalities across crypto.

The Ritual Superchain
The Ritual Superchain

Privacy

The foundational element of crypto is encryption and cryptography, which can be utilized in numerous ways to facilitate privacy-preserving interactions with AI. Verifiable AI will empower users to retain ownership of their data, restricting third-party access. Despite advancements, privacy concerns persist because data is not encrypted at all stages of processing. Currently, the standard practice involves encrypting messages between a user's device and the provider's server. However, this data is decrypted on the server to enable the provider to utilize it, such as running models on user data, exposing it to potential privacy risks. There is therefore a large risk of exposing sensitive information to the LLM service provider and in critical sectors like healthcare, finance, and law, such privacy risks are substantial enough to halt progress and adoption.

These privacy concerns are evident in IBM's 2023 Security Report, which reveals a notable rise in the incidence of healthcare data breaches throughout the decade:

Source: Zama
Source: Zama

Crypto is now pioneering a new standard for data transfer and processing, called Fully Homomorphic Encryption (FHE). FHE allows data to be processed without ever being decrypted. This innovation ensures that companies / people can provide their services while keeping users' data completely private, with no discernible impact on functionality for the user. With FHE, data remains encrypted not only while it is being transmitted but also throughout the processing phase. This advancement extends the possibility of end-to-end encryption to all online activities, not just message transmission and in the context of AI FHE will enable the best of both worlds: protection of both the privacy of the user and the IP of the model.

FHE allows functions to be executed on data while it remains encrypted. A demonstration by Zama has shown that an LLM model, when implemented with FHE, preserves the accuracy of the original model's predictions, whilst keeping data (prompts, answers) encrypted throughout the whole process. Zama modified the GPT-2 implementation from the Hugging Face Transformers library, specifically, parts of the inference were restructured using Concrete-Python. This tool facilitates the transformation of Python functions into their FHE counterparts, ensuring secure and private computation without compromising on performance.

In part 1 of our thesis, we analyzed the structure of the GPT models, which broadly consist of A sequence of multi-head attention (MHA) layers applied consecutively. Each MHA layer utilizes the model's weights to project the inputs, executes the attention mechanism, and then re-projects the attention's output into a novel tensor.

Zama's TFHE approach encodes both model weights and activations as integers. By employing Programmable Bootstrapping (PBS) and simultaneously refreshing ciphertexts, it enables arbitrary computations. This method allows for the encryption of any component or the entirety of LLM computations within the realm of Fully Homomorphic Encryption.

Source: Zama
Source: Zama

After converting these weights and activations, Zama enables the model to be run fully encrypted, meaning the server can never see a user's data or inputs. The above graphic from Zama, shows a basic implementation of FHE in LLMs:

  1. A user begins the inference process on their local device, stopping just after the initial layer, which is omitted from the model shared with the server.

  2. The client then encrypts these intermediate operations and forwards them to the server. The server processes a portion of the attention mechanism on this encrypted data, and sends the results back to the client.

  3. Upon receipt, the client decrypts these results to proceed with the inference locally.

A more concrete example may be the following:In a scenario where a potential borrower applies for a loan, the bank is tasked with assessing the applicant's creditworthiness while navigating privacy regulations and concerns that restrict direct access to the borrower's detailed financial history. To address this challenge, the bank could adopt a FHE scheme, enabling secure, privacy-preserving computations.

The applicant agrees to share their financial data with the bank under the condition that privacy safeguards are in place. The borrower encrypts their financial data locally, ensuring its confidentiality. This encrypted data is then transmitted directly to the bank, which is equipped to run sophisticated credit assessment algorithms & AI models on the encrypted data within its computing environment. As the data remains encrypted throughout this process, the bank can conduct the necessary analyses without accessing the applicant's actual financial information. This approach also safeguards against data breaches, as hackers would not be able to decrypt the financial data without the encryption key, even if the bank's servers were breached.

Upon completing the analysis, the user decrypts the resulting encrypted credit score and insights, thus gaining access to the information without compromising the privacy of their financial details at any point. This innovative method ensures the protection of the applicant's financial records at every step, from the initial application through to the final credit assessment, thereby upholding confidentiality and adherence to privacy laws.

Incentivised RLHF

Within part one of our thesis, we highlighted the centralizing forces within the RLHF stage of fine-tuning, namely the aggregation of specialized labor within a few large companies due to the compensation they can provide. Crypto economic incentives have proven valuable in creating positive feedback loops and engaging top tier talent toward a common goal. An example of this tokenized RLHF has been Hivemapper, which aims to use crypto economic incentives to accurately map the entire world. The Hivemapper Network, launched in November 2022, rewards participants who dedicate their time to refining and curating mapping information and has since mapped 140 million Kilometers since launched in over 2503 distinct regions. Kyle Samani highlights that tokenized RLHF starts to make sense in the following scenarios:

  • When the model targets a specialized and niche area rather than a broad and general application. Individuals who rely on RLHF for their primary income, and thus depend on it for living expenses, will typically prefer cash payments. As the focus shifts to more specialized domains, the demand for skilled workers increases, who may have a vested interest in the project's long-term success.

  • When the individuals contributing to RLHF have a higher income from sources outside of RLHF activities. Accepting compensation in the form of non-liquid tokens is viable only for those with adequate financial stability from other sources to afford the risk associated with investing time in a specific RLHF model. To ensure the model's success, developers should consider offering tokens that vest over time, rather than immediately accessible ones, to encourage contributors to make decisions that benefit the project in the long run.

Inference

In the inference phase, the deployment platform delivers the inference to the end user via on-premise servers, cloud infrastructure, or edge devices. As previously mentioned, there is a centralization in both the geographic location of the hardware and its ownership. With daily operational costs reaching hundreds of thousands of dollars for the most popular deployment platforms, most corporations find themselves priced out and thus the serving of models aggregates. Similarly to the pre-training phase, DePIN networks can be utilized to serve inferences on a large scale, offering multiple advantages:

  1. User ownership - DePIN’s can be used to serve inference by connecting and coordinating compute across the globe. The ownership of the network and the subsequent rewards flow to the operators of this network, who are also the users. DePIN enables the collective ownership of the network by its users, avoiding the misalignment of incentives we historically find in web2 operations.

  2. Crypto economic incentives - Crypto economic incentives such as block rewards or rewards for proof of work enables the network to function with no central authority and accurately incentivise and compensate work done that is beneficial to the network.

  3. Reduced costs - onboarding the long-tail of GPU’s across the globe can greatly reduce the costs of inference as we have seen with the price comparisons between decentralized compute providers when compared to their centralized counterparts.

Decentralized frontends

Source: Marlin protocol
Source: Marlin protocol

The underlying code of smart contracts is executed on a decentralized peer-to-peer network, however, the primary way users interact with these contracts is through frontends hosted on centralized servers. This centralization presents multiple challenges, such as vulnerability to Distributed Denial of Service (DDoS) attacks, the possibility of domain name DDoS or malicious takeovers and most importantly censorship by corporate owners or nation states. Similarly, the dominance of centralized frontends in the current AI landscape raises concerns, as users can be restricted access to this pivotal technology. When developing community-owned, censorship-resistant front ends that facilitate worldwide access to smart contracts and AI, it's crucial to take into account the geographical distribution of nodes for data storage and transmission. Equally important is ensuring proper ownership and access control over this data. There are a number of protocols and crypto systems that can be used to enable this:

IPFS

The Interplanetary File System (IPFS) is a decentralized, content-addressable network that allows for the storage and distribution of files across a peer-to-peer network. In this system, every file is hashed, and this hash serves as a unique identifier for the file, enabling it to be accessed from any IPFS node using the hash as a request. This design is aimed at supplanting HTTP as the go-to protocol for web application delivery, moving away from the traditional model of storing web applications on a single server to a more distributed approach where files can be retrieved from any node within the IPFS network. Whilst a good alternative to the status quo, webpages hosted on IPFS, connected via DNSLink rely on gateways which may not always be secure or operate on a trustless basis. This webpage is also a static HTML site.

3DNS & Marlin

The advent of 3DNS introduces the concept of ‘tokenized domains’ managed directly on the blockchain, presenting a solution to several decentralization challenges. This innovation allows smart contracts, and by extension DAOs to oversee domain management. One of the primary benefits of managing DNS records on the blockchain is enhanced access control. With this system, only keys with explicit authorization can modify DNS records, effectively mitigating risks such as insider interference, database breaches, or email compromises at the DNS provider level. However, the domain must still be linked to an IP address, which a hosting provider can change at their discretion. Consequently, any alteration in the IP address of the server hosting the frontend necessitates a DAO vote to update the records—a cumbersome process.

To address this, there's a need for a deployment method that enables smart contracts to autonomously verify the server's codebase and only then redirect DNS records to the host, contingent upon the code aligning with an approved template. This is where Trusted Execution Environments (TEEs), such as Marlin’s Oyster come into play. TEEs create a secure enclave for executing code, shielded from any data/code modifications or access by the host machine, and enable the verification of the code's integrity against its intended version through attestations.

This framework allows for the implementation of Certificate Authority Authorization (CAA) records, managed by a DNS admin contract, to ensure that only enclaves executing approved software can request domain certificates. This mechanism guarantees that any data received by users visiting the domain is authenticated, emanating exclusively from the authorized application, thereby certifying its integrity and safeguarding against tampering.

Public Key Infrastructure Problem

During the 1990s and early 2000s, cryptographers and computer scientists extensively theorized about the vast benefits and innovations that Public Key Infrastructure (PKI) could bring forth. PKI represents a sophisticated framework that is pivotal for bolstering security across the internet and intranets, facilitating secure network transactions including e-commerce, internet banking, and the exchange of confidential emails through robust encryption and authentication mechanisms. Utilizing asymmetric cryptography, also known as public-key cryptography, as its foundational security mechanism, this approach involves the use of two keys: a public key, which can be shared openly, and a private key, which is kept secret by the owner. The public key is used for encrypting messages or verifying digital signatures, while the private key is used for decrypting messages or creating digital signatures. Central to the PKI system is the generation of a pair of keys—a public key and a private key—which enables the encrypted transmission of data, thereby safeguarding privacy and ensuring that only authorized individuals can access the information.

For PKI to operate effectively, it is imperative for users to maintain their private keys in a manner that is both secure and accessible. "Secure" in this context means that the private key is stored in a private manner, exclusively accessible to the user. "Accessible" implies that the user can easily and frequently retrieve their private key when needed. The challenge of PKI lies in achieving this delicate balance. For instance, a user might secure their private key by writing it down and storing it in a locked box and then misplacing the box—akin to storing it in a highly secure location but then forgetting about it. This scenario compromises security. Conversely, storing the private key in a highly accessible location, such as on a public website, would render it insecure as it could be exploited by unauthorized users. This conundrum encapsulates the fundamental PKI problem that has hindered the widespread adoption of public key infrastructure.

Cryptocurrency has addressed the PKI dilemma through the implementation of direct incentives, ensuring that private keys are both secure and readily accessible. If a private key is not secure, the associated crypto wallet risks being compromised, leading to potential loss of funds. Conversely, if the key is not accessible, the owner loses the ability to access their assets. Since Bitcoin's introduction, there has been a steady, albeit gradual, increase in the adoption of PKI. When utilized effectively, PKI and crypto’s resolution to the PKI problem, plays a critical role in facilitating the secure expansion of open-source agents, proof of personhood solutions, and, as previously discussed, data provenance solutions.

Source: An empirical three phase analysis of the crypto market
Source: An empirical three phase analysis of the crypto market

Autonomous Smart Agents

The concept of an agent has deep historical roots in philosophy, tracing back to the works of prominent thinkers like Aristotle and Hume. Broadly, an agent is defined as any entity capable of taking action, and "agency" refers to the expression or demonstration of this ability to act. More specifically, "agency" often pertains to the execution of actions that are intentional. Consequently, an "agent" is typically described as an entity that holds desires, beliefs, intentions, and possesses the capability to act based on these factors. This concept extended into the realm of computer science with the goal of empowering computers to grasp users' preferences and independently carry out tasks on their behalf. As AI evolved, the terminology "agent" was adopted within AI research to describe entities that exhibit intelligent behavior. These agents are characterized by attributes such as autonomy, the capacity to respond to changes in their environment, proactiveness in pursuing goals, and the ability to interact socially. AI agents are now recognized as a critical step towards realizing Artificial General Intelligence (AGI), as they embody the capability for a broad spectrum of intelligent behaviors.

As judged by World Scope, LLMs have showcased remarkable abilities in acquiring knowledge, understanding instructions, generalizing across contexts, planning, and reasoning. They have also proven adept at engaging in natural language interactions with humans. These strengths have led to LLMs being heralded as catalysts for Artificial General Intelligence (AGI), highlighting their significance as a foundational layer in the development of intelligent agents. Such advancements pave the way for a future in which humans and agents can coexist in harmony.

Within the confines of its environment, these agents can be used to complete a wide array of tasks. Bill Gates used the following scenario to describe their myriad functions: “Imagine that you want to plan a trip. A travel bot will identify hotels that fit your budget. An agent will know what time of year you’ll be traveling and, based on whether you always try a new destination or like to return to the same place repeatedly, it will be able to suggest locations. When asked, it will recommend things to do based on your interests and propensity for adventure and book reservations at the types of restaurants you would enjoy.” Whilst this far out, OpenAI is reportedly developing AI agents capable of executing complex, multi-step tasks autonomously. These agents, transcending the traditional bounds of user interaction, are designed to manipulate user devices directly to perform intricate tasks across different applications. For instance, an AI agent could autonomously transfer data from a document into a spreadsheet for further analysis, streamlining work processes significantly. Innovating beyond mere desktop applications, these AI agents could also navigate web-based tasks—such as booking flights or compiling travel itineraries—without relying on APIs.

Whilst useful, these centralized AI agents pose similar risks to the ones we identified in part 1:

  1. Data control & access

  2. Verifiability

  3. Censorship

However, we also run into a few new issues:

Composability - One of the primary benefits of crypto is the composability it facilitates. This feature enables open-source contributions and the permissionless interaction of protocols, allowing them to connect, build upon, and interface with each other seamlessly. This is illustrated in DeFi through the concept of 'money legos'. For an AI agent to function optimally, it must possess the capability to interface with a broad spectrum of applications, websites, and other agents. However, within the confines of traditional closed network systems, AI agents face significant challenges in task execution, often limited by the need to connect to multiple third-party APIs or resort to using complex methods like Selenium drivers for information retrieval and task execution. In the era of sovereign AI, the limitations become even more pronounced, as agents are unable to access models or data behind national firewalls. To truly empower AI agents, credibly neutral, decentralized base layers are essential. Such layers would allow agents to interact permissionlessly with a diverse range of applications, models, and other agents, enabling them to collaboratively complete complex tasks without the barriers imposed by current network infrastructures.

Value Transfer - The ability for agents to transfer value is a crucial functionality that will become increasingly important as they evolve. Initially, these transactions will primarily serve the needs of the humans utilizing the agents, facilitating payments for services, models, and resources. However, as agent capabilities advance, we will observe a shift towards autonomous transactions between agents themselves, both for their own benefit and for executing tasks beyond their individual capabilities. For instance, an agent might pay a DePIN to access a computationally intensive model or compensate another specialized agent for completing a task more efficiently. We believe that a significant portion of global value transfer will eventually be conducted by agents. Evidence of this trend is already emerging, as seen with initiatives like Autonolas on the Gnosis chain. Autonolas agents make up over 11% of all transactions on the chain and in the last month they have averaged 75.24% of all Gnosis chain transactions.

Source: adrian0x on Dune analytics
Source: adrian0x on Dune analytics
Source: adrian0x on Dune analytics
Source: adrian0x on Dune analytics

Privacy & Alignment - Human-operated agents become more effective when they have access to extensive personal data, enabling them to tailor suggestions and services based on individual preferences, schedules, health metrics, and financial status. However, the reliance on centralized agents raises significant concerns regarding data privacy, as it entails the accumulation of sensitive personal information by large technology corporations. This situation can lead to unintended consequences, as illustrated by the incident with Samsung, where employees inadvertently compromised company secrets while using ChatGPT. Such scenarios highlight the misalignment of incentives between technology incumbents and users, presenting substantial ethical and privacy risks. Users' sensitive data could be exploited, either sold to advertisers or used in ways that serve corporate interests rather than the individuals'. To safeguard sensitive information while maximizing the efficiency of AI agents, it's essential that each user retains ownership of both their agent and their data. By facilitating the local operation of these agents, users can effectively protect their personal data from external vulnerabilities. This approach not only enhances data privacy but also ensures that the agents can perform at their optimal capacity, tailored specifically to the user's preferences and needs without compromising security.

Dependency & Lock-in - In part 1 of our thesis, we delved into the drawbacks of lock-in effects stemming from closed source models. The core issue with relying on centralized agents developed by corporations focused on maximizing shareholder value is the inherent misalignment of incentives. This misalignment can drive such companies to exploit their users in the pursuit of increased profits, for instance, by selling the intimate data provided to these agents. Moreover, a company's take rate—the percentage of revenue taken as fees—is often directly tied to the exclusivity and indispensability of its services. In the context of AI agents, a company like OpenAI might initially offer low take rates and maintain a relatively open network. However, as these agents become more integral to users, evolving through iterative improvements from vast amounts of user data, these centralized corporations may gradually increase their take rates through higher fees or by venturing into other revenue-generating activities like advertising or selling user data. We believe in the establishment of agents on a credibly neutral, open-source base layer that is user-owned, ensuring that incentives are properly aligned and prioritizing the users' interests and privacy above corporate gains.

Smart Agents

Source: David Johnston Smart agents paper
Source: David Johnston Smart agents paper

Revisiting the challenges inherent to PKI, it becomes evident that cryptographic solutions offer compelling resolutions to the issues of alignment, privacy, and verifiability. Furthermore, these solutions adeptly align incentives.

The term "smart agent" refers to a class of general-purpose AI systems designed to interact with smart contracts on blockchain networks. These agents can be categorized based on their operational model: they may either be owned and controlled by users or function autonomously without direct human oversight. Here we will examine their capacity to enhance human interactions in various domains.

The implementation of a smart agent encompasses three critical components, each playing a pivotal role in its functionality and effectiveness. These components are designed to ensure secure, informed, and contextually aware interactions within the crypto ecosystem:

  1. User's Crypto Wallet: This serves as the foundational element for key management and transactional operations. It enables users to sign and authorize transactions recommended by the smart agent, ensuring secure and authenticated interactions with blockchain-based applications.

  2. LLM Specialized in Crypto: A core intelligence engine of the smart agent, this model is trained on extensive crypto datasets, including information on blockchains, wallets, decentralized applications, DAOs, and smart contracts. This training enables the agent to understand and navigate the complex crypto environment effectively. This LLM must be fine-tuned to include a component that scores and recommends the most suitable smart contracts to users based on a set of criteria, prioritizing safety.

  3. Long-Term Memory for User Data and Connected Applications: This feature involves the storage of user data and information on connected applications either locally or on a decentralized cloud. It provides the smart agent with a broader context for its actions, allowing for more personalized and accurate assistance based on historical interactions and preferences.

Users interact with their personal smart agents through either a locally installed application or a community-hosted frontend interface. This interface could be similar to that of platforms like ChatGPT, allowing users to input queries and engage in dialogues with their smart agent. Through this interaction, users can specify the actions they wish to be executed. The smart agent then provides suggestions tailored to the user's preferences and the security of the involved smart contracts.

What sets these smart agents apart from standard LLMs is their action-oriented capability. Unlike LLMs that primarily generate informational responses, smart agents have the advanced functionality to act on the suggestions they provide. They achieve this by crafting blockchain transactions that represent the user's desired actions. This capacity for direct action on the blockchain distinguishes smart agents as a significant advancement over traditional AI models, offering users a more interactive and impactful experience. By integrating conversational interaction with the ability to perform blockchain transactions, smart agents facilitate a seamless and secure interface between users and the crypto ecosystem.

Incorporating PKI as the foundational element of agent usage empowers individuals with direct control over their data and the actions of their agents. This approach addresses the issue of misaligned incentives as users actively confirm that their agents are acting in their interests by reviewing and approving transactions. This mechanism not only ensures that agents operate in alignment with user goals but also secures the ownership and control of sensitive personal data that powers these agents. In an era where artificial intelligence can easily generate convincing fabrications, the immutable nature of cryptographic techniques stands as a bulwark against such threats. As AI technology advances, the authenticity guaranteed by a private key may emerge as one of the few unforgeable proofs of identity and intent. Therefore, private keys are pivotal in constructing a framework that allows for the controlled and verifiable use of agents.

Crypto UX

Smart agents represent an upgrade from their centralized counterparts, but they also have the potential to drastically improve crypto UX. This mirrors the technological journey experienced during the 1980s and 1990s with the advent of the internet—a period marked by challenges in navigating a novel and largely misunderstood technology.  The initial phase of the internet was characterized by a hands-on and often challenging process of website discovery and access. Users primarily relied on directories such as Yahoo! Directory and DMOZ, which featured manually curated lists of websites neatly categorized for easy navigation. Additionally, the dissemination of website information was largely through traditional methods, including word of mouth, publications, and printed guides that provided URLs for direct access. Before the internet became widespread, platforms like Bulletin Board Systems (BBS) and Usenet forums were instrumental in exchanging files, messages, and recommendations for interesting websites. With the absence of advanced search tools, early internet exploration necessitated knowing the precise URLs, often obtained from non-digital sources, marking a stark contrast to the sophisticated, algorithm-driven search engines that streamline web discovery today.

It wasn’t until Google's introduction in 1998, which revolutionized internet use by indexing web pages and enabling simple search functionality, that the internet became vastly more accessible and user-friendly. This breakthrough allowed people to easily find relevant information, and Google's minimalist search bar and efficient results paved the way for widespread internet adoption, setting a precedent for making complex technologies accessible to the general public.

Currently, the crypto ecosystem is that early internet, presenting non-technical users with the daunting task of navigating chains, wallets, bridges, token derivatives with varying risk profiles, staking, and more. This complexity renders crypto largely inaccessible to the average person, primarily due to the poor user experience that is standard in the space. However, smart agents hold the potential to create a 'Google moment' for crypto. They could enable ordinary people to interact with smart contracts as simply as typing a command into a search bar, significantly simplifying the user interface. This breakthrough could transform the crypto landscape, making it user-friendly and accessible, akin to how Google transformed internet search and usability.

Morpheus

One such project which is advancing Smart agents is Morpheus. The Morpheus whitepaper, authored by the pseudonymous trio Morpheus, Trinity, and Neo, was released on September 2nd, 2023 (Keanu Reeves Birthday - a tip to his role in the Matrix). Unlike typical projects, Morpheus operates without a formal team, company, or foundation, embodying a fully decentralized ethos.

The project is architected to advance the creation of a peer-to-peer network, consisting of personal, general-purpose AIs that act as Smart Agents capable of executing Smart Contracts for individuals. It promotes open-source Smart Agents and LLMs that enable seamless interactions with users' wallets, decentralized applications, and smart contracts. Within the network there are four key shareholders:

  1. Coders - The coders in the Morpheus network are the open source developers that create the smart contracts and off chain components that power Morpheus. They are also developers who build smart agents on top of Morpheus.

  2. Capital Providers -  The capital providers in the Morpheus network are the participants who commit their staked ETH (stETH) to the capital pool for use by the network.

  3. Compute Providers - The compute providers provide agnostic compute power, mainly in the form of GPUs.

  4. Community - The community allocation in the Morpheus network refers to shareholders who create frontends in order to interact with the Morpheus network and their smart agents. It also encompasses any users who provide tools or do work to bring users into the ecosystem.

The final shareholder in the ecosystem is the user. The user encompasses any individual or entity soliciting inference services from the Morpheus network. To synchronize incentives for accessing inference, the Yellowstone compute model is employed, operating under the following (simplified) structure:

  1. To request an output from an LLM, a user must hold MOR tokens in their wallet.

  2. A user specifies the LLM they want to use and submits their prompt.

  3. An offchain router contract connects the user to a compute provider that is hosting that LLM on their hardware and is providing the cheapest and highest quality response.

  4. The compute provider runs the prompt and returns the answer to the user.

  5. The compute provider is paid MOR tokens for their work done.

Source: Yellowstone compute model
Source: Yellowstone compute model

Proof of Personhood

Proof of personhood (PoP), also referred to as the "unique-human problem," represents a constrained version of real-world identity verification. It confirms that a specific registered account is managed by an actual individual who is distinct from the individuals behind all other accounts, and it strives to accomplish this ideally without disclosing the identity of the actual person.

The advent of sophisticated generative AI necessitates the establishment of two key frameworks to enhance fairness, social dynamics, and trustworthiness online:

  1. Implementing a cap on the quantity of accounts each individual can hold, a measure crucial for safeguarding against Sybil attacks, with significant implications for facilitating digital and decentralized governance.

  2. Curbing the proliferation of AI-generated materials that are virtually identical to those produced by humans, in order to prevent the mass spread of deception or disinformation.

Utilizing public key infrastructure and human verification systems, PoP can provide a fundamental rate limit to accounts, preventing sybil attacks. With the use of valid human verification systems, even if a human or agent tried to get 100 bot accounts created, they would need 100 humans to consistently complete the verification. This naturally reduces spam and sybil attacks.

The second, and perhaps more critical, application of Proof of Personhood (PoP) systems lies in their ability to accurately differentiate between content generated by AI and that produced by humans. As highlighted in the data provenance section of our report, Europol has projected that AI-generated content might constitute up to 90% of online information in the coming years, posing a significant risk for the spread of misinformation. A prevalent instance of this issue is the creation and distribution of 'deepfake' videos, where AI is utilized to craft highly convincing footage of individuals. This technology allows creators to promulgate false information while masquerading it as legitimate and real. Essentially, intelligence tests will cease to serve as reliable markers of human origin. PoP endows users with the choice to engage with human-verified accounts, layering on social consensus around verifiability. This is akin to the existing filters on social media that allow users to select what content appears in their feeds. PoP offers a similar filter, but focused on verifying the human source behind content or accounts. It also supports the creation of reputation frameworks that penalize the dissemination of false information, whether AI-crafted or otherwise.

Worldcoin

Source: Worldcoin.org
Source: Worldcoin.org

Sam Altman, known for his role as the CEO of OpenAI, co-founded Worldcoin, a project aimed at providing a unique ledger of human identities through public key infrastructure. The philosophy is as follows: As AI is poised to generate significant prosperity and resources for society, it also presents the risk of displacing or augmenting numerous jobs, on top of blurring the lines between human and bot identities. To address these challenges, Worldcoin introduces two core concepts: a robust proof-of-personhood system to verify human identity and a universal basic income (UBI) for all. Distinctly, Worldcoin utilizes advanced biometric technology, specifically by scanning the iris with a specialized device known as “the Orb”, to confirm individual uniqueness.

As you can see from the graphic, Worldcoin works like this:

  1. Every Worldcoin participant downloads an application onto their mobile device that creates both a private and a public key.

  2. They then go to the physical location of an “Orb”, which can be found here.

  3. The user stares into the camera of the Orb while simultaneously presenting it with a QR code that their Worldcoin application generates, which includes their public key.

  4. The Orb examines the user's eyes with sophisticated scanning hardware and applies machine learning classifiers to ensure that the individual is a genuine human being and that the iris of the individual is unique and has not been recorded as a match to any other user already in the system.

  5. Should both assessments be successful, the Orb signs a message, approving a unique hash derived from the user's iris scan.

  6. The generated hash is then uploaded to a database.

  7. The system does not retain complete iris scans, destroying these images locally. Instead, it stores only the hashes, which are utilized to verify the uniqueness of each user.

  8. The user then receives their World ID.

A holder of a World ID can then demonstrate their uniqueness as a human by creating a Zero-Knowledge Succinct Non-Interactive Argument of Knowledge (ZK-SNARK). This proves they possess the private key that matches a public key listed in the database, without disclosing the specific key they own.

Source: Worldcoin.org
Source: Worldcoin.org

Nirmaan

Nirmaan, with its mining-as-a-service product is your gateway to decentralized AI networks.

Nirmaan democratizes access to participation in AI networks. Nirmaan's delegation service abstracts away the complexity of choosing GPU providers and AI networks. Users can simply buy and stake NRM tokens to access mining rewards earned by the mining service. By supplying GPU compute power to networks like Bittensor, Ritual and Morpheus, Nirmaan can acquire tokens at a cost that is between 60-70% of the current market price and even as low as 5-10% for emerging networks. Nirmaan will use a risk-managed approach to maximize earnings by allocating resources to both established and emerging networks.

Our team is composed of experts with a rich background in machine learning and quantitative trading. This includes achievements such as producing highly regarded research papers, founding successful ML startups & a deep reinforcement learning hedge fund. We are in the process of developing an optimization algorithm, alongside the deployment of skilled devops personnel, to seamlessly allocate compute across multiple networks. This approach aims to maximize rewards for our token holders. With proven experience in deploying GPUs and validators across 16 networks, including but not limited to Bittensor, Akash, Heurist, and IO.Net, our team brings invaluable expertise to the table.

Twitter:

Reach out to us on Telegram: @RMSYx0

Email: hello@nirmaan.ai

Subscribe to ASXN Labs
Receive the latest updates directly to your inbox.
Mint this entry as an NFT to add it to your collection.
Verification
This entry has been permanently stored onchain and signed by its creator.