Data Tokenization
February 25th, 2025

In the digital age, ensuring critical text data provenance and integrity is paramount — and currently a challenge for every industry. Xyxyx is building innovative solutions that harness blockchain technology to transform how we manage and protect critical text data records entirely on-chain, unlocking new frontiers for every industry, from AI and finance to legal and healthcare.

Through Xyxyx API and ERC-721F, Xyxyx is pioneering critical advancements in data tokenization. In this blog post, we introduce an overview of Tokenized Datafile, Tokenized Dataset, and Tokenized Database — three novel concepts coined by Xyxyx. We also propose use case examples that incorporate the application of these concepts.

Xyxyx API, ERC-721F, and Text-Based Tokens

To begin outlining Tokenized Datafile, Tokenized Dataset, and Tokenized Database novel concepts, it is essential to first shed light on their foundational building blocks: the Xyxyx API, the ERC-721F protocol, and Text-Based Tokens.

  • Xyxyx API is an API that enables the integration of text-based tokenization into any existing system.

  • ERC-721F is a variant of the ERC-721 standard, developed by Xyxyx, that enables the individualized issuance of text-based tokens. ERC-721F contracts offer extensive customization options for smart contract authority, including infinite or finite supply, nonPayable or payable mint, public or OnlyOwner mint, and more.

    At its core, ERC-721F introduces an innovative externally callable mint function called svgString. This function enables individualized token issuance, allowing each token mint to be externally arbitraged by the minter. As a result, each token tokenURI is defined at the time of minting, providing a highly flexible and dynamic approach to token creation.

  • Text-Based Tokens serve as the foundation for tokenizing text data into blockchain-based tokens. Xyxyx introduces two tokenization models built on text-based tokens technology: 1x1 (designed for short text records) and A4 (designed for longer text records).

Tokenized Datafile, Tokenized Dataset, Tokenized Database

“A datafile is a computer file that stores data, a dataset is a collection of data, and a database is a system that stores and manages data”.

In contemporary times, the effective storage and management of text data have become pivotal to every industry. At the core of this data-centric paradigm lie three fundamental, traditional concepts: a datafile, which is a computer file that stores data; a dataset, which is a collection of data; and a database, which is a system designed to store and manage data.

Building upon this notion, Xyxyx introduces innovative concepts that harness the power of blockchain technology to redefine critical text data integrity, provenance, and management: Tokenized Datafile, Tokenized Dataset, and Tokenized Database.

Tokenized Datafile

We define a Tokenized Datafile as a unique text-based token issued by an ERC-721F contract. Each Tokenized Datafile represents a distinct piece of critical data related to a specific task. These text-based tokens ensure that each data entry is immutable and verifiable.

Tokenized Dataset

We define a Tokenized Dataset as an ERC-721F contract that issues multiple Tokenized Datafiles for a specific task. This means that the Tokenized Dataset is a collection of these unique text-based tokens (Tokenized Datafiles), each representing a distinct piece of critical data related to a specific task. The aggregate of Tokenized Datafiles within the ERC-721F contract forms a comprehensive and immutable Tokenized Dataset.

Tokenized Database

We define a Tokenized Database as an aggregate of multiple Tokenized Datasets. These multiple Tokenized Datasets (i.e., ERC-721F contracts) are all deployed by the same on-chain address (i.e., the Tokenized Database Unique Identifier). The result is a comprehensive Tokenized Database, where each Tokenized Dataset and its related Tokenized Datafiles contribute to a larger, cohesive whole.

Tokenized Database Unique Identifier

Fundamentally, we define a Tokenized Database Unique Identifier as a public key (e.g., EOA address) created to serve exclusively as a Tokenized Database. This means that such public key is strictly dedicated to deploying ERC-721F contracts that form the Tokenized Database, without executing other non-related interactions involving the public key. This provides a robust tokenized database architecture, ensuring accurate data retrieval for the public key serving as the Tokenized Database.

In other words, while a Tokenized File is identified via tokenID and a Tokenized Dataset is identified via contract address, a Tokenized Database is identified via a public key, called the Tokenized Database Unique Identifier.

Use Cases

Below, we present three comprehensive use case examples that embody the concept of tokenizing datafiles, datasets, and databases using the Xyxyx API, the ERC-721F protocol, and text-based tokens.

Example 1

In this example, we simulate a hospital that tokenizes interactions and health records of its patients.

The entity behind the hospital creates a public key to use exclusively as a Tokenized Database Unique Identifier for deploying ERC-721F for this task.

Each patient is represented by an ERC-721F contract. This contract records and stores all the critical text data related to the interactions between the hospital and the patient.

For each patient interaction, the hospital issues a text-based token with all the critical data under the patient’s ERC-721F contract, forming a comprehensive and transparent record of all medical activities performed for each patient by the hospital.

Example Setup

Tokenized Database Unique Identifier: 0x5678...ABCD

Patient A: Bob, Contract Name: Contract_Bob, CA: 0x3210…AAA, Interactions Count: 32, Deployer: 0x5678...ABCD

Patient B: Alice, Contract Name: Contract_Alice, CA: 0x0101…BBB, Interactions Count: 9, Deployer: 0x5678...ABCD

Patient C: John, Contract Name: Contract_John, CA: 0x2131…CCC, Interactions Count: 83, Deployer: 0x5678...ABCD

A theoretical data directory endpoint example for Example 1 ([database]/[dataset]/[datafile]) would be:

0x5678...ABCD/Contract_Bob/#29

Benefits

Bob, a patient at the hospital, unfortunately passed away after receiving treatment for a chronic illness. His family suspected that his death was caused by a medical error involving the prescription and dosage of drugs administered during his stay in the hospital. Bob’s family initiated a legal process against the hospital, alleging negligence and malpractice. The hospital's innovative system of tokenized interactions and health records became a crucial element in the case. Each interaction between the hospital and Bob was recorded and stored under Bob's ERC-721F contract, providing a transparent and immutable record of all medical activities performed during his time at the hospital.

During the trial, the legal team accessed the EVM to retrieve the comprehensive and detailed records stored under Bob's ERC-721F contract. The tokens included timestamps, prescriptions, dosages, and notes from every interaction between Bob and the hospital. By analyzing this data, experts were able to trace the sequence of events leading up to Bob's death and identify any discrepancies or potential errors. The transparency and integrity of the tokenized records allowed for an objective examination of the hospital's actions, ultimately helping to resolve the legal dispute and determine accountability in Bob's tragic case.

Example 2

In this example, we simulate the development process of an AI model that is tokenized end-to-end using ERC-721F contracts.

The entity building and maintaining the AI model creates a public key to be used exclusively as the Tokenized Database Unique Identifier for the AI model.

This public key will deploy 8 ERC-721F contracts, each dedicated to a specific part of the AI model's development process.

For each part of the process, the entity building the AI model issues text-based tokens under the respective ERC-721F contract.

Example Setup

Tokenized Database Unique Identifier: 0x1234...ABCD

Contract 1: Data Collection, Contract Name: DataCollection, Task: Text-based tokens capturing data sources, data samples, and collection methodologies. Deployer: 0x1234...ABCD

Contract 2: Data Preprocessing, Contract Name: DataPreprocessing, Task: Text-based tokens detailing data cleaning, normalization, and augmentation techniques. Deployer: 0x1234...ABCD

Contract 3: Model Training, Contract Name: ModelTraining, Task: Text-based tokens capturing training configurations, hyperparameters, and training logs. Deployer: 0x1234...ABCD

Contract 4: Model Evaluation, Contract Name: ModelEvaluation, Task: Text-based tokens recording evaluation metrics, validation results, and benchmark scores. Deployer: 0x1234...ABCD

Contract 5: Model Tuning, Contract Name: ModelTuning, Task: Text-based tokens detailing hyperparameter tuning, optimization steps, and tuning outcomes. Deployer: 0x1234...ABCD

Contract 6: Model Deployment, Contract Name: ModelDeployment, Task: Text-based tokens capturing deployment configurations, deployment environments, and deployment logs. Deployer: 0x1234...ABCD

Contract 7: Model Monitoring, Contract Name: ModelMonitoring, Task: Text-based tokens detailing monitoring metrics, performance logs, and issue tracking. Deployer: 0x1234...ABCD

Contract 8: Model Maintenance, Contract Name: ModelMaintenance, Task: Text-based tokens capturing maintenance schedules, update logs, and maintenance outcomes. Deployer: 0x1234...ABCD

Benefits

Data Auditability

Example 2 proposes unparalleled data auditability and transparency in the development of AI models. By issuing text-based tokens under respective ERC-721F contracts for each part of the development process, every data source, preprocessing step, training configuration, evaluation metric, etc, is recorded on the blockchain. These transparent and immutable critical records compel AI developers to adhere to ethical principles, ensuring that every decision and action taken during the AI model's development can be independently verified and audited. This fosters trust and accountability in AI development, promoting ethical AI practices.

Data Accountability

The tokenized records provide a robust legal structure by creating a comprehensive audit trail of the AI model's development process. Each text-based token captures critical information, including the rationale behind decisions, configurations, and outcomes at each stage of development. This audit trail serves as a legally verifiable record that can be used to resolve disputes, demonstrate compliance with regulations, and hold developers accountable for their actions. The transparency and immutability of these records ensure that the underlying thinking and decision-making processes of the AI are preserved, offering a clear and defensible account of the AI's behavior and development.

Example 3

In this example, we simulate an automaker company that records critical data for each vehicle produced.

The automaker company creates a public key to use exclusively as a Tokenized Database Unique Identifier for deploying ERC-721F contracts for this task.

Each car produced by the automaker is represented by an individual ERC-721F contract.

Each ERC-721F contract issues a text-based token every 24 hours, capturing the current critical data of the car (e.g., mileage, fuel levels, service status).

Example Setup

Tokenized Database Unique Identifier: 0x0001...ABCD

Car 1: VIN 1234XYZ, Contract Name: Contract_1234XYZ, Deployer: 0x0001...ABCD

Car 2: VIN 5678ABC, Contract Name: Contract_5678ABC, Deployer: 0x0001...ABCD

Car 3: VIN 9101DEF, Contract Name: Contract_9101DEF, Deployer: 0x0001...ABCD

Benefits

Example 3 proposes a robust, immutable, and transparent audit trail of each vehicle produced by the automaker. By issuing a text-based token every 24 hours, and capturing critical data such as mileage, fuel levels, and service status, each ERC-721F contract ensures that all critical data of a specific vehicle are securely recorded on the blockchain.

These tamper-proof data records are invaluable when the vehicle is sold on the secondary market. Prospective buyers can easily access the vehicle's entire history, confident that the data is accurate and unaltered. The transparency and auditability of these critical records foster trust and accountability, making it easier for buyers to make informed decisions. Additionally, the accessibility of the blockchain ensures that this critical information is readily available, enhancing the overall efficiency and integrity of the vehicle resale process.

Annotations

  • As aforementioned in this blog post, ERC-721F enables infinite token supply. This implies that each ERC-721F contract, when employed as a Tokenized Database, has no storage limit. Considering that the EVM imposes a size limit of 24KB per text-based token, every new token issued translates to an additional 24KB of storage capacity within the ERC-721F database. As a result, each ERC-721F contract, when employed as a Tokenized Database, can expand its storage capacity indefinitely.

  • One of the core fundamentals of text-based tokens is Portability. Text-based tokens are designed to be both human-readable (through token visual output) and system-readable (via token metadata). This dual capability ensures that text-based tokens can seamlessly integrate into any existing system, whether on-blockchain or off-blockchain, while providing seamless accessibility for users through SVG-based token outputs. Read about Portability here.

  • This blog post proposed simple example use cases with a straightforward architecture to aid in comprehending the Tokenized Datafile, Tokenized Dataset, and Tokenized Database novel concepts. Despite the simplicity of these examples, other potential use cases are endless. By leveraging the Xyxyx API, the ERC-721F protocol, and text-based tokens — which feature two distinct tokenization models, 1x1 and A4 — developers can build robust solutions for critical text data provenance, integrity, and management across every industry, from AI to finance to legal sectors.

  • The three use case examples presented in this blog post are envisioned to run upon the ERC-721F protocol, with the Xyxyx API underneath.

Subscribe to Xyxyx
Receive the latest updates directly to your inbox.
Mint this entry as an NFT to add it to your collection.
Verification
This entry has been permanently stored onchain and signed by its creator.
More from Xyxyx

Skeleton

Skeleton

Skeleton