diff --git a/HIP/hip-1009.md b/HIP/hip-1009.md new file mode 100644 index 000000000..d8dcd9c03 --- /dev/null +++ b/HIP/hip-1009.md @@ -0,0 +1,164 @@ +--- +hip: 1009 +title: Native Schema Registry Service +author: Michael Bennett , Justin Atwell +working-group: Viv Diwakar , Michael Bennett , Ty (Patches) , Justin Atwell <@justin-atwell>, Bram Bosch , Cyndy Montgomery , May Chan +requested-by: Swirlds Labs +type: Standards Track +category: Service +needs-council-approval: Yes +status: Review +created: 2024-07-11 +discussions-to: https://github.com/hashgraph/hedera-improvement-proposal/pull/1009 +updated: 2024-07-29 +--- + +## Abstract + +This document outlines the development of a Web3 Schema Registry. It addresses the need for many data items stored or shared on-chain or around DLT technologies to be used programmatically across a number of business/industry verticals and aims to improve the reuse and accessibility of data stored in HCS/IPFS and other locations. In many scenarios where data is shared (Supply Chain, Sustainability, NFT's, RWA's), knowing details about that data ahead of time allows the user to interact with the data in exciting ways. A Schema Registry allows creators to share data contracts effortlessly whether it's providing Models for AI, describing NFT metadata, or adding structure to HCS messages. + +## Motivation + +**Unstructured / Unvalidated Messages on HCS :** Many messages stored on HCS are data elements of a supply chain or other similar use case (e.g. Sustainability, Defi, NFTs, AI). There is a need for many of these messages to be used in different systems programmatically. + +**Token Metadata:** In many scenarios, it would be beneficial for Token metadata that is stored on-chain or near-chain in IPFS, etc to be also read programmatically in many use cases. From NFT collections to semi-fungible tokens alongside issued tokens to Sustainability platforms and RWAs. + +**Cross-chain data interchange:** We are seeing a rise in systems that need to operate cross-chain or have elements shared/visible beyond their own network. The ability to have this data validated or parseable by many systems is becoming a strict requirement. + +**The ability to have a single source of truth of an Objects Structure:** Having an on-chain location to have immutable data definitions or schemas for objects ensures more transparency. Being able to programmatically exchange data brings our ecosystem more utility across many web2 and web3 use cases. All without requiring the users to provision private infrastructure. + +**Benefits:** + +* Provides a decentralized location to manage and validate schemas for Tokens and Topics. +* Allows Sustainability assets to be Auditable, Data Discoverable and Liquid. +* Enables new products and business models on Hedera. +* Enhances the utility of HCS and HTS within Hedera. +* Brings the look and feel of big data to Hedera. + +## Feature Summary + +* A native Schema Service that allows these schemas to be stored on-chain and versioned in an easily accessible manner by any interested party. + +* Ability to Create, Read, Update, and Delete (soft) Schemas to enable other use cases for users to interact with. + +* Integration of the native schema service with HTS. This integration will give more added value and utility to these services and the data they hold/create. + +## Example +Create Schema + +Given the following example Avro Schema +``` +{ + "type": "record", + "name": "UltraSchema", + "namespace": "com.UltraSchema.avro", + "fields": [ + { + "name": "firstName", + "type": "string" + }, + { + "name": "lastName", + "type": "string" + } + ] +} +``` + +``` + Schema testSchema = new Schema.Parser().parse(new File("UltraSchema.avsc")); + Gson gson = new Gson(); + + SchemaCreateTransaction transaction = SchemaCreateTransaction.builder() + .schema(testSchema) + .adminKey(OPERATOR_KEY) + .build(); + + TransactionResponse txResponse = transaction.execute(client); + TransactionReceipt receipt = txResponse.getReceipt(client); + SchemaId newSchemaId = receipt.schemaId; +``` + +Use the SchemaId for creating an NFT + +``` +TokenCreateTransaction transaction = new TokenCreateTransaction() + .setTokenName("Your Token Name") + .setTokenSymbol("F") + .setSchemaId(newSchemaId) + .setInitialSupply(5000) + .setMetadataKey(OPERATOR_KEY) + .setTokenMetadata(metadata.getBytes()); +``` + +Use the SchemaId for creating a Topic + +``` +TopicCreateTransaction transaction = new TopicCreateTransaction() + .setSchemaId(newSchemaId) + .setAdminKey(OPERATOR_KEY) + .setSubmitKey(OPERATOR_KEY); + +TransactionResponse txResponse = transaction.execute(client); +TransactionReceipt receipt = txResponse.getReceipt(client); +TopicId newTopicId = receipt.topicId; +``` + +REST API + +GET `/api/v1/schemas` + +Response: + +``` +{ + "Schemas": [ + { + "UltraSchema": { + "type": "record", + "name": "UltraSchema", + "namespace": "com.UltraSchema.avro", + "fields": [ + { + "name": "firstName", + "type": "string" + }, + { + "name": "lastName", + "type": "string" + } + ] + } + } + ] +} +``` + +## User Stories + +* As a Schema Creator, I want to publish a schema for shipment data (e.g. shipmentId, origin, status) to ensure all partners can validate and use this data seamlessly. + +* As a Schema User of HCS, I want to be able to utilize schemas to ensure the data my applications publish on a chain or elsewhere are correct and reusable for other systems and users internally and externally to my organization. + +* As a Schema User - HTS, I want to define and update an NFT schema such as the 412 Community Standard or the Music Metadata schema with attributes (e.g. tokenId, creator, attributes) so all applications can correctly interpret and display this metadata. + +* As a Schema User - Cross Chain, I want to be able to utilize schemas to ensure the data associated with my tokens or applications that may involve cross-chain communications or interchange can be used in both on and off-chain systems for any number of use cases, like auditing, compliance and RWA issuance. + +* As a Schema User - Industry Vertical, I want to be able to utilize schemas to ensure the data associated with externally sourced data is easily integrated into my internal systems or data pipelines in an easy programmatic fashion. This will allow me to utilize the plethora of on-chain data in verticals like RWA, Sustainability, and supply chain. + +## Backwards Compatibility + +The change is backwards compatible as Schemas are optional. If no SchemaId is present in the above transactions, there will be no impact to existing network entities. + +## Security Implications + +Schemas will bloat the mirror network over time and will require additional work down the line. Some potential solutions for this might include large file stores for historical data or Elastic Search with keys. + +## Reference implementation + + + +## References + +## Copyright +This document is licensed under the Apache License, Version 2.0 -- see [LICENSE](../LICENSE) or (https://www.apache.org/licenses/LICENSE-2.0)