Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficient binary RDF format #18

Closed
6 tasks done
aaronc opened this issue Mar 27, 2019 · 0 comments
Closed
6 tasks done

Efficient binary RDF format #18

aaronc opened this issue Mar 27, 2019 · 0 comments
Assignees

Comments

@aaronc
Copy link
Member

aaronc commented Mar 27, 2019

In order to store on-chain RDF data, it makes most sense to have an efficient binary format for storing that data that relates to the schema module (which defines global schemas for RDF data). This format will:

  • enable efficient verification that the data conforms to the global RDF schema
  • enable efficient verification of the graph hash
  • save storage space on-chain

Should:

  • implement the format only for string node names, and data properties for properties that have been registered in the schema (referencing their PropertyID from the schema module Property schemas #17)
  • serializer should write out nodes and properties in normalized form (i.e. alphabetical, no blank nodes), return a "normalized" graph instance to the caller, and return the graph hash
  • deserializer should verify graph has been serialized in normalized form and return the computed graph hash
  • write thorough tests, including generative tests
  • write thorough docs including grammar of format
  • add CHANGELOG entry

DEV NOTES:
the grammar for the data should be roughly as follows:

File = FileVersion Node*
FileVersion = <varint encoding of file format version>
Node = NodeID Property*
NodeID = 0x0 <node-name-string>
Property = PropertyID PropertyValue
PropertyID = 0x0 <integer property id from schema module>
PropertyValue = <binary encoding of property value based on schema type>
  • a special "un-named" root node is allowed in every graph
  • the classes for a node (currently unsupported) should be serialized at the start of every node
@aaronc aaronc added this to the 0.4 milestone Mar 27, 2019
@aaronc aaronc added the backlog label Mar 27, 2019
This was referenced Mar 27, 2019
aaronc added a commit that referenced this issue Mar 27, 2019
@ghost ghost assigned aaronc Mar 27, 2019
@ghost ghost added Status: In Progress and removed backlog labels Mar 27, 2019
aaronc added a commit that referenced this issue Mar 27, 2019
aaronc added a commit that referenced this issue Mar 28, 2019
aaronc added a commit that referenced this issue Mar 28, 2019
aaronc added a commit that referenced this issue Mar 28, 2019
aaronc added a commit that referenced this issue Mar 28, 2019
aaronc added a commit that referenced this issue Mar 29, 2019
aaronc added a commit that referenced this issue Mar 29, 2019
@ghost ghost added review and removed Status: In Progress labels Mar 29, 2019
aaronc added a commit that referenced this issue Mar 29, 2019
aaronc added a commit that referenced this issue Mar 29, 2019
aaronc added a commit that referenced this issue Mar 29, 2019
aaronc added a commit that referenced this issue Mar 29, 2019
@ghost ghost removed the review label Mar 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant