Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add simple JSON pointer parsing to prevent data duplication #1681

Closed
wants to merge 1 commit into from

Conversation

patrickkettner
Copy link
Contributor

@patrickkettner patrickkettner commented Mar 24, 2018

fixes #813

idea is, given the following

file a.json

{
  "a": {
    "b": {
       "c": "hello!"
    }
  }
}

and file b.json

{
  "a": {
      "$ref": "a.b.c"
    }
}

then b.json would result in

{
  "a": "hello!"
}

@Elchi3 Elchi3 added the infra 🏗️ Infrastructure issues (npm, GitHub Actions, releases) of this project label Apr 4, 2018
@patrickkettner
Copy link
Contributor Author

@ExE-Boss got an opinion?

@ExE-Boss
Copy link
Contributor

ExE-Boss commented Apr 7, 2018

Given that this is mainly used for mixins/interfaces, which are implemented by other things, and in many cases, the versions when the mixins were implemented tends to differ from when the features, that are now part of the mixins, were originally implemented on the objects that are now implementing the mixins.

Based on that, I believe that the way I described in #813 (comment) would allow us to support the cases where certain objects implemented the mixins at different times.

@patrickkettner
Copy link
Contributor Author

@ExE-Boss sure. updated

@Elchi3
Copy link
Member

Elchi3 commented Apr 17, 2018

This is a very cool idea. Thank you for this PR!
I will need to test this some more before merging. Also, I'm unsure if this slows down the data access in any way? There are now over 7000 features in over 1000 files in this repo already.
Tests and docs for this would be cool, too.

@ExE-Boss
Copy link
Contributor

This happens during loading for EVERY FILE, so it ends up being O(n²), and it happens when the tree hasn’t been fully parsed (eg. we might have loaded api.*s and css.*, but not yet html.* or webextensions.*), which means that we might get weird stuff happening when a file in api.* imports something from webextensions.*.

Copy link
Contributor

@ExE-Boss ExE-Boss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this violates the current schema, since $ref is a valid entry in an identifier type of type identifier, which can’t be a string:

"identifier": {
"type": "object",
"properties": {
"__compat": { "$ref": "#/definitions/compat_statement" }
},
"patternProperties":{
"^(?!__compat)[a-zA-Z_0-9-$@]*$" : { "$ref": "#/definitions/identifier" }
},
"additionalProperties": false
},

Doing the following instead would be valid according to the current schema, since "<ref_to_import>" would be an empty object if it’s identical to the source:

"__import": {
  "<ref_to_import>": {
    // Override stuff here when necessary.
  }
}

I’m currently working on implementing my above thing.

@patrickkettner
Copy link
Contributor Author

patrickkettner commented Apr 17, 2018

Also, I'm unsure if this slows down the data access in any way?

@Elchi3 the object parse takes a couple of ms on my laptop

it happens when the tree hasn’t been fully parsed (eg. we might have loaded api.s and css., but not yet html.* or webextensions.*),

@ExE-Boss how so? It happens after the result object is completely extended out

@ExE-Boss
Copy link
Contributor

it happens when the tree hasn’t been fully parsed (eg. we might have loaded api.*s and css.*, but not yet html.* or webextensions.*),

how so? It happens after the result object is completely extended out

Except that the function load calls itself recursively:

function load() {
// Recursively load one or more directories passed as arguments.
var dir, result = {};
function processFilename(fn) {
let fp = path.join(dir, fn);
let extra;
// If the given filename is a directory, recursively load it.
if (fs.statSync(fp).isDirectory()) {
extra = load(fp);
} else if (path.extname(fp) === '.json') {
extra = require(fp);
}
// The JSON data is independent of the actual file
// hierarchy, so it is essential to extend "deeply".
result = extend(true, result, extra);
}
for (dir of arguments) {
dir = path.resolve(__dirname, dir);
fs.readdirSync(dir).forEach(processFilename);
}
return processRefs(result, result);
}

@patrickkettner
Copy link
Contributor Author

patrickkettner commented Apr 17, 2018 via email

@patrickkettner
Copy link
Contributor Author

ping @Elchi3

@Elchi3
Copy link
Member

Elchi3 commented May 3, 2018

Sorry, reviewing this and the other approach isn't in my priorities at the moment.
This PR at least needs a schema update and a test in the test/sample-data.json file so that we can run a basic test against the implementation. But then again, I'm currently focusing on getting the PR backlog down and finishing the data migration so we can retire old/static MDN compat tables. Thanks for your understanding.

@patrickkettner
Copy link
Contributor Author

patrickkettner commented May 3, 2018 via email

@Elchi3 Elchi3 added the not ready ⛔ This is not yet ready to be merged. It's pending a decision, other PR, or a prerequisite action. label Feb 4, 2019
@Elchi3
Copy link
Member

Elchi3 commented Apr 10, 2019

Bulk data updates through external sources and scripts are our preferred approach to better compat data right now and we're not currently considering de-duplicating version data. Therefore I'm closing this issue. Thanks for your thoughts here! Maybe we will reconsider this idea at a later stage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
infra 🏗️ Infrastructure issues (npm, GitHub Actions, releases) of this project not ready ⛔ This is not yet ready to be merged. It's pending a decision, other PR, or a prerequisite action.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider a way to avoid duplication for entries that share compat data
3 participants