Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jsonschema 2.5.0 breaks Merger.get_schema() #20

Closed
avian2 opened this issue May 12, 2016 · 3 comments
Closed

jsonschema 2.5.0 breaks Merger.get_schema() #20

avian2 opened this issue May 12, 2016 · 3 comments
Labels

Comments

@avian2
Copy link
Owner

avian2 commented May 12, 2016

jsonschema 2.5.0 introduced a LRU cache in the reference resolver:

python-jsonschema/jsonschema#203

This breaks get_schema() in some subtle ways, since we're modifying the schema while we're walking through it and dereferencing things. RefResolver.resolving() now sometimes returns outdated parts of the structure, which causes problems.

A proper solution would be to build up a new schema in parallel while walking over the old one (which should be read-only), similar to how we do it for walking over instances. This is not a trivial change. Specifically, it's hard to properly retain cross-references in the new schema.

Related comments in the code (several parts regarding handling of references currently work more-or-less by luck):

https://github.com/avian2/jsonmerge/blob/master/jsonmerge/strategies.py#L186
https://github.com/avian2/jsonmerge/blob/master/tests/test_jsonmerge.py#L1248

@avian2 avian2 added the bug label May 12, 2016
@avian2 avian2 closed this as completed in 9345471 May 12, 2017
@RayPlante
Copy link
Contributor

Hey Tomaz (@avian2). I was wondering if you could say something more about "returns outdated parts of the structure", perhaps by giving an example if its not too complicated.

Upgrading to use jsonmerge 1.4.0 broke my application with the introduction of the LocalRefResolver class. I was using Merger.cache_schema() to preload schemas I would need to avoid downloads; however, it seems the use of is_remote_ref() in 1.4.0 prevents the option of resolving a schema from cache. Consequently, my $refs are not getting resolved.

1.3 still works fine, so I'm still using that, but I'm trying to think about the best way to get around this with 1.4. Are able to say more about what was going wrong with RefResolver? Any other enlightenment would be welcome as well. thanks!

@avian2
Copy link
Owner Author

avian2 commented Oct 27, 2017

My memory is getting a bit fuzzy on this issue. I think the problem was that once you dereferenced a $ref, the result got cached. The issue was that jsomerge descended into a $ref during a get_schema() call and changed something there. When it dereferenced the same $ref again, it got the old, unchanged part of the schema back, not the new, modified one.

The simplest way to see what broke is to install the latest jsonschema 2.5.0 and jsonmerge 1.3.0 and run python setup.py test from jsonmerge. You should see some test failures

I chose not to dereference external references at all in jsonschema 1.4.0. This avoided all sorts of corner cases that were not well supported previously. Can you make a minimal example of what doesn't work for you?

@RayPlante
Copy link
Contributor

Hey, thanks for the response--this will get me started. To tell the truth, I'm not exactly sure what's not working, yet; I just know that a couple of my unit tests were failing.

BTW, you might be interested to know that I'm using jsonmerge in my project, oar-metadata, which is part of a project to build a public data repository at the US National Institute of Standards and Technology. I need a way to merge JSON metadata provided by the user with that generated automatically from other sources. jsonmerge allows me to control that merger on a fine-grained level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants